Releases: semihalev/sdns
v1.6.5
Patch release for 1.6.3. Major focus on RFC 5011 trust-anchor correctness, DNSSEC validation hardening, and listener lifecycle. Also closes a build-tag bug that prevented 1.6.4 from releasing on FreeBSD/NetBSD/OpenBSD/DragonFly.
Note: 1.6.4 was tagged but never published — the goreleaser pipeline failed on
freebsd_amd64because of thereuseport_*build constraint bug fixed in this release. 1.6.5 is the first available shipping point that includes the trust-anchor work below.
What's Changed
Trust Anchors (RFC 5011)
A full pass over middleware/resolver/auto_trust_anchor.go to bring the resolver into alignment with RFC 5011 §2 / §4 and to harden persistence against partial failures. Highlights:
verifyFetchedKeysis now correct under KSK rollover. At-least-one-trusted-anchor RRSIG semantics with a narrow revoked-bootstrap carve-out (RFC 5011 §2.1: a revoked key may authenticate the RRset that contains it, but only for the purpose of validating its own revocation). Returns a split-mode flag so a revoked-only proof can tombstone the matching key but cannot seed AddPend or mark other anchors missing.- Revocation requires a self-signed RRSIG and key-material match, not just key-tag arithmetic. Defends against 16-bit tag collisions where an unrelated self-signed key could otherwise be admitted as a revocation of the real anchor.
- Tombstones moved out of
kskCurrentinto a material-keyed store with its own state file. Tag collisions with future legitimate KSKs can no longer suppress them. - Missing keys remain valid trust-point keys until the remove hold-down expires (§4.2). Missing→Valid restoration on KeyPres. AddPend reset on KeyRem (the hold-down aborts cleanly instead of drifting through Missing). Missing-aged-out simply deletes (§2.4.2 is bookkeeping); only RevBit revocations tombstone permanently.
- Configured-merge uses an immutable
cfg.RootKeyssnapshot taken at startup, honours tombstones by key material, filters seeded entries on load, and refuses to resurrect a stale admin-config anchor that the root has revoked. - Atomic gob writes (CreateTemp + fsync + rename + parent dirsync) with tombstones-first ordering. New revocations dual-write to a
StateRevokedmarker so a tombstone-write failure survives across retries; selective fail-closed only when an actual contraction would otherwise be lost on disk. - New
errTrustAnchorsUnavailablegatesanswer/authority/validateDelegationand the two delegation-cacheSetsites, so an empty trust set fails closed with SERVFAIL instead of slipping into the "unsigned delegation" branch. AutoTA's own DNSKEY query runs CD=true so it doesn't depend onr.rootKeys.
Bug Fixes
- Build reuseport file on all BSDs, not just darwin. The file was named
reuseport_darwin.go, which Go treats as an implicit GOOS=darwin build constraint. The explicit//go:build darwin || freebsd || netbsd || openbsd || dragonflywas ANDed against that, so freebsd/netbsd/openbsd/dragonfly all failed to link withundefined: defaultUDPWorkers / kernelLoadBalances / reusePortControl. Renamed toreuseport_bsds.goso the explicit tag governs. - Windows compatibility for AutoTA persistence. Split the post-rename directory fsync into platform-specific helpers (POSIX does it; Windows is a no-op since FlushFileBuffers on a directory handle requires GENERIC_WRITE which
os.Opendoesn't grant). Tombstones-file open errors now distinguish Windows sharing violations from real corruption — only decode failures fail closed. - DNSSEC validation hardening + DNAME correctness. SERVFAIL when a signed zone omits RRSIG. Multiple correctness fixes around DNAME synthesis and parallel lookup paths.
- Five correctness bugs in the parallel lookup path.
- Listener lifecycle: explicit fail-fast binds. Listener startup now fails immediately if a bind cannot be established, instead of silently degrading.
Performance
- Pool
net.Dialerand bypassDialContexton UDP upstream. Per-query allocation cut on the recursive hot path. - Pre-build hostsfile answer RRs at load time instead of constructing them per query.
Refactor / Naming
- Package
authcache→authority(split intoserver.go+cache.go). Type renames:AuthServer/AuthServers→Server/Servers;NSCache→Cache;NS→Delegation;DSRR→DSSet;Version→IPVersion. parentDSRR/parentdsrr→parentDS.accesslist.AccessList→accesslist.List;accesslog.AccessLog→accesslog.Log(config field names preserved).r.ncache→r.delegations;nameserversmap type →hostSet;nameserverInfo→delegationInfowithhostsfield.rootservers/rootkeys(smashed lowercase) →rootServers/rootKeys.ipv4cache/ipv6cache→glueV4/glueV6.- Internal sub-pipeline now flows through the
Queryerinterface;util.ExchangeInternalretired.
Dependencies
github.com/semihalev/zlog/v2→ v2.0.6.k8s.io/apimachinery→ 0.36.0.k8s.io/client-go→ 0.36.0.codecov/codecov-action→ v6.
Full Changelog: v1.6.3...v1.6.5
v1.6.3
Patch release for 1.6.2. Recommended upgrade for anyone on 1.6.2 — it fixes a startup panic affecting any setup whose config disables a middleware that sits before blocklist in the chain (e.g. reflex, kubernetes, hostsfile).
What's Changed
Bug Fixes
- Fix
middleware.Get("name")returning the wrong handler when a disabled (typed-nil) middleware sits earlier in the chain. In 1.6.2 this causedpanic: interface conversion: middleware.Handler is *kubernetes.Kubernetes, not *blocklist.BlockListat startup whenever the API was enabled. The lookup now resolves by name against an O(1) map on an immutable compiled pipeline, so the index can no longer drift.
Performance
Redesigned the middleware package around an immutable Pipeline published via atomic.Pointer, and eliminated the per-query allocation inside responseWriter.Reset.
Microbenchmarks on Apple M4 / Go 1.26 for the realistic 15-handler production chain (BenchmarkChainNext/15):
| ns/op | B/op | allocs/op | |
|---|---|---|---|
| 1.6.2 | 79.3 | 32 | 2 |
| 1.6.3 | 32.8 | 0 | 0 |
That's ~58% faster per query walk with zero allocations on the hot path. At 100k QPS that removes ~200k allocs/s and ~3.2 MB/s of GC pressure per core. middleware.Get is ~4x faster (21.9 → 5.3 ns/op) though that's off the request path.
Internal
- Middleware package redesigned:
Registry+ immutablePipeline+ lock-freeatomic.Pointerreads; constructors no longer run under a write lock;Handlers()/Get()/Ready()are race-free by construction. All middleware packages kept their existing Handler API — no downstream changes. Chain.Nextdrops the unreachable modulo and the unusedtailfield.responseWriter.ResetreplacesRemoteAddr().String() == "127.0.0.255:0"with a direct IP+port compare against a preparsed sentinel.- Build badge in the README now points at
ci.ymlafter the workflow consolidation.
New Tests
Test_Get_SkipsDisabled— regression guard for the 1.6.2 crash.Test_Registry_Build_ConcurrentReads— race coverage for lock-free pipeline reads.BenchmarkChainNext,BenchmarkGet,BenchmarkGet_Legacy— so the performance numbers are reproducible.
Full Changelog: v1.6.2...v1.6.3
v1.6.2
What's Changed
New Features
- Add
Cache.ForEachEntry()andCache.ForEach()for iterating cache entries (#452)
Bug Fixes
- DNSSEC: return SERVFAIL when a signed zone omits RRSIG records (#447)
- DNSSEC: correctly determine zone security for missing-signature responses by extracting
isZoneSecure()and probing the actual delegation point instead of arbitrary internal names, per RFC 4034 §5 and RFC 4035 §5.3.3 (#450) - middleware: skip typed-nil handlers in
Setupso disabled middlewares (reflex, accesslist, kubernetes, hostsfile, blocklist, etc.) can no longer crashServeDNSwith a nil-pointer dereference on the first request (#455, fixes #453) - Fix gosec G118 warning in API server shutdown goroutine
- Fix gosec G118 and G122 linter warnings
CI / Infrastructure
- Consolidate GitHub Actions workflows from 8 to 5 (
ci.yml,docker.yml,codeql.yml,release.yml,claude.yml) - Migrate Docker publish from the retired
docker.pkg.github.comtoghcr.io; a single buildx invocation now pushes multi-arch images to both Docker Hub and GHCR - Pin GoReleaser CLI to
~> v2and bumpgoreleaser-actionto v7.1.0 for stable releases - All workflows now track
go-version: stable, declare least-privilegepermissions:, and useconcurrencygroups - Fix gosec linter failures for golangci-lint v2 compatibility
Testing
- Add fuzz tests for cache, config, blocklist, hostsfile, resolver, doh, and util packages
Dependencies
- Bump
github.com/miekg/dnsfrom 1.1.68 to 1.1.72 - Bump
github.com/quic-go/quic-gofrom 0.57.1 to 0.59.0 - Bump
k8s.io/{api,apimachinery,client-go}from 0.34.2 to 0.35.4 - Bump
github.com/BurntSushi/tomlfrom 1.5.0 to 1.6.0 - Bump
github.com/spf13/cobrafrom 1.10.1 to 1.10.2 - Bump
golang.org/x/netfrom 0.47.0 to 0.53.0 - Bump
golang.org/x/cryptofrom 0.45.0 to 0.50.0 - Other dependency updates
New Contributors
- @linkdata made their first contribution in #452
- @MaciejTe made their first contribution in #447
- @SAY-5 made their first contribution in #455
Full Changelog: v1.6.1...v1.6.2
v1.6.1
What's Changed
New Features
- Add Reflex middleware for DNS amplification/reflection attack detection (#438)
- Behavioral scoring based on query patterns (rate, types, amplification)
- TCP connection proves real IP (clears suspicion)
- Bounded memory with automatic cleanup (100K IPs max)
- Learning mode for threshold tuning
- Prometheus metrics for monitoring
- Add config validation flag
-t/--testfor CI/CD pipelines (#439) - Add cache Prometheus metrics (hits, misses, evictions, prefetches, size, hit rate) (#439)
Bug Fixes
- Fix rate limiter CPU spike (140% at 800 QPS) under random IP attacks (#403)
- Fix concurrent CopyTo panic in resolver (#425)
- Fix response silently truncated without TC bit being set
- Fix MaxConcurrentQueries default value (#425)
- Fix Windows console ANSI escape codes issue (#401)
Improvements
- Add circuit breaker pattern to prevent querying known-down servers (#425)
- Add goroutine limiting (MaxConcurrentQueries) to prevent resource exhaustion (#425)
- Update zlog to v2 with improved Windows terminal support
Testing
- Add comprehensive tests for util, cache, resolver, kubernetes, blocklist packages
- Improved overall test coverage
Dependencies
- Bump github.com/miekg/dns from 1.1.66 to 1.1.68
- Bump github.com/quic-go/quic-go from 0.53.0 to 0.55.0
- Bump k8s.io/client-go from 0.33.2 to 0.34.2
- Bump golang.org/x/crypto from 0.41.0 to 0.45.0
- Bump github.com/prometheus/client_golang
- Other dependency updates
Full Changelog: v1.6.0...v1.6.1
v1.6.0
What's Changed
Major Performance Improvements
- Improve cache performance with new segment-based implementation (#395)
- 98.6% performance improvement (from 1359 ns/op to 18.66 ns/op)
- Zero allocations for all operations
- Better concurrency with 256 segments
- Fix SyncUInt64Map memory leak (#394)
- Fixed critical memory leak where deleted nodes were never freed
- Implemented backward shift deletion
New Features
- Add Kubernetes DNS middleware with killer mode (#385)
- Implement endpoint support for headless services in sharded registry (#386)
- Add automatic TLS certificate reloading and cache memory optimization (#392)
- Added shared rate limiter
Bug Fixes
- Fix quic api issues
- Fix cache issues
- Fix K8s issues
Full Changelog: v1.5.3...v1.6.0
v1.5.3
What's Changed
Features
- Add Extended DNS Error (EDE) support with proper caching
- Add wildcard domain support to blocklist (*.example.com)
- Modernize codebase for Go 1.23+ compatibility
Bug Fixes
- Fix EDE information not being preserved in cached responses
- Fix goroutine leak in resolver fallback handling
- Fix context timeout errors returning blank EDE messages
- Fix DoQ test failures by removing KeepAlivePeriod
- Fix blocklist SOA record placement
Performance Improvements
- Optimize blocklist wildcard matching (360x faster for large lists)
- Add fast path for empty blocklists
- Use math/rand/v2 for better performance
- Optimize extractRRSet for single-type queries
Code Quality
- Replace interface{} with any throughout codebase
- Use slices package for slice operations
- Use clear() builtin for map/slice clearing
- Add comprehensive tests for EDE functionality
- Improve error handling with errors.Is()
Other Changes
- Rename zregister.go to registry.go for clarity
- Improve snap packaging with automatic release detection
- Add GitHub Actions workflow for snap releases
- Update documentation and examples
Full Changelog
v1.5.2
v1.5.1
What's Changed
Critical Bug Fixes
- Fixed NSEC validation for non-existent TLDs (e.g., random TLD queries)
- Fixed NXDOMAIN responses being incorrectly returned as NOERROR from cache
- Fixed goroutine leak in DNS resolver's singleflight operations
- Fixed SERVFAIL cache TTL from 5 seconds to 30 seconds to prevent repeated queries to failed servers
DNSSEC Improvements
- Implemented RFC 8914 Extended DNS Errors (EDE) support for better diagnostics
- Fixed RRSIG expiration handling in cache TTL calculations
- Improved NSEC/NSEC3 validation for better RFC compliance
- Fixed AD flag handling for authenticated responses
Performance Improvements
- Zero-allocation cache key generation using sync.Pool (~25% faster)
- Zero-allocation logging with migration to zlog v1.2.3
- Optimized NSEC coverage checks for better performance
Other Changes
- Migrated from log to zlog v1.2.3 for better performance
- Improved error messages with typed errors
- Better handling of single-name zones and edge cases
- Updated various dependencies for security and performance
What's Fixed
This release primarily addresses critical bugs discovered after v1.5.0, with a focus on DNSSEC validation correctness and cache performance. The SERVFAIL caching fix significantly improves performance when dealing with unreachable or failing authoritative servers.
Docker
docker pull ghcr.io/semihalev/sdns:v1.5.1Checksums
Checksums will be automatically added by the release workflow.
Full Changelog: v1.5.0...v1.5.1
v1.5.0
What's Changed
Overview
SDNS v1.5.0 is a major release featuring comprehensive modernization, significant performance improvements, and new enterprise-grade features. This release marks a substantial upgrade with enhanced monitoring capabilities, improved caching mechanisms, and better operational visibility.
New Features
1. Domain Metrics with Prometheus Integration
- Per-domain query metrics: Track query counts for individual domains
- Prometheus integration: Export metrics in Prometheus format for monitoring
- Smart filtering: Automatically filters out single-label domains (TLDs)
- Configurable limits: Control maximum domains to track with
domainmetricslimit - Auto-eviction: Low-count domains are automatically evicted to maintain performance
- Configuration: Enable with
domainmetrics = "on"in config
2. Dnstap Binary Logging Support
- Full dnstap protocol implementation: Industry-standard binary DNS message logging
- Unix socket support: Connect to dnstap collectors via Unix sockets
- Auto-reconnection: Automatic reconnection on connection failures
- Configurable logging: Choose to log queries, responses, or both
- Buffered writes: Efficient buffering for high-throughput environments
- 100% test coverage: Thoroughly tested implementation
3. TCP Connection Pooling with Keepalive
- Persistent TCP connections: Reuse TCP connections to root and TLD servers
- EDNS-Keepalive support: Properly implements RFC 7828 for connection persistence
- Configurable timeouts: Separate timeouts for root (5s) and TLD servers (10s)
- Connection pool management: Automatic cleanup and connection limiting
- Significant latency reduction: Eliminates TCP handshake overhead for subsequent queries
- Configuration: Enable with
tcpkeepalive = truein config
4. High-Performance Caching System
- New SyncUInt64Map: Generic high-performance map with atomic operations
- Parallel nameserver lookups: Improved resolver performance with concurrent queries
- Smart cache eviction: Random bucket sampling for efficient eviction
- Separate positive/negative caches: Better cache management and hit rates
- Prefetch queue improvements: More efficient DNS prefetching
Performance Improvements
Benchmark Results (June 2025)
- Query throughput: 709 QPS (24% improvement from v1.3.3's 571 QPS)
- Average latency: 137ms (significantly faster than previous versions)
- Lost queries: Only 2 out of 50,000 (99.996% success rate)
- Runtime: 70.5s for 50,000 queries (19% faster than v1.3.3's 87.5s)
- Cache hit rate: Significantly improved with new caching architecture
- Memory efficiency: Better memory utilization with optimized data structures
Key Optimizations
- TCP Connection Pooling: Eliminates repeated TCP handshakes to authoritative servers
- Parallel nameserver lookups for faster resolution
- Lock-free data structures with atomic operations
- Efficient random sampling for cache eviction (avoiding O(n) iterations)
- Migrated to
golang.org/x/sync/singleflightfor deduplication
Code Modernization
Go 1.23 Compatibility
- Updated to Go 1.23 with modern patterns
- Proper error wrapping with
%wformat - Use of
strings.Cutfor string operations - HTTP clients with proper timeouts
- Modern test patterns and benchmarks
Improved Code Quality
- Test coverage: Significantly improved across all packages
- Race condition fixes: Resolved multiple race conditions
- Linting: Updated to latest golangci-lint with comprehensive checks
- Documentation: Enhanced inline documentation and configuration comments
Configuration Enhancements
New Configuration Options
# TCP Connection Pooling
tcpkeepalive = true # Enable TCP connection pooling
roottcptimeout = "5s" # TCP idle timeout for root servers
tldtcptimeout = "10s" # TCP idle timeout for TLD servers
tcpmaxconnections = 100 # Maximum pooled TCP connections
# Domain metrics configuration
domainmetrics = "on" # Enable domain metrics
domainmetricslimit = 10000 # Maximum domains to track
# Dnstap configuration
dnstap = "on" # Enable dnstap logging
dnstappath = "/tmp/dnstap.sock" # Unix socket path
dnstapidentity = "sdns" # Server identity
dnstapversion = "1.5.0" # Server versionConfiguration File Improvements
- Detailed inline documentation for all options
- Better organization and grouping
- Validation with helpful error messages
- Config test coverage improved from 77.8% to 90.0%
Bug Fixes
- Fixed race conditions in resolver's parallel lookup
- Fixed race condition in blocklist's timesSeen map
- Fixed root priming issues
- Improved error handling throughout the codebase
- Fixed Windows path handling in tests
Upgrading to v1.5.0
No Breaking Changes
All changes in v1.5.0 are backward compatible. Existing configurations will continue to work without modification.
Recommended Actions
- Review new configuration options (domain metrics, dnstap)
- Update monitoring to utilize new Prometheus metrics
- Consider enabling dnstap for comprehensive DNS logging
- Test in staging environment before production deployment
Performance Tuning
- Adjust
domainmetricslimitbased on your traffic patterns - Configure cache sizes based on available memory
- Enable prefetching for frequently accessed domains
Dependencies Updated
- golang.org/x/net: v0.31.0 → v0.33.0
- golang.org/x/sync: Added for singleflight implementation
- golangci-lint: Updated to v1.62.2
- Various other dependency updates for security and performance
Contributors
Special thanks to @semihalev (Yasar Alev) for leading this major release with comprehensive improvements across the entire codebase.
What's Next
- Continued performance optimizations
- Additional monitoring capabilities
- Enhanced DNSSEC validation
- Improved documentation and examples
For detailed changes, see PR #375.
v1.4.0
What's Changed
- Added new root key
- Fix docker file.
- Updated various dependencies.
Full Changelog: v1.3.7...v1.4.0