Skip to content

Add TCP DNS transport with connection limits, idle timeout, and frame guard (To-Do)#1272

Open
Ronitsabhaya75 wants to merge 11 commits intoapple:mainfrom
Ronitsabhaya75:vmnet-limitaton
Open

Add TCP DNS transport with connection limits, idle timeout, and frame guard (To-Do)#1272
Ronitsabhaya75 wants to merge 11 commits intoapple:mainfrom
Ronitsabhaya75:vmnet-limitaton

Conversation

@Ronitsabhaya75
Copy link
Contributor

@Ronitsabhaya75 Ronitsabhaya75 commented Feb 26, 2026

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update

Motivation and Context

Right now, our DNS server only handles UDP. Since UDP has a 512-byte limit, any large responses (like when resolving a bunch of container names) just get silently cut off. The standard way to handle this is for clients to retry over TCP when they see a truncated response.

The Fix: I've added a TCP listener that runs alongside the UDP server on the same port, handling queries that need more space.

Key changes:

  • Refactored processRaw(data:) so both UDP and TCP share the exact same query logic.
  • Added proper TCP framing (2-byte length prefix as per RFC 1035).
  • Built-in safeguards: 128 max concurrent connections, 30s idle timeouts, and a 4096-byte frame limit.
  • Per-connection error handling so one bad request won't bring down the whole server.
// Correct NIO 2.86 async pattern for TCP accept loop
let server = try await ServerBootstrap(group: NIOSingletons.posixEventLoopGroup)
    .serverChannelOption(.socketOption(.so_reuseaddr), value: 1)
    .bind(host: host, port: port) { channel in
        channel.eventLoop.makeCompletedFuture {
            try NIOAsyncChannel(
                wrappingChannelSynchronously: channel,
                configuration: .init(
                    inboundType: ByteBuffer.self,
                    outboundType: ByteBuffer.self
                )
            )
        }
    }

try await server.executeThenClose { inbound in
    try await withThrowingDiscardingTaskGroup { group in
        for try await child in inbound {
            guard connections.tryIncrement(limit: Self.maxConcurrentConnections) else { continue }
            group.addTask {
                defer { self.connections.decrement() }
                await self.handleTCP(channel: child)
            }
        }
    }
}

Testing

  • Tested locally
  • Added/updated tests
  • Added/updated docs

@Ronitsabhaya75
Copy link
Contributor Author

TEST] Test 1: UDP Fallback & Basic Resolution
  $ dig @127.0.0.1 -p 2053 +notcp localhost A
[PASS] DNS UDP connection established properly
; <<>> DiG 9.10.6 <<>> @127.0.0.1 -p 2053 +notcp localhost A
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 39591
;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0



[TEST] Test 2: Native TCP Resolution 
  $ dig @127.0.0.1 -p 2053 +tcp localhost A
[PASS] TCP connection successfully established
[PASS] DNS response over TCP received via DNSServer+TCPHandle
; <<>> DiG 9.10.6 <<>> @127.0.0.1 -p 2053 +tcp localhost A
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 40221
;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0


[TEST] Test 3: Truncated Over-sized UDP (TC Bit Fallback)
  $ dig @127.0.0.1 -p 2053 +bufsize=40 localhost A
[PASS] Initial UDP request successfully truncated
[PASS] Client automatically retried with TCP fallback
; <<>> DiG 9.10.6 <<>> @127.0.0.1 -p 2053 +bufsize=40 localhost A
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 62412
;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0



[TEST] Test 4: Container Swift Unit Tests

  $ swift test --filter DNSServerTests
[PASS] Test run with 18 tests in 6 suites passed after 0.322 seconds.
✔ Suite HostTableResolverTest passed (0.001s)
✔ Suite NxDomainResolverTest passed (0.001s)
✔ Suite CompositeResolverTest passed (0.001s)
✔ Suite StandardQueryValidatorTest passed (0.001s)
✔ Suite ProcessRawTest passed (0.006s)
✔ Test testTCPDropsOversizedFrame() passed (0.010s)
✔ Test testTCPRoundTrip() passed (0.010s)
✔ Test testTCPPipelinedQueries() passed (0.010s)
✔ Test testTCPIdleTimeoutDropsConnection() passed (0.321s)

@Ronitsabhaya75 Ronitsabhaya75 marked this pull request as ready for review February 27, 2026 03:16
@Ronitsabhaya75 Ronitsabhaya75 changed the title Add TCP DNS transport with connection limits, idle timeout, and frame… Add TCP DNS transport with connection limits, idle timeout, and frame guard Feb 27, 2026
Co-authored-by: renish avaiya <renishpatel2482001@gmail.com>
@Ronitsabhaya75 Ronitsabhaya75 marked this pull request as draft February 27, 2026 04:02
@Ronitsabhaya75 Ronitsabhaya75 marked this pull request as ready for review February 27, 2026 18:29
@Ronitsabhaya75 Ronitsabhaya75 changed the title Add TCP DNS transport with connection limits, idle timeout, and frame guard Add TCP DNS transport with connection limits, idle timeout, and frame guard (To-Do) Mar 3, 2026
@Renish-patel
Copy link

Hey @jglogan, could you take a look at this TCP implementation when you get a chance? It’s been on to‑do list for a long time, and I think it’ll be a solid improvement for the network going forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants