-
Notifications
You must be signed in to change notification settings - Fork 1.7k
WSL2 DNS: NODATA AAAA responses not cached by systemd-resolved, causing ~50 to 200ms latency per lookup #14568
Description
Summary
When using WSL2 with dnsTunneling (the default in mirrored networking mode), every DNS lookup for an IPv4-only hostname incurs an unnecessary full-round-trip penalty (typically 50–200ms depending on environment) because Ubuntu's systemd-resolved does not cache NODATA responses for AAAA queries. This is due to Ubuntu's compile-time patch that changes the default Cache= setting from upstream's yes to no-negative, which explicitly skips caching negative/NODATA DNS responses.
Since the DNS tunnel routes queries through the Windows host's DNS infrastructure — adding significant per-query latency — and modern applications perform parallel A + AAAA lookups, every resolution of an IPv4-only hostname pays the full tunnel round-trip cost on every lookup because the NODATA AAAA result is never cached.
Environment
- Windows: 11 Pro, Build 26100
- WSL Version: 2.4.13.0 (Kernel: 5.15.167.4-1)
- Distro: Ubuntu 24.04 LTS (Noble)
- systemd-resolved: v255.4-1ubuntu8.14
- Networking mode: Mirrored (
networkingMode=mirroredin.wslconfig) - DNS Tunneling: Enabled (default for mirrored mode)
Root Cause Analysis
The DNS Tunnel Architecture
When dnsTunneling is enabled, WSL configures:
- A tunnel endpoint at
10.255.255.254/32on the loopback interface /etc/resolv.conf→nameserver 127.0.0.53(systemd-resolved stub)- systemd-resolved forwards queries to
10.255.255.254(the WSL DNS tunnel)
The tunnel routes DNS queries from Linux through the Windows host's DNS infrastructure. This adds significant latency per round-trip (measured at 90–160ms in our environment; varies by DNS server distance, VPN, and network topology).
The NODATA Problem
When an application resolves a hostname (e.g., via getaddrinfo()), the resolver typically sends both an A (IPv4) and AAAA (IPv6) query. For IPv4-only hostnames (which are still very common — especially on corporate/enterprise networks), the AAAA query returns a NODATA response: RCODE=NOERROR but with zero answer records.
Ubuntu's Cache=no-negative Default
Ubuntu applies the patch UBUNTU-resolved-default-no-negative-caching.patch to systemd, changing the compiled-in default for the Cache= setting from upstream's yes to no-negative. This was introduced to work around LP #1668771 where caching SERVFAIL responses caused issues in OpenStack environments.
The no-negative mode means:
- Positive responses (A records, AAAA records with data) → cached ✅
- Negative responses (NXDOMAIN, NODATA) → not cached ❌
This is normally a reasonable trade-off on systems with low-latency DNS. But in WSL's tunnel architecture, the per-query penalty makes the lack of NODATA caching extremely costly.
The Impact Chain
Application calls getaddrinfo("intranet-server.example.com")
→ glibc sends A query + AAAA query to 127.0.0.53
→ systemd-resolved forwards both to 10.255.255.254 (DNS tunnel)
→ A query: returns 10.x.x.x → CACHED ✅ (positive)
→ AAAA query: returns NODATA (no IPv6) → NOT CACHED ❌ (negative)
Next call to getaddrinfo("intranet-server.example.com"):
→ A query: served from cache in ~1ms ✅
→ AAAA query: full tunnel round-trip again ❌ ← THIS IS THE BUG
Every subsequent resolution of the same IPv4-only hostname still pays the full round-trip for the uncached AAAA NODATA response.
Empirical Proof
A/B Test: Cache=no-negative vs Cache=yes
Test domain: www.bankofamerica.com (IPv4-only, AAAA returns NODATA via Akamai CNAME chain)
With Cache=no-negative (Ubuntu default, no Cache= in config):
| Query | Time | CacheStats (size, hits, misses) | Source |
|---|---|---|---|
Q1: AAAA --cache=no |
162ms | (3, 0, 0) | network |
| Q2: AAAA (normal) | 117ms | (3, 1, 1) | network (NODATA not cached) |
| Q3: AAAA (normal) | 118ms | (3, 2, 2) | network (NODATA not cached) |
Cache misses increment on every query. CNAME chain entries are cached (positive), but the terminal NODATA is not. Each query takes ~117ms.
With Cache=yes (explicit override):
| Query | Time | CacheStats (size, hits, misses) | Source |
|---|---|---|---|
Q1: AAAA --cache=no |
229ms | (4, 0, 0) | network |
| Q2: AAAA (normal) | 11ms | (4, 2, 0) | cache |
| Q3: AAAA (normal) | 11ms | (4, 4, 0) | cache |
NODATA is cached. Subsequent queries served in 11ms with zero cache misses. 10x faster.
Large-Scale Impact: Multi-Domain Resolution
With 6+ search domains configured (common in enterprise environments), a single getaddrinfo() call for an unqualified hostname triggers AAAA lookups across all search domain suffixes:
hostname.search1.example.com AAAA → NODATA (not cached)
hostname.search2.example.com AAAA → NODATA (not cached)
hostname.search3.example.com AAAA → NODATA (not cached)
hostname.search4.example.com AAAA → NODATA (not cached)
hostname.search5.example.com AAAA → NODATA (not cached)
hostname.search6.example.com AAAA → NODATA (not cached)
This adds 600ms+ of latency to every hostname lookup. With Cache=yes, subsequent lookups serve all NODATA responses from cache in ~1ms each.
Source Code Evidence
Upstream systemd v255 — resolved-manager.c
// In manager_new():
*m = (Manager) {
// ...
.enable_cache = DNS_CACHE_MODE_YES, // ← upstream default
// ...
};Ubuntu Patch: UBUNTU-resolved-default-no-negative-caching.patch
Changes the compiled-in default to DNS_CACHE_MODE_NO_NEGATIVE. Confirmed by:
- Man page (
man resolved.conf): "If 'no-negative' (the default), only positive answers are cached." - Config template (
/etc/systemd/resolved.conf):#Cache=no-negative - Binary strings: Contains
"Not caching negative entry for: %s, cache mode set to no-negative" - Empirical test: With no
Cache=configured anywhere, NODATA is not cached (behavior matchesno-negative)
Cache implementation — resolved-dns-cache.c
// In dns_cache_put():
if (cache_mode == DNS_CACHE_MODE_NO_NEGATIVE) {
// ... skips caching for negative entries
return 0;
}Why Windows DNS Cache (Dnscache) Cannot Compensate
A natural question is whether the Windows DNS Client cache could absorb repeat queries. It cannot. ETW tracing (Microsoft-Windows-DNS-Client/Operational) reveals that the DNS Client service internally adds DNS_QUERY_BYPASS_CACHE (0x8) to all DnsQueryRaw requests — even though WSL's DnsResolver.cpp only sets DNS_QUERY_NO_MULTICAST (0x800).
This is consistent with Microsoft's own DnsQueryRaw documentation, where both example code snippets explicitly set request.queryOptions = DNS_QUERY_BYPASS_CACHE. The WSL source code comment confirms: "N.B. All DNS requests will bypass the Windows DNS cache".
ETW evidence (Event IDs 3016/3018 from Microsoft-Windows-DNS-Client/Operational):
archive.orgprimed in Dnscache via[System.Net.Dns]::GetHostAddresses, verified present → DnsQueryRaw cache lookup returns 9701 (no records), goes to wire- Same domain queried from WSL a second time (DnsQueryRaw cached the first response) → still 9701, still goes to wire
- Control: WmiPrvSE
NO_WIRE_QUERYprobe for same domain → cache hit, no wire query
DnsQueryRaw writes responses into Dnscache (benefiting other Windows APIs) but never reads from it. The only caching layer available for WSL DNS queries is systemd-resolved inside the Linux guest.
Why CacheFromLocalhost Is NOT the Issue
An initial theory was that CacheFromLocalhost=no (the default) might be preventing caching because the DNS tunnel is on 10.255.255.254 which is on the loopback interface. However:
CacheFromLocalhostchecks the IP address, not the interfacein4_addr_is_localhost()in systemd only matches127.0.0.0/8- The tunnel IP
10.255.255.254is not in127.0.0.0/8 - Therefore
CacheFromLocalhosthas no effect on tunnel DNS responses
Workaround
Add Cache=yes to a systemd-resolved drop-in configuration file:
sudo mkdir -p /etc/systemd/resolved.conf.d/
sudo tee /etc/systemd/resolved.conf.d/cache-fix.conf << 'EOF'
[Resolve]
Cache=yes
EOF
sudo systemctl restart systemd-resolvedThis restores upstream systemd's default behavior and enables caching of NODATA responses.
Proposed Fix
WSL should configure Cache=yes in its DNS configuration drop-in when dnsTunneling is enabled. The high latency of the DNS tunnel makes negative response caching essential for acceptable performance.
Alternatively, WSL could:
- Set
Cache=yesin the auto-generated resolved configuration - Document the
Cache=yesworkaround for users experiencing slow DNS - Consider adding cache-read support to the
DnsQueryRawcode path in the DNS Client service (this would require a Windows-side change)
Related Issues
- LP #1668771: Original Ubuntu bug requesting
no-negativemode (SERVFAIL caching caused outages) - LP #1895418: Fix config template to show
no-negativeas the default - systemd PR Permission denied error when Docker Desktop tries to load eBPF 'udpv6csum' program on WSL 2 #13047: Upstream implementation of the
no-negativecache option - WSL DNS resolver sets TTL for all answers to
0#9423: Slow DNS in WSL (may be related) - WSL WSL DNS tunnel nameserver appends search suffixes to IPv6 queries for domains without AAAA records, causing timeouts and mismatches #13415: DNS resolution issues with mirrored networking
Diagnostic Commands
# Check current cache mode
grep -r Cache /etc/systemd/resolved.conf /etc/systemd/resolved.conf.d/ 2>/dev/null
# Monitor cache statistics (size, hits, misses)
busctl get-property org.freedesktop.resolve1 /org/freedesktop/resolve1 \
org.freedesktop.resolve1.Manager CacheStatistics
# Flush caches
sudo resolvectl flush-caches
# Query with timing and cache bypass
time resolvectl query example.com --type=AAAA --cache=no # force network
time resolvectl query example.com --type=AAAA # try cache
# Check DNS tunnel endpoint
ip addr show dev lo | grep 10.255.255.254