Skip to content

WSL2 DNS: NODATA AAAA responses not cached by systemd-resolved, causing ~50 to 200ms latency per lookup #14568

@syed-dawood

Description

@syed-dawood

Summary

When using WSL2 with dnsTunneling (the default in mirrored networking mode), every DNS lookup for an IPv4-only hostname incurs an unnecessary full-round-trip penalty (typically 50–200ms depending on environment) because Ubuntu's systemd-resolved does not cache NODATA responses for AAAA queries. This is due to Ubuntu's compile-time patch that changes the default Cache= setting from upstream's yes to no-negative, which explicitly skips caching negative/NODATA DNS responses.

Since the DNS tunnel routes queries through the Windows host's DNS infrastructure — adding significant per-query latency — and modern applications perform parallel A + AAAA lookups, every resolution of an IPv4-only hostname pays the full tunnel round-trip cost on every lookup because the NODATA AAAA result is never cached.

Environment

  • Windows: 11 Pro, Build 26100
  • WSL Version: 2.4.13.0 (Kernel: 5.15.167.4-1)
  • Distro: Ubuntu 24.04 LTS (Noble)
  • systemd-resolved: v255.4-1ubuntu8.14
  • Networking mode: Mirrored (networkingMode=mirrored in .wslconfig)
  • DNS Tunneling: Enabled (default for mirrored mode)

Root Cause Analysis

The DNS Tunnel Architecture

When dnsTunneling is enabled, WSL configures:

  • A tunnel endpoint at 10.255.255.254/32 on the loopback interface
  • /etc/resolv.confnameserver 127.0.0.53 (systemd-resolved stub)
  • systemd-resolved forwards queries to 10.255.255.254 (the WSL DNS tunnel)

The tunnel routes DNS queries from Linux through the Windows host's DNS infrastructure. This adds significant latency per round-trip (measured at 90–160ms in our environment; varies by DNS server distance, VPN, and network topology).

The NODATA Problem

When an application resolves a hostname (e.g., via getaddrinfo()), the resolver typically sends both an A (IPv4) and AAAA (IPv6) query. For IPv4-only hostnames (which are still very common — especially on corporate/enterprise networks), the AAAA query returns a NODATA response: RCODE=NOERROR but with zero answer records.

Ubuntu's Cache=no-negative Default

Ubuntu applies the patch UBUNTU-resolved-default-no-negative-caching.patch to systemd, changing the compiled-in default for the Cache= setting from upstream's yes to no-negative. This was introduced to work around LP #1668771 where caching SERVFAIL responses caused issues in OpenStack environments.

The no-negative mode means:

  • Positive responses (A records, AAAA records with data) → cached
  • Negative responses (NXDOMAIN, NODATA) → not cached

This is normally a reasonable trade-off on systems with low-latency DNS. But in WSL's tunnel architecture, the per-query penalty makes the lack of NODATA caching extremely costly.

The Impact Chain

Application calls getaddrinfo("intranet-server.example.com")
  → glibc sends A query + AAAA query to 127.0.0.53
    → systemd-resolved forwards both to 10.255.255.254 (DNS tunnel)
      → A query:    returns 10.x.x.x       → CACHED ✅ (positive)
      → AAAA query: returns NODATA (no IPv6) → NOT CACHED ❌ (negative)

Next call to getaddrinfo("intranet-server.example.com"):
  → A query:    served from cache in ~1ms  ✅
  → AAAA query: full tunnel round-trip again  ❌  ← THIS IS THE BUG

Every subsequent resolution of the same IPv4-only hostname still pays the full round-trip for the uncached AAAA NODATA response.

Empirical Proof

A/B Test: Cache=no-negative vs Cache=yes

Test domain: www.bankofamerica.com (IPv4-only, AAAA returns NODATA via Akamai CNAME chain)

With Cache=no-negative (Ubuntu default, no Cache= in config):

Query Time CacheStats (size, hits, misses) Source
Q1: AAAA --cache=no 162ms (3, 0, 0) network
Q2: AAAA (normal) 117ms (3, 1, 1) network (NODATA not cached)
Q3: AAAA (normal) 118ms (3, 2, 2) network (NODATA not cached)

Cache misses increment on every query. CNAME chain entries are cached (positive), but the terminal NODATA is not. Each query takes ~117ms.

With Cache=yes (explicit override):

Query Time CacheStats (size, hits, misses) Source
Q1: AAAA --cache=no 229ms (4, 0, 0) network
Q2: AAAA (normal) 11ms (4, 2, 0) cache
Q3: AAAA (normal) 11ms (4, 4, 0) cache

NODATA is cached. Subsequent queries served in 11ms with zero cache misses. 10x faster.

Large-Scale Impact: Multi-Domain Resolution

With 6+ search domains configured (common in enterprise environments), a single getaddrinfo() call for an unqualified hostname triggers AAAA lookups across all search domain suffixes:

hostname.search1.example.com  AAAA → NODATA (not cached)
hostname.search2.example.com  AAAA → NODATA (not cached)
hostname.search3.example.com  AAAA → NODATA (not cached)
hostname.search4.example.com  AAAA → NODATA (not cached)
hostname.search5.example.com  AAAA → NODATA (not cached)
hostname.search6.example.com  AAAA → NODATA (not cached)

This adds 600ms+ of latency to every hostname lookup. With Cache=yes, subsequent lookups serve all NODATA responses from cache in ~1ms each.

Source Code Evidence

Upstream systemd v255 — resolved-manager.c

// In manager_new():
*m = (Manager) {
    // ...
    .enable_cache = DNS_CACHE_MODE_YES,   // ← upstream default
    // ...
};

Ubuntu Patch: UBUNTU-resolved-default-no-negative-caching.patch

Changes the compiled-in default to DNS_CACHE_MODE_NO_NEGATIVE. Confirmed by:

  1. Man page (man resolved.conf): "If 'no-negative' (the default), only positive answers are cached."
  2. Config template (/etc/systemd/resolved.conf): #Cache=no-negative
  3. Binary strings: Contains "Not caching negative entry for: %s, cache mode set to no-negative"
  4. Empirical test: With no Cache= configured anywhere, NODATA is not cached (behavior matches no-negative)

Cache implementation — resolved-dns-cache.c

// In dns_cache_put():
if (cache_mode == DNS_CACHE_MODE_NO_NEGATIVE) {
    // ... skips caching for negative entries
    return 0;
}

Why Windows DNS Cache (Dnscache) Cannot Compensate

A natural question is whether the Windows DNS Client cache could absorb repeat queries. It cannot. ETW tracing (Microsoft-Windows-DNS-Client/Operational) reveals that the DNS Client service internally adds DNS_QUERY_BYPASS_CACHE (0x8) to all DnsQueryRaw requests — even though WSL's DnsResolver.cpp only sets DNS_QUERY_NO_MULTICAST (0x800).

This is consistent with Microsoft's own DnsQueryRaw documentation, where both example code snippets explicitly set request.queryOptions = DNS_QUERY_BYPASS_CACHE. The WSL source code comment confirms: "N.B. All DNS requests will bypass the Windows DNS cache".

ETW evidence (Event IDs 3016/3018 from Microsoft-Windows-DNS-Client/Operational):

  • archive.org primed in Dnscache via [System.Net.Dns]::GetHostAddresses, verified present → DnsQueryRaw cache lookup returns 9701 (no records), goes to wire
  • Same domain queried from WSL a second time (DnsQueryRaw cached the first response) → still 9701, still goes to wire
  • Control: WmiPrvSE NO_WIRE_QUERY probe for same domain → cache hit, no wire query

DnsQueryRaw writes responses into Dnscache (benefiting other Windows APIs) but never reads from it. The only caching layer available for WSL DNS queries is systemd-resolved inside the Linux guest.

Why CacheFromLocalhost Is NOT the Issue

An initial theory was that CacheFromLocalhost=no (the default) might be preventing caching because the DNS tunnel is on 10.255.255.254 which is on the loopback interface. However:

  • CacheFromLocalhost checks the IP address, not the interface
  • in4_addr_is_localhost() in systemd only matches 127.0.0.0/8
  • The tunnel IP 10.255.255.254 is not in 127.0.0.0/8
  • Therefore CacheFromLocalhost has no effect on tunnel DNS responses

Workaround

Add Cache=yes to a systemd-resolved drop-in configuration file:

sudo mkdir -p /etc/systemd/resolved.conf.d/
sudo tee /etc/systemd/resolved.conf.d/cache-fix.conf << 'EOF'
[Resolve]
Cache=yes
EOF
sudo systemctl restart systemd-resolved

This restores upstream systemd's default behavior and enables caching of NODATA responses.

Proposed Fix

WSL should configure Cache=yes in its DNS configuration drop-in when dnsTunneling is enabled. The high latency of the DNS tunnel makes negative response caching essential for acceptable performance.

Alternatively, WSL could:

  1. Set Cache=yes in the auto-generated resolved configuration
  2. Document the Cache=yes workaround for users experiencing slow DNS
  3. Consider adding cache-read support to the DnsQueryRaw code path in the DNS Client service (this would require a Windows-side change)

Related Issues

Diagnostic Commands

# Check current cache mode
grep -r Cache /etc/systemd/resolved.conf /etc/systemd/resolved.conf.d/ 2>/dev/null

# Monitor cache statistics (size, hits, misses)
busctl get-property org.freedesktop.resolve1 /org/freedesktop/resolve1 \
  org.freedesktop.resolve1.Manager CacheStatistics

# Flush caches
sudo resolvectl flush-caches

# Query with timing and cache bypass
time resolvectl query example.com --type=AAAA --cache=no  # force network
time resolvectl query example.com --type=AAAA              # try cache

# Check DNS tunnel endpoint
ip addr show dev lo | grep 10.255.255.254

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions