eBPF Mystery: When is IPv4 not IPv4? When it's pretending to be IPv6!

This adventures starts with a simple eBPF program to transparently redirect DNS requests on port 53 for a single program (or docker container).

To do this I used BPF_CGROUP_INET4_CONNECT on a cgroup. That lets me inspect and redirect traffic when syscall.connect occurs from within the cgroup. Here is a simplified version 👇

int handle_connect_redirect(struct bpf_sock_addr *ctx, __be32 original_ip,
                            bool is_connect4, struct redirect_result *result) {
  __be32 new_ip = original_ip;
  __be16 new_port = ctx->user_port;

  if (ctx->user_port == bpf_htons(53)) {
    new_ip = const_mitm_proxy_address; // Our MITM DNS server we're using for intercept
    new_port = bpf_htons(const_dns_proxy_port);
  }

  result->is_redirected = did_redirect;
  result->ip = new_ip;
  result->port = new_port;
  return 1;
}

SEC("cgroup/connect4")
int connect4(struct bpf_sock_addr *ctx) {
  struct redirect_result r = {
      .ip = ctx->user_ip4,
      .port = ctx->user_port,
      .is_redirected = false,
  };
  handle_connect_redirect(ctx, ctx->user_ip4, true, &r);
  if (r.is_redirected) {
    // If we redirected the request then we need to update the socket
    // destination to the new IP and port
    ctx->user_ip4 = r.ip;
    ctx->user_port = r.port;
  }
  return 1;
}

The machines running the program don’t have IPv6 support, so my assumption was that I’d covered the bases.

I then used an BPF_PROG_TYPE_CGROUP_SKB eBPF program to ensure that the redirect couldn’t be circumvented by making direct IP calls. Roughly this looked like 👇

SEC("cgroup_skb/egress")
int cgroup_skb_egress(struct __sk_buff *skb) {
  // Block IPv6 traffic. Currently not supported.
  if (skb->family == AF_INET6) {
    struct event info = {
        ...
        .eventType = PACKET_IPV6_TYPE,
    };

    bpf_ringbuf_output(&events, &info, sizeof(info), 0);

    return EGRESS_DENY_PACKET;
  }

  // .... then we'd check if outbound ip was allowed and deny/allow ...

Note: I’m using bpf_ringbuf_output here to track events from the eBPF program and logging them from userspace. This was invaluable for tracking down this bug, without them, it would be very hard to reason about what was happening inside the eBPF portion of the progam.

Everything was going very well, until a user tried dotnet CLI.

When they ran dotnet add package x it would hang indefinitely and produce lots of PACKET_IPV6_TYPE blocked messages via the ringbuf_output.

Oh dotnet is using IPv6

The obvious conclusion was the machine was somehow making IPv6 requests, so I did some digging

❓ My connect4 eBPF program was not getting hit, I added similar bpf_ringbuf_output events so I could stream to log in userspace
❌ Hooking up wireshark, I confirmed that dotnet wasn’t making IPv6 calls off the box and the box couldn’t make IPv6 requests to the internet

Now I’m baffled.

Network traffic shows IPv4 calls going out
egress eBPF program shows IPv6
connect4 eBPF program doesn’t fire suggesting no IPv4 connect call is made

What! These contradict each other!

`IPv4-Compatible IPv6 Address` or “When IPv4 pretends to be IPv6 for a little bit”

This 👇 line of the dotnet DualMode socket docs felt like the key 🗝

This allows us to handle IPv4 traffic from an IPv6 socket

But how! Digging more into the source and kernel I found:

IPv4-mapped IPv6 addresses

This is a way to encode an IPv4 address inside an IPv6 address.

ipv4 mapped address image

When used you get an IPv6 address where the last 32bits are actually a IPv4 address.

I updated my connect6 eBPF to output the IPv6 address to me bpf_ringbuf_output event, so I could have a look at it.

Low and behold it was a IPv4 mapped address 🤯🎉

When using DualMode sockets dotnet requests an IPv6 Socket, even for non-IPv6 requests, and sets the user_ip6 address field to be a IPv4-Mapped address.

What does a IPv4 mapped address look like? These look like ::ffff:1.1.1.1 encoding the IPv4 address at the end of the IPv6 address.

I thought I must have this wrong, surely you can’t just smash an ipv4 address in ipv6 field and magic happens?! Nope, didn’t have it wrong, that’s what happens. Linux supports this, and will go on to route the request as IPv4.

My wireshark traces didn’t see the IPv6 traffic, as the kernel is translating it back to IPv4 when making the networking call, so this interim state is only visible to the eBPF programs/kernel.

Fixing the eBPF to handle `IPv4-mapped IPv6 addresses`

To make my original syscall.connect intercept work I have to now hook both IPv4 and IPv6 version of it. For this case I updated the connect6 from earlier to parse out the IPv4 address from the IPv6 address.

SEC("cgroup/connect6")
int connect6(struct bpf_sock_addr *ctx) {
  // Check if we have an IPv4-mapped IPv6 address (::ffff:x.x.x.x)
  // The first 10 bytes should be zeros, followed by 2 bytes of 0xffff
  // user_ip6[0] and user_ip6[1] should be 0 (first 64 bits)
  // user_ip6[2] should be 0x0000ffff (next 32 bits with pattern 0000...1111)
  if (ctx->user_ip6[0] != 0 || ctx->user_ip6[1] != 0 ||
      ctx->user_ip6[2] != bpf_htonl(0x0000ffff)) {
    return 1;
  }

  // See: https://en.m.wikipedia.org/wiki/IPv6#IPv4-mapped_IPv6_addresses As
  // bpf_sock_addr stores `user_ip6` as IPv6 is 4x32 bits to get the IPv4
  // address we ignore the first 96 bits and take the last 32 bits which is the
  // __u32 at index 3 of the user_ip6 array
  __be32 ipv4_address = ctx->user_ip6[3];

  struct event info = {
      .ip = bpf_ntohl(ipv4_address),
      .eventType = IPV4_VIA_IPV6_REDIRECT_TYPE,
  };
  bpf_ringbuf_output(&events, &info, sizeof(info), 0);

  struct redirect_result r = {
      .ip = ipv4_address,
      .port = ctx->user_port,
      .is_redirected = false,
  };
  handle_connect_redirect(ctx, ipv4_address, false, &r);
  if (r.is_redirected) {
    // If we redirected the request then we need to update the socket
    // destination to the new IP and port
    ctx->user_ip6[3] = r.ip;
    ctx->user_port = r.port;
  }
  return 1;
}

It pulls the original IPv4 address out and then calls the existing handle_connect_redirect method. Done right? Nope.

This wasn’t quite enough though as my egress eBPF program would still block these requests as IPv6.

I tracked this down to this check in the egress program:

if (skb->family == AF_INET6) {
    // end up here
    return 0
}

The mapped IPv4 socket was identifying its family as IPv6, how do I work around this? I want to block IPv6 but not IPv4 mapped via IPv6 sockets 🤔

After lots of playing I found this approach:

if (skb->protocol == bpf_htons(ETH_P_IPV6)) {
    // IPv6 hits this but IPv4 Mapped doesn't
    return 0
}

By looking at the protocol on skb I was able to distinguish between IPv6 and IPv4 mapped onto IPv6 (TODO: more testing needed on this last bit).

That’s it

When is IPv4 not IPv4?

When it’s using an IPv4-Compatible IPv6 Address to sending IPv4 over an IPv6 socket 🤯

Oh dotnet is using IPv6

Read more Kernel and dotnet source code

IPv4-Compatible IPv6 Address or “When IPv4 pretends to be IPv6 for a little bit”

Fixing the eBPF to handle IPv4-mapped IPv6 addresses

That’s it

Read more Kernel and `dotnet` source code

`IPv4-Compatible IPv6 Address` or “When IPv4 pretends to be IPv6 for a little bit”

Fixing the eBPF to handle `IPv4-mapped IPv6 addresses`