IPv6 Re-implementation

This is a follow up to the activities in IPv6 implementation, which was published on March 2nd and revised up through March 19th, as new challenges were addressed. Since March 19th a great deal of what I wrote has been revised, as I have learned a lot more.

The main issue was that there remained a number of problems with the implementation of IPv6 in my residence.

  • The biggest was the question how to handle the delegated prefix, particularly in renumbering. Over the course of the last several months I have to note that Comcast has never changed my prefix, except early on, when I forced it to do so by changing my DUID. And I don’t think it likely that my prefix would change unless some great catastrophe befalls which results in my being down for a very extended period – like 30 days; or more likely there is some change in my service (a change in ISP, or perhaps fiber arriving in my area).
  • The first implementation required that I make patches to the code of my router. This meant that I would have to figure out how to carry those patches forward in the event of firmware updates from Ubiquiti, the maker of the Edgerouter-X that I am using.
  • The implementation was pretty fragile, with a lot of unrelated bits in different places. In particular there was a lot of hand-waving in trying to assign and maintain a separate network for the virtual machines on one of the interior boxes.

At the end of March and into April I was grappling with writing scripts and trying to manage these difficulties, Mr. G who is both smarter and less impetuous than I, found information in early April about a feature called NPTv6 or NAT66; Network Prefix Translation, RFC 6296. This specification enables translating at the router between the globally routable designated prefix, and the ULA prefix described in the earlier post.

So, for example, from Comcast I am assigned (currently) 2601:xxx:yyyy:5660::/60, and my ULA prefix is FD30:1839:ded::/48. From the point of view of the internet, there are 16 IPv6 networks that route to my house: 2601:885:8000:5660::/64 – 2601:885:8000:566F::/64. I can set up a mapping of any or all of those externally known networks with exactly one of my ULA networks, so, for example:

  • 2601:xxx:yyyy:5661::/64 => FD30:1839:ded:111::/64
  • 2601:xxx:yyyy:5662::/64 => FD30:1839:ded:222::/64
  • 2601:xxx:yyyy:5663::/64 => FD30:1839:ded:333::/64
  • 2601:xxx:yyyy:5665::/64 => FD30:1839:ded:555::/64

Obviously, this is very similar to what goes on with NAT in IPv4: there are external addresses which are mapped at the router to non-routable internal addresses which I control. What is different from IPv4 nat is that we only change the prefix, there is no need to change the host address; and there is no need to do connection tracking.

There were case stories in which people had done this successfully, and the whole idea was very appealing from a renumbering perspective. Mostly this is because the router implementations, so far, are pretty weak in the ability to configure various services to be contingent on a changeable prefix. Just for example, one would like to establish a dhcpd server which gives out static address assignments in the host address part, while stipulating that the network part is “whatever I got from the ISP”, but there is no syntax for that. For that matter, these routers I am using do not even have the IPV6 configuration implemented in their fancy gui (which is ok with me, but indicates that it is not “ready for prime time” in their opinion).

I didn’t know at the time, but it turns out the subject is a little controversial. There is some difference between NPT66 and NAT66, I think principally that the former will only translate the prefix, and will not do connection tracking; so it is considered less harmful; but both of these are still mostly despised by the cognescenti. On the other hand, it is all very well to lecture about cleaner ways of doing things, so long as they exist. Unfortunately, with the current level of software in the routers available to me, it isn’t really possible without patching the router code (as described in the earlier post). The IPv6 cool kids may sneer, but we have to do what is workable and maintainable with the equipment we have.

So NPT66 seems like a more robust and trouble free implementation in the case of my layered network, what I can get from comcast and what my routers can do..

I set up a testbed of sorts, utilizing the one downstream interface on my outer router, chersonese, which is not part of my main network (although I often refer to this as the DMZ that is actually a misnomer, as I don’t use it that way, it is not open to the internet). It is a network segment separated from the rest of the network, but with access to the internet, and upon which I could experiment. I put a wireless AP on that separate test segment, so that I could attach to it from laptops, and proceeded to configure the outer router with this improved configuration, viz:

  • This subnet was numbered 192.168.223.0/24 and FD30:1839:ded:555::/64.
  • I configured the external router with rules in the nat table to invoke the “NETMAP” target to translate the prefixes just as one does with nat: dnat (change the destination) on input in the pre-routing chain, and snat (change the source) on output in the post-routing chain. The NETMAP target apparently has existed for some time in both IPv4 and IPv6 and does 1:1 mapping of networks, instead of using the snat and dnat targets.

I expected a bit of struggle with this, but it worked almost immediately. After experimenting with it a little, I decided to implement the same change in the main network. I built a little plan of the bits I had to change, and implemented it, and it began to work with little fuss.

So the external router has a pair of NETMAP rules for each of 4 different subnets (as listed above), all in the following form, where the prefixs 1 to n are as shown above.

ip6tables -t nat -A PREROUTING  -i $EXTERNAL_INTERFACE  
   -d $EXTERNAL_PREFIXn -j NETMAP --to $INTERNAL_PREFIXn
ip6tables -t nat -A POSTROUTING -o $EXTERNAL_INTERFACE
   -s $INTERNAL_PREFIXn -j NETMAP --to $EXTERNAL_PREFIXn

Additional Adjustments and Learning

I made the move on April 24, and for the most part it has worked well. However there have been various hiccups and problems, and much has been learned, all of which will have leaked out of my brain very rapidly unless I write them down.

Prefix change will be a big deal: During the thinking between completion of the initial implementation, and the move to some variant of network prefix translations, I came to the realization that prefix change is something that will likely occur only in a fairly momentous, or even catastrophic, situation. It will happen if I have a major power failure that keeps me down for a long time. Or it will happen when I change ISPs. It isn’t something that the router needs to cope with automatically with no help from me. It doesn’t have to be handled inline instantaneously, automatically, and transparently. If I have a prefix change things are going to break.

When I realized this, I stopped worrying so much about adapting to a prefix change. If it happens it will be a major event and will require me to be involved.

Default Routes: Early on after the implementation I had various kinds of trouble with not having default routes in IPv6 on one box or another. It was always easy enough to fix manually, but over time I’ve realized that there were a couple of different issues, solved at different times, but reported here together.

One is whether a box accepts router advertisements. I learned that there is a sysctl variable called “accept_ra” for each interface (e.g. /proc/sys/net/ipv6/conf/enp7s0/accept_ra). This system parameter is well documented, but needs to be set to 2 to enable accepting router advertisements on any system which has forwarding turned on, the theory being: if you are forwarding, then you are a router; if you are a router you shouldn’t be accepting router advertisements.

A second learning was that router advertisements must have a value for “default lifetime.” When I set up the router advertisement section on my router, I left various things blank, including the default lifetime. The RFC says you may not use a router as a default router if it doesn’t have a default lifetime. I also learned, unrelated to default routes, that there were various other parameters on the router advertisements that I eventually needed to set up, the name servers, and the “managed” and “other configuration” flags to tell clients respectively, to use dhcp, and that there was other information to be had from dhcp besides addresses.

Third, I also learned it was a good idea to set the minimum interval parameter on the router advertisements, lest it take 10 minutes or so for a default route to appear on a recently rebooted box.

I still have a problem getting one of my Arch Linux boxes to apply the sysctl setting for accept_ra after a reboot. I put the setting in /etc/sysctl.d/99-sysctl.conf, and yet it doesn’t seem to take. I will find the problem eventually, but for now I have to reset it manually when I boot that system.

DHCP Static Addresses:

Moving to ULA addresses internally allowed me to set static addresses for almost all the boxes, which is good for a couple of reason. First it is often a help when trying to debug something, especially with tcpdump. The 25 or so systems with static addresses are easier to identify if they consistently use the same two digit host number in both IPv4 and IPv6; e.g. oregano is always host 5, in both. Second, there are still occasions when some bit of software wants an ip address. For many of the boxes I know their host number by heart.

However, it seemed to be much harder to set up IPv6 static addresses. For one thing, there is the DUID. This is a generated identifier on the client which has to match on the server. It comes in 3 forms (a 4th isn’t used in a home), and different OS’s choose different forms. The more common two both involve (some) mac address on the client, and may also involve a time stamp. It is a design goal of IPv6 that once a DUID is established between a client and a server, the allocated IPv6 address should not change. There is a catch-22 in this, in that once the box has obtained a dynamic address, getting the client and server to agree to give that up requires overcoming their adherence to the design goal.

Second, if one has many boxes, it isn’t always apparent what the dhcpv6 client will do, or even which piece of software actually IS the dhcpv6 client in use. Network Manager uses his own client, unless you load a different one. Systemd-networkd has his own. Dhclient6 is available, and I like it. All Raspberry Pis seem to use dhcpcd5. I had one horrible problem in trying to get to a static address, in which I was asking for a release renew with dhclient6, who would do so, and correctly get the desired static address, using his DUID, but after a reboot Network Manager was asking for an address with his own dhcp client and his own DUID, which was using a different form of DUID that didn’t match, so the box would go back to a non-static address after every reboot.

Another issue, this related maybe to my router in particular, is that even if you have figured out that the DUID is an LLT — meaning it has the link level address and the timestamp — you still have to determine how the id will be stored on the router. Turns out my router removes redundant leading zeroes in the hex string, e.g. 0:3:0:1:ac:ed:5c:b7:50:5c and not 00:03:00:01….

Which leads to a fourth problem. To figure out the above you have to know what the DUID was that was actually transferred by the client, and that is made harder by the form in which the dhcpv6 leases are stored, both on the client and server, in a wacky unreadable hex packed into binary representation, which shows up as a character string consisting of printable characters interspersed with three digit octal sequences for the non-printable characters. I don’t even know what to call it. It is coming from the perl pack function (I think). I eventually wrote a script to unpack and display it when I grew tired of having to do it by hand. The mac has a nice little feature in its ipconfig to do this automatically, but I only figured that out when I didn’t need it anymore, and I never found a way to decode it on linux except doing it by hand or finally writing code. I am completely mystified why the ISC DHCP code chooses to go out of its way to turn a hex string into something unreadable. But you have to unpack it if you want to discover what the client actually sent.

One more item, speaking of the mac. Most of the boxes had a way to say I want to use dhcpv6 instead of SLAAC. Not the mac, they offer “Automatic”. What does “automatic” mean? It is a secret. Silly me, I thought it meant SLAAC. What it really means, if you dig deep enough into the net, is that the mac will decide between dhcpv6 and SLAAC based on the “Managed” bit in the router advertisement. When the mac picks a router, it allows the router to decide whether to do SLAAC or dhcpv6. Initially I was annoyed at having to figure this out, but ultimately I decided the mac is right to do it this way, and it is a way of reducing complexity for the user, and shifting that complexity to the IT folk who maintain the router.

Services and IPv6

I encountered a few other things worth noting which went wrong and had to be tracked down – not so much related to the switch to ULA addresses, but simply things that arose from beginning to use IPv6.

For example, at one point I discovered my mail server was no longer adding dkim signatures to outgoing mail. What I discovered was that postfix, now preferring IPv6, was failing in his attempt to open the socket to the opendkim daemon. I had to change the specification of the inet socket. (I use a network socket instead of a unix socket for a somewhat lazy reason, because some of my boxes run postfix chrooted and some don’t, so unix sockets would require me to tailor the parameter file, and it is easier to just use a common postfix main.cf. Most of these boxes have very little mail.

Postfix itself had a similar problem. I had long ago encountered some problem with postfix getting confused trying to use IPv6 and had set the inet interfaces to IPv4 only, and had to fix that. Dovecot too, had an issue. He was set to listen on *, and turned out he had to be told to listen to both: *,[::].

And I also had a problem with gmail starting to bounce mail, and he complained I didn’t have a reverse dns record for the mail server. Really? Well I had always had that on my list to do, for both IPv4 and IPv6. I knew I needed to do it. So I went through the hastle of getting Amazon to set up reverse dns records for me.

Scripting

Even with the migration to ULA addresses I still needed to react to a change in prefix for two reasons. First, the firewall rules on chersonese which do the NPT66 conversion from global unicast addresses to my ULA addresses have to be updated with the correct prefix. Second, the DNS has to be updated for any boxes in the house which need to be reachable from outside. There are only two boxes reachable from outside, oregano and cinnamon. External connection to the former isn’t used much, even by me, except when I am traveling. The later however contains a database used by a number of people from outside.

It turns out to be annoyingly difficult to get one’s hands on the prefix from within an interior box. I have a script getmyipv6 which gathers up and figures out various addresses. Some things it can figure out by analyzing its interfaces with the ip command, but for some information it needs to analyze the output of ip commands on the routers. To do that, of course, it has to figure out where the routers are, and it has to be able to authenticate to issue commands programmatically.

Authentication to the routers is done using ssh public key authentication, which these routers support. I have only to physically put a key on the router somewhere, in /tmp for example, and then use the router command loadkey <user> <key location> (in configure mode). This adds a public key to the authorized_keys file, just like any other box. For that matter, I could probably have just done this manually by editing the authorized_keys files. This is a well understood technique.

It took me a long time to get all the ipv6 scripts written and operating correctly, and as part of that I resorted to writing a sort of generalized ip6address handler in python (using the ip6address class), to take in something that is vaguely an “ipv6 address or network or interface” and give back, based on a parameter, a string in the desired format. This is incorporated into my own bash script to give me the flexibility I need.

My initial foray into this was to detect prefix change on oregano, my primary daily internal box, where a lot of housekeeping stuff is localized. But after some thought and experimentation, it turned out to be simpler for updating the firewall just to add it into the firewall script on chersonese, which runs whenever it reboots. This script now detects the allocated prefix, and builds the firewall rules needed. As a convenience, it also salts away the current prefix in /etc/prefix. Oregano still detects the change in prefix and updates DNS for the internal boxes.

One curious thing about this. It is my believe that there is no actual way for me to determine on my Ubiquiti Edgerouter X, what actual prefix length was given to me by Comcast. I know that the chersonese Edgerouter is configured to ask for a /60, because I am convinced that is all that Comcast will allow a non-business customer. And I don’t think the length will change unless Comcast changes their policy AND I change the router configuration. So the script on chersonese simply assumes that what I got was a /60. The information on what I have configured the router to ASK FOR is in the /config tree on the router, but there is nowhere available, as far as I can determine, anything about what I actually got.

Source and Destination Selection

The last learning I want to write about has to do with what is called Source Selection or Destination Selection, covered in RFC 6724. Both Mr. G and I noted that when utilizing the IPv6 checkers, e.g. ipv6-test.com or test-ipv6.com, after our transition to ULA addresses, the checkers would complain that the browsers chose to “prefer” IPv4. This took a little while to figure out, but it turned out to involve the selection algorithms described in RFC 6724, and manifested in a table at /etc/gai.conf.

In a nutshell, the problem we created was that the default address selection algorithm considers a ULA address to be a lower priority and a different scope than an ordinary routable unicast IPv6 address. Even though we “knew” than the ULA address would get translated to a global routable unicast address by the outer router, the operating system didn’t know that. It concluded that it couldn’t use the ULA source address to communicate with the IPv6 destination, and so it chose the IPv4 address as the best option.

We were able to resolve this problem by altering the gai.conf to give a higher priority to ULA addresses, and to designate them having the same scope as global unicast addresses.

I made the switch to ULA on April 24, now on May 31 I believe that our 4 month journey is pretty much over, and it is working reliably.