802.11w causing random wireless problems.

11 September 2023

A couple of weeks ago I found that I had to replace the wireless router upstairs because its radios were spiking to extemely high temperatures a couple of times a day. 1 When anything spikes over ten standard deviations in the universe, generally speaking it's probably a very bad thing. So I did a little research and picked up a new wireless router, a Linksys EA8300 (affiliate link) which has very good OpenWRT support, 256 megs of RAM (which is a lot for a wireless router) and 256 megs of on-board flash storage. Most importantly, the EA8300 has three separate wifi radios, which means that it can cover much more area with less of a chance of neighboring wireless devices crosstalking and interfering with each other. Net result: Much better bandwidth, which speeds everything up overall.

Flashing OpenWRT onto the unit was fairly straightforward and unremarkable. Configuring the new router took a little time and involved locating a couple of settings which had changed position over the last couple of years. It was largely a matter of copy-and-pasting from the old router into the new one and then changing the IP networking at the very last minute. Same with setting up WDS for the downstairs router, like last time. Swapping in the new router was the work of a couple of minutes. But then I noticed something weird (isn't there always?)

A couple of systems I have at home dropped off the Net and wouldn't reconnect, even though everything else did within a minute or two. And there didn't seem to be any rhyme or reason behind which systems had vanished. One could reasonably guess that, say, every Raspberry Pi wouldn't come back online, or every weird-ass little sensor built on top of an ESP8266 was having trouble but that wasn't the case. Leandra wouldn't come back online. My weather station outside promptly found one of the wireless networks but the printer and our Kodi server didn't. They weren't on the wireless network so I couldn't just shell in and take a look around, so I had to plug in a spare keyboard and display; given that my office is still full of shipping crates of family stuff from my mom's estate, it was anything but a straightforward exercise. The one thing they all had in common was that every isolated system was throwing "CTRL-EVENT-ASSOC-REJECT status_code=16" errors a couple of times per second. Did any of the machines have a different wireless password? Not that I could tell. Recent system update? Nothing within the last week or so. IPv6? Some of those boxen were already IPv6 enabled and some weren't, so that didn't change. Searching for that error was somewhat difficult given that a fair amount of my exocortex was offline, and the answers I was finding ran the entire gamut from reinstalling Windows 11 (which nothing in the house runs, and certainly doesn't apply to an ESP8266 sensor pod) to HDMI interference.

I decided to start walking back my configuration changes on the new router, one at a time to see which, if any machines would suddenly pop back onto the wireless network. As luck would have it I stumbled onto the solution almost immediately. For every wifi radio in the router (in the OpenWRT control panel for v22.03.2 go to Network -> Wireless -> pick a radio -> Edit -> Interface Configuration -> Wireless Security) there is a checkbox for 802.11w Management Frame Protection 2 support. If 802.11w is turned on, turn it off for every radio, click the Save & Apply button, and then restart each radio by clicking Disable and then Enable 3 (or power cycle the router). If this was the problem you should start seeing other systems start popping back onto your wireless network by themselves.

Why 802.11w caused problems, I don't know. I haven't dug deeply enough into it to tell. I definitely don't know why the set of all of my at-home machines that were having trouble didn't have a discernable pattern. Logically, if Leandra (Intel wireless chipset, running Arch Linux) was having problems, Windbringer (laptop, same chipset, also running Arch Linux) should have also, but this wasn't the case.

  1. Monitoring and alerting of said wireless router done with System Script, a rewrite of my System Bot as a pure shell script for embedded devices. Alerts sent via XMPP using a copy of go-sendxmpp compiled for that system architecture. 

  2. In as nutshell, 802.11w protects wifi management frames (which implement stuff under the hood like detecting new devices on the network, negotiating joining and leaving the wireless network, and sending out "Hey, I'm here!" beacons) from certain kinds of wireless attacks, like replay attacks and deauth attacks

  3. This will, of course, punt you from your wireless network until the radio comes back up. If you have more than one wireless network configured on the computer you're working from it'll drop onto the next one. If you're plugged into a hardline you probably won't notice.