While reading the files in /usr/src/linux/Documentation/usb/ I got it in my head to see if anyone else had spent any time reverse engineering the OCZ NIA, or at least had figured out how to get output from it. I spent some time a couple of days ago playing with it on Windbringer (running Gentoo Linux and all I was able to determine in the short time I worked on it was that it successfully registers itself with the Linux kernel's USB subsystem as an USB Human Interface Device (heh). After collecting some information I put the project down for a couple of days. A simple Google search on "OCZ NIA USB protocol" revealed a wealth of information, and not a little software that I should have gone searching for to begin with. If I had any common sense I would have gone right to the OCZ forums to look at the information that other hardware hackers had already collected. As it turns out a lot of work has already been done reversing the format of the data the
OCZ NIA sends to the host computer across the USB bus.
This is essentially a roll-up post on what I've dug up, with links to sources of information and attribution where appropriate. It's more for me than for anyone else, and will probably be updated as I figure things out. Interspersed with notes and links will be some of the results of my own work deciphering the data the NIA sends back. I put this post together, not only because I wanted a way to make notes about the information I found but also to provide a better roll up and summary than the wiki page, which really isn't all that helpful.
After the cut there be dragons basking in the stream of consciousness. This post in the forums contains the first roll-up of links to other sources of information. I found it immensely helpful.
Packets of data coming from the NIA are 55 bytes long each, plus two bytes of command header.
All values are in hexadecimal.
The packet format is thought to make up 16 byte-triplets (one per pickup?), followed by five static bytes (38 bd fd ff 0c), and then a byte that contains the number of bytes containing valid data. As you'll read later, this isn't exactly the case but it was a good starting point. From my cursory analysis of the patents (and information encountered later) the data really isn't arranged by pickup.
A triplet that has no data consists of the sequence (00 12 7a) in little endian format, so to figure out the actual value you have to reverse them (00 12 7a becomes 7a 12 00, which is equivalent to 0x7a1200, or 800000 in decimal).
My analysis shows that the byte-triplet hypothesis appears to be correct, but the remainder of the packet's breakdown does not hold up when you look at a larger volume of production data coming out of the NIA. A number of patterns emerge in the data.
I just realized, reading farther down, that someone else already figured that out, which appears to confirm my analysis.
The signals coming from the dermatrodes are said to be 24-bit single bit streams (confusing) that are delta-sigma encoded ADC (analog to digital converter) samples. Mental note: start researching how analog-to-digital converters work and the basics of signal processing. I am not expected to understand this.
More data on the format of the protocol can be found here.
in this thread it is stated that the NIA shows up as a regular HID (Human Interface Device) on the USB subsystem, which concurs with my findings. This simplifies the activity of pulling data out of the device immensely because the existing HID drivers will do the work of pulling data off the bus and putting it into kernel space. I need to figure out how to get those values out of kernel space - I'm hoping it'll be as simple as opening a device node in /dev as a character device to read from, queuing the bytes in an array, and
using my copy of 777 parsing the values.
From the copy of /usr/src/linux/Documentation/usb/hiddev.txt on Windbringer, the HID protocol is a standard that defines how to do two things: get input from the outside world, and provide a basic interface for things that are technically input devices but not for people, like the microprocessors that monitor UPSes (uninterruptible power supplies). It was meant to be generic so that you can plug whatever USB keyboard, mouse, or trackball you want into a computer and it 'just works' without drivers (unless you're running Windows, and even then you don't really need drivers for anything but really unusual keyboards and mice).
The NIA begins sending data to the USB subsystem as soon as it's plugged in.
All packets begin with the string 'RD 00', which appears to be a USB command sent from the NIA's microcontroller to the host computer to say "Here, I've got a packet of data for you - write this down."
The last thread I linked to is full of raw packet captures from the USB bus, which look like this:
RD 00 A2 FA 7F BA FA 7F 92 FA 7F C2 FA 7F 72 FA 7F DB FA 7F 73 FA 7F DB FA 7F 00 12 7A 00 12 7A 00 12 7A 00 12 7A 00 12 7A 00 12 7A 00 12 7A 00 12 7A 38 BD FA FF 51 54 08
RD 00 73 FA 7F DB FA 7F 73 FA 7F DC FA 7F 00 12 7A 00 12 7A 00 12 7A 00 12 7A 00 12 7A 00 12 7A 00 12 7A 00 12 7A 00 12 7A 00 12 7A 00 12 7A 00 12 7A 38 BD FA FF 55 54 04
RD 00 74 FA 7F C4 FA 7F 94 FA 7F BC FA 7F A5 FA 7F A5 FA 7F BD FA 7F 95 FA 7F 00 12 7A 00 12 7A 00 12 7A 00 12 7A 00 12 7A 00 12 7A 00 12 7A 00 12 7A 38 BD FA FF 5D 54 08
Here's a packet that's been picked apart:
Read command: RD 00
Sixteen byte triplets:
74 FA 7F (this is data but only the first byte changes)
C4 FA 7F
94 FA 7F
BC FA 7F
A5 FA 7F
A5 FA 7F
BD FA 7F
95 FA 7F
00 12 7A (defined above as no data)
00 12 7A
00 12 7A
00 12 7A
00 12 7A
00 12 7A
00 12 7A
00 12 7A
Adjusting the values for endianness and converting them from hex to decimal gives this:
Four static values:
38 BD FA FF
The next byte increases by four and then eight, alternating with each packet:
The progression for this value appears to go something like this:
51 -> 55 -> 5d -> 61 -> 69 -> 6C -> 74 -> 78 -> 80 -> 84 -> 8c...
This byte appears in the next-to-last position of each packet:
The final byte in each packet flips between 04 and 08 over and over again:
The pattern for this marker's value:
08 -> 04 -> 08 -> 04 -> 08 -> 04 -> 08 -> 04
The pattern the byte after the static marker I quoted above makes:
51 + 04 == 55
55 + 08 == 5d
5d + 04 == 61
61 + 08 == 69
69 + 04 == 6c
6c + 08 == 74
74 + 04 == 78
So, this byte appears to be the increment for the timing byte.
This might wiggle a little bit because samples are not taken on perfectly even fractions of a second (milliseconds) but 1024 samples every second. Thus, the timing byte's value might be skewed a little to either side of what is predicted. This is why I disagree with Half-Dead on the forums, who stated that this is the number of data-bytes in the packet.
A 60 Hz background signal (because the US uses 60 Hz current) will have to be filtered out of the data to get a clean signal. Elsewhere in the world, a 50 Hz signal will have to be filtered out.
The data coming out of the interface/microcontroller appears to be buffered because the number of data-triplets in each packet is variable. This gives the client-side software a chance to look at each data packet, figure out how much real data is present, extract the data, and move on.
The values of the data-triplets appear to be signed values judging by their values and visualizing the values with a graph.
One person researched the properties of the electrical activity of the human brain and inferred that signals above 30 Hz can be filtered out of the data stream, which simplifies data processing by removing signals which don't actually have anything to do with the brain's electrical activity.
A thread specifically about developing Linux software for the OCZ NIA.
It has been hypothesized that 1024 samples per second are pushed down the USB bus by the PIC microcontroller. Each sample is comprised of 24 data channels of one bit each. Using one of the samples from earlier, the bit patterns would look like this:
(I love pcalc.)
Interrogating the USB subsystem showed pretty much the same results I got when I first started messing around with this unit.
Someone on the forums with a background in the hardware end of neurology modified the headband to use real medical dermatrodes rather than the diamond-shaped plastic pickups. If you hunt long enough (and you're quick) you can sometimes find strips of disposable dermatrodes on eBay.
There is already a project called NIA4Linux that has written some driver and application code for the NIA, though you have to check the code out of the Subversion repository. Also, development seems to have tapered off. However, buried in the directory structure (in nia4linux/trunk/docs/usb_traces) are a bunch of data dumps that will be helpful.
In one of the threads, Half-Dead posted his (or her) data samples in the form of Excel spreadsheets. Opening them in OpenOffice and looking at the graphs included (nice going, Half-Dead) shows the data organized by sample number and value, and a couple of familiar looking graphs.
Some useful information on electroencephalography, in particular, the frequencies of the various sorts of electrical activity in the brain can be found at Wikipedia. Here, I start making notes on electroencephalography in general.
A timing signal is required for sampling. This timing signal then has to be filtered out.
The signals picked up by an EEG are actually differences between the voltages picked up by pairs of electrodes placed on the scalp or (rarely) inside the cranium.
A number of signal correlation methods may be used during an EEG; each characterizes different features of the volume of data collected. Only one of these happens to be relevant to the OCZ NIA because it's implemented in the PIC microcontroller.
Table of wave type/frequencies:
- delta waves - 0-3 Hz; slow wave sleep, infants
- theta waves - 4-7 Hz; drowsiness, arousal, idling, small children
- alpha waves - 8-12 Hz; relaxation, meditation, introspection, lack of
visual input due to closing the eyes (interesting - open eye meditation?)
- beta waves - 12-30 Hz; active, busy, conscious, thinking, concentrating
- gamma waves - 30-100 Hz; unspecified cognitive and motor functions
The uses of this device for biofeedback are obvious. Looks like I really have a reason to learn Python, now.
I'm curious about what the electrical activity of the brain looks like during certain ritual activities.
Clinical EEGs use an internationally-recognized naming system for the positions of the electrodes. There are 19 recording dermatrodes, one ground (mine was stuck to my chin during the sleep studies), and a system reference (which I don't recall).
A low pass filter is used in clinical EEGs to scrub all signals between 0.5 and 1.0 Hz. Conversely, a high pass filter is used to scrub all signals between 35 and 70 Hz (which includes interference from the power supply's output) which comes from electrogalvanic signals from muscles in the scalp and face. However, the OCZ NIA is designed to pick up these signals, so from the other information I've dug up there is no high pass filter in the unit, or if there is it acts upon a range smaller than 35 to 70 Hz. This is further suggested by a poster who wrote that a 50 Hz signal (the poster lives in the European Union) would have to be filtered out of the data. This won't be an issue because it happens inside the black box (literally, in this case).
Artifacts, or glitches in the data from other sources (such as galvanoelectric activity from the eye muscles (which the NIA is, again, designed to pick up), fluorescent lights, and the cardiac electrical activity) will appear in the data.
Among other items of data, a list of the electronic components found in the interface can be found here.
In this thread it is stated that the center electrode is the reference electrode, which strongly suggests that the referential montage method is used internally by the OCZ NIA's microcontroller. this means less data processing code has to be written. Yay.
The right and left pickups/channels are combined into a single data stream, due to the fact that the values of the samples are signed (values above a certain number can be considered positive and values below a certain number can be considered negative because the most significant bit is uses as the sign (+ or -)). Positive values might denote one channel (left?) and negative channels might denote the other (right?).
It was noted in this thread that the microcontroller inside the interface probably isn't doing any signal processing, it's just taking samples and passing them over USB to the application software. A few people have commented on the load incurred by the CPU while running the application software, which strongly suggests that the signal processing is, in fact, done outside of the NIA itself.
Svein Skogen on the forums has hypothesized that there are six bytes of data per sample taken due to the fact that the analog-digital converter of the unit can process up to 24 KHz signals and outputs 24 bits of data per sample.
Half-Dead noted that it is more likely that there is a single data stream that needs to be broken into waveforms rather than multiple streams. Half-Dead also suggests that the patent applications OCZ filed should be read for more information because they explain how everything works in those documents. I skimmed them yesterday, and will cache them on Windbringer soon for further analysis.
D3adg0d on the OCZ forums states that he got the NIA working with Neuroserver, software deveoped by the OpenEEG project. You need to use the NIA reader software he developed to pull signals from the NIA.
The outputs from the NIA should be within the range -8388608 to 8388607 inclusive, which means that figuring out the data format will involve reworking the endianness of the data, converting the hex values into binary, and figuring out what binary format is used (probably 2's complement). It would also be interesting to flip the bit patterns around entirely (0x7ffa94 becoming 0x49aff7, the binary interpretations of which look very different). Whether or not this will actually be meaningful is unknown at this time.
16 channels of data come from the OCZ NIA; the first five are the most active, the next five show very little activity, the rest don't appear to be used at all. I don't yet understand where this comes from, but looking at the binary representations of the value (scroll up a few times) seems to support this.
Next up: reading the patents to see how things are supposed to work, and possibly researching signal processing to figure out what actually to do to the data coming out of the OCZ NIA.
To everyone on the OCZ forums (which I should have read first), thank you. Your work has been incredibly helpful, and I should have spoken to you folks first to keep from re-inventing the wheel.
My plan of attack involves going over the Brainfingers patents to see how things are supposed to work, followed by figuring out how the HID drivers for Linux work. There has to be a relatively simple means of creating a device node under /dev that I can then open with a short script and pull data out of. Once I get a device node, I can start writing code that will actually do something with the data stream.