Turtles All the Way Down: How to Bootstrap a More Trustworthy Computing Platform From Scratch
- Integrated Circuits
- Fabbing Circuit Boards
- SoCs and Storage
- Firmware and Bootloaders
- Bootstrapping an Operating System
- So, does anyone actually operate this way?
- Copyright and License
Let's lay one thing out first: At some point you're going to have to start trusting your toolchain because it simply won't be possible to accomplish some of the necessary tasks yourself. The lowest possible level sseems as good a place as any to start. I mean silicon wafers, the basic component of integrated circuitry. Let's face it, nobody's in a position to turn ordinary sand and handfuls of trace elements into silicon wafers themselves. This is a very complex operation that you can't do in your basement these days. There are lots of very specialized companies that do just that and if you are in such a position by definition you're a central point of failure because you can do pretty much whatever you want to them and we can't tell.
Chip design these days is more like programming than electrical engineering. When chips were simpler a couple of decades ago (for certain values of 'simple') they were more like analogues of circuits based upon discrete components, just much smaller. Up until the Intel 80386 or therabouts (or so some of my more knowledgable colleagues tell me) they were still designed manually, and had blueprints that could cover the floor of a high school gymnasium or larger. Hardware description languages like Verilog, ABEL, and VHDL made it possible to define the myriad functions of microprocessors in much the same way as the software that runs on them. There are even simulators which allow chips specified in an HDL (hardware description language) to be compiled and executed in virtual environments for testing so that chips don't have to be fabbed to be debugged. The languages' compilers take the instructions and convert them into circuit diagrams and chip layouts which are used to fabricate the wafers.
It's certainly possible for hackers to use HDLs to develop their own chips. There are open source HDLs like Verilator Icarus Verilog which are fully usable for this purpose. There are websites like opencores.org which are directories of gratis and/or libre IC specifications that you are free to download, read (and audit!), tinker around with, fabricate, and use yourself. There are even fully operational CPUs on those sites - 8-bit, 16-bit, 32-bit, and 64-bit. Some of them are re-implementations of classic microprocessors, like the 6502, M68k, OpenSPARC, OpenRISC, and OpenFire just to name a few. There are also open source CPUs like the LatticeMico32 which are not only fully operational microprocessors but also have their own software development toolchains. To put it another way, with a little patience you could download the designs for probably all of the chips you need, verify them (if you're of a mind to), modify them if you feel a need, and implement them somehow so you ca use them. My advice is to pick something that you can get real work done on every day. Go 64-bit if you can, but use a 32-bit core if you have to.
A sound question to ask is, how can you trust that your copy of the HDL source code for a core hasn't been tampered with? The answer is that you can't unless you wrote the whole thing yourself. There aren't any security measures approaching perfection unless you have absolute control of the entire development process starting from step 0. You'll have to accept some risks and mitigate other risks as best you can, because the alternative is to give up. Some of the ways to mitigate the risk are to check out copies of the code from multiple repositories around the Net (yay, distributed version control systems like Git and Mercurial) compare them to one another to see if they've been tampered with. This can be done with common tools like find, cmp, and diff. It's also possible to use the revision control tools to compare multiple checkouts of the same repository to make sure that they have the same states and the same commit histories. If n copies of the code from far and wide all match, chances are they haven't been tampered with. If they don't, then their commit histories will diverge after a certain point, which narrows down where you need to check for shenanagains. Verifying the digital signatures on downloads help verify this, too. You do check the signatures, right?
It's painful to admit (and a lot of sufficiently paranoid people are going to cry foul at this point), but at some point you're going to have to trust someone. In our case we need to have some amount of trust that other hackers who are working on bits and pieces of this project are not up to any skullduggery. I know, this runs totally counter to life in the grimdark cyberpunk future we live in, but we're all in this together. The alternatives are either teach yourselves how to design microprocessors pretty much from the ground up, or to abort the plan. The former is entirely possible - proof above by freely available practical implementation - but how much time do you have on your hands to do so?
So, time to sit down, figure out your use cases, and plan your project appropriately. What do you want to do with your trusted open system? Start picking out open cores that match those functions.
The next phase of the trusted open computer project is actually manufacturing usable integrated circuits that you can plug into a circuit board, apply power to, and use to do whatever it is that you do. In other words, processing information.
I hate to be a killjoy, but this is really hard. A vital question that we have to ask at this point is whether or not this is the point at which the project is pwnable by a determined third party. Fabbing integrated circuitry on silicon wafers is, to be gentle, a nontrivial process. Here are a couple of links on how the process can go:
Even Jeri Ellsworth, who is somebody for whom I have the utmost of admiration and respect, spent two years building a semiconductor foundry in her home and as far as I know (and I'd love to hear different) she's only up to fabbing her own discrete transistors. I think says something for how far we are from fabbing our own ICs at home. In case you're curious here's her photo album for her home semiconductor fab. It's a breakthrough, to be sure, and I have no doubt that she will one day fabricate her own CMOS and NMOS chips. However I think that day is a considerable period of time in the future and, for the purpose of practically implementing a computer system from the ground up, doing so is not an option yet. But what can we do that will get us results?
A more commonly used method of implementing custom chips is compiling the design into an FPGA (field programmable gate array), which is a chip that you can load code into to turn it into whatever you want, like a CPU, a math coprocessor, a communications controller, a graphics comtroller, or something more unusual (like a signal processor). The nature of FPGAs is that they are slower than fabbed chips (but they're getting better, and some high-end Xilinx FPGAs run at speeds practically indistinguishable from ASICs) but if you don't have your own chip foundry they're probably the only realistic option open unless you feel up to the task of building functional versions of the ICs out of discrete components. Which you can do, as the following examples depict:
- http://www.evilmadscientist.com/2013/555-kit/ (fixed!)
- MT15 transistorized CPU
- A gallery of various homemade CPUs
By their very nature they'll be orders of magnetude larger than the originals and will probably have unusual quirks due to the constraints of macroscale electrical engineering. So, if you're the sort of person who is not willing to sacrifice some speed just to have a working system, you're out of luck at this point.
FPGAs also notably differ in the number of logic gates they are constructed out of. Some have enough hardware on board - microprocessors, clocks, and other stuff - that they can be used to fabricate full Systems-On-A-Chip, which I'll talk about later. Some designs may require FPGAs of a minimum number of gates, speed, or other onboard functionality so a hacker would have to be mindful of the specifications of the FPGAs acquired and match them to their intended purpose. Additionally, the un-flashable onboard subsystems of some FPGAs may be subvertable somewhere along the supply line, certainly at the factory but possibly closer to home. I don't feel comfortable speculating any farther beyond that due to my comparative lack of hands-on knowledge of FPGA technology. Then there is the argument of open source versus closed source FPGA synthesis and programming software. The few times I've mentioned using FPGAs to construct computers while doing research for this article the discussion rapidly turned into a debate over the ethics of using closed source software to build open source integrated circuitry. I'm not going to restart that debate, plenty of people already do so every day. This is one of those pain points where you have to decide how far you're willing to go to achieve your goals. This might put you out of the game, it might change your direction, or it might not.
After doing some research I found that there are several HDL's out there, but there are not many open source software packages that take the synthesized binaries and load them onto the FPGA chips. Such applications are often manufacturer (and sometimes FPGA model) specific, often expensive, and sometimes have EULAs that have disuaded many people from reverse engineering them to write their own versions. However, I found a couple of non-proprietary applications which carry out the same tasks. xc3sprog is quite handy, though you really need to study the documentation (and play around with it a little) before using it for anything moderately complex. xilprg has been around for a while but it hasn't been updated since 2006, which is a bad sign for an open source project. jwrt is an FPGA programmer of more recent vintage. Qflow is a very active software project for turning HDL code into an FPGA bitstream or ASIC fabrication circuit layout (interestingly). The Qflow project mentions another open source FPGA synthesis package called Yosys, which would also be of interest in the context of this project. An outfit called Qi Hardware, which specializes in free and libre hardware, maintains some public domain software called fpgatools which can be used to flash bitstreams into a couple of models of FPGA.
Knowing that the above tools exist, they sort of dictate the hardware one could use in a ground-up computing stack effort. Not all of these applications work with all FPGAs, which seems as if it might limit the applications certain FPGAs could be put to. Additionally, when most FPGAs are powered off they go back to their default blank state and must be reprogrammed when next powered on. The firmware needs to be reloaded from some other storage medium, like an EEPROM, an SD card, or another computer via some interface (such as serial or point-to-point Ethernet). When an FPGA is powered up it reaches over a particular bus or into a particular region of memory (make and model dependent, of course), pulls its bitstream over, and then goes into execution mode. Some FPGAs seem to have internal flash-like memory for storing bitstreams on a semipermanent basis, which is implied by the existence of open source software which is capable of flashing bitstreams into FPGAs over a programming interface but not attendent storage media if I'm reading the docs correctly. Some FPGAs seem to be permanently flashable - upload a bitstream and it permanently becomes that kind of chip.
Incidentally, a couple of years ago HacDC offered a course on FPGA design - here's the wiki page with lots of links to resources, and if video is available, I'll link to it.
FPGAs tend to be on the expensive side, sometimes costing a significant percentage of the cost of a new laptop. A list of higher-end FPGA development boards can be found here. Such development kits can, of course, be acquired through other means depending upon one's basic proclivities. I leave this as an exercise to the reader. I would be remiss in not pointing out that there are also some remarkably inexpensive FPGA experimenters' boards available on the open market, on the order of $70us apiece. The Papilio seems reasonably popular among hardware hackers and can be purchased from Sparkfun at very reasonable prices. They also sell another, similar board called the Mojo). Ztex FPGA boards are also designed for people to hack around on and with, and their toolchain is open source (GPLv3), which is a plus. Unfortunately, this means that you now have to trust the manufacturer of the FPGA chips (and/or the reference boards) to not have backdoored them, or the board, or any firmware of other chips on the board...
There are only a couple of strategies for hopefully getting not-backdoored FPGAs because we do not yet live in a world in which there are black market chip foundries run by criminal cartels, the products of which are less likely to have nasty surprises embedded in them. Probably the best thing that a determined hacker can do is find functionally equvalent FPGAs from multiple manufacturers that are suitably complex to carry out the tasks at hand. Try to source them at different times from different vendors in as many jurisdictions as possible to ensure that it is difficult to predict which batches are going where and when and hope that you get a shipment that hasn't been intercepted and modified. If possible, source them by driving to the stores personally, buy as randomly as possible from the lots available to try to evade targeted attacks with compromised chips. Then pick the FPGAs you want to use randomly from all of the ones that you acquired. Tip-of-the fedora to my cypherpunk cow-orkers for helping me figure that acquisition strategy out. It is by no means the only possible strategy.
Fabbing Circuit Boards
This brings us right along to designing and fabricating the circuit boards that our bright, shiny new open source chips will plug into. This level of complexity is probably one of the best understood parts of the development process. Arguably electrical engineering has been around since the discovery of electricity, because a circuit of some kind is required to guide an electrical current to do useful work. You could make the case that the wet string that Benjamin Franklin's kite was tied to was one of the first electrical conductors (because the Baghdad battery hypothesis has too many holes in it for my liking (I know, I know, Mythbusters disproved Ben Franklin in the thunderstorm, I need an alusion of some kind to make this work)). The advent of EDA software has accelerated the design process by at least several orders of magnetude compared the days of drafting circuitry on sheets of paper.
In recent memory I've experimented with two open source packages, KiCAD and GEDA. These applications can handle even the design of multi-layer circuit boards, but I will warn you that some messing around with them is required to get used to their user interfaces and toolsets. Hypothetically speaking, if you start by loading up the proof of concept circuit schematics in your EDA package of choice you can start the process of manufacturing a mainboard for our hypothetical trusted and open source computer. Many open hardware projects make KiCAD or GEDA files available along with their source code, so it is certainly worth messing around with them to get a feel for things. More experienced engineers no doubt have some ideas for what they want to accomplish, by definition know what they are doing and are probably inclined to get right to work. Rather than attempt to go into detail that I don't understand well enough, I'll leave it as an exercise to the reader (don't you hate it when people say that?) and push on to the topic of circuitry manufacture. Having a set of schematics for a mainboard is all well and good but they have to be fabricated if they're going to be useful. That means transferring, etching and drilling at least one printed circuit board for our trusted and open computing platform. Hobbyists and hackers have been etching their own circuit boards for decades employing a variety of techniques. For example, here's a fairly simple technique that requires chemicals no more exotic than vinegar and hydrogen peroxide, both of which can be acquired from the corner grocery store. There is a more complex technique which involves ferric chloride, isopropyl alcohol, and baking soda, the first reagent of which can be purchased at a well stocked hardware store or Radio Shack.
The layout of the circuitry must be transferred onto the copper clad board somehow, and then the copper that isn't part of the circuitry must be removed. There are many ways to accomplish both tasks:
- silk screening
- use a CNC router to cut the extraneous copper from the board
I'll be honest with you, I don't have any first hand experience with working with them, aside from tearing stuff apart that used them once in a while. I've hung out at HacDC when folks were doing it and I've read about it.
It also seems feasible that flexible circuits could be used in this project instead of rigid plastic, metal, or fibreglass boards. They're certainly more lightweight than copper-clad boards. There is a highly rated tutorial on how one may make their own flexible circuit boards that may or may not be of interest in the context of this discussion. There are flexible circuit construction kits available on the open market (much to my surprise). It is also possible to cut and lay out your own flexible circuit boards if one has access to a vinyl cutter. If your friendly neighborhood hackerspace or workplace has a laser cutter a slightly different method can be used to make flexible circuit boards. Or if you have the cash (and a spare printer laying around) one can acquire inkjet printer tanks that have conductive ink in them and print out circuit boards at home. Unfortunately the boards then need to be cured, and so far as I know nobody's got a way to do that at home.
The circuitry in much of the computing hardware we use today is comprised of multiple layers, because there are so many traces connecting so many components that putting them on the top and bottom of the board only isn't enough. Even if the traces are creatively packed there's only so much surface area available on a given board. The solution is to make each layer of the circuit board as thin as possible, make them interconnect by any means necessary, stack the layers in the correct order, and then laminate them together. Circuit traces on different layers of the board are connected with vias - contact pads that bridge layers together. Rik te Winkle wrote a tutorial on how to use EAGLE to draft multi-layer boards, which should be of use, though the exact methods won't apply to our aforementioned F/OSS CAD software the design techniques will.
Some hobbyists have painstakingly etched and cut each layer and soldered them together, which certainly gets the job done. If you've got some spare cash you might want to consider acquiring or building a 3D printer capable of making circuitry, similar to the EX1 from Cartesian. The EX1 is also open source, so at some point hackers are going to start building their own... alternatively, you could print your boards on a more conventional 3D printer and use a conductive material to form the circuit traces.
The reason I mentioned flexible circuitry first is because multiple double sided flexible circuit boards could be stacked and laminated together to create multi-layer circuit boards. Flexible circuitry is very thin and seems amenable to the process. It should be possible to do the same thing with home-etched PCBs but the resulting boards would tend to be thick, heavy, and you might have heat dissipation problems. Nothing that couldn't be mitigated, but those are factors to keep in mind. Again, I have no first hand experience with this, so if anybody knows better please leave a comment. I'm aiming for as practical a solution as possible. After the boards are etched and stuck together, holes for legs of discrete components will need to be drilled, and then the components - from the most basic resistors, diodes, and capacitors to sockets for the integrated circuits - must be placed on the board in the correct locations. Depending on the size of the components used (traditional discrete components through commercially available SMT), the amount of difficulty involved in positioning them on the board is variable. Personally, I would say the process ranges in laboriousness from 'tricky' to 'fuck it'.
Then everything needs to be soldered down, and the circuits need to be tested and probably debugged. This is not unusual for experienced hackers or electrical engineers, it's all a part of the process. There are probably other dynamics that I'm unaware of - my experience with electronics isn't quite rudimentary, but neither am I an expert. Let's just say that I know enough to know that some pretty weird stuff can happen on the bench, but not enough to know what it might be or how to fix it. Good luck.
SoCs and Storage
This brings us along to designs that are rather common even though we don't normally think of them as either common or systems. By this, I refer to SoC's - Systems On A Chip. As the name implies, they are full (or nearly so) computers implemented as single mother-huge silicon chips (relatively speaking). On the die you'll find a CPU or microcontroller, supporting electronics for same, an MMU, and enough interfaces to do whatever you want, be it plug in a USB keyboard and mouse, an Ethernet adapter, or a simple USB-to-serial converter circuit. An excellent example of a SoC is the Broadcom BCM2835, which forms the nucleus of the ultra-cheap RaspberryPi, which has become one of the most popular hardware platforms for hackers since the Arduino due to its low cost and the fact that it runs Debian GNU/Linux right out of the box. You can do almost everything with a RasPi that you can with a full-sized laptop or desktop machine. To be sure it has less RAM (512 megs on the high-end rev.B) than we're used to these days, but it's quite sufficient for such tasks as word processing, browsing the web, building multimedia entertainment centers, emulating other computers, home and vehicular automation, near Earth orbit launches...
The RasPi's firmware, which we don't have the source code for is a sticking point of this platform. As has been mentioned previously, firmware blobs can potentially be doing anything and we wouldn't know unless the time was taken to reverse engineer them. The distributions of Linux and BSD that run on the RasPi seem to have a "whatever it takes" attitude toward getting the job done, and have incorporated code that loads the vendor-supplied firmware. This has caused several projects to flat out refuse to port their OSes to the RasPi.
Another example of a commercial SoC is the Vortex86DX2, which is a 32-bit x86-compatible CPU that also incorporates USB, SATA, PCIe, Ethernet, JTAG, and several other peripherals and interfaces on a single chip. It runs at 800MHz and consumes only a hair over 2 watts of power. The downside is that it's closed source commercial product so you can't audit what might be going on inside the chip. Once again, if you bounce over to OpenCores' System On Chip page (note: that link goes to the projects page because there's no way to link to a specific category) you'll find a collection of open source SoC's that you can flash onto appropriate FPGAs and run with. I would also like to mention that a few hackers have re-implemented entire computers on single FPGAs, from the CPUs all the way to the graphics controllers and audio chipsets. Case in point the Suska, a stem-to-stern reimplementation of the Atari ST on a single chip. In 2012 the Open Source Hardware User Group of London held a seminar on how to implement an OpenRISC SoC on a £50 FPGA board, which demonstrates that doing so is entirely feasible for an emminently reasonable cost. There is also the ORPSoC (OpenRISC Reference Platform System on Chip), which combines an OpenRISC OR1200 CPU with a set of peripherals. Another option is the J-Core Open Processor, a BSD-licensed processor core and system-on-a-chip design that you can compile into a bitstream and flash into a relatively cheap FPGA; it's also been through at least one security audit. A potential risk to the project is incurred in the the FPGAs themselves; some of the more powerful units have lots of processing power on board that could potentially have been subverted somewhere along the line. Caveat hacker.
The trusted and open computer we are hypothesizing is going to need some peripherals so we can interact with it. At a minimum, we need a display, a keyboard, probably a mouse, and storage (volatile and not). If you are able to build your own high-res display comparable to those commercially available these days, please post your instructions so I can mirror them. So far as I know at this moment in time, constructing a display from scratch is probably not feasible. Using a different kind of display, like an older television with an RF modulator is possible but you might not get very high resolution graphics out of it. Even the RasPi's composite video output tops out at 720x576 (on a PAL television), though its HDMI output is considerably better. People like graphics. Going the text-only via serial terminal route is possible but, let's face it, nobody's going to do it. Depending on how hardcore one is, repurposing displays that are sufficiently simple that the circuitry can be eyeballed to determine that it was probably not tampered with is a possibility. Salvaged laptop display panels, for example. However, just to get this show on the road we're going to use a commodity video display, USB keyboard and mouse. We're going to assume that you've located suitable physical RAM for the SoC or CPU that you've selected for the core of the trusted open computing stack. We'll need a graphics chipset of some kind to make this happen; our options range from high-res LCD controller to VGA display adapter to sourcing a discrete graphics chipset and intgrating it into our board.
Another option would be to run our open and trusted system headless, SSH into it, and access applications running on our platform either inside the context of an Xvfb pseudo-display or export the X windows over SSH. We could configure everything as we install the OS to the open and trusted computing platform before booting it if we had to. We would have to produce a USB v2.0 chip to be able to make use of our interfaces. It might be worth our while to investigate an open source I/O controller, which could interface the storage subsystem with the CPU, USB chipset, and whatever other I/O controllers wee added. Then again this is also a strongly x86 PC conceit and there are other potential hardware configurations, some of which might be simpler and use fewer FPGAs (which would reduce both complexity and cost). I'll definitely have to give this some more thought.
What about storage? Hard drives or flash media. Can we make them ourselves right now? Probably not. Hard drives we certainly can't manufacture because we don't have clean rooms or the gear to manufacture hard drive platters, so we'll have to source drives from someplace. Except that we know some of their supporting chips are potentially backdoorable before they get into our custody. We might be able to get away with using flash media like SD cards (I do on my netbook), but flash media has embedded microcontrollers that are hackable, too, and they can potentially be compromised before they get into our custody as well. At this point I think it's safe to say that there are always going to be potential points of external compromise, and short of a post-scarcity microfacture facility there isn't a whole lot that we can realistically do except try to estimate how much risk we're accepting by sourcing certain complex components on the open market. Were I in this position I'd use the same strategy outlined in the FPGA post and try to present as small a moving target evidencing as little of a predictable pattern as possible. Good luck.
We'll need an interface controller for whichever storage we go with. There are implementations of IDE and SCSI floating around. I've heard of one or two open source SATA chipset designs floating around but haven't actually tried any of them. It seems reasonable to me that reading up on software emulations of storage controllers, such as those in VirtualBox's source code would be somewhat enlightening at least. Same with CompactFlash or SD card interface controllers. We'll need a USB chipset as well, and there are implementations of that which can be had (unless we opt for getting our hands on a small batch of them on the open market, or if we salvage a few from old mainboards which we can acquire documentation and pinouts for). We might want to consider getting hold of some after-market USB boards and integrating them (but then we're running a significant risk of compromise there - there's no way of knowing what's really in those chips, is there?)
Are there other means of mass data storage that we can use? Certainly. They're probably too slow to be practical these days, though. Magnetic tapes and floppy disks are still around. Kind of. You'll have to really hunt for them, and probably buy and clear used ones from the surplus market to get hold of a supply. Tapes are extremely slow and floppy disks are even slower. It seems reasonable to state that trading off some processor speed for peace of mind might be okay, but 1980's data transfer speeds are unacceptible. Optical disks can't be used for read/write storage in the same way that a hard drive or solid state media can. So, hard drives or flash media with an appropriate file system (like YafFS, BTRFS, or EXT4) should be sufficient. Of course, you'll want to encrypt your storage media to protect the data in case somebody tried to copy while you were out and about...
Another question which seems reasonable to ask is, can we go to the trouble of building our own mice and keyboards? The answer is "probably." In the recent past people have done so for special purposes. It's possible if you have the equipment to fabricate the necessary physical parts (like a 3D printer), or access to such equipment. It might not be worth doing so when compared to how much trouble is involved because we're discussing building a mainboard and chipset from as far down the stack as possible. We do want to finish this project in a somewhat reasonable period of time given the overall difficulty of the effort. Then again it would be interesting exercise (if only because Kinesis keyboards are stupidly expensive). The amount of effort that would go into developing the microcontroller for a custom keyboard or mouse seems as if it would be miniscule when compared to the effort that would go into constructing ICs and a mainboard.
Firmware and Bootloaders
After rethinking this post a little, I feel a need to caveat things: In a previous post in this series I mentioned the possibility of using an open source System On A Chip because it would simplify the construction process somewhat. I've been doing some more research and I'm not certain that all SoC's (if that is the direction a project like this would go in) require system firmware of the sort we're about to discuss. The Broadcom BCM2835 mentioned earlier, for example, has firmware on board that is sufficient to initialize the hardware and then try to load the OS from an SD card, and that's about it. It doesn't seem to use anything even vaguely PC-like insofar as the early stages of the boot process are concerned. So, let's cover this ground because it may be necessary, and if it isn't it would be handy to have some knowledge of a few open source projects that don't get enough attention. If it's not necessary, then we'll figure out what is when the time comes.
Moving up the stack, now we need core code - a set of instructions that the computer loads into memory and executes to bootstrap the rest of the system, starting with initializing a basic set of peripherals and activating the boot loader. PCs have a BIOS, which you've probably encountered when playing around with your machine's configuration. Apple computers use something different that does much the same thing called UEFI (Unified Extensible Firmware Interface). Sun servers used OpenFirmware for this purpose, which is incidentally open source and bears a BSD license. Examples given, we need something that does the same job that we have transparency into the inner workings of.
We may as well check out Coreboot, a GPL'd open source firmware that aims to boot rapidly and handle errors in an intelligent manner. Coreboot tries to be portable to every common platform we're likely to run into, including x86, x86-64, and ARM architectures. While doing some digging I was surprised to discover that, except for 12 instructions in platform-native assembly language the entire codebase is written in ANSI C, which is about as standard as code gets (and is deliberate because the project wants to facilitate code auditing as one of its stated goals). It is also interesting to note that a number of mainboard vendors have helped work on the codebase. Make code run on your product, people buy your product more. Who knew? Coreboot can load and initiate any ELF executable at boot time and can provide PC-standard BIOS services if necessary, both of which may be useful features insofar as this project is concerned.
OpenBIOS is another such project which is also designed to be portable across as many common platforms as possible. It consists of several modules which are indiginous to one or more platforms:
- OpenFirmware for x86, PowerPC, and ARM which does not require a boot loader and can also directly execute ELF binaries.
- SmartFirmware, which is written entirely in C and seems designed to make it easier to write drivers for PCI expansion cards.
- OpenBOOT, which was designed for the sun4v architecture.
- OpenBIOS, which implements a boot ROM.
- SLim Open Firmware, which provides only what is necessary to kickstart a Linux kernel or a hypervisor.
Most of the OpenBIOS modules rely on lower level code to initialize the hardware, like Coreboot, so if we're careful we can assemble a pretty slim stack of code to bring the trusted open computer to life. Depending upon the architecture the project goes with a different selecction of OpenBIOS modules will be necessary. For the purpose of our discussion this will probably be either OpenFirmware or SmartFirmware given the open source CPUs or SoC's available at this time.
The firmware will have to be cross-compiled on another system, which means that we'll need to start figuring out a trusted software development toolchain. I plan on addressing this in more detail in a later chapter, but for now I will only get the obligatory post-Snowden infosec drinking game mention of Ken Thompson's Trusting Trust paper out of the way (Cheers!) For bonus points, I will link to Dr. David A. Wheeler's dissertation Fully Countering Trusted Trust Through Diverse Double Compiling, which documents a practical technique for detecting Trusting Trust-backdoored software development toolchains (Cheers!) Please read it. Until the post in this series where I talk about it, assume that we've carried out the process and feel reasonably certain that we have a toolchain which cross-compiles for our trusted and open computing platform that will not generate binaries with hidden backdoors in them (including copies of itself). The bootstrap process for the cross-compilation toolchain will take a considerable period of time but once it's done, it's done. Make backup copies of the binaries to read-only media and lock them away for later.
Then the project will probably need a boot loader, which will pull the operating system kernel into RAM and start executing it. If you've built a Linux box or two you're probably familiar with GRUB, which many distributions use as their default bootloader. GRUB2 is the only version of the codebase which is still under active development. It's config files are also baroque and most reliably edited with a third-party graphical app. I don't know how you feel about this, but needing to install a desktop environment and an extra app just to edit a freaking boot loader's configuration files really gets under my skin. However, if you're comfortable editing the GRUB2 config files, more power to you. You're a better lifeform than I.
Moving along to other options LILO is one of the oldest open source bootloaders out there that is still under development. It has a fairly easy to understand and modify configuration file format so it's pretty non-expert friendly. While this seems like a burned-out system admin crabbing for the sake of doing so (and I won't claim that I'm not a burned-out sysadmin), there are advantages to things having simple configuration files. For the threat model we're speculating under (relatively) simpler is better because it means less confusion and a lower chance of setting things up in an insecure manner. Misconfigurations and bugs often hide in situations of extreme complexity and the more complexity we can eliminate the better for us in the long run. I don't know what platforms LILO is known to work on other than x86 and x86-64, they're the only two places I've used it. There are slightly different versions of LILO out there, like SILO for Sparc but I have no first-hand experience with them and will defer to those who have to comment.
Barebox is a fairly new bootloader which is aimed at embedded systems, not too unlike the system we're discussing. It is specifically available for platforms like ARM, MIPS, and Nios (which was designed for FPGA use) and has a list of boards that it already works on. Syslinux is the name of a collection of bootloaders which all do pretty much the same thing just under different circumstances.
- SYSLINUX is used for booting from NTFS and FAT file systems.
- ISOLINUX is used on optical disks for booting from ISO-9660 file systems.
- PXELINUX is used to boot from a server on a local network. This is nice but probably not useful inside of our threat model because it means trusting another machine on the network to serve our trusted machine boot code, and an attacker serving malicious PXEboot code could compromise the project.
- EXTLINUX does the same thing as SYSLINUX but from EXT2, 3, 4 or BTRFS file systems on non-volatile storage.
- MEMDISK is used to boot older operating systems. We're trying to get away from older OSes (like DOS), so we needn't concern ourselves with this one, either.
It would be worth keeping in mind that Syslinux also includes an environment for writing custom modules. This may or may not be useful in the future for, say, verifying cryptographic signatures on certain aspects of the boot process to determine if they've been tampered with or not. Das U-Boot is yet another open source bootloader, only this one was designed with embedded systems in mind and so has been ported to multiple architecures, from PowerPC to ARM. The codebase of Das U-boot was designed so that it's easy to add new boards if the implementation warrants it. The README file lists everything that U-Boot will boot, and it's by far not a short list. Das U-Boot can also boot from lots of storage devices that most systems can't, such as PCMCIA cards, CompactFlash cards, network boot servers, and probably other stuff. It can also be flashed into a boot ROM, which means that it might be of use to this project if a SoC implementation is used.
Now the more-speculative-than-usual bit of this post. With one exception (the distro of Linux which runs on the LatticeMico32 that uses Das U-Boot), I don't know which, if any of these boot loaders have been ported to any of our potential open source hardware platforms. Doing so might be a fairly easy task to accomplish, or it could be a long and drawn out process that few would be willing to undertake. I don't know, having not tried yet. However, if building one's own computers from the ground up becomes a popular thing to do, this might change in a hurry. At the very least, one of the more portable bootloaders (like Das U-Boot) might find itself with many more functioning platforms. Time will tell.
Bootstrapping an Operating System
Now we need an operating system for the trusted, open source computer. As previously mentioned, Windows and MacOSX are out because we can't audit the code, and it is known that weaponized 0-days are stockpiled by some agencies for the purpose of exploitation and remote manipulation of systems, and are also sold on the black and grey markets for varying amounts of money (hundreds to multiple thousands of dollars). It has been observed by experts many a time that software being open source is not a panacea for security. It does, however, mean that the code can be audited for bugs (vulnerabilities and otherwise) via any number of methods, from manual inspection to dynamic analysis. It also means that the OS and its associated applications can be ported to platforms it's not already available on. And, let's face it, the chances that Microsoft will port Windows to our trusted open source platform are Slim and None, and Slim's in his hotel room in Berlin recovering from CCC.
So, what are our options?
Linux is always a good place to start. It's been ported to some pretty strange hardware in the past and there is even at least one full distribution for the LatticeMico32 that we could make use of if we had to. If, previously, the project went with an ARM architecture then we could use a distribution like Arch Linux for ARM, Debian, or Slackware for ARM. If it followed the MIPS path then it seems probable that one of the distros from this extensive list could be used. It is also conceivable that we could repurpose one of the more popular embedded distros of linux, such as OpenWRT as the software platform for this project if we had to (fun fact: OpenWRT has been ported to the x86 platform even though it's actually meant for building embedded devices). We might even be able to cross compile an existing distro using a toolchain like landley/mkroot with a little fiddling, assuming that we can set up an appropriate compilation toolchain. From these links, it seems reasonable to state that if we pick a sufficiently developed and popular hardware platform there is a good chance that a distro will already be available for it.
Moving a little farther afield of the penguin, there are other options available to this project. FreeBSD has already been ported to a couple of embedded(-ish) systems like the Raspberry Pi and the Beaglebone ARM development board. That's a good sign, though it also implies that the software is dictating the hardware again. OpenBSD is already something of a wildcard; love it though I do, it's probably not a good fit for our project unless we show up with a complete set of diffs for the OpenBSD source tree that adds full support for our trusted computing platform. Even then, someone that did so might be turned away. We need to make the best use of our time on an ongoing basis, so OpenBSD's probably off the table. The least-well publicized of the BSDs, NetBSD, has the motto is "Of course it runs NetBSD," because it was designed from the bottom up to be portable to as many platforms as possible. Consequently it runs on some pretty exotic hardware like the Alpha, Super-H, ARC, NeXT 68k, and the Sharp Zaurus. It's also readily portable to embedded systems, which arguably our trusted and open platform is if we're talking CPUs emulated on FPGAs and Systems-On-A-Chip.
NetBSD also makes a point of having a proactive security community and model without needing hype, which may be one of the reasons that it's not quite as well known, so we should not discount it without further research. As a platform it is still under heavy development so it's not going away anytime soon. If NetBSD isn't already ported to our trusted reference platform then chances are it should be pretty easy to do with a little work. I'll not make any suggestions here; I would personally go with whatever works the best. If there is a distribution of Linux already available for our platform then let's use it. If there isn't then we have a few options to explore: We can try to find a distro that runs on a similar platform and try to make it work on our hardware. If it doesn't work we can try to port it over, which might be as simple as recompiling the source tree with our trusted crossdev toolchain. Or not. Or we can try something else suitably creative, which I leave as an exercise to the reader because I'm still working on my first cup of coffee of the day. Alternatively we can try a BSD on our trusted and open hardware platform, probably NetBSD due to its stated goals of extreme portability. It does not stand to reason that somebody would go to all the trouble of designing and debugging a CPU or full system for an FPGA without having anything to run on it, even as a proof of concept to build on top of later.
What? You're not a fan of Linux or BSD? Too bad. Today's threat model explicitly states that you can't trust any closed-source software because many companies are strongly motivated to backdoor their code. Besides, there is too much money to be made in selling 0-day vulnerabilities for many hackers to bother disclosing what they find. Doubly so in today's job market, where it's not unusual for highly skilled people to be out of work for several years at a time. A single 0-day can easily bring in a month's wages. That said, Windows and MacOSX are right out. If our software is open source, then not only do we have a better chance of fiding and fixing vulnerabilities before they become a problem but we also have additional security options that many closed-source OSes don't, such as setting hard and soft limits on resource usage, Address Space Layout Randomization, jails, and others. It's a very long list and I won't go into all of them. There are lots of options open to us.
Now for the obligatory discussion of a famous paper, Ken Thompson's Reflections on Trusting Trust (Cheers!), which describes an attack in which a compiler is boobytrapped to do to things: Detect when certain software (like OpenSSH's sshd or the Linux kernel's TCP/IP stack) is being compiled and insert a backdoor of some kind into it, and detect whenever it's compiling a copy of itself, determine if the compiler has the backdoor injector or not, and put the injector back if it doesn't. The theory is that if the backdoor injector is discovered in the compiler, then the injector can be removed and the compiler suite can be recompiled with the backdoored toolchain... only the backdoor would be silently re-added to the now-considered-clean compiler suite causing a chicken-and-egg problem. It is a problem. It is also a problem with a solution.
Dr. David A. Wheeler wrote as his Ph.D dissertation a paper called Fully Countering Trusting Trust Through Diverse Double Compiling (local mirror). The paper provides a formal proof and a technique by which a potentially backdoored compilation suite can be detected by compiling it from source code multiple times with multiple untrusted (and independent) compilers. The process does not assume or require deterministic builds, which most compiler suites do not generate by default because, at a minimum, compilers tend to put unique data into binaries to identify them, like the date and time of compilation, configuration variables from the compilation environment, varying amounts of debugging information... it's kind of weird when you think about it because it's non-obvious, but there you have it. When carrying out the process the source code (which we can read and understand) and the generated executable are compared and differences (like boobytraps) are noted.
First, our assumptions:
- We have the source code to Compiler A.
- We have the source code to Compiler B.
- We have a working build of Compiler A, which we don't necessarily trust.
- We have a working build of Compiler B, which we don't necessarily trust.
- Compilers A and B are totally different. They're unrelated to one another but do the same thing, i.e. compile the same language into executables appropriate to our environment.
Bruce Schneier did an excellent job of summarizing the technique, which I'll reiterate briefly here using his text as a base (because his overview is what helped me make sense of the paper). Here is roughly, how the diverse compilation process would work:
- Build the source code to Compiler A with Compiler A. This results in Executable X, which is a new copy of Compiler A.
- Build the source code to Compiler A with Compiler B. This results in Executable Y, which is another new copy of Compiler A.
- RESULT: Two copies of compiler A from two different sources which are functionally equivalent to one another, but not bit-for-bit identical.
- Build the source code to Compiler A again with Executable X. This results in Executable V, which is another new copy of Compiler A.
- Build the source code to Compiler A yet again with Executable Y. This results in Executable W, which is another new copy of Compiler A.
- Compare Executables V and W.
- If V and W are bit-for-bit identical, then there is a high probability that your toolchain is not compromised.
- If V and W are not bit-for-bit identical (because they were compiled with the same compiler and the same options), then one or more of your compilers are up to something sketchy.
It's a huge paper (199 pages) but one well worth sitting down and reading. Summaries just aren't enough. Dr. Wheeler published his proof of concept shell script for executing these tests on his website. Glancing at it, it seems fairly well written, and would make an excellent base for a script that meets our purposes nicely. As with many things these days, it's a trust problem. We can mitigate the risk of attackers boobytrapping toolchains by picking compilers from all through the set of all development environments like a frog jumping around on a hotplate. An attacker can't possibly get to all of them before we do, statistically not all versions of any one tool, and if it comes down to it we can use a couple of older toolchains on the hypothesis that newer ones are more likely to be compromised than older ones. All you need is one un-spiked compiler out of the set of all compilers to be able to detect if the generated executables are functionally identical or if one or more are potentially backdoored. At this point, a good test of our probably-not-backdoored toolchain is to try to compile the OS that we intend to run on our trusted and open computing platform. I don't know what OS you would want to run, so the best advice I can give is to start with the OS' developers' documentation. Figure out how they do a cross-compilation build and get to it. If you're concerned that the environment you're cross-compiling on is going to misuse you trusted crossdev toolkit to backdoor the code anyway, you can and should repeat the Wheeler process on our platform before it's ever hooked up to a network. If you need to (and, let's face it, crossdev kits aren't a cakewalk) consider using an open and auditable tool like Scratchbox to set up a trusted cross-compilation chain. I haven't really looked at it but I wish I'd known about it in a previous life...
Time to install our recompiled-for-paranoia's-sake OS. There are two ways that I can see to go about this, which are to build new installation media (like an .ISO image or a file system image that can be pressed onto a USB key), or you can plug one or more storage devices for the trusted and open computing platform into the machine you're bootstrapping from and do a manual install using a process along the lines of Linux From Scratch. I'm not going to go over this process because they literally wrote the book on it. If you opted to rebuild the installation media, now's the time to boot it on the open and trusted computing platform and install it. If you're not already familiar with alternative operating systems, start by reading their documentation, at the very least reading over the installation process twice. By rebuilding the installation media for a distro you've basically done a remaster on it, and if you did it right their installation should work normally. Chances are you'll want to set up encrypted file systems as well. This is when you would do so using the instructions for whatever OS you're installing (Linux, BSD, et cetera). Install as minimal a set of packages as feasible; this is so that the new system has as small a vulnerability footprint as can be managed. Fewer features mean fewer potential vulnerabilities, which is important at this stage of the game. If all goes according to plan (and for the purposes of this series of articles, no unexpected difficulties will have been encountered that could not be overcome) our trusted open workstation should boot to a login prompt.
Login as the root user. Now harden the OS you just installed. I shouldn't have to say this. There are lots of hardening guidelines out there for open source operating systems, from Ubuntu Linux to OpenBSD. Google them, save a local copy, read through the document at least once before you try anything, and then follow the process. If you're feeling sufficiently motivated (or hate doing the same thing over and over again), write a shell script as you go that implements as much of the hardening process as is feasible. You may wish to consider setting up a Git repository to put the configuration files that changed during hardening under verseion control and copy them into appropriate places in the repo. Some people suggest turning your whole /etc directory into a Git repository with an app like etckeeper or metastore. I personally think this is a good idea because it becomes easy to determine if any of the files have been altered, and if so when and how they were altered. Then use software like AIDE or OSSEC to take a snapshot of the state of your file system so you can detect if anything has been changed without your knowing it. Configure a local firewall to deny all traffic that isn't a response to an outbound request, like Ubuntu does. Run as few services listening on the local network as you can get away with. If this sounds like InfoSec 101 stuff, that's because it is. You'd be horrified at the number of entities (people and corporations alike) that don't bother.
All locked down? Good. Now install all extant patches for your OS (unless you downloaded and compiled the latest and greatest of everything). This will probably mean plugging it into your network at this point, but if you're hardcore enough to want to keep your new system airgapped, you'll have to develop processes by which you can sneakernet data onto and out of your isolated system using removable media in such a way that you run less of a risk of getting gigged. This includes diffs, updated packages, and source code. Now start patching. The OS you selected undoubtedly has a toolkit for managing packages, which includes system updates. The construction process for this trusted and open computing platform involved recompiling the installation binaries from scratch with a more-trusted compiler, so we're talking tarballs full of source code. Unpack them or install diffs as appropriate, then start compiling. Alternatively, it should be possible to cross-compile the updates on our bootstrap workstation using the more-trusted crossdev toolchain that we built earlier. It's likely an incredible pain to do this but we went so far as to build our own computers from the ground up, so if integrity is to be maintained that's the route we're going to have to take. This is why a BSD style system updating process is a better fit for what we're trying to accomplish, they've got it down to a science and it takes only a few commands to get everything updated. If you opted to use a BSD for our hypothetical trusted and open computing platform, then there isn't much you have to worry about here so long as the compilation toolchain we built that we're pretty sure hasn't been backdoored is the one used.
If you didn't rebuild your own distro-native packages (say, you used LFS to compile everything from scratch) then read what they have to say about package management and decide what steps you're going to take. The gotcha, and it's a big one is that we compiled everything with a toolchain that we're pretty sure hasn't been backdoored to boobytrap anything we compile with it, so if we want to maintain that sense of trust then we have to keep using it for everything that gets installed, from security updates to new applications. This is where I think I have to advise against airgapping the systems built - to keep the system updated (as well as install new applications) we have to get the necessary code onto the system, and downloading it (via SSL/TLS and checking the cryptographic signatures wherever possible) is the most workable way. If installing updates isn't reasonably easy most people (even folks like us) won't bother, which means vulnerabilities tend to pile up and increase risk of compromise over time. That seems like a pretty big waste of effort to me, going to the lengths discussed in this series and then not maintaining it. The Linux From Scratch FAQ has some suggestions, one of which boils down to "If you did it right, it's safe to compile and install the new version over the old version," which is true but you'll want to keep an eye on things to make sure that cruft doesn't build up.
After pondering it a little bit, this is what I came up with: When you compile the stuff on your root partition - the most basic of systemware - that is one package. root.tar.bz2, if you like. All of that stuff goes under / as one would expect. Everything else, from a native development environment to scripting languages to your SSH client are compiled and turned into installable packages using one of the methods described, be it Arch Linux style .tar.xz files, one of the package formats supported by Checkinstall, installing every package into a separate subdirectory and then symlinking as appropriate (which is a huge pain in the ass and easy to get wrong, I've done it), or some other way that I'm forgetting. At this point I have to leave it up to you, because everyone has a different style of package management and different needs. Speaking only for myself, if I didn't already have a package management system in place I'd use Checkinstall to make Slackware-style .tgz files (which I've done in the past) or if I had enough processing power the newer .txz files. I've used it to build Debian style .deb packages, too.
Now our hypothetical trusted and open computing platform needs applications so you can get real work done. Text editors, scripting languages, officeware, and probably a desktop of some kind. To stick with our security practice of keeping systems as spare as possible, I recommend only installing applications and their dependencies as you need them. Earlier I suggested picking a package management system of some kind if one isn't already a core component of the OS that we recompiled and installed. If you get in the habit of building and using packages now you'll save yourself a lot of heartache later. Trust me on this.
So, we need a user interface of some kind. Just about every operating system comes with a text-mode shell of some kind, be it Windows' cmd.exe, tcsh on MacOSX and FreeBSD, or bash on Linuxes of all kinds. It's the bare minimum to accomplish anything, and chances are when our recompiled OS was installed a basic shell came with it, probably bash. However, seeing as how we're living in the second decade of the twenty-first century chances are you're going to want a GUI of some kind - a desktop, in other words. I am now going to pull out of my ear the statement that the OS the trusted open computer we've been discussing has a framebuffer capable of supporting reasonably high resolution graphics. This implies that Xorg has a good chance of running on it. Thus, it should be possible to install X, a basic graphical toolset, and a desktop environment onto the trusted computer. My advice to you is to pick something relatively lightweight, doesn't require 3D acceleration, and doesn't have much in the way of dependencies. GNOME and KDE are great - they certainly get the job done and they look pretty. On the other hand one has to consider just how much RAM is available on their system of choice and plan accordingly. These two desktops may be too much for the trusted and open computing platform we've been talking about.
In the past couple of years I've fallen in love with LXDE, which is not only easy to learn but lightweight and doesn't have much in the way of dependencies. It works quite nicely on the RasPi and it's more than sufficient for use on some LiveCDs. I haven't used Cinnamon before but it bears a stong resemblence to GNOME v2's user interface. I don't know what its dependencies look like so I can't speak to them. I've also used Razor-QT a fair amount (it's the default desktop in Byzantium Linux these days) and while I don't use it normally neither am I opposed to it. It's certainly lightweight enough to serve as the default desktop of a LiveCD. Whether or not one considers libQT lightweight is not something I plan to address. I don't have a whole lot of experience with XFCE, either, but it's designed to be fairly lightweight so I think it's worth investigating if nothing else.
When picking a desktop environment, choosing application software comes along with the deal. In the end, it's all software that you have to interact with. By application software I'm mean native development toolkits, scripting languages like Perl and Python, office software, web browsers... you know, pretty much everything you need on a day to day basis. So long as there is sufficient RAM and our trusted computer has sufficient processing power (and/or enough CPUs - just because I don't know of any open multiprocessor cores doesn't mean that there aren't any) this hardware stack should be able to operate day in and day out as a replacement workstation, capable of most anything that a store-bought machine, or computer built out of untrusted commercially available modules is. You will probably have to compile everything yourself, if not on the system you bootstrapped from with the trusted compiler than on the computing platform we've been talking about. My advice to you is to keep things as lean as possible so you won't be compiling until the heat death of the universe. You may wish to consider compiling very large packages (like LibreOffice) on larger systems with the trusted crossdev toolchain but smaller packages natively.
I'll say it again: If you're not using package management of some kind start before you compile anything other than the core. Seriously.
Some distributions offer the opportunity to compile software from scratch if you're not installing from pre-built binary packages. For example, Portage, which is the package management system of Gentoo Linux allows for the creation of binary packages after compilation that can be copied to like systems and installed, so you only have to compile once. The BSDs have the ports collection, which makes installing third party software (and dependencies) incredibly easy even though you're compiling from source. Along the lines is pkgsrc, which is very similar to the BSD ports collection but pkgsrc as a whole is intended to be portable across multiple OSes and platforms, including Linux. So, you could install pkgsrc to your newly installed (and very basic) trusted and open computer and compile stuff from it with the more-trusted compilation toolchain, and in addition to getting lots of very useful software you'll get package management included for free. I've worked with it a little bit in the past and there's a learning curve if you're used to installing packages from your distro's repository, but if you take the time to read the documentation you should be okay.
It takes time to compile any reasonably complex package, but the nice thing is that if you do it you'll only have to do it once because you can usually have OS or distro-native packages built at the same time that you can then copy to other machines of the same platform and running the same OS and install them. It's a much quicker prospect, to be sure. Or if you're trying to compile something as large as X or as elaborate as Metasploit, it is possible to install pkgsrc on a reference platform and cross-compile to our trusted platform with the trusted crossdev toolchain, which means that you could throw your eight-core 64-bit everyday machine at the task and have a set of installation packages in much less time.
Now, something that I've been wondering since I started writing this series. How long would it take a relatively small group of hackers to actually make this happen?
I'll be honest with you, I have no idea.
I'm going to highball it and say that the first couple of iterations of the open trusted computing platform would probably take experienced hackers weeks or months of work to get off the ground. Designing the hardware (or finding decent designs), sanity checking them, and building the hardware would probably take weeks. I'm not very good with electronics, my experience is barely what one would expect of a ham radio operator, so that's undoubtedly influencing my estimates. Finding the right components at a good price just takes legwork, though it might be necessary to set up a group buy to get good prices on some of the ICs. Good designs for hardware tend to propagate rapidly through communities, and as they're built and improved upon writeups tend to help less skilled hackers build their own. Certain aspects of the software compilation process could be scripted to speed things up, which would help with the software side of things. Moreover, this seems like the sort of thing that not a few hackerspaces would be likely to attempt, even if only as part of an ongoing computer architecture class.
Well, that's about all I've got. I've undoubtedly made some serious mistakes along the way and also probably outright said some dumb stuff. So it goes. This series of posts has been a learning experience for me almost from step zero. I don't think it's a fully fleshed out plan but I do think it makes most of a good skeleton for such a project. I would love to hear from people who know more about any aspect of this discussion; please leave any helpful insights or suggestions in the comments. I'll answer any questions as best I can (unless someone else gets to them first). At the very least I hope that this text inspires someone who wouldn't otherwise have done so to learn more about the lower level workings of computers and maybe try their hand at programming or electronics.
So, does anyone actually operate this way?
So, after all everything's said and done, you're probably asking yourself "Why would somebody go through all this trouble to build a computer from the ground up? It's never going to be as fast as one that you can buy, so what's the point?"
Ultimately, it comes down to what you're trying to accomplish. If you want the fastest possible CPU, tens of gigabytes of RAM, and four monitors so you can go raiding more efficiently chances are you have a threat model that doesn't approach the level of concern, paranoia, or security requirements that we assumed through the other articles in this series. If you're hoping to precisely replicate a customized laptop to run half a dozen virtual machines, an office suite, and a visual IDE you will probably be disappointed. I have no idea if it will be possible to run a virtualization stack like QEMU or even DOSbox on our free and open laptop. If you're looking to learn how computers operate from the ground up this is a good project; it might even make a good class project that covers a semester or two. If you're willing to trade off raw processor speed for visibility into the CPU's inner workings, a gig or two of RAM to be able to modify and upgrade every last aspect of the motherboard and interfaces, and no (or little) 3D acceleration for a platform that you're much more certain hasn't been compromised in subtle ways, this might be a project of interest (especially if your threat model fits the last 'if'...)
Let's face facts. When working with computers we are forced to make many trade-offs. Code in a language which incorporates many features that let you precisely specify certain functionality and you might be trading off readability. Program in a language which is relatively simple and orthogonal and you'll be trading off by having to write more code later to carry out certain tasks. Ease of use and increased security measures are a trade off that we all make, one which frustrates many end users and causes people to pick legendarily bad passwords. Use a graphical user interface and you often trade off for visibility of inner workings and configurability (though good arguments have been made for the former, even I have to admit). Unfortunately we can't have everything both ways, as much as we might tell ourselves that we can. So, let's consider a couple of real-life case studies of some platforms and use cases that seem inconveniently limited or inordinately difficult to use.
Richard M. Stallman, founder of the Free Software Foundation and the GNU Project is famous for his hard-line stance against closed and non-free software in all shapes and forms. Love him, hate him, or respect him, he's a man who can clearly elucidate his beliefs, how he's come to hold them, and what he does every day to personally uphold and demonstrate them. I mention RMS not to criticise or canonize him but to discuss how he goes about his every day work. He wrote an essay called How I Do My Computing (which he updates periodically, judging by the train of copyright dates at the bottom of the page) that describes in great detail the trade offs he's willing to make to ensure that he uses as much free and libre software as possible. He tried for nearly a decade to find a laptop which not only ran a free operating system but had a free BIOS on the motherboard. For his use cases he says that he prefers text mode applications rather than a desktop environment for getting work done; not everyone is in a position to do this but it works for him. He is best known for refusing to use any closed source, non-free software (there is a difference) on purely ethical grounds. Yet, RMS is able to do everything he needs to do, from browsing the web to writing to checking his e-mail without any of the bells, whistles, and flashy effects that seem to come with tech in the twenty-first century.
It is a common sentiment in the western world today that by admitting that you have made certain choices or taken certain trade-offs you are, in effect, trying to martyr yourself or trying to win people over by making needless sacrifices. This is not the case at all. If one were so inclined, one could make the case that it approaches the pinnacle of arrogance by stating in effect "I'm too lazy to even consider other possibilities, and I'm better than you because I can't be bothered to think." But that's a rant I'd rather not go on.
The RaspberryPi, the latest darling of hackers and makers the world over, was created in response to a specific need: Computers have become so expensive and are perceived as being so fragile that kids are being forbidden to play around with them lest they damage something accidentally. It was created with the intention of making basic computer science more accessible in schools and at home. The 8-bits may of us grew up hacking around with - the Commodores, the Ataris, and the BBC Micros - had their operating systems and programming languages built into chip-ROM, so if something went screwy you could just pull the power, count to five, plug the it back in and everything was fine again. Maybe you'd have to key your program in again if you didn't save it to tape or disk, or maybe you'd try something else that day. Today's computers rarely come with what we'd think of as a programming language installed; there are many out there but whether or not a youngster is allowed to install one on the family computer is a different matter entirely. Whether or not kids are more inclined to learn to program if they have a programming language available to them is something I can't speak to, but I'd be quite surprised if no one has studied this yet. The RasPi was designed to be inexpensive ($35us), easy to use (plug it into the TV, add a cheap keyboard and mouse), and powerful (it runs any number of free and open operating systems, with the thousands of packages available that we're accustomed to). The trade offs are that its clock speed is slower than we're used to thinking of as useful (700 MHz), it has less RAM (256MB for the model A, 512MB for the model B), and has practically no other peripherals on board, instead relying upon USB ports for expandability (though relatively inexpensive USB hubs help mitigate this).
The RaspberryPi also includes a respectable suite of software for kids and students to use right out of the box. Right on the desktop when it starts is Scratch, a visual programming environment which lets kids teach themselves how to write software that does interesting stuff almost immediately, like moving images around on the screen. It lets you create games, interactive stories, and animations by dragging and dropping components and customizing them by typing. If the user is interested in rolling up their sleeves and digging into a more complex programming language two versions of IDLE, a graphical environment for programing in Python are prominently displayed on the desktop. A full version of Mathematica from Wolfram Research, ordinarily a very expensive package has been included with the Pi for free since November of 2013. It boots into a simple desktop environment by default, with all of the important stuff available immediately (like configuring wireless). There is even an app store that makes it easy to install games, applications, and tutorials with a couple of mouse clicks. As for how children take to the RasPi, my only experience with that is how kids interacted with the ones HacDC made available at the Mini-Maker Faire last year.
As an example of what a sufficiently motivated person might do "just because," Dieter Mueller's work comes immediately to mind. Mueller designed and built something that he calls the MT15, which is a CPU that is almost entirely comprised of discrete transistors; a small number of relatively simple integrated circuits are used to assist in instruction decoding, implement the CPU clock, and act as buffers on the data and address buses. As CPUs go it's not terribly fast - 0.5MHz - and it can address up to 128Kb of RAM only. Is it practical for the sort of thing we had in mind for this series of articles? Probably not. I don't have a complete picture of everything it's capable of, so I can't speculate as to what one could actually with with an MT15 if you built one. I think it's safe to say that it's not complex enough to run BSD or Linux on. Maybe you could port FreeDOS to it if you built a suitable cross-compilation toolchain but I think that's about it. As a demonstration of what you can do if you put your mind to it I think it stands as a fine example. If a single person can build a CPU pretty much from the most basic components available and run arbitrary code on it it is certainly possible to build more complex systems.
As another example of the technical limitations people are willing to accept to accomplish their goals, I'd like to talk about the 8-bit demoscene briefly. As you may or may not be aware the 8-bit microcomputers were very limited in what they could do. They ran at processor speeds of just a few MHz (a small fraction of clock speeds today) and had very limited RAM (64KB or 128KB, and sometimes much less). The maximum screen resolutions and color palettes of those systems are less than those of some digital wrist watches today. Humble systems indeed. And yet, hackers do things with them that the machines' creators never conceived of. Sometimes bugs in the architectures themselves were used as features, such as the volume register bug in the SID chip being exploited as a playback mode. To see some of what the Commodore 64 (my beloved 8-bit platform, in point of full disclosure) is capable of you may wish to watch some of the following demos on Youtube:
So... why should you care about a bunch of low res video hacks running on an obsolete machine? The answer is, many of the effects in those videos were never conceived of by the engineers who created the C-64, and if you asked them if their brainchild was capable of any of them they might have shaken their heads and said, "No, it can't and it won't." When you consider those demos in light of the fact that they're running on a machine with 64KB of RAM, a clock speed of 1MHz, and was designed to display only up to 16 colors at a maximum screen resolution of 320x200 pixels, they're pretty impressive. A lot of clever hacking went into writing those demos to surpass the limitations of the hardware. Many of them were written in 6510 machine language. Sometimes the opcodes used were selected because they had useful side effects elsewhere in the system or because they ran in one clock cycle apiece instead of four clock cycles. You can do a lot despite the limitations of your equipment if you're willing to try. Also, you can't always discount something based upon its specs if you're willing to be realistic about what you're trying to accomplish. You don't need a $2kus machine if all you do is mess around on Facebook all day, and you may not need a 3GHz machine with 16GB of RAM to edit documents and write code (though some document formats are incredibly memory inefficient... Powerpoint, I'm looking at you).
Last, and certainly not least, I present an example of a free and libre' computing platform called the Novena. The motherboard is sized so that it'll fit in an average laptop chassis and it packs a fair amount of horsepower on board. It is based around the 64-bit quad-core Cortex A9 CPU running at 1.2 GHz with a GC2000 3D accelerator. It tops out at 4GB of RAM but for the kind of use cases we've been talking about in this series of articles I think that's more than useful. It can boot from a microSD card but it also has a couple of SATA jacks on board so hard drives (or SD cards plugged into adapter sleds) can be used as mass storage. The Novena motherboard has a full complement of slots, ports, and peripheral interfaces that we've come to take for granted today. Most interestingly, the Novante is completely NDA-free. You won't have to sign away any of your freedoms or rights to hack on it; no restrictions on what you can do with or say about it, the development docs are anyone's for the downloading, and it's as hackable as such a thing can be made. All of the files necessary to start fabbing it are available for download.
Why did I mention the Novena? To show that it is, in fact, entirely possible to build exactly what we've been talking about in this series of articles: A full featured, fully operational, as close to open source as is possible general purpose laptop computer that we can use as an everyday workstation. It may not be as cheap as a Beaglebone or a RasPi, but it's something that you can throw into your backpack, use at work, use at home, have fun with, work on serious business with, and take with you on the road as your everyday machine. If Bunnie Huang and Sean Cross can do it, we can do it too.
Copyright and License
This work by Bryce Alexander Lynch is published under a Creative Commons By Attribution / Noncommercial / Share Alike v3.0 License.