Notes toward the Network 25 unhosted social network application.
Quite a few years (and a couple of re-orgs) ago on the Zero State mailing list we were kicking around the idea of building an unhosted social network to keep in touch, which is to say, a socnet that was implemented only as a single file, with all of the JavaScript and CSS embedded at the end. Some of the ideas included using a distributed hash table so each instance could find the others, as many crazy but feasible ways as possible to bootstrap a new member of the network into the DHT, and using using the browser's built-in local storage database to hold all of the information. A lot of this stuff already exists, from the local storage functionality (which has been there, albeit silently, in every modern browser for years) to the DHT in JavaScript so I think that a fair amount of it would consist of tinker-toying it together. However, and I must confess, the front-end stuff is well beyond me. Not from lack of trying, mind you: The HTML5 and JavaScript classes I've taken over the years were largely toward the goal of making this happen. However... I suck. Web apps are not my thing, unfortunately.
Additionally, this was before I'd ever done any serious information architecture and communications stuff, so you will undoubtedly cringe upon reading some of my assumptions and JSON sketches. Additionally, this was before I discovered PouchDB (which is basically CouchDB in the browser) so a few of my ideas really wouldn't wash today. So, please consider these notes somewhat naive toward the goal of building the application. Please don't facepalm too hard, you'll give yourself a concussion. Maybe somebody will find them useful in their own work.
Eveyrbody saves a copy of the HTML page (with everything it needs to operate included in the file) and accesses it with a web browser. Updates can come from downloading a new version from the Zero State website or Github. Kind of like Tiddlywiki.
CouchDB is a distributed NoSQL database which stores JSON documents rather than information in rows and columns in tables.
Documents are managed with MVCC (Multi-Version Concurrency Control), and the network is designed for eventual consistency.
Automatic conflict resolution.
CouchDB nodes can connect to one another to replicate data very easily.
Designed to operate offline - do what you want with your local copy of the data, and when you reconnect it'll automatically resynch with the rest of the CouchDB network you set up.
Designed for ad-hoc connections - whatever connection it has, when it has it, it can use it to connect to the network.
Lightweight - the proof of concept Twitter-like app (Toast) supported thousands of transactions per second while running on an older Macbook Air.
Accessed in the fashion of a hash table, i.e. with key/value references.
Queries are done with JavaScript.
The API is bog-standard HTTP(S).
Can be interacted with using wget, curl, or a web browser.
Runs on everything from Windows to Android.
Because it includes its own HTTP(S) server, it is capable of hosting apps within the database itself, so you don't need an external framework (like PHP, Rails, or Django).
Apps are written in HTML5 and JavaScript, stored as documents in the database.
Apps are trivial to deploy.
Book is online, free to download and read, can be found on Github also: http://guide.couchdb.org/
We can conceivably use CouchDB to implement Network25 (and whatever other information storage solutions we will eventually want to use).
Installation can be a single installer for every platform - download this, double click on it, when it's ready it'll tell you.
Every document can have its own schema, which is simple to determine because it works just like a hash table (key/value).
Public profile document (gets replicated):
{
_id: "Bryce A. Lynch",
_interests: ["long walks on the beach", "moonlit nights", "distributed systems"],
_friends: ["friend", "friend", ...],
_publickey: "<public key>",
...
}
Private profile document (doesn't get replicated, stored on machine, optionally encrypted and must be unlocked with a passphrase when the Network25 app is started):
{
_id: "Bryce A. Lynch",
_publickey: "<public key",
_privatekey: "<you no can haz>",
...
}
Dump the list of keys and sort through their associated values one at a time.
Because there is no central server - every client is also a server - this also means that we could, in theory, protect arbitrary volumes of data in a given Network25 user's account with public key crypto.
For every user of Network25, there is a document containing the public keys of that user's friends.
Documents can be encrypted to the public keys of only a subset of that user's friends, such that only their private keys can (automatically) decrypt them:
{
_id: "<SHA256 hash here>",
_post: "<cyphertext here>",
_encrypted: "yes",
_authorized_accounts: ["<somebody>", "<somebody>", "<somebody>", "<somebody>", ...],
}
problem: addressing
Practically everybody is behind at least one NATting firewall these days.
IP addresses are dynamic, so you can't count on a buddy being reachable at the same IP for very long.
Having the Network25 app post its current IP address somewhere (a field on a blog, to a mailing list, Tweet) or get a dynamic DNS hostname every time isn't really workable. In fact, it outs the user in obvious ways, and not everyone is okay with that.
Pnother problem: Port forwarding. There are some solutions to this, but not all of them work well, work at all, or are suitable. Multiple layers of NAT make this solution suck.
Solution used by TorChat, which would probably work for us: Tor hidden service addresses
TorChat creates a unique hidden service address for you when you set it up. when you add people to your buddy list, it stores <public key>.onion in the list, and it's up to you to set an alias ("qwertyuiopasdfgh.onion" == "Bryce A. Lynch") on it.
Network25 should be able to do the same thing.
The Zero State is talking about using Tor for general communications in the future, anyway, this would be a perfect time to start.
When the Network25 socnet software starts up, it looks to see if Tor is running, if it has any hidden services configured, and if any of those services correspond to a unique port that Network25 uses.
Shell/batch scripts FTW.
If not found, it tells the Tor daemon to create a hidden service descriptor, copies the public key/.onion hostname into the user's Network25 profile, and announces it to that person's friends so they know where to find them and can start synching databases.
The name of the hidden service is then added to a field in your profile document, so when people friend you on the network they know how to reach you:
{
_id: "Bryce A. Lynch",
_interests: ["long walks on the beach", "moonlit nights", "massively distributed systems",
"tor", "writing stuff about CouchApps in Tomboy"],
_friends: ["friend", "friend", ...],
_publickey: "<public key>",
_toraddress: "qwertyuiopasdfgh.onion",
...
}
This means that CouchDB (configured to use Tor rather than IP address/ports combos) knows how to reach your copy of the socnet software and sync its copies of users' databases (profile, timeline, forums/communities/mailing lists/distribution lists/news feeds).
This also helps authenticate users, in the same way that hidden services are authenticated (there is a corresponding private key which is never shared by Tor). if the public key (.onion) and private key (on your box) don't match, then the service isn't trusted.
Because database creation in CouchDB is cheap, there is no reason why there can't be multiple databases in every user's profile
- user profile
- shared public forum (anologous to the Doctrine Zero mailing list)
- specific forums (public or not) (anologous to zs-p2p, zs-arg mailing lists)
- personal blog
- blogs specific to the projects the user is working on (which themselves can have multiple people posting to them, because they're distributed)
- private blogs/chat forums for specific people
- blog/news feed/private messages from everyone the user has friended in Network25
- database: amon_zero_public_feed
- database: amon_zero_private_messages
- database: amon_zero_philosophical_pontification
- database: bryce_a_lynch_public_feed
- database: bryce_a_lynch_project_byzantium
- database: bryce_a_lynch_3d_printing
- database: bryce_a_lynch_private_messages
- database: zs_med_discussion
- database: zs_game_plot
Restricted databases are only replicated by members that are part of that project or group.
A list of authorized users and their corresponding public keys are part of the database for every forum.
A majority of people in a private forum have to vote to include that person?
All messages are encrypted to the public keys of everyone authorized to participate in that form/replicate that database.
Private databases are only replicated by people they're shared with, i.e., a personal chat feed for one other person is only in two places in the Network25 socnet, your machine and theirs.
Consider making private databases purgeable, i.e., either or both people can have their copy of the socnet software dump the database so that there is no record of the discussion on either side.
This is where PKE or OTR would come into play - even if the database were recovered somehow, it should be difficult for the attacker to figure out what the cyphertext is.
I don't know how easy, or how safe implementing crypto at the level of a CouchApp is.
All of us are going to have running copies of the Tor Browser Bundle, and all of us are going to have copies of the CouchDB stack and Network25 app, so it would be possible to use a crypto.cat-like plugin for the TBB which implements the encryption/decryption/acquisition of a buddy's public key/addition of key to the user's profile database.
How much disk space will this take up? I don't know yet.
Encryption/decryption of data before it enters/leaves the CouchApp? Good question. I don't have enough experience yet with CouchApps to say, but would love to talk to someone who does.
Couchdb listens on 127.0.0.1:5984 by default. Configure Tor to expose port 5984/tcp on <foo>.onion. Voila.
Creating new databases (even remotely) is as simple as making an HTTP request to localhost:5984 consisting of
PUT http://localhost:5984/<database>
Then documents can start being added to it.
The array "acl" has a couple of default values. An an entry "all" mean that anyone can read and replicate it. "private" means that only the author is allowed to access it; an empty entry means the same thing.
From http://www.cmlenz.net/archives/2007/10/couchdb-joins:
Every post and comment is stored as a separate document with the same schema. The key "acl" is an array of usernames that are allowed to see the post. By extension, this is also the list of nodes that are allowed to replicate this document. The field "post" is an integer that increments with every blog post one makes. This field is used to tie together a post and all comments that are tied to it. The key "type" is the type of document, either "post" (a blog post) or "comment" (a comment on a blog post).
blog schema:
{
_id: "autogenerated",
_rev: "autogenerated, too",
acl: ["all", ],
author: "me",
content: "Contents of my blog post",
post: 0,
title: "Frist psot!",
type: "post",
}
Every bookmark one stores is kept as a separate document in a database. All have the same schema. The "acl" key works as defined earlier. Eventually, I'd like to make a JavaScript bookmarklet that makes it much easier to store a bookmark in this database. I'm not sure if bookmarks are going to be private only (i.e., personal) or what.
bookmarks schema:
{
{
title: "",
url: "",
description: "",
tags: ["", ],
categories: ["", ],
acl: ["all", ],
date_added: ["YYYY/MM/DD", "HH:MM", "TZ"],
date_modified: ["YYYY/MM/DD", "HH:MM", "TZ"],
},
}
https://wiki.apache.org/couchdb/PerDocumentAuthorization
One possible Network25 document schema:
date_generated: "2012/12/04 00:00:00 ZST",
user_profile {
chosen_name: "bryce lynch",
email_addresses: ["me@example.com", "my.official.addr@zero.state"],
websites: ["https://drwho.virtadpt.net/", "https://about.me/drwho"],
instant_messanger: [{
network: "Gchat",
protocol: "XMPP",
handle: "another-me@example.com"
},
{
network: "Network25",
protocol: "Toast",
handle: "Bryce A. Lynch"
},
{
network: "zero.state",
protocol: "XMPP",
handle: "bryce.lynch@zero.state"
}
],
public_keys: ["PGP public key here", "another PGP public key here"],
tor_addresses: ["something.onion", "something_else.onion"],
aliases: ["Bryce", "The Doctor"],
gender: "Androgynous",
identification: "Organic sapient with semiautonomous software augmentations",
location: "I am everywhere",
interests: ["information security", "memetics", "programming", "python",
"mesh networks", "tabletop RPGs", "Eclipse Phase"];
skillset: ["system administration", "network design", "system architecture",
"penetration testing", "security research",
],
projects: [{
name: "Network 25",
position: "hacker"
},
{
name: "Zero State Hollywood Movie",
position: "Consultant"
},
{
name: "ZS-Media",
position: "Manager"
},
{
name: "ZS-p2p",
position: "Consultant"
}
],
affiliations: ["HacDC", "Project Byzantium", "The Zero State", "Telecomix",
"The Bavarian Illuminati"],
friends: [{
name: "Amon Zero",
local_name: "Amon",
tor_addresses: ["something.onion", ],
cached_profile_info: {
local copy of user profile
},
cached_profile_last_updated: "datestamp here",
}, ],
}
PouchDB is an extremely tiny implementation of much of the CouchDB API written in JavaScript and is designed to be embedded in web apps. When you access CouchDB apps that use PouchDB, PouchDB transparently proxies requests and caches output from the Couch database. Here's the nifty bit: If you go offline - your cellular reception tanks or you have to disconnect from the local wireless net, you can continue to interact with the web app in question as long as it's in your browser's cache because it has a snapshot of all of the relevant data inside of it. When you go back online, PouchDB synchs up with the CouchDB instance elsewhere...
Try a federated model - hosts replicate to hosts which replicate to yet other hosts.
A number of hosts can be lost without damaging the entire network overmuch. Every node that somebody posts from keeps a copy of its posts as well as any content that it's accessed so when new nodes appear they can get them up to speed in a fairly short time.
What's the upper limit on browser local storage? We're going to max it out at some point.
Any network connection can be used - any port in a storm, you know?
Livejournal-like profile pages, with skillsets and interests and whatnot?
Personal blogs in addition to discussion forums and private messaging?
Status and reputation network? Quantified prestige?
Direct messaging (person to person) encrypted with public key encryption for privacy.
Elliptic curve cryptography for both identification and encryption because it's much faster and easier to generate keys on limited processing power (like mobile devices, which have limited power)?
All network traffic is opportunistically encrypted by default.
Everything is encrypted to the user's public key before it hits persistent storage.
PGP-like web of trust which maintains not only buddy lists in the app but also defines which nodes to contact first for network updates?
Is as persistent as possible when advertising its location to make hooking into the network easier? Every instance posts its contact info (network addressing information) to many different places to make it easy to find and dodge censorship. Posts to Twitter, Tumblr, Pastebin, pads, vanishing message sites, SMS to friends, blog posts, network hubs like pump.io, XMPP chat rooms and account statuses on public servers, the global BitTorrent DHT, personal e-mails, whatever else to get its current network address out there for as many others in the network to find? As difficult to cut off from the rest of the network as is possible. Friends in the buddy list have personal lists of Internet resources they prefer to use to find their friends in the network.
Flood the local network looking for other instances? Bonjour? Link-local XMPP chat?
Make use of the .well-known directories on websites somehow?
Write a plugin that users XMPP servers to circulate content?
Synchronize with local servers to back themselves up? Say I have a server at home running a document database. When I get home and access my local wireless, the Network25 app contacts the database server and dumps its contents to back itself up. Conversely, new Network25 nodes can contact a database server (with permission) and bootstrap themselves by downloading all of the latest stuff to get themselves up to speed.