Nexus P2P trust network
This is a preliminary draft description of the Nexus P2P trust network.
After a long search for an easily deploy-able existing solution, I did not find one and so began developing one. So far, I am the only developer working on it. I'm about half-way through.
- 1 Design goals
- 2 Status
- 3 Questions
- 3.1 Intro
- 3.2 Where do you want to store member profiles?
- 3.3 Will there be redundant copies on different machines, or will each profile live on one machine?
- 3.4 What about people switching off their home computers - will that disrupt anything?
- 3.5 How can I know that the network node I'm currently connected to delivers authentic information?
- 3.6 How does a search work? Where do you want to store the search index?
- 3.7 References
- 3.8 Conclusion
Operates on a swarm of networked personal computers, like existing file-sharing networks. Minimal or no corporate server involvement. The heart of the system is a Kademlia-like network. There is an unavoidable trade-off of speed vs. privacy and immunity from control. I opted for the later, but will try to optimize speed.
Free and open software. I'm not familiar with licenses, but am leaning toward GPL2. Could use some advice.
Should be as universal as possible. Considered a Firefox browser add-on, but went with Java because of the availability of the UDP data protocol and many other advantages.
Functionality limited to searching for and evaluating the reputation of individuals or communities. Usable from a GUI or XML-RPC interface.
The heart of Nexus is a Kademlia-like distributed P2P network and is awaiting large scale testing. Data can be stored and retrieved from it. One or more layers will be built on top of it to complete the functionality.
I would like to set this up for community development once a license is chosen and if anyone else wants to help.
Matrixpoint 12:42, 20 October 2009 (UTC)
Yeah, things are moving!
Some questions I have:
Where do you want to store member profiles?
A fundamental design specification is to keep Nexus as lightweight and as fast as possible. An identity record on Nexus would be not much more than a handle, a set of descriptive tags, a link to a profile on a social website, an expiration date, a public encryption key, and a digital signature.
If applying the public key to the record (less the signature) produces a match to the signature, then it is confirmed that the owner of the private key corresponding to the public key is the individual who created the record. This step is automatically performed by Nexus.
The website profile would include a means of contact (such as email). If an email containing some random data (or just a unique message) is encrypted with the public key and sent to the owner of the profile, and that person responds with an email containing the random data, then the original sender now has a reliable means of contacting the true owner of the Nexus record and has the option of having a private (encrypted) communication with that owner. That owner can confirm that the profile used to contact him is legitimate.
Typically, the owner of a Nexus record would have many profiles on different social websites. That owner would pick one of them to represent his identity to Nexus. He could actually create several independent Nexus identities if he chooses.
The essence of a virtual identity on Nexus is the public key. It can be used to reliably connect to the real person behind the identity as long as that person keeps his private key secret. A virtual identity starts out with no reputation and must acquire one over time. This is done through a second kind of record, a reference record.
A Nexus reference record contains at least a rating code but perhaps also a text reference. It also contains the handles and public keys of the owner of the reference and the object of the reference, an expiration date, and a signature made from the owner's private key.
Each record stored in Nexus has a search key (probably invisible to the end-user). A search key can have multiple records associated with it. Identity records can be retrieved with keys like identity:handle, reference records with keys like reference-for:handle or reference-by:handle. Several different identities might use the same handle, so a handle is only a convenient approximation to an identity. The retrieved references are matched to the correct identities using the public keys.
There are also tag records in Nexus, used for searching for identities using descriptive tags. The design of this could go several ways, but the basic idea is that these records only contain handles.
So, all of the data that is moved and stored on Nexus is relatively small in size. No web pages, no pictures or other media. This is important for speed especially since there is much redundancy.
Please note that I have not actually implemented the layer that manages structured records yet. What is currently implemented can only store (with redundancy) and retrieve unstructured data records using a search key. But the lower-level mechanisms for bootstrapping and maintaining a distributed network with many nodes frequently entering and leaving is now implemented.
Will there be redundant copies on different machines, or will each profile live on one machine?
There will be redundant copies of all nexus records, probably about 20 live (online) copies. No profile information is stored. If the owner of a nexus identity record wants to change the link to his profile, he must create a new identity record. All records on nexus have an expiration date. So two competing records referring to the same identity are resolved by the most recent update. It is up to the owner of an identity to periodically refresh or update his identity and the references he creates. He can specify the lifetime of his records, but there would be a system-wide maximum time limit.
What about people switching off their home computers - will that disrupt anything?
The Nexus network frequently refreshes itself. If one of the twenty copies of a record goes offline, it would soon be automatically replaced. The current version does not store any records offline. It is very dynamic, almost like a living thing. In the even of a total simultaneous internet failure, or the unlikely event that all twenty copies would disappear before the next refresh cycle, the worst consequence is that the searchable data is gone until the owner goes online and refreshes it (automatically). For this application, I don't think that's very serious. It would be easy enough to provide for local storage of records on multiple PC's, but it would be interesting to try to avoid this.
Right now, the only things locally stored between sessions are:
- a list of IP addresses to enable quick reconnection to the network. This list is updated every session with known active nodes.
- the identity record and reference records created by the local node owner.
- the public/private key pair of the owner.
How can I know that the network node I'm currently connected to delivers authentic information?
The Nexus network is a swarm of interconnected computers. Anyone who can run a java program on their computer can become a node in the network if they have even one IP address of another live node. The could get this in a number of ways:
- from the saved list of nodes from the last session
- someone gives them a live node address via email, chat, web page, etc.
- automatically from a range of dyndns.com domain names with a predetermined pattern like nexus-001.dyndns.com, nexus-002.dyndns.com, etc. that the Nexus software can automatically scan. Some members of the network would have to set up one of the domains for the benefit of the whole network. But this measure would only be needed for first contact or after a long absence.
One of the unique features of Nexus is that the hash codes of its DHT (Dynamic Hash Table) are computed from the IP addresses of the nodes. This is a protection against manipulation of the network, such as introducing a split in the node grid.
But the data itself can always be verified by making use of the public encryption keys and digital signatures that are part of each identity or reference record.
The network could allow for access through an XML-RPC port on nodes with known domain names (such as from dyndns.com), but they would not be peers in the network, and could probably be tricked by an impostor. But in the end the data acquired would not stand up to validity checks.
I'm not sure it's a good idea to allow such access to the network. An individual node might be overwhelmed with requests (unless the requests were automatically dished off to other nodes). I would rather have all users of Nexus also be peer nodes. Since the Nexus platform is Java, the big hospex websites could participate in Nexus by embedding a Java applet in one of their web pages (making each of their users a node of Nexus) (this is another reason to keep Nexus very lightweight), or else by creating a PHP version of Nexus to run on their server, or at least making it available to download.
How does a search work? Where do you want to store the search index?
As explained above, there would be no centralized index residing on a server somewhere. The index would be distributed across the Nexus network in the form of tag records.
This is probably the most challenging and interesting part of the project. It is really still an open question in P2P network technology. I haven't worked out the details yet but I have several ideas.
There are some fairly good open-source implementations in use such as eMule that we could imitate. P2P range searches are the most challenging, and they are an active area of academic research already producing usable results. A few recent papers are available on the internet.
In keeping with the minimalist philosophy of this project, I suggest that searching should be highly structured according to a standardized format suitable for hospex, which would make implementation much easier. But innovative ideas are welcome.
Dedicated Search engines?
If I understand right, there is nothing that would stop a bigger player (such as Google) to crawl all the records and store them on a central server, with a search index, and associated with the profile information on the social networks - if this profile information is publicly available. Maybe that is not what you intended, but this would be a possible solution for search: Have one or more services that offer search features.
I hadn't thought of it, but you are right.
Although the 3M's talked a lot about privacy and dark nets, this particular project is for people who want to be found (up to the level of detail provided in their profile), and don't mind being the subject of references. So there's no need to protect the network even from Google. Third-party server-based searching would enhance the network, providing speed and caching, and reducing the network load. But I would want to design the network so that it could function completely independently of any third-party server.
As for persons who are sensitive about privacy, they can hide within a community. The community can create a profile which gains a reputation on the trust network based on how well they internally vouch for their own members.
If I leave a negative reference to someone, how can I be sure that it will be displayed on this person's profile? And, how can you prevent anonymous people from leaving fake negative references?
A universal problem with P2P networks is that there is no guarantee that all records corresponding to a key will be found on any given attempt, especially when first connected when the DHT is only partially filled. (P2P file sharing networks take a few minutes to get up to speed). Redundancy does a lot to offset this weakness. The reliability is quite high, but not perfect. The positive side is that there is no censoring possible.
I think the magic needs to happen at the moment that the profile is rendered: The system will have some rules to find those references that are relevant to the visitor, and then decorate the social network profile with this additional information.
Yes, there needs to be an algorithm that gives more weight to references by those you highly trust, somewhat less weight to references by friends of those you highly trust and so on. A reference by someone with whom there is no prior trust relationship would carry little or no weight. There are trust metrics already well-established that take into account the various paths of trust between two entities.
I imagined that the result of a search would be a list of candidates (along with perhaps a thumbprint picture that could be part of the identity record or obtained from the profile), a link to the profile, and a trust ranking, and maybe a display of trust paths to this person. Clicking on a link would render a profile in a browser and as you say some magic might be needed to embed the trust info into the rendering.
Typically, the results of a P2P search are displayed as they come in over some seconds or minutes. Visually on the display, that would correspond to a growing list of entities, with possibly changing values of the trust measures as the results come in. This is not like what people are used to from a central database, and is one of the reasons why P2P networks have limited popularity. Anything we can do to improve on this would help popularity.
- I imagine that aggregation services will spring up for that purpose, similar to the search engines mentioned above. They will be powered by central servers and thus not be truly P2P, but at least they don't own your data.
This means: Either the social network where you created your account needs to be aware of Nexus, or you need to view the profile information through another service that adds the Nexus information - which can be a Firefox extension, or a website.
It might be possible to handle the rendering within the Nexus platform. (Java is known for it's vast range of libraries. There must be a rendering engine. Java can do almost anything.) Or, Nexus can fetch the html page, modify it, and then pipe it to the browser in some way. This would be a nice little piece of magic.
- Please keep in mind that this should work in internet cafés!
There could be "reference realms", their purpose being to decide which references are relevant and which are not. In addition, these entities could offer services similar to couchsurfing's MDST. The difference being that you can switch to a different reference realm if you feel they do a bad job.
Possibly, a whole digital eco-system could spring up around this. That's fine with me. I like your concept of competing reference realms. I used to study genetic algorithms, and it's amazing what self-organized complexity that can spontaneously occur if you create the right kind of framework for that to happen. Incidentally, from the beginning we've kept the possibility in mind that this trust network could grow beyond hospex into bartering, etc.; anything requiring trust.
Great thoughts. Thanks!
Basically, I would like to see answered all the questions I posted on Decentralized networks.
Each time when I thought myself about decentralized networks, it was these questions that stopped me.
Thanks! -- Lemon-head 22:17, 20 October 2009 (UTC)
Thanks for your interest! Matrixpoint 12:01, 21 October 2009 (UTC)