wiki:IPv8

 Visit forum
 Forum search "IPv8"
 Discuss "IPv8"

IPv8: Peer-to-Peer overlay network

In short: a library for networking in distributed applications based on a P2P-overlay which handles IP changes, strong identities, trust levels, and neighbourhood graphs.

Overview

Problems with the very fabric of The Internet, IPv4, are mounting. The approach of IPv6, Mobile IP, and IPSec is hampered by fundamental architectural problems. A superior solution is moving the intelligence up to a higher layer in the protocol stack and towards the end points.

We have the expertise to design and build innovative P2P overlay software. Our overlay will offer a secure network connection to either a known person or a specific computer which is robust against eavesdropping, man-in-the-middle attacks, peer failure, network failure, packet loss, change of IP numbers, network mobility, and blocking by NAT/Firewalls. Our solution exposes trust and reputation levels to the networking layer to lower the risk of DDOS attacks.

Functionality

IPv8 is an P2P overlay network which unlocks more advanced functionality. Over the coming 5 years we aim to evolve this technology and offer the following functionality:

  • Direct, safe, and robust communication between you and any other node
  • Determine the friendship paths between you and any other node by integrating existing web-based social networks
  • Estimate the trust level between you and any other node
  • Exchange of multimedia information of any size or popularity
  • Transfer of virtual currency (credits) or real money to any other node

Include(wiki:ProtectedSectionMessage)?

ToDo?: Also manage internal network addresses, discover external network address, connect to peers within subnet with internal IP address. Expand with NAT/Firewall puncturing, UDP/HTTP encapculation, user space TCP rate control, relaying through proxies.

Performance and awareness

IPv8 also enables a new interface for performance and network awareness. Currently every application has to guess the available bandwidth, latency, etc. while all this information is availbe in the hidden TCP state. Especially for network-dependent applications this can boost effectiveness and efficiency. (As nicely described years ago by MIT people in the Daytona paper)

TCP manages each stream/connection separately; when working with multiple concurrent streams, TCP has issues. As P2P routinely employs numerous connections, that issues surface . E.g. BitTorrent has 4 upload connection slots - otherwise, Cohen claims, TCP performance is suboptimal.

So, managing all streams by a single control loop may bring some benefits.

Related work

Real world measurements

Security specific

STUN Specific

ICE - Interactive Connectivity Establishment

Uses coordinating servers to enable two NATted peers to talk. Automatically switches to relay techniques when port prediction of symmetric NAT fails.

TCP connection establishment

The aim is to copy the TCP handshake algorithm with the SYN and SYN-ACK packets

SCTP

Planning

  • NAT check: every peer runs the STUN protocol in order to find out the kind of NAT (if any) they are behind, as well as their public address (IP and port) our code
  • NAT timeout: every peer has to find out the timeout of their own NAT for UDP connections
  • UDP hole punching: combining the information above in order to implement UDP hole punching

For the NAT check we are using th STUN algorithm:

Error: Macro Image(STUN_Algorithm.png) failed
Attachment 'wiki:IPv8: STUN_Algorithm.png' does not exist.

P2TP: rate-controlled UDP

32 bits per line
+-UDP-----------+---------------+
| source port   | destination p.|
| length        | checksum      |
+-P2TP----------+---------------+
|fl    pckt seq |fl  timestamp  |
+---------------+---------------+ 

Where
  fl: 2+2=4 bits of flags

    SOP stream open
	set after receiving a correct returned timestamp
    LSS packet loss detected
	set when loss is detected (gap in packet sequence numbers
	which was not closed for some time)
	unset when CLR flag is received
    CLR clear packet loss flag
	set when LSS flag received
	unset when LSS flag is cleared
    YTS returned timestamp
	set when the second field is occupied by a returned timestamp
	unset when the second field is occupied by a forward timestamp

	if SOP is unset, the second field always contains a returned
	timestamp or 0 if no timestamps were received yet; the first
	field contains forward timestamp as no packet sequence numbers
	are meaningful before the stream is open

  pckt seq, 30bits: sequential number of the packet (or forward
        timestamp if SOP=0)

  timestamp, 30 bits: either forward or returned timestamp, used for
        RTT calculations as well as a lightweight security mechanism

Stream initiation is supposed to work as follows: either one or both peers send out an initial packet having SOP=0, forward timestamp in the first field and 0 in the second field as no peer timestamps were received yet. The forward timestamp is set to local time and "encrypted" using peer-ip-and-port as a key (variation of SYN cookies). I plan to use timestamps, not sequence numbers for lightweight security because... I don't know why. RTT varies by less orders of magnitude than transmission rate, so probably that is a better choice. On receiving a SOP=0 datagram, peer's actions depend on whether the returned timestamp looks good. If it is the case, the peer "opens" the stream and sends out a datagram {SOP=1, LSS=0, CLR=1, YTS=1, pckt_seq=i++, timestamp=peer timestamp}; this datagram may already contain some payload as we know that the other end is really responding and not a DDoS victim. In the case no returned timestamp is present, the peer sends out {SOP=0, LSS=0, CLR=1, YTS=1, pckt_seq=own timestamp, timestamp=peer timestamp} without any payload.

Basically, the algorithm mimics TCP's 3-way handshake except both parties may be initiators, simultaneously. Returning of timestamps has to ensure that the other side is really talking the protocol. Ideally, a peer may send out some data one RTT after the stream is initiated.

Packet sequence numbers let the receiver detect gaps resulted from losses. Once loss is indicated, LSS bit is set on. On receiving a packet with LSS bit, the sender adjusts sending rate and starts sending datagrams with CLR bit on until LSS bit is cleared by the other side. (Need to remember the last sequence number for the last state change to ignore packet reordering.) During normal data exchange, peers send datagrams with forward (YTS=0) or returned timestamp (YTS=0), depending on conditions.

The planned rate control algorithm is a variation of TCP CUBIC.

Seemingly, the protocol allows no extensions as every bit is consumed. Still, no data is supposed to reside in SOP=0 datagrams, so we may append some stuff there later to negotiate options/extensions.

Towards a first prototype

Simplify and make concrete decisions.

Problem description V2

The research challenge is to combine cloud technology with database techniques. Can we abstract away IPv4 addresses from applications? Can we create make a single superset of various peers relations into a single flexible storage principle? Can we store peer relations seperated and policy neutral.

Design V2

Focus on abstracting away IPv4 addresses into a simple bindings table which is utilised when actual connection is made.

Peers_Table

IDPeer PermID
01 98f87AA

Bindings_Table

ID IPv4 port timestamp certificate

Design for keeping superset of various relations
Relations_Type_Table

REL_ID Peer Relation Type
0 Peer Traffic Exchange
1 Similar Taste relation
2 Voted as a moderator
3 Facebook Friend
4 n e w

Relations_Table

ID_1 ID_2 REL_ID timestamp Relationship_Strength_Factor certificate

Database synchronisation algorithms

Describe the State of the art for Gossip and DB sync.

See IPv8Datasync

Key aspect of IPv8 is keeping track of 50k or 1 million peers. Research challenge is what database synchronisation algorithm to use.

Existing algorithms and approaches:

Within a lot of the related work in this area a lot of though and complexity lies in conflict resolution. For IPv8 we can explicitly exclude conflict resolution or simplify it to use latest timestamp entry. Thus we can fortunately simplify matters and use the data synchronisation class of algorithms.

Our problem seems different to the prior work. Our key principle is that we operate in a completely untrusted self-organising ecosystem. Thus for each peer we meet we only want records which are signed by peers which have a sufficient reputation. In the Relations_Table we use the Relationship_Strength_Factor to store the weight of this link in the graph. Using betweenness centrality or any other reputation function we can then calculate the reputation score of each peer. As bandwidth is very scarce in P2P we only want to upload records to a peer when the receiver does not yet have them. This leads us to the following:

Research Question: Which architecture supports the exchange of records between
         two peers which encounter each other in a self-organising system where:
         the exchanged records both contain previously unknown information and 
         those records are signed by peers which have a sufficient
         score in the reputation system.

Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible ExtensionsAdomavicius et al.

Planning

  • Expand problem description, design and sync algorithms to readable form
  • Create the Python code for this table
  • Discuss with Tribler people
  • first prototype where 2 peers sync their data
  • Implement, test & compare algorithms for sync