Distributed Tracker

Status: algorithm design done, re-PEX code merged in mainbranch
svn.tribler.org/abc/branches/mainbranch/Tribler/Core/DecentralizedTracking/repex.py

Motivation: Traditional television is moving online according to numerous studies. This means every point of failure needs to be removed from the architecture, resulting in a fully self-organising system with unbounded scalability called 4th Generation P2P.

Tribler is striving for unbounded scalability. Tribler already addresses the problem of decentralized content discovery using BuddyCast, but the current client still lacks a solution for decentralized tracking of swarms. Previous work on decentralized tracking has been done by Jelle Roozenburg, but further research is required. Current research is being conducted by MSc student Raynor Vliegendhart.

Master thesis work

Implement and deploy a fully distributed tracker in Tribler. Used algorithm is the 2-hop torrentsmell algorithm, described in the research assignment document below. ToDo:

(done) basic cmdline tool for Re-pex handshake
(done) expand tool into swarm measurement crawler
Measure for a few swarms the average response time per client type
- (done) 1min max listen time
- determine liars about PEX capability
(done) Parallel crawling & measurement
- Uniform event format .crawl file
- Timestamp, SHA1, Peer-IP, peer-Port, EVENT, DATA
  - CONNECTION_ESTABLISHED, PeerID
  - CONNECTION_TIMEOUT
  - CONNECTION_CLOSED, bytes-downloaded, bytes-uploaded
  - CONNECTION_LOST (in case of network error), bytes-downloaded, bytes-uploaded
  - EXTEND_SEND
  - EXTEND_RECEIVED, ut_pex=1 (msg#), Transmission 1.34 (client version)
  - PEX_RECEIVED, peer-IP, peer-port, ..., DROPPED:, peer-IP, peer-port, ...
  - (first time) ONLINE_CHECK_SEND
  - (after 1hour,2h,...) ONLINE_CHECK_SEND
  - ONLINE_CHECK_SUCCESS
  - ONLINE_CHECK_TIMEOUT (connection never accepted: offline or busy)
  - ONLINE_CHECK_ERR (network error)
- Also included, but currently only relevant for human readers:
  - TRACKER_PEERS_INJECTED, announce-url, #reported-seeds, #reported-downloaders, peer-IP, peer-port, ...
Experiment around churn rate and decay of PEX info
- Crawl several swarms for 1? hour, collect 10000 peers
- Store all collected IPv4 addresses
- Some peers were heard about multiple times
- check online status and connectability of discovered peers for 1 week (drop if 10x connection unsuccessful)
  - Still online
  - Still seeding
- Determine relation with swarm age (1hour old vs. 3 months old)
Measure bytes used for PEX traffic
Are seeders-only swarms breaking our algorithms? Seeders have no established connections without leechers

Thesis Chapter Layout

Intro
Problem Description
speed, scalability, security, etc
Design of 2-hop TorrentSmell
- remote search, expand with PEX results
- Re-PEX for keeping swarm contact
Implementation and experiments
measuring and understanding PEX (or prev. section?) Graphs of 1 command line experiment
cost per re-pex, interval settings, etc.
Algorithm setting adjustments
Reduced tracker fallback rate, new bandwidth/storage requirements
Deployment?
Deployment crawl results, etc
Conclusions

Swarm crawling measurement

Popular torrent, reportedly 2897 seeders and 826 downloaders, on 2009-09-04, from 10:36 to 11:38.
The graphs below show the bandwidth usage during the crawl, the number of the peers actually PEXed, statistics about the first PEX message, and the likelihood of reconnecting to a previously known peer in the future.

3 Linux ISO swarms
Below is a graph showing the number of peers still connectable after we've last seen them. Note that the big drop at the end can be explained through the lack of data points: only 203 of the 475 initial peers have been checked for their online status.

If we crawl the swarm while trying to stay connected to PEX-capable peers until the connection is dropped by the other party, the graph looks quite different (see below). Peers seem to be less likely connectable after they have closed the connection. Also, while less peers get sent an online check, there is an increasing trend of connectability. This is in line with the results of the "Understanding Churn in Peer-to-Peer Networks" paper, where the authors found the uptime of a peer to be a good indicator of the peer's remaining uptime.

Research Assignment

Ongoing work by Raynor. Large measurements show that The Pirate Bay is the most dominant central tracker. For each tracker we computed the number of total peers that it tracks. This was done by summing up the swarm sizes of all torrents in which a tracker was mentioned in the announce fields. The .torrent files of these swarms were obtained by running the Tribler TorrentCollecting software for roughly a year. Swarm sizes were constantly updated during this time. The top 20 trackers are depicted in the bar chart below:

Dataset: 283,031 swarms, total of 52,634,778 peers.

In this chart, peers that are tracked solely by a tracker are marked as "Single Tracker". Hence, these peers come from a swarm which are only tracked by a single tracker. The remaining peers come from swarms that are tracked by multiple trackers. "Primary Tracker" indicates the peer came from a multi-tracker swarm which is primarily tracked by this tracker, i.e. the tracker is mentioned in the 'announce' field of the torrent file. If the tracker was not mentioned in the 'announce' field (and hence, it must have been mentioned in the 'announce-list' field), we label the peers as "Foreign Tracker".

The swarms presented in this chart where carefully filtered for spam, fakes, and misreportings.

DHT Papers

An Analysis of BitTorrent's Two Kademlia-Based DHTs (PDF, CiteSeer)
Describes the DHT implementations of Mainline and Azureus and their shortcomings.
The Index Poisoning Attack in P2P File Sharing Systems (PDF, CiteSeer)
Describes index poisoning in Overnet, which is a Kademlia-based DHT.
DDoS Attacks by Subverting Membership Management in P2P Systems (PDF, BibTeX)
Discusses the vulnerabilities of DHTs and gossip algorithms and argue that pull-based designs are preferred over push-based designs.
Handling Churn in a DHT (PDF, CiteSeer, TR)
Contains a table on observed session times in various P2P systems.
The Impact of DHT Routing Geometry on Resilience and Proximity (PDF, BibTeX)
Comparison of different DHT routing geometries, including the XOR geometry of Kademlia.
POND: The Power of Zone Overlapping in DHT Networks (PDF, BibTeX)
Discusses the problem of load balancing in DHTs.
Profiling a Million User DHT (PDF, BibTeX)
Studies the Azureus DHT.
Secure routing for structured peer-to-peer overlay networks (PDF, BibTeX)
Discusses secure nodeIds, secure maintenance of routing tables and secure message forwarding in DHTs.

PEX, Gossips, Random/Ant Walks

Understanding the Properties of the BitTorrent Overlay (PDF, BibTeX)
Describes the effect of PEX on the overlay network.
Evolution and Enhancement of BitTorrent Network Topologies (PDF, BibTeX)
Slightly related to the previous paper, but does not cover PEX. Paper proposes modified central tracking algorithm for Small World effect.
Really Truly Trackerless BitTorrent (PDF, BibTeX)
Proposes random walks to discover peers.
Random Walks in Peer-to-Peer Networks (PDF, BibTeX)
Argues random walks are superior to flooding in case of clustered overlay topologies. Discusses a weakly decentralized construction algorithm using random walks.

Gossip-based peer sampling (PDF, BibTeX)
Explores the design space of gossip protocols for peer sampling.
A pheromone-based coordination mechanism applied in P2P (PDF, CiteSeer)
Discusses real and synthetic pheromones and studies how the latter can be used to build a P2P system.
Exploring the Interdisciplinary Connections of Gossip-based Systems (PDF, BibTeX)
Gives a definition of gossip protocols and shows several other algorithms share similarities.

Also of interest: PEXCrawl

BitTorrent Trackers

Availability in BitTorrent Systems (PDF, BibTeX, TR)
Discusses central single and multi-tracker swarms and the availability of the trackers.

Unsorted Papers

A Survey and Comparison of Peer-to-Peer Overlay Network Schemes (BibTeX)
Survey and comparison of various structured and unstructured P2P networks.
Peer-to-Peer Overlay Networks: A Survey (PDF, BibTeX)
A less extensive survey. (Can be removed since the previous paper is more extensive?)
Debunking some myths about structured and unstructured overlays (PDF, BibTeX)
Debunks the myths that structured overlays are expensive to maintain, their topology constraints make it harder to exploit heterogeneity, etc.
Understanding Churn in Peer-to-Peer Networks (PDF, BibTeX)
Contains extensive study of churn.

Previous Work: Little Bird

This section should be merged after research has been completed

Removing the Bittorrent tracker

JelleRoozenburg

In the BitTorrent p2p system all users that download a certain file profit from each other by bartering file pieces. Using this principle, files can be downloaded faster. A group of users that download the same file on the same moment are called a download swarm. The function of the BitTorrent tracker is to find out which users are in the download swarm of a certain file. A secondary task of the tracker is to administrate statistical information about the swarm, for instance which part of the file is downloaded by which users.

Tracker protocol

These tasks are currently implemented in a centralized fashion. A BitTorrent tracker is a server to which users can send a getPeers() request. This is request will be answered by the tracker with a response containing a list of IP addresses and port numbers of peers that are currently in this swarm. Communication with the tracker is done using the HTTP protocol, just like a webserver.

Disadvantages of a centralized tracker

The decentralized design of p2p systems is one of their major advantages. It makes the BitTorrent system flexible, scalable and reliable. The BitTorrent tracker is one of the parts of the system that still has a centralized design. This makes its simple implementation possible, but is formost a disadvantage for BitTorrent.

A centralized tracker is a single point of failure in the BitTorrent system. This means that its failure will cause an interruption of the BitTorrent service, namely: to enable people to join and use download swarms. While the system gives a reliable download environment through thousands of users going online and offline at each moment, a disfunctional tracker immediately stops the download possibility of all new users.

A BitTorrent tracker is also not scalable because of its centralized design. On this moment it is already noticable that trackers have long response delays and peers often have to do more tracker requests due to time-outs. When the number of BitTorrent users will grow in the future this problem will even grow in such a way that building a reliable centralized tracker will be undoable or at least very expensive.

A final disadvantage of the current centralized trackers is that they are an easy target for an attack at the BitTorrent system. A simple distributed denial of service (DOS) attack can stop a tracker and with it thousands of users. With the current use of BitTorrent for the exchange of copyrighted materials, it is obvious that certain parties have motives for such attacks.

Distributed tracker

For the reasons given in the previous section, it shows that the centralized BitTorrent tracker should be replaced by a distributed follow-up. In this section we will explain how this is done and which issues have to be considered when distributing the tracker's functionality over all peers in the swarm. Find initial tracker peers

The most important functionality of a tracker is to give any peer a list of addresses through which other peers in the swarm can be contacted. With a centralized tracker, the tracker location is saved in the .torrent-file so that each peer that starts a download can immediately contact it. When such a central point is not present, then we first have to find the tracker. In the distributed case, the tracker consists of all peers (or more commonly: all peers in the swarm in question). So the problem is: find initial peers that are in the swarm and are part of the distributed tracker.

This problem will be solved using peer gossiping. Using this protocol, each peer that starts to seed a new file over the network or joins an existing swarm, pushes a message to the peers in its neighborhood. This messages contains his permanent identifier, a timestamp and a datastructure (probably a Bloom filter) with all the info hashes of the files it is currently seeding. These messages are forwarded by other peers according to the probabilistic gossiping protocol. In this way, most peers in the network receive the messages.

When a swarm of reasonable size has formed, most peers should have a collection of messages with information about the current members of this swarm. If a peer wants to download the file (so join the swarm), it can simply connect to the peers in its list, check if they are still active swarm members and join in. Get more swarm peers

In the current BitTorrent system, all peers in a swarm periodically reconnect to the tracker to add more swarm members to their peer lists. By doing so, they have knowledge of more peers and the probability of finding a peer with the pieces of the file you need grows.

In the distributed tracker case, when a peer has found some peers using the method of section 'Find initial tracker peers', it will use these peers to enlarge its list of known peers in the swarm. This can be done by asking these peers for their swarm list (all peers in the swarm). Although this sounds very simple, there are different approaches to do this.

Track just seeders?

Would it make sense to gossip only about seeders? A popular file will have fewer seeders than leechers, and the set of seeders will be more stable. In principle, you only need to find one peer, since you can random walk from there to find others. Bittorrent is normally supposed to work even when the swarm includes only leechers, but would it make sense to give up on that and instead try to get more seeders? Actually, the distinction between seeders and leechers is blurry to me. In practice, a peer could advertise itself when 1) its policy has been configured to seed files for a sufficiently long period of time after download completion and 2) it has detected that the user leaves it running all the time and not just when he is downloading a file. --rimey@hiit.fi

DHT and alternatives

Numerous DHT proposals exist to decentralise peer discovery. An advanced proposal uses Skip Graphs to discover content and can be adapted to discover peers. However, security is not fully taken into account and a trivial pollution attack can bring the system down.

Attachments

2-Hop TorrentSmell - DevMeeting 10-07-2009 v2.ppt (1.0MB)
Azureus_DHT_failure.png (41.0KB)
crawl1-bandwidth-pex-zoom.png (75.1KB)
crawl1-online-percentage.png (40.0KB)
crawl1-online.png (52.4KB)
crawl1-pexers.png (25.7KB)
crawl1-pexresponse-combined.png (91.0KB)
crawl1-pexsize-zoom.png (70.4KB)
data_s2=10_i=600_j=60_scan=30.pdf (336.3KB)
graph-singletpbswarm.pdf (12.0MB)
graph-ubuntu-swarm.eps (1.5MB)
linux-keepalive-online-percentage.png (44.0KB)
linux-keepalive-online.png (53.6KB)
linux-online-percentage.png (40.9KB)
linux-online.png (48.3KB)
MSc_Thesic_final_jroozenburg.pdf (1.0MB)
pexquality.eps (33.9KB)
research-20090327.pdf (1.3MB)
research-20090506.pdf (1.3MB)
stats.bz2 (14.9MB)
stats2.bz2 (36.1MB)
thesis-draft2-pexchapter.pdf (1.8MB)
thesis-draft3-implchapter.pdf (6.1MB)
thesis-draft5.pdf (6.6MB)
thesis-final-vliegendhart.pdf (5.8MB)
Top20Hosts-12G.png (18.1KB)
Top20Hosts-166M.png (18.2KB)
Top20Trackers-Peers-SingleMulti.PNG (18.8KB)
Top20Trackers-Peers.PNG (18.4KB)
Top20Trackers-TPB.PNG (35.1KB)
tpb-9swarms-connectivitymatrix.pdf (7.9MB)