Removing the Bittorrent tracker
In the BitTorrent p2p system all users that download a certain file profit from each other by bartering file pieces. Using this principle, files can be downloaded faster. A group of users that download the same file on the same moment are called a download swarm. The function of the BitTorrent tracker is to find out which users are in the download swarm of a certain file. A secondary task of the tracker is to administrate statistical information about the swarm, for instance which part of the file is downloaded by which users.
Tracker protocol
These tasks are currently implemented in a centralized fashion. A BitTorrent tracker is a server to which users can send a getPeers() request. This is request will be answered by the tracker with a response containing a list of IP addresses and port numbers of peers that are currently in this swarm. Communication with the tracker is done using the HTTP protocol, just like a webserver.
Disadvantages of a centralized tracker
The decentralized design of p2p systems is one of their major advantages. It makes the BitTorrent system flexible, scalable and reliable. The BitTorrent tracker is one of the parts of the system that still has a centralized design. This makes its simple implementation possible, but is formost a disadvantage for BitTorrent.
A centralized tracker is a single point of failure in the BitTorrent system. This means that its failure will cause an interruption of the BitTorrent service, namely: to enable people to join and use download swarms. While the system gives a reliable download environment through thousands of users going online and offline at each moment, a disfunctional tracker immediately stops the download possibility of all new users.
A BitTorrent tracker is also not scalable because of its centralized design. On this moment it is already noticable that trackers have long response delays and peers often have to do more tracker requests due to time-outs. When the number of BitTorrent users will grow in the future this problem will even grow in such a way that building a reliable centralized tracker will be undoable or at least very expensive.
A final disadvantage of the current centralized trackers is that they are an easy target for an attack at the BitTorrent system. A simple distributed denial of service (DOS) attack can stop a tracker and with it thousands of users. With the current use of BitTorrent for the exchange of copyrighted materials, it is obvious that certain parties have motives for such attacks.
Distributed tracker
For the reasons given in the previous section, it shows that the centralized BitTorrent tracker should be replaced by a distributed follow-up. In this section we will explain how this is done and which issues have to be considered when distributing the tracker's functionality over all peers in the swarm. Find initial tracker peers
The most important functionality of a tracker is to give any peer a list of addresses through which other peers in the swarm can be contacted. With a centralized tracker, the tracker location is saved in the .torrent-file so that each peer that starts a download can immediately contact it. When such a central point is not present, then we first have to find the tracker. In the distributed case, the tracker consists of all peers (or more commonly: all peers in the swarm in question). So the problem is: find initial peers that are in the swarm and are part of the distributed tracker.
This problem will be solved using peer gossiping. Using this protocol, each peer that starts to seed a new file over the network or joins an existing swarm, pushes a message to the peers in its neighborhood. This messages contains his permanent identifier, a timestamp and a datastructure (probably a Bloom filter) with all the info hashes of the files it is currently seeding. These messages are forwarded by other peers according to the probabilistic gossiping protocol. In this way, most peers in the network receive the messages.
When a swarm of reasonable size has formed, most peers should have a collection of messages with information about the current members of this swarm. If a peer wants to download the file (so join the swarm), it can simply connect to the peers in its list, check if they are still active swarm members and join in. Get more swarm peers
In the current BitTorrent system, all peers in a swarm periodically reconnect to the tracker to add more swarm members to their peer lists. By doing so, they have knowledge of more peers and the probability of finding a peer with the pieces of the file you need grows.
In the distributed tracker case, when a peer has found some peers using the method of section 'Find initial tracker peers', it will use these peers to enlarge its list of known peers in the swarm. This can be done by asking these peers for their swarm list (all peers in the swarm). Although this sounds very simple, there are different approaches to do this.
Track just seeders?
Would it make sense to gossip only about seeders? A popular file will have fewer seeders than leechers, and the set of seeders will be more stable. In principle, you only need to find one peer, since you can random walk from there to find others. Bittorrent is normally supposed to work even when the swarm includes only leechers, but would it make sense to give up on that and instead try to get more seeders? Actually, the distinction between seeders and leechers is blurry to me. In practice, a peer could advertise itself when 1) its policy has been configured to seed files for a sufficiently long period of time after download completion and 2) it has detected that the user leaves it running all the time and not just when he is downloading a file. --rimey@hiit.fi
DHT and alternatives
Numerous DHT proposals exist to decentralise peer discovery. An advanced proposal uses Skip Graphs to discover content and can be adapted to discover peers. However, security is not fully taken into account and a trivial pollution attack can bring the system down.
Attachments
- MSc_Thesic_final_jroozenburg.pdf (1.0 MB) -
Master thesis of Jelle Roozenburg titled: Secure Decentralized Swarm Discovery in Tribler
, added by jelle on 01/03/07 10:35:46.
