swift to Tribler Integration

The goal of this work was to build a first implementation of the integration of the novel swift protocol into the current Tribler BitTorrent client. Thanks to this new development, Tribler should be able to use the new transport protocol and obtain data from the swift swarms.

To understand the concepts behind this work, the reader should be familiar with the regular BitTorrent operation (especially Piece Picking techniques used in Tribler) and know in detail the concepts behind the swift protocol (such as Binmaps, LEDBAT, Merkle Hash Trees and the messages of the protocol).

This page is intended to be an extended abstract of the whole report. For further and detailed information, please refer to the attached PDF report.

The code of the implemented modules (/tribler), the modifications done in Libswift (/swift) and the documentation (/doc) can be found on http://svn.tribler.org/abc/branches/guillem/.

Problem Description

Several issues need to be faced when building the protocol integration into Tribler. Once the implementation is working and testing several scenarios, some questions could be answered.

Bandwidth usage

Thanks to the use of the new congestion control algorithm (LEDBAT) Tribler should be able to use all the upload bandwidth without disturbing the user. Due to its design, LEDBAT always yields to other connections, so it is not possible to prioritise its transfers in front of other TCP transfers (which controversially would go against its main design principle).

Discover how LEDBAT behaves when sharing the link with legacy BitTorrent TCP transfers. UDP transfers should back off in front of others, but it would be interesting to discover if LEDBAT efficiently uses the remaining bandwidth. In this case, swift would work as a transfer boost, increasing the overall available bandwidth.

Piece translation

Difference in piece sizes between BitTorrent transfers and swift bin system is also an issue to solve when joining both protocols. swift data bins are 1 KByte long, while BitTorrent uses variable size pieces, split in 16 KBytes chunks.

A piece retrieval coordination mechanism must be designed in order to correctly coordinate both protocols, so the same client can obtain and share all parts of the file from and to both swarms at the same time.

Multi-protocol piece retrieval techniques

Considering BitTorrent and swift were independently designed, each protocol has its own features and its operation mode. The idea for this first integration is to keep BitTorrent as the main transfer protocol, so swift needs to adapt to the BitTorrent operation.

Taking into account the constraints of libswift about bin requesting from external tools, several piece retrieval mechanisms could be used when running Tribler and libswift:

  • Opportunistic piece retrieval: request only one piece from the swift swarm at a time, as if the whole swarm was only one BitTorrent peer. This is the simpler approach but probably the obtained performance would be very low.
  • Bulk data retrieval: request several pieces to the swift swarm, so libswift can work on a longer set of bins to acquire. This approach requires higher control over the peding pieces, but would lead to much better performance. The operation of this mode could be definetely limited by the libswift external requesting mechanism.

Design and Implementation

The current Tribler client is a BitTorrent-based P2P client, with the added feature of using web servers as data source. The goal of this work is to add swift as another supported protocol in Tribler, so data could be obtained from and seeded to swift swarms.

Instead of writing a new full implementation of the swift protocol in Python and use it in Tribler, integrating the current libswift implementation seemed to be a much more feasible choice: reuse a solid and already-working software with better performance because of the C++ compiled code.

C++ Python wrapping

The first open issue to face when integrating libswift into the Tribler client was how to join its Python implementation with C++ libswift's implementation. The most suitable procedure here was to wrap the C++ swift implementation and make it accessible for the Python Tribler's code, as if it was another simple Python module.

Building Python bindings directly using the CPython API is a complex and tedious process, since small changes in the implementation could mean rewriting the whole binding. Moreover, debugging errors is not a trivial job in this case. However, several initiatives exist to make this work lighter: they provide tools to generate the needed bindings and modules from a new small piece of code. The considered tools and their little descriptions are listed in one of the appendixes in the report.

SWIG and Boost.Python seemed to be the most convenient options for libswift. Though its promised better C++ interfacing, its focus on providing access to all other Boost libraries dismissed Boost.Python as the chosen tool for this case. SWIG became the natural choice, thanks to its simplicity but its good enough C++ integration and performance.

Data type management in SWIG is quite simple as long as basic types are used. Since swift implementation declared some non-standard data types in the API, an extra wrapping over the original API was needed. This extra layer declares the same methods the original libswift's API, but using standard or well-known C data types, and adapts the information to perfectly fit into the original methods.

The basic data type methods are then exposed to SWIG, using ANSI C/C++ declarations and special SWIG directives to describe the target methods and the desired result module. From this file, SWIG generates a C/C++ file containing all the wrapper code and Python file describing the new module itself. The C/C++ wrapper file must then be compiled with all other needed code files and linked as a shared library for Unix operating systems or a DLL for Windows environment.

SWIG script

%module swift

%{

#include"pswift.h"

%}



%apply int { int8_t }

%apply unsigned int { uint8_t }

%apply int { int16_t }

%apply unsigned int { uint16_t }

%apply int { int32_t }

%apply unsigned int { uint32_t }

%apply long long { int64_t }

%apply unsigned long long { uint64_t }



%include"pswift.h"



%{

#include "swift.h"



PyObject* python_callback = 0;



void api_callback(int transfer, bin64_t bin) {

        PyObject *arglist;

        PyObject *result;

        PyGILState_STATE gstate;



        if (!PyCallable_Check(python_callback)) {

                PyErr_SetString(PyExc_TypeError, "Python callback must be callable");

                return;

        }

        arglist = Py_BuildValue("(il)", transfer, bin.v);



        gstate = PyGILState_Ensure();



        result = PyEval_CallObject(python_callback, arglist);



        Py_XDECREF(arglist);

        Py_XDECREF(result);



        PyGILState_Release(gstate);

}



void AddProgressCallback(int transfer, uint8_t agg, PyObject *pyfunc) {

        python_callback = pyfunc;

        swift::AddProgressCallback(transfer,api_callback, agg);

        Py_INCREF(pyfunc);

}



void RemoveProgressCallback(int transfer,PyObject *pyfunc) {

        python_callback = pyfunc;

        swift::RemoveProgressCallback(transfer,api_callback);

        Py_INCREF(pyfunc);

}

%}



void AddProgressCallback(int transfer,uint8_t agg, PyObject*);

void RemoveProgressCallback (int transfer,PyObject*);

C++ Python Wrapper API

#include<stdint.h>

/*

 * Binds a socket to the specified ip:port

 * @param address as "ip:port" string

 * @return socket descriptor

 */

int Listen(const char* address);



/*

 * Run send/receive loop for the specified amount of time

 * @param running time

 */

void Loop(int64_t time);



/* 

 * Closes the specified socket. If no socket

 * specified, all open sockets are closed

 * @param sockdesc socket descriptor

 */

void Shutdown(int sockdesc=-1);



/*

 * Opens the file with the given name

 * @param filename as "file.extension" string

 * @param hash in hex string

 * @return file descriptor

 */

int Open(const char* filename, const char* hash="");



/*me

 * Calculate the Root Hash of the specified file 

 * (identified by its file descriptor)

 * @param fd file descriptor

 */

const char* RootMerkleHash(int fd);



/*

 * Closes the file identified by its file

 * descriptor

 * @param fd file descriptor

 */

void Close(int fd);



/*

 * Add a possible peer which participates in a given

 * transmission

 * @param filename as "file.extension" string

 * @param hash in hex string

 */

void AddPeer(const char* address, const char* hash="");



/*

 * Sets the contact information for a tracker peer

 * @param address as "ip:port" string

 */

void SetTracker(const char* address);



/*

 * Size of the file in bytes, 0 if unknown

 * @param fd file descriptor

 * @return bytes

 */

uint64_t Size(int fd);



/*

 * Amount of retrieved and verified data

 * @return bytes

 */

uint64_t Complete(int fd);



/*

 * Returns if a file transfer is completed or not

 * @param fd file descriptor

 * @return boolean

 */

bool IsComplete(int fd);



/*

 * Checks the number of bytes that are complete sequentially,

 * starting from the beginning of the file until the first 

 * not-yet-retrieved packet

 * @param fd file descriptor identifier

 * @return bytes

 */

uint64_t SeqComplete(int fd);



/*

 * Initialize library parameters

 * @param logging activates the debug logging

 */

void Init();



/*

 * Notify about externally acquired data ranges

 * @param transfer file descriptor identifier

 * @param layer of the binmap tree

 * @param offset in the layer of the tree

 */

void ExternallyRetrieved (int transfer, uint8_t layer, uint64_t offset);



/*

 * Limit the scope of the piece picker

 * @param transfer file descriptor identifier

 * @param layer of the binmap tree

 * @param offset in the layer of the tree

 */

void LimitPiecePickerRange(int transfer, uint8_t layer, uint64_t offset);



/*

 * Check if a certain bin is filled or not

 * @param transfer file descriptor identifier

 * @param layer of the binmap tree

 * @param offset in the layer of the tree

 * @return boolean

 */

bool isBinFilled(int transfer, uint8_t layer, uint64_t offset);



/*

 * Check if a certain bin is within another one

 * @param received higher level binmap

 * @param target_layer of the binmap to check if its within

 * @param target_offset in the layer of the tree of the binmap to check if its within

 * @return boolean

 */

bool isBinWithin(uint64_t received, uint8_t target_layer, uint64_t target_offset);

Libswift integration into Tribler

In order to able Tribler to use the new multiparty transport protocol, some integration with the present BitTorrent Download Manager module was needed. The complete solution here would be to design and implement a new swift download manager which was aware of all swift internal matters and worked in coordination with the current BitTorrent Download Manager. However, the immaturity of libswift implementation and specially the idea of abstracting the network as a single data cloud (hiding peer management, piece retrieval and transfer management to the application layer) discouraged this choice.

As a first approach of swift integration in Tribler, the HTTP Downloader could be taken as a model. In this case, a new swift Downloader must be designed, making the swift swarm to be considered a single BitTorrent peer acting as a leecher for the Tribler PiecePicker. The novel module should interact with the regular BitTorrent's piece picker operation and translate the BitTorrent piece requests to swift operation.

High-level module schema, presenting the intended interaction between all of the involved one. Red color indicates the new components designed to integrate swift as a transport protocol for Tribler.

swift metadata in .torrent files

Add the root hash and the default tracker for a swarm into the .torrent file: add a swift dictionary, containing the root hash and a list of trackers and known peers addresses.

The swift Downloader module

The swift Downloader module would use the libswift library through the Python wrapping module described earlier. However, in this case libswift should not acquire all data bins linearly but only acquire the bins corresponding to the BitTorrent pieces ordered by the Tribler PiecePicker.

Piece management

First, two considerations:

  • BitTorrent piece length will be always regarded as a power of 2, so they can be easily converted to swift bins, as one of the pieces corresponds to a branch of the tree identified by a bin from layer N.
  • Only .torrents describing a single file would be managed. As a first implementation, this would keep things simpler and would make easier all the piece management.

BitTorrent pieces to swift bins. Assuming 4 KBytes BitTorrent pieces, each layer 2 bin would stand for a whole BitTorrent piece: in the diagram, bin 3 would stand for BitTorrent piece 0 and bin 11 for piece 1.

The novel module should then determine which bins should be acquired from the swift swarm to fulfill the PiecePicker requests. The easiest procedure here seems to determine which layer of the tree stands for full BitTorrent pieces and then use the piece index as the offset using the layer-and-offset bin notation.

Piece storage

All the obtained pieces should be written to and read from the file described in the .torrent and stored on the disk. While Tribler uses an StorageWrapper to perform all disk-related tasks, libswift writes and read all the obtained bins directly to or from the file.

When using the Python wrapper, the library runs inside the same process as the Python interpreter. Thus, libswift and Tribler can access to the same file as long as they work in separate areas of the file.

To make the StorageWrapper notice about the parts of the file already written by libswift, a new method was written. This method only sets up the swift written parts of the file to be real data, hash check and other BitTorrent specific operations are omitted.

Piece notification

The module should coordinate the operation of both protocols by notifying each other about the progresses done on the other swarm.

It notifies the Tribler PiecePicker which pieces are available to be obtained from the swift swarm. With this information, the PiecePicker can better determine which pieces to retrieve first. This information is maintained by libswift in a binary tree for each remote peer contacted, so some method was needed to obtain it in a way the PiecePicker understood it.

Binmap combination and collapse. Assuming binary trees representing the status of remote peers to be the ones shown in (a), (b) and (c), the result of applying an OR operation to all trees would be the one in (d). If BitTorrent pieces were 4 KBytes, reading the tree to layer 3 would result in the bit string 1001.

The module also notifies libswift about which pieces were already acquired by external means. Therefore, libswift should notify remote peers the corresponding bins were already acquired and can transfer them. Again, the layer-and-offset notation can be used.

A little modification was done on the Tribler StorageWrapper module to execute this action every time a piece is acquired. In both cases, hash checking and data corruption should not be an issue because each piece of software perform independent data integrity checks (even though different techniques are used).

Threading integration

Some constraints limited the operation of the swift integration module:

  • Python interpreter single thread execution (the GIL limits the execution)
  • Tribler's Network Thread running (almost) all the time to manage BitTorrent operation
  • No Python pseudo-multi-threading operation when executing C++ code
  • New thread required to run libswift (calling swift::Loop constantly)

To assure all libswift related tasks were run in the Libswift Thread, a task scheduler mechanism was implemented. All other threads can add a task to execute by this thread and they are stored on a queue. The queued tasks are executed every time the thread is run. Together with a similar feature of the Network Thread, this becomes a communication mechanism between both threads.

Piece retrieval techniques

Tribler includes several piece picker algorithms for BitTorrent, intended to be used in several cases, which work basically following the rarest-first principle, introducing some weight to certain pieces when running on delay or order sensitive cases.

The first approach was to make the swift Downloader work as a BitTorrent Downloader, requesting the PiecePicker for a single piece index and setting the corresponding bin range in the libswift limiter as the only active one. However, the constraints of the time-based libswift execution created an issue: if the limited range corresponing to a full BitTorrent piece is small and can be acquired in little time, neither libswift nor BitTorrent will download more data until the Network Thread runs again and so the BitTorrent dowloaders.

Experiments and Results

The goal of these tests was to evaluate the operation of the new solution and the performance obtained when combining swift and BitTorrent under different conditions.

Testing tools and procedures

Description of the tools and procedures used to run the experiments.

Simple BitTorrent tracker and seeder

Starting a BitTorrent tracker and seeder with Tribler is quite simple using the already developed tool dirtrackerseeder.

This script reads a directory looking for .torrent files, which then are offered in a BitTorrent tracker and seeded as a regular BitTorrent seeder. Data files must be also present in the same directory.

Logging features

The measured variable in the experiments will be the instant download rate reached by each protocol. The easiest way to measure this is to save a timestamp and the number of incoming bytes each time a full BitTorrent is acquired.

Pyhton logging tools were used to add to the swift Downloader module the feature of saving a file with the timestamp and the piece size for both protocols. From that information, it is easy to calculate the instant bandwidth and the accumulate mean bandwidth, the information shown in the following experiment results.

Bandwidth estimation

The bandwidth used by each protocol can be easily estimated from the logged timestamps and the BitTorrent piece size in bytes. The instant bandwidth is obtained dividing the amount of obtained data since the last timestamp by the difference between the current and the previous timestamp. The mean is obtained by dividing all the obtained data by the time since the beggining of the module operation.

Operation assessment

Assess the correct operation of the implemented integration, making sure the piece retrieval and the protocol coordination worked as expected.

The scenario used to run this test was composed by one server acting as a tracker and seeder for BitTorrent and another server acting as a tracker/seeder for swift. The expected result was BitTorrent downloading data at a fixed rate and libswift obtaining also some data.

Assessment results. As expected, BitTorrent only downloaded at the limited rate of 8 Mbps, but swift bandwidth (around 2.5 Mbps) indicates some pieces were obtained from the swift swarm.

With the obtained results and after checking the integrity of the obtained file, it can be stated than the implemented integration of libswift into the Tribler code works in terms of piece retrieval and acquiring coordination.

Performance measurement

After making sure the protocol coordination worked fine, the performance of the implementation needed to be assessed. To do so, the same scenario described earlier was used.

The procedure here was to test the solution under different rate conditions. Therefore, the same test was run in 3 more situations: running Tribler with the BitTorrent rate unlimited; the rate limited to 40 Mbps and to 2 Mbps.

Bandwidth evolution when BitTorrent not limited. As shown, BitTorrent works close the the link limit and swift uses only around 2.5 Mbps.

Bandwidth evolution when BitTorrent limited to 40 Mbps. BitTorrent works only at around 40 Mbps but swift keeps using only around 2.5 Mbps.

Bandwidth evolution when BitTorrent limited to 2 Mbps. Again, while BitTorrent uses only 2 Mbps (so the link is almost empty), swift again only uses around 2.5 Mbps.

After running all the experiments and considering also the results obtained in the operation assessment, it can be clearly stated that the maximum download rate obtained from the swift swarm when using the built software was around 2.5 Mbps, independently of the traffic present in the network link.

Although the obtained results were not promising, this work pretended to be a first approach to protocol integration. The reasons most likely to be behind of this low performance are the time based operation used when integrating the newly created thread with the Tribler Network Thread and the single-piece requesting mechanism used to interact with libswift.