MSc thesis assignments in Tribler

Please contact Tribler coordinator Dr. J.A. Pouwelse on the 7th floor of EWI or via email.

Disclaimer: this is just a list to show what we are about. Please drop by for a brainstorm session to identify an assignment that you really like.

We are a group conducting experimental P2P research. All members of the Tribler team have at some point to show that their algorithms work, what their performance is, and how they can be improved. All possible tasks expand the existing work or ongoing research with operational Python code.

New topics for 2014: anonymity and cybercurrency

Tribler now includes experimental code for anonymous streaming. The TOR specification is partially implemented and we are working towards an anonymous streaming test. There is room for several master students to expand upon this work, see their Python code.

  • Create an Youtube clone which is anonymous and uses P2P. Create a hidden service for Bittorrent-like seeding and streaming which offers Youtube-like interface and speed. Specific focus is on reducing startup latency: all videos should start playing within 1-2 seconds. This extreme level of performance can be obtained with aggressive pre-fetching and opportunistic caching. For example, after a users issues a keyword search, all matching swarms are displayed and the P2P client starts already contacting these swarm in parallel. When the P2P client is idle it actively seeks out the popular content, matching the user interest and downloads roughly the first minute of that content, further accelerating the user experience. Users can optionally set their acceleration cache to 10 to 250 GByte.
  • Integrate Elliptic curve Ed25519 inside our anonymity protocol. This curve is not yet used in TOR, but they want to move beyond their old RSA 1024-bit security this curve also. So within your thesis assignment you can overtake TOR in this aspect.

Already five years ago the Tribler team was working on cybercurrency together with Harvard. Then Bitcoin came along and our early analysis of this shows the downsides.

  • Create a credit mining application. Tribler now includes code for credit mining. Your task is to create an easy to use, zero-configuration application which earns credits as quickly as possible. Your code scans all available Bittorrent swarms and evaluates their investment potential. The default is to use all available hard-disk space minus, say, 5 GB of free space.
  • Design and implement micropayments. You can earn credits by running seedboxes and transfers earned credits to friends and other people. A wallet and money transfer app needs to be created. Throw away keys and anonymous credit mixing services will be offered.

Security and performance are critical when doing anonymity. We have the following ideas to work on.

  • Create a secure master swarm discovery channel community. Channels in Tribler are used to discover a collection of Bittorrent swarms, to prevent fake content and reduce spam. Boost the discovery of swarms investment opportunities and metadata channels
  • Improve the channel scalability and robustness. Tens of thousands of swarm should be downloaded in mere seconds.
  • Create a DHT discovery community to discover and join a swarm within 1-2 seconds. Essential feature is to make this NAT-compatible by using the neighbor-invite primitive.

Possible topics

  • SwarmMusic: P2P music app on the iPhone+iPad+Android. Our Libswift UDP-based P2P engine is now running on embedded devices. You are responsible for creating an easy to use portable and free music app. The goal is to get your app listed in the Apple store and the Google market. Key functionality is sharing you music collection using libswift and downloading a few new songs automagically every hour/minute. This song cache can be either just a few minutes in size or contain enough music for a day of playback. The existing SimilarityFunction algorithms can be used to offer a type of personalised radio.
  • for computer engineering students only: Reconfigurable processor for Internet acceleration. Within this project you will use ρ-VEX to improve Internet performance. ρ-VEX is an open source reconfigurable and extensible Very-Long Instruction Word (VLIW) processor. From a purely engineering level, your assignment is to: take C++ code, compile this for FPGA usage and optimize the code. Internet performance can be improved with a reconfigurable and extensible processor by turning it into a "caching router". Seedboxes and CDN content hosting boxes are essential for providing a good Youtube or Bittorrent experience. Using ISP-level caches has proven to improve cost-efficiency of the Internet in general. Content-centric networking is emerging as a field which can make all (multimedia) Internet traffic suitable for caching. background paper. Delft has developed an upcoming IETF Internet standard around P2P video streaming which is fully based on the content-centric networking paradigm. This Open Source P2P engine has been tested and running in the cloud. Your challenge is to try to improve the cost-efficiency of a general purpose implementation using a mix of FPGA or softcore based hardware. What makes this project unique is that in parallel to this task we are creating a cybercurrency, project "bandwidth as a currency". The billion dollar CDN market revolves around bandwidth and fast downloads. Our medium-term goal is to transform the closed CDN market with a cybercurrency and our open IETF standard. Within this context your ρ-VEX processor can be compared to FPGA-based Bitcoin mining hardware.
  • TV of the future using a mediaplayer. Our Libswift UDP-based P2P engine is now running on embedded devices. Create the television of the future by porting Libswift to the NetworkMediaTank Media Player line of boxes. Key for an embedded box are the resource constraints and quick responsiveness (think DirectFB. By building upon existing work you need to move P2P from searching in filenames towards navigating tags clouds and thumbnail previews; using a remote control interface. Several hardware boxes will be available for software development. It is shown here how easy software development is.
  • New programming language for P2P. Design and implement language extensions for Python which enables the programmer to use P2P primitives more easily. For instance, reputations, signed certificates, peer discovery, metadata, content downloads and message exchange should be investigated. Where possible and desirable support is added in the language, similar to network sockets and file access. Goal is cleaner code which is less prone to bugs.
  • Multimedia communities in Tribler. Private Bittorrent communities are very popular due to their superior download performance and lack of spam. These communities are kept clean by moderators, but still rely on a single central server. Private Bittorrent communities lack the ease-of-use and do not the rich interactivity of multimedia communities in Facebook and Youtube. Your job is to moved private Bittorrent communities beyond mere sharing of files. Be the first to implement certain technology in P2P such as collaborative tagging, retro-active subtitling, thumbnail indexes, friend sharing, friend notification and message boards.
  • Improve P2P search results by removing duplicate swarms and improving precision/recall. Improve P2P search algorithms to find more relevant content, improve the search result ranking and filter duplicates. A problem in Bittorrent is that numerous swarms exist with the same video or audio content. Such duplicates degrade system performance because it fragments swarms into smaller ones. For some popular items such as Ubuntu swarms 100+ duplicate swarms exist. Your mission is to solve this problem by identification of duplicates using filenames and file sizes.
  • Real-time cell phone democracy: use live video streaming of cell phones to obtain insight of what is happening. Several volunteers continuously record an event with their cell phone camera and stream it using P2P to potentially millions of viewers. Cell phone democracy can give real-time insight on-the-ground during protests, disasters or other matters. Every viewer would be able to see several live video feeds in a 2x2 or 4x4 matrix of an event.
  • Create a real-time text/video/audio edit and sharing tool scalable to millions of users by using P2P technology to create a blend of SubEthaEdit, Google.Wave, and Twitter. hardcore researchers only
  • Software engineering excellence and stability: system testing, test code coverage, structural bug hunting. Ensure our P2P multimedia engine is stable for years and suitable for standardisation in HTML5 and DVB and Television inclusion. Move beyond unit testing. Cooperation of P2P team with software engineering department, Dr. Martin Pinzger.
  • Extend HTML5 with deep integration of our Tribler P2P engine in Firefox. The new HTML5 video element supported in Firefox would then be able to download using P2P by expanding our BrowserPlugin. This mean a web page can contain both server-based pictures and video and swarm-based multimedia content.
  • The task is to build a proof-of-principle blogging tool which merges the best of Bittorrent with the web, called BittBlogg. Key is that BittBlogg uses a zero-server P2P architecture to download HTML and publish content. BittBlogg can read both existing blogs on central web servers and Bittorrent-based blogs using simple screen scraping techniques and P2P downloads. BittBlogg consists only of three screens, main keyword search page (like Google frontpage), search results, and actual blog posts. A first prototype would use the simple WxHTML function to display the blog posts. A more advanced prototype would use the Webkit lib to render HTML, load pictures, and execute JavaScript.
  • Create a stunning real-time visualisation of Bittorrent swarm dynamic such as Digg example other examples
  • P2P Widgets: The Internet is your Operating System and provides you the code to execute
  • Live playlists which download Bittorrent content on-demand
  • Craft a P2P Radio by generalizing the Video-on-Demand features and expanding with Last.fm functionality
  • Scientific document search and publications e.g. uses Tribler to discover .pdf files and supports keyword search with a simple interface such as http://scholar.google.com/, http://citeseer.ist.psu.edu, and the richer http://www.citeUlike.org.
  • Privacy-by-Design: Make Tribler blatantly unstoppable. Run Tribler from a USB stick with crypto and modify Tribler to transfer Bittorrent messages only inside HTTP GET image.jpg requests plus stenography.
  • Numerous attempts have been made to use P2P to enhance web browsing. The task is to generalize our running code for acceleration of Web 2.0 multimedia content towards any URL. This is done by linking browser caches with P2P and clustering various URLs into a single Bittorrent swarm. For this the Tribler software must be linked inside a browser plug-in.
  • Chat integration using Jabber and XMPP to strengthen the social network features. The challenge is to craft a zero-server implementation by making the central chat server redundant. Enable Tribler client to talk to the other client supporting chat. Existing XMPP Python code: here1, here2, and here3.
  • Effective and scalable real-time recommendation. The overlay network of Tribler is founded on the concept of user similarity. The currently implemented function is too computationally expensive and does not perform well because of sparse data. Can you develop a fast, scalable and function method to derive the similarity between Tribler peers and the available content? Similarity Project
  • Expanding the P2PSimulator with ideas such as emperical datasets, novel algorithms, tagging, etc.
  • P2P GeoWidget: create a generic module for P2P map reading by P2P caching of Google Maps. Uses P2PWidgets as the application framework. Usefull for offline map reading on mobile nomadic devices (iPhone, G1, ..) and as a building block for other widgets using maps.
  • Automatic Debugging of Tribler software
    Software pervades in every aspect of our society. A sharp contrast to the potential blessings of this development is the looming software complexity crisis. At present, complex systems already require millions of lines of code. It is becoming apparent that under the current traditional IT paradigms this huge complexity makes it impossible to deliver software that provides system capability, correctness, and availability, at any reasonable cost. Moreover, there will not be enough skilled IT professionals to design, test, debug, install, configure, operate, diagnose, and maintain such complex systems, assuming such tasks will stay within human ability to begin with. A good example is software debugging of complex software, where a huge amount of resources are spent to uncover defects. A novel approach to software debugging is to use software fault localization (SFL) tools that aid the developer in finding the root-causes of failures. Although SFL is still in its infancy, recent results have shown its great promise as a technique to automatically track down software bugs, providing the developer with a list of the software localtions that are suspect. The aim of the MS project is to assess the value of SFL for the Tribler software stack. Tasks involve automatically instrumenting the software stack, running the stack with known faults thus obtaining profile data, analyzing the profile data using SFL algorithms to see if the faults can be localized, and analyzing the reasons behind the measured debugging performance.

Planning and additional information

More information: TUDelft info page

As Tribler is an experimental research group an msc thesis project usually follows this pattern: detailing of a new idea or problem description, detailed design of a novel solution/algorithm, implementation and performance evaluation. Generic planning for a 9 month timeline:

  Month 1  One page problem description read all relevant literature (seperate literature study?)

           This text will grow and be included in the final thesis

  Month 2  Development of possible solutions

  Month 3  Details of possible solution or perhaps proof-of-principle prototype

  Month 4  Several pages with final architecture and design

           This text should be in a suitable format as a thesis chapter

  Month 5  Development of a prototype and determine which performance graphs can be generated

  Month 6  Development of a prototype and initial writing of implementation thesis chapter

  Month 7  Performance measurements

  Month 8  Expand thesis + detailed measurements

  Month 9  Final thesis writing



Literature research project planning (in Dutch)

Mogelijke Onderzoekstaak planning



Wk_0 Kick-off meeting orientatie Afstuderen:

 Student geeft cijferlijst van behaalde vakken;

 Docent stelt 3 verschillende opdrachten voor;

Wk_1 Focus meeting:

 Student geeft gemotiveerde keuze voor 1 opdracht en

 start literatuur onderzoek;

 Docent geeft meer vakinghoudelijke informatie

Wk_3 Onderwerp afbakening:

 Student overlegt een lijst met gevonden literatuur;

 Docent bepaalt of het onderwerp te breed of te smal is.

Wk_5 Draft literatuurlijst:

 Student heeft uitgebreide literatuurlijst gevonden en gelezen;

 Docent geeft feedback over taxonomie en prioriteiten.

Wk_7 Draft literatuur classificatie:

 Student heeft complete literatuur ingedeelt en weet wat de

 onopgeloste problemen zijn;

 Docent geeft feedback en vertaling naar hoofstukken.

Wk_9 Draft van onderzoeksverslag

 Student overlegt een compleet literatuurverslag en

 "research challenges" sectie;

 Docent geeft feedback en verbeterpunten

Wk_10 Inleveren onderzoeksverslag

 Student levert verslag in en geeft aan welk

 gevonden probleem intressant is voor afstuderen;

 Docent evalueert de afgelopen 10 weken.

Attachments