One fresh EZTV swarm having total peer count around 4000 was crawled in a straightforward manner: the bot bootstrapped with some peers, did a BitTorrent connection and waited for PEX messages. As a PEX message arrives, the bot attempt to connect to every mentioned peer. The process is recursive. In every experiment, the total number of known peers was close to the tracker's estimation of the swarm size.
The central point of the measurement was to measure the rate of obsolescence of PEXed data. The rate was high.
Three crawls were made in 2008: on 14 Oct 16:16 and 17:16 pm and on 15 Oct on 10:16 am.
On the rate of obsolescence.
So, old PEX data is mostly garbage. And even with the freshest data, we may shortcut a triangle in about 15% of the cases (i.e. connect to a peer who is connected to somebody we are connected to). Lists of known peers of the 1st and 2nd runs intersect by about a half. Between 1 and 3 runs, 334 known peers are common; between 1st and 4th runs, 168 known peers are common. So, peer rotation rate is high. Everything is very fluid and volatile here.
Obvious reasons for peer unconnectability:
UPDATE 16 Oct On the next day, 16 oct, connection success rate is 25% using the same version of software. it really depends on the wind! (swarm lifecycle stage?)
Private tracker experiments
Is it possible?
10% connected => know arrivals, departures of virtually any peer on the swarm, but will not know the connection topology. Full tomography => connect all non-NATed and listen for NATed => will know the connection topology to a large degree.