Discussion:
Tracker scrape extension proposal
Nick Johnson
2005-01-12 05:11:03 UTC
Permalink
A project I'm working on at the moment requires fetching scrape data
from trackers. Often, I have multiple torrents on the same tracker that
I want scrape data for, so fetching scrape data for each file
individually is probably very inefficient. On the other hand, the
tracker's full scrape might be huge (multiple megs!) depending on how
many torrents are on the tracker that I don't care about.

With this in mind, I have a couple of proposals for how to improve this:
-Add a GET parameter 'hashes', which is a string that's a multiple of 20
bytes. Each 20 bytes corresponds to one info_hash that you want the
tracker to return. If the tracker recognises the extension, it returns
only the requested keys, otherwise it returns all keys (since this would
be the default behaviour of trackers getting unknown get parameters
anyway - hopefully!)
Due to limitations on GET string length, this is probably limited to
about 100 torrents before problems start to crop up. URLs like this are
quite likely to result in messes in log files, too. :/

-Somewhat more complicated: The client supplies a Bloom Filter
(http://en.wikipedia.org/wiki/Bloom_filter) to the tracker (using a
defined set of hash functions), with the appropriate entry in the filter
set for each torrent it wants. The nature of the Bloom Filter guarantees
all the requested entries will be returned, but extra entries may be
returned as well. This way, an unlimited number of hashes can be
requested with a static request size, just with an increasing
false-positive rate.

Other suggestions are welcomed. It just seems to me that, at the moment,
for fetching 100 torrents from a tracker that's tracking 1000, either
option is rather inefficient.

-Nick Johnson



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Harold Feit - Depthstrike.com Administrator
2005-01-12 05:39:07 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

The current scrape system allows multiple info_hash= values to be
passed to a tracker. It is up to individual tracker package
developers to support this.

Your proposal does allow for more torrents to be scraped in a given
pass, but it's probably not 100% necessary.

- -----Original Message-----
From: Nick Johnson [mailto:***@notdot.net]
Sent: Wednesday, January 12, 2005 1:11 AM
To: ***@yahoogroups.com
Subject: [BitTorrent] Tracker scrape extension proposal


A project I'm working on at the moment requires fetching scrape data
from trackers. Often, I have multiple torrents on the same tracker
that
I want scrape data for, so fetching scrape data for each file
individually is probably very inefficient. On the other hand, the
tracker's full scrape might be huge (multiple megs!) depending on how
many torrents are on the tracker that I don't care about.

With this in mind, I have a couple of proposals for how to improve
this:
- -Add a GET parameter 'hashes', which is a string that's a multiple of
20
bytes. Each 20 bytes corresponds to one info_hash that you want the
tracker to return. If the tracker recognises the extension, it
returns
only the requested keys, otherwise it returns all keys (since this
would
be the default behaviour of trackers getting unknown get parameters
anyway - hopefully!)
Due to limitations on GET string length, this is probably limited to
about 100 torrents before problems start to crop up. URLs like this
are
quite likely to result in messes in log files, too. :/

- -Somewhat more complicated: The client supplies a Bloom Filter
(http://en.wikipedia.org/wiki/Bloom_filter) to the tracker (using a
defined set of hash functions), with the appropriate entry in the
filter
set for each torrent it wants. The nature of the Bloom Filter
guarantees
all the requested entries will be returned, but extra entries may be
returned as well. This way, an unlimited number of hashes can be
requested with a static request size, just with an increasing
false-positive rate.

Other suggestions are welcomed. It just seems to me that, at the
moment,
for fetching 100 torrents from a tracker that's tracking 1000, either
option is rather inefficient.

- -Nick Johnson

- --
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.10 - Release Date: 1/10/2005


-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0.3

iQEVAwUBQeS3+l8nceBm0DUaAQK62Af/fk/iuMWokwRfLwg0zaQ0CJrKRJj6RWJl
PIM8E1+We6fbKx+WFoj0ey2U+wNAhlxqg611cR2p+OBlpTaupmq13t4PkcDngxoj
52l0BGPf02W6ebeJALJEWazyQoxaMomst6ENBVCRQvg4UFMs9lCZ4F1TlzrsFrKK
+qe1dX1HV3lbwCsPJF8PIuGIwz2Ip3Mq63QLVJcIcEmKl9xBila8BRd7IeV/XmoY
BSaC3IwAupwzmwFUrigGr9DnMK0WJu8ZfKoxnoBquXSH6WxDzwBuso+aXH0iRZNc
WeIyCqJjekra50eIoExdsEQWIrXVnndB4R28wF21jpXyYCpOAfWxfg==
=05gg
-----END PGP SIGNATURE-----





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Nick Johnson
2005-01-12 21:48:46 UTC
Permalink
Is this documented? I didn't see it mentioned on the unofficial wiki
that more than one value could be provided.
The fallback behaviour here seems wrong, too. Many systems, such as PHP,
won't support multiple GET values with the same key (in PHP's case,
unless the key is appended with []), so they won't support it, and many
other trackers probably won't either. If a tracker doesn't support it,
it'll only return one key, which is entirely the wrong behaviour if
you're trying to retrieve information on 100 different torrents.

-Nick Johnson
Post by Harold Feit - Depthstrike.com Administrator
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
The current scrape system allows multiple info_hash= values to be
passed to a tracker. It is up to individual tracker package
developers to support this.
Your proposal does allow for more torrents to be scraped in a given
pass, but it's probably not 100% necessary.
- -----Original Message-----
Sent: Wednesday, January 12, 2005 1:11 AM
Subject: [BitTorrent] Tracker scrape extension proposal
A project I'm working on at the moment requires fetching scrape data
from trackers. Often, I have multiple torrents on the same tracker
that
I want scrape data for, so fetching scrape data for each file
individually is probably very inefficient. On the other hand, the
tracker's full scrape might be huge (multiple megs!) depending on how
many torrents are on the tracker that I don't care about.
With this in mind, I have a couple of proposals for how to improve
- -Add a GET parameter 'hashes', which is a string that's a multiple of
20
bytes. Each 20 bytes corresponds to one info_hash that you want the
tracker to return. If the tracker recognises the extension, it
returns
only the requested keys, otherwise it returns all keys (since this
would
be the default behaviour of trackers getting unknown get parameters
anyway - hopefully!)
Due to limitations on GET string length, this is probably limited to
about 100 torrents before problems start to crop up. URLs like this
are
quite likely to result in messes in log files, too. :/
- -Somewhat more complicated: The client supplies a Bloom Filter
(http://en.wikipedia.org/wiki/Bloom_filter) to the tracker (using a
defined set of hash functions), with the appropriate entry in the
filter
set for each torrent it wants. The nature of the Bloom Filter
guarantees
all the requested entries will be returned, but extra entries may be
returned as well. This way, an unlimited number of hashes can be
requested with a static request size, just with an increasing
false-positive rate.
Other suggestions are welcomed. It just seems to me that, at the
moment,
for fetching 100 torrents from a tracker that's tracking 1000, either
option is rather inefficient.
- -Nick Johnson
- --
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.10 - Release Date: 1/10/2005
-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0.3
iQEVAwUBQeS3+l8nceBm0DUaAQK62Af/fk/iuMWokwRfLwg0zaQ0CJrKRJj6RWJl
PIM8E1+We6fbKx+WFoj0ey2U+wNAhlxqg611cR2p+OBlpTaupmq13t4PkcDngxoj
52l0BGPf02W6ebeJALJEWazyQoxaMomst6ENBVCRQvg4UFMs9lCZ4F1TlzrsFrKK
+qe1dX1HV3lbwCsPJF8PIuGIwz2Ip3Mq63QLVJcIcEmKl9xBila8BRd7IeV/XmoY
BSaC3IwAupwzmwFUrigGr9DnMK0WJu8ZfKoxnoBquXSH6WxDzwBuso+aXH0iRZNc
WeIyCqJjekra50eIoExdsEQWIrXVnndB4R28wF21jpXyYCpOAfWxfg==
=05gg
-----END PGP SIGNATURE-----
Yahoo! Groups Links
!DSPAM:41e4b8d7169281354111247!
Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Harold Feit - Depthstrike.com Administrator
2005-01-13 00:07:28 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm not sure how documented it is, but there are trackers that DO
support multiple info_hash values being passed to them in a single
scrape.

The trackers that do not support multiple info_hash values in a
scrape call return either the first or last info_hash value passed to
them, typically last such as the case with PHP, but sometimes first
depending on the way the tracker is programmed.

Quite simply, if a tracker's native platform supports handling
multiple info_hash values being passed to it, all of the info_hash
values passed through the get values are searched in the tracker's
database and returned to the client in a single reply formatted as if
a full scrape was called and the only torrents that exist on the
tracker are the ones listed in the info_hash keys ( example tracker:
http://tracker.scarywater.net:443/scrape ).

If you're scraping for 100 torrents, you may be better off just
getting the full scrape anyway. As wrong as the behavior appears for
multiple info_hash values, the way that it is handled is fairly
consistent across different archtypes of trackers (although is more
dependent on the archtypes of programming styles). Additional
trackers will be supporting this in the near future.

- -----Original Message-----
From: Nick Johnson [mailto:***@notdot.net]
Sent: Wednesday, January 12, 2005 5:49 PM
To: ***@yahoogroups.com
Subject: Re: [BitTorrent] Tracker scrape extension proposal


Is this documented? I didn't see it mentioned on the unofficial wiki
that more than one value could be provided.
The fallback behaviour here seems wrong, too. Many systems, such as
PHP,
won't support multiple GET values with the same key (in PHP's case,
unless the key is appended with []), so they won't support it, and
many
other trackers probably won't either. If a tracker doesn't support
it,
it'll only return one key, which is entirely the wrong behaviour if
you're trying to retrieve information on 100 different torrents.

- -Nick Johnson


- --
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.10 - Release Date: 1/10/2005


-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0.3

iQEVAwUBQeW7v18nceBm0DUaAQJTSAgAwTNWGHElbwBnmo8GGOnICJSS1XQ8DYq3
vartxhNqQ06Lr86cS0GrbvyBhYpCr18xKaNV7ON/beTJr2cDMWemwtcsuEWVlm+m
956De64Od6e4m7upxMZIBHwb41dhzDGtsQc66Y07JhGjVEIB42ktsbBKoqOvYEnL
IiuIYJr+JFEhaPCy1EMJZpqzeqpG8h8Zgm/ypexdwY7S1iUnSHnB/HctWncgqdi6
nm37ixn0aUu/gU92ivTdb2f51CyTBc/Vr7VewOUBaLqSHDJBOptL0n6ns07tiAiU
6o3qP9GLkcvMQ4miJ1qL4UJN+LWcmLb//KO5BnvmDnk+/2RE0P66Cg==
=PKPL
-----END PGP SIGNATURE-----





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Nick Johnson
2005-01-13 01:04:52 UTC
Permalink
On 13/01/2005, at 1:07 PM, Harold Feit - Depthstrike.com Administrator
Post by Harold Feit - Depthstrike.com Administrator
If you're scraping for 100 torrents, you may be better off just
getting the full scrape anyway. As wrong as the behavior appears for
multiple info_hash values, the way that it is handled is fairly
consistent across different archtypes of trackers (although is more
dependent on the archtypes of programming styles). Additional
trackers will be supporting this in the near future.
This isn't such a good idea when the tracker has literally thousands of
torrents, though. I'd really like to see an extension that has at least
semi-official support, and for which the fallback behaviour when not
supported is all torrents, rather than a single one. If you request
multiple hashes, it makes no sense for a client not supporting the
feature to return only a single one of them. :/

-Nick Johnson




Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Harold Feit - Depthstrike.com Administrator
2005-01-13 01:35:38 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Taking a look at the way get values are generated in cases of
multiple paramaters of the same key typically overwrites the value in
default configurations, preventing other values from being available.
Other configurations only read the first value, not allowing access
to those after it.

Clients that don't support asking for multiple scrapes in the same
request will only request 1 at a time, consistent with the way they
have done in the past.

For tracker-side development, I feel your suggestion of returning all
torrents when processing multiple info_hash values isn't supported is
quite damaging to the tracker (pointing back to your "thousands of
torrents" tracker example). If you requested the scrape data for 5
torrents and got 5000, you would be doing more damage to the tracker
than getting 5 separate requests.

- -----Original Message-----
From: Nick Johnson [mailto:***@notdot.net]
Sent: Wednesday, January 12, 2005 9:05 PM
To: ***@yahoogroups.com
Subject: Re: [BitTorrent] Tracker scrape extension proposal


On 13/01/2005, at 1:07 PM, Harold Feit - Depthstrike.com
Administrator
Post by Harold Feit - Depthstrike.com Administrator
If you're scraping for 100 torrents, you may be better off just
getting the full scrape anyway. As wrong as the behavior appears
for multiple info_hash values, the way that it is handled is fairly
consistent across different archtypes of trackers (although is more
dependent on the archtypes of programming styles). Additional
trackers will be supporting this in the near future.
This isn't such a good idea when the tracker has literally thousands
of
torrents, though. I'd really like to see an extension that has at
least
semi-official support, and for which the fallback behaviour when not
supported is all torrents, rather than a single one. If you request
multiple hashes, it makes no sense for a client not supporting the
feature to return only a single one of them. :/

- -Nick Johnson

- --
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.10 - Release Date: 1/10/2005


-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0.3

iQEVAwUBQeXQal8nceBm0DUaAQL6TAf/b4dOLXFzt0nVREC1WM7TR9l1yrOxPfMJ
9daW9l+7+BgzLkToLj/0s8pRY6yhMoTJDjQe3HLbAz+bVOcaVUJZvH+xxrgllcou
U84tiqVVIJ3BQkBjAgKHu6DJu9jVQhgp7nOPbVGYjxNENyEks6LXgC2j8dOU8oPf
mPDzACpChJcusdg6WtRAxZMCrdt4bGjv+kxJBlHsg7g+BXsbJKHdAeLH3bkhNbuu
hZdzWlIQGoYLmelThmLRvNurq6jR0Sv+CV5ElmnaKTig/wbnxhmXDX44D7ZOiy29
fN7p1e/tGfll0qBNBzQaZRBn6QRtlhSHgfWEl2uo3wHKIGq8PhmzTQ==
=QaNz
-----END PGP SIGNATURE-----





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Nick Johnson
2005-01-13 03:23:17 UTC
Permalink
Post by Harold Feit - Depthstrike.com Administrator
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Taking a look at the way get values are generated in cases of
multiple paramaters of the same key typically overwrites the value in
default configurations, preventing other values from being available.
Other configurations only read the first value, not allowing access
to those after it.
Clients that don't support asking for multiple scrapes in the same
request will only request 1 at a time, consistent with the way they
have done in the past.
For tracker-side development, I feel your suggestion of returning all
torrents when processing multiple info_hash values isn't supported is
quite damaging to the tracker (pointing back to your "thousands of
torrents" tracker example). If you requested the scrape data for 5
torrents and got 5000, you would be doing more damage to the tracker
than getting 5 separate requests.
Then perhaps two mechanisms is better. At the moment, I'm requesting the
full scrape for each tracker, because individual requests are usually
inefficient. If I could specify a list of desired keys and have it give
me them or everything, this would be an improvement in efficiency. The
one that only wants 5 (but really, how often will that happen in
reality?) can use the current mechanism.
As it stands, if I try and use the current system to request more than
one hash, I'm quite likely to get a response with only one, wasting a
request.

-Nick Johnson



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Harold Feit - Depthstrike.com Administrator
2005-01-13 05:16:26 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

The client that scrapes and wants five is more common than you think.
Azureus' scrape code allows it to scan the torrents (both active and
not) in the list and scrape all the common-tracker torrents in one
request (when the tracker supports it). It's not uncommon to have a
tracker with 4000 torrents listed and have one client request scrapes
for batches of 5-10 at a time.

How is requesting the scrape information for several dozen (of
several thousand) torrents and getting the scrape information for
several thousand more efficient? It sounds quite wasteful if you ask
me, especially considering the wide support for connection
keep-alives in both clients and trackers.

Because of limitations in the HTTP protocols, there's a limit to the
number of torrents you can request the scrape information for before
the protocol breaks the request, and average users won't reach that
limit very often.

As for your "wasted request" theory, the clients that support
gathering multiple scrapes in one connection KNOW when they request
mulitple scrapes, and KNOW that in the case of them only getting one
back that the tracker doesn't support multiscrape. After finding out
one way or another if multiscrape is supported, they either wait for
the next cycle (in the case of it being supporteD) or request the
remaining scrapes in separate, individual requests.

Two mechanisms is just more code to break trackers AND clients right
now. I've started work on adding support for the current multiscrape
handling in the trackers I help program.

- -----Original Message-----
From: Nick Johnson [mailto:***@notdot.net]
Sent: Wednesday, January 12, 2005 11:23 PM
To: ***@yahoogroups.com
Subject: Re: [BitTorrent] Tracker scrape extension proposal
Post by Harold Feit - Depthstrike.com Administrator
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Taking a look at the way get values are generated in cases of
multiple paramaters of the same key typically overwrites the value
in default configurations, preventing other values from being
available. Other configurations only read the first value, not
allowing access
to those after it.
Clients that don't support asking for multiple scrapes in the same
request will only request 1 at a time, consistent with the way they
have done in the past.
For tracker-side development, I feel your suggestion of returning
all torrents when processing multiple info_hash values isn't
supported is quite damaging to the tracker (pointing back to your
"thousands of
torrents" tracker example). If you requested the scrape data for 5
torrents and got 5000, you would be doing more damage to the tracker
than getting 5 separate requests.
Then perhaps two mechanisms is better. At the moment, I'm requesting
the
full scrape for each tracker, because individual requests are usually
inefficient. If I could specify a list of desired keys and have it
give
me them or everything, this would be an improvement in efficiency.
The
one that only wants 5 (but really, how often will that happen in
reality?) can use the current mechanism.
As it stands, if I try and use the current system to request more
than
one hash, I'm quite likely to get a response with only one, wasting a
request.

- -Nick Johnson

- --
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.10 - Release Date: 1/10/2005


-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0.3

iQEVAwUBQeYEKV8nceBm0DUaAQLpTQf+NZP339Mxj4jElu/U+wMbEe+4Iy4XQLVK
QNz6x33KCGWdcdHtd5f21GyqpYvKH37Zx3jfI69HH7IbdIPUtjih8ktMoWZJKlzZ
wqPQCTeHoeo0Rr+Id+VP3iwYJtpmc6NS4JP+qf8s/r+gZ9Fy8Dxf+7Mekq+LuSVW
Pbzd7xNX1lSpfLIJsqjXzoLFPEIsqIpKjrX7XHfvyBK9AYqcZfczVgVwNAPFkqEB
yZH6ZK2GtWc91pXcvORkTc9+/POLMg7f1ZkuBfPVs7IM8KNOsGCRxb9boza/eDBH
7vH+e8TuN9qVbedljXf9xV2+SVJSrXvdmk2o9nRCurev7Fb0MJirhA==
=O5hE
-----END PGP SIGNATURE-----





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-01-13 17:25:55 UTC
Permalink
Post by Nick Johnson
On 13/01/2005, at 1:07 PM, Harold Feit - Depthstrike.com Administrator
Post by Harold Feit - Depthstrike.com Administrator
If you're scraping for 100 torrents, you may be better off just
getting the full scrape anyway. As wrong as the behavior appears for
multiple info_hash values, the way that it is handled is fairly
consistent across different archtypes of trackers (although is more
dependent on the archtypes of programming styles). Additional
trackers will be supporting this in the near future.
This isn't such a good idea when the tracker has literally thousands of
torrents, though. I'd really like to see an extension that has at least
1000 torrents would result in 32 kb of 'real' info. That's assuming
bencoding overhead is compressed away.
100 single-scrape requests/responses already use 50 kb (IP/TCP/HTTP
overhead).
So a full-scrape might be the best option.
Post by Nick Johnson
semi-official support, and for which the fallback behaviour when not
supported is all torrents, rather than a single one. If you request
multiple hashes, it makes no sense for a client not supporting the
feature to return only a single one of them. :/
-Nick Johnson
Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Nick Johnson
2005-01-16 22:38:34 UTC
Permalink
Post by Olaf van der Spek
Post by Nick Johnson
On 13/01/2005, at 1:07 PM, Harold Feit - Depthstrike.com Administrator
Post by Harold Feit - Depthstrike.com Administrator
If you're scraping for 100 torrents, you may be better off just
getting the full scrape anyway. As wrong as the behavior appears for
multiple info_hash values, the way that it is handled is fairly
consistent across different archtypes of trackers (although is more
dependent on the archtypes of programming styles). Additional
trackers will be supporting this in the near future.
This isn't such a good idea when the tracker has literally thousands of
torrents, though. I'd really like to see an extension that has at least
1000 torrents would result in 32 kb of 'real' info. That's assuming
bencoding overhead is compressed away.
100 single-scrape requests/responses already use 50 kb (IP/TCP/HTTP
overhead).
So a full-scrape might be the best option.
I've seen tracker scrape pages that are 3-4MB. When I want 100 hashes
from the tracker, individual requests aren't particularaly efficient,
but nor is fetching everything. I'd still rather fetch everything than
hit the tracker with 100 HTTP requests, though. This is why I think it'd
be useful to have a mechanism that:
1) Falls back to retrieving everything
2) Can be easily supported by any CGI system, rather than relying on
specifying multiple values for a single variable, something PHP does not
easily support.

-Nick Johnson



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-01-16 23:19:42 UTC
Permalink
Post by Nick Johnson
Post by Olaf van der Spek
Post by Nick Johnson
On 13/01/2005, at 1:07 PM, Harold Feit - Depthstrike.com Administrator
Post by Harold Feit - Depthstrike.com Administrator
If you're scraping for 100 torrents, you may be better off just
getting the full scrape anyway. As wrong as the behavior appears for
multiple info_hash values, the way that it is handled is fairly
consistent across different archtypes of trackers (although is more
dependent on the archtypes of programming styles). Additional
trackers will be supporting this in the near future.
This isn't such a good idea when the tracker has literally thousands of
torrents, though. I'd really like to see an extension that has at least
1000 torrents would result in 32 kb of 'real' info. That's assuming
bencoding overhead is compressed away.
100 single-scrape requests/responses already use 50 kb (IP/TCP/HTTP
overhead).
So a full-scrape might be the best option.
I've seen tracker scrape pages that are 3-4MB. When I want 100 hashes
With gzip compression? Or without?
3 mb is a lot.
Post by Nick Johnson
from the tracker, individual requests aren't particularaly efficient,
but nor is fetching everything. I'd still rather fetch everything than
hit the tracker with 100 HTTP requests, though. This is why I think it'd
Why?
100 single-torrent requests consume far less than 3 mb.
Post by Nick Johnson
1) Falls back to retrieving everything
2) Can be easily supported by any CGI system, rather than relying on
specifying multiple values for a single variable, something PHP does not
easily support.
-Nick Johnson
Yahoo! Groups Links
Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Roger Pate
2005-01-17 00:17:02 UTC
Permalink
Post by Olaf van der Spek
Post by Nick Johnson
from the tracker, individual requests aren't particularaly efficient,
but nor is fetching everything. I'd still rather fetch everything than
hit the tracker with 100 HTTP requests, though. This is why I think it'd
Why?
100 single-torrent requests consume far less than 3 mb.
Each HTTP request consumes server resources, even if the data transferred is small. One hundred requests is significant.

How trackers support keep-alive? That would help, but not eliminate, this problem.



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-01-17 09:51:38 UTC
Permalink
Post by Roger Pate
Post by Olaf van der Spek
Post by Nick Johnson
from the tracker, individual requests aren't particularaly efficient,
but nor is fetching everything. I'd still rather fetch everything than
hit the tracker with 100 HTTP requests, though. This is why I think it'd
Why?
100 single-torrent requests consume far less than 3 mb.
Each HTTP request consumes server resources, even if the data transferred is small. One hundred requests is significant.
You mean CPU? RAM?
True, but IMO the most costly/valuable resource is network bandwidth.
Post by Roger Pate
How trackers support keep-alive? That would help, but not eliminate, this problem.
They don't.



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Harold Feit - Depthstrike.com Administrator
2005-01-17 11:39:39 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

PHP trackers have inherited HTTP keep-alive support from whatever
server they're hosted on.
BNBT supports HTTP keep-alives as well.

It's up to client/parser developers to support it on their side or
else it's useless.

- -----Original Message-----
From: Olaf van der Spek [mailto:***@LIACS.NL]
Sent: Monday, January 17, 2005 5:52 AM
To: ***@yahoogroups.com
Subject: Re: [BitTorrent] Tracker scrape extension proposal
Post by Roger Pate
How trackers support keep-alive? That would help, but not
eliminate, this problem.
They don't.

- --
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.13 - Release Date: 1/16/2005


-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0.3

iQEVAwUBQeuj+l8nceBm0DUaAQI5iQgAgQftHryPcuAmJUrleWypiMSZMS+zLeBr
7VoI9aq75pAhr7zAlVtIESB28A6vkGZvhWmnBHlq1bnIec2xet1+AVmjn6boxzgX
NHZHV78FC4Dzu6sanpXWQRRQBU0XJ/Dnrh9iROgv3rT5l1l/9WLeoOjvjfXq9MQU
v3QDGv0ajw3cSR1tV0gg7s6FWTcnfD8cAnc5J6jHr5BHqWCqKmJTRU2YecK7C4aU
uT+61XadVDUIRHBjgM4hkKISqVnH3Q3c/yslTwI1VLUR/cmb3kaS3QLdo+uEjGym
GVFjOOaFnCIczeoYywd4+ouvCyPr5/Rdj1S00v+3AT0hoJdAFTu+bQ==
=Z0S4
-----END PGP SIGNATURE-----





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-01-17 12:50:27 UTC
Permalink
Post by Harold Feit - Depthstrike.com Administrator
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
PHP trackers have inherited HTTP keep-alive support from whatever
server they're hosted on.
BNBT supports HTTP keep-alives as well.
It's up to client/parser developers to support it on their side or
else it's useless.
Isn't one of the first performance tips to disable it?



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Harold Feit - Depthstrike.com Administrator
2005-01-17 16:32:57 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Cite your source. According to apache, leaving it on is better.

- ----- Keep-alive section of Apache's HTTPd.conf -----
#
# KeepAlive: Whether or not to allow persistent connections (more
than
# one request per connection). Set to "Off" to deactivate.
#
KeepAlive On

#
# MaxKeepAliveRequests: The maximum number of requests to allow
# during a persistent connection. Set to 0 to allow an unlimited
amount.
# We recommend you leave this number high, for maximum performance.
#
MaxKeepAliveRequests 100

- -----Original Message-----
From: Olaf van der Spek [mailto:***@LIACS.NL]
Sent: Monday, January 17, 2005 8:50 AM
To: ***@yahoogroups.com
Subject: Re: [BitTorrent] Tracker scrape extension proposal
Post by Harold Feit - Depthstrike.com Administrator
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
PHP trackers have inherited HTTP keep-alive support from whatever
server they're hosted on.
BNBT supports HTTP keep-alives as well.
It's up to client/parser developers to support it on their side or
else it's useless.
Isn't one of the first performance tips to disable it?

- --
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.13 - Release Date: 1/16/2005


-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0.3

iQEVAwUBQevot18nceBm0DUaAQIk8ggAsfsfbsC+lVN74+i6o7j3G2Sg7kSh92Xd
VTKXxNSpYWDMp8LkHQFqcT7o+uvl8X2xwxGWPcBir8iwrPAMsIy7Bj/6i+ZgmlzZ
iYIIE+DWMBJuomc5r+3sr/XosluFGN4mIGXvYCI+/5y0FOlFYy8HcutL23Kpsv59
e6HcoeV8BoQwmyo33tN0HFPI2zHSPIdQbUO7+BgXQezOWx72DOAfglEz1CEOgRoO
Pzua9TD0xmcWx+y1ZAXFo7aBdLUrG6n70xYiXbnXopLjLzNWyQubLvkHpW2fdVX4
jAMBPbL7cIKstisro4cclAX2Q/3f9AMoxSzLamYWJYmVbfxgrJrjag==
=Q5O1
-----END PGP SIGNATURE-----





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-01-17 16:36:01 UTC
Permalink
Post by Harold Feit - Depthstrike.com Administrator
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Cite your source. According to apache, leaving it on is better.
Can't find it right now, but I was talking about when running a tracker,
not when running Apache in general.
Post by Harold Feit - Depthstrike.com Administrator
- ----- Keep-alive section of Apache's HTTPd.conf -----
#
# KeepAlive: Whether or not to allow persistent connections (more
than
# one request per connection). Set to "Off" to deactivate.
#
KeepAlive On
#
# MaxKeepAliveRequests: The maximum number of requests to allow
# during a persistent connection. Set to 0 to allow an unlimited
amount.
# We recommend you leave this number high, for maximum performance.
#
MaxKeepAliveRequests 100
- -----Original Message-----
Sent: Monday, January 17, 2005 8:50 AM
Subject: Re: [BitTorrent] Tracker scrape extension proposal
Post by Harold Feit - Depthstrike.com Administrator
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
PHP trackers have inherited HTTP keep-alive support from whatever
server they're hosted on.
BNBT supports HTTP keep-alives as well.
It's up to client/parser developers to support it on their side or
else it's useless.
Isn't one of the first performance tips to disable it?
- --
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.13 - Release Date: 1/16/2005
-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0.3
iQEVAwUBQevot18nceBm0DUaAQIk8ggAsfsfbsC+lVN74+i6o7j3G2Sg7kSh92Xd
VTKXxNSpYWDMp8LkHQFqcT7o+uvl8X2xwxGWPcBir8iwrPAMsIy7Bj/6i+ZgmlzZ
iYIIE+DWMBJuomc5r+3sr/XosluFGN4mIGXvYCI+/5y0FOlFYy8HcutL23Kpsv59
e6HcoeV8BoQwmyo33tN0HFPI2zHSPIdQbUO7+BgXQezOWx72DOAfglEz1CEOgRoO
Pzua9TD0xmcWx+y1ZAXFo7aBdLUrG6n70xYiXbnXopLjLzNWyQubLvkHpW2fdVX4
jAMBPbL7cIKstisro4cclAX2Q/3f9AMoxSzLamYWJYmVbfxgrJrjag==
=Q5O1
-----END PGP SIGNATURE-----
Yahoo! Groups Links
--
Olaf van der Spek
http://xccu.sf.net/



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Roger Pate
2005-01-19 18:57:07 UTC
Permalink
Post by Olaf van der Spek
Post by Roger Pate
Post by Olaf van der Spek
Post by Nick Johnson
from the tracker, individual requests aren't particularaly efficient,
but nor is fetching everything. I'd still rather fetch everything than
hit the tracker with 100 HTTP requests, though. This is why I think it'd
Why?
100 single-torrent requests consume far less than 3 mb.
Each HTTP request consumes server resources, even if the data transferred is small. One hundred requests is significant.
You mean CPU? RAM?
True, but IMO the most costly/valuable resource is network bandwidth.
Each TCP/IP connection takes roughly 4k, on average. This is in the TCP stack. If the connection is not closed exactly properly, it may have to time-out to be reclaimed. Even if it was, it may not be reclaimed immediately. That means 100 single-torrent requests is 400k of RAM per client in just connection resources. It would be similar to a SYN DoS attack, where the bandwidth isn't targetted, but the TCP stack is, and it used to bring down servers until they learned to ignore the SYN packets. But in this case, they'd be legitimate connections and couldn't be ignored.

The most valuable resource is usually bandwidth. But if you're running low on other resources, they can become much more important.

And this without even mentioning that doing 100 database lookups instead of 1 has similar consequences.

Some mechanism to pass a list of torrent hashes should be used if retrieving all of them is unacceptable.



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-01-19 21:06:43 UTC
Permalink
Post by Roger Pate
Post by Olaf van der Spek
Post by Roger Pate
Post by Olaf van der Spek
Post by Nick Johnson
from the tracker, individual requests aren't particularaly efficient,
but nor is fetching everything. I'd still rather fetch everything than
hit the tracker with 100 HTTP requests, though. This is why I think it'd
Why?
100 single-torrent requests consume far less than 3 mb.
Each HTTP request consumes server resources, even if the data transferred is small. One hundred requests is significant.
You mean CPU? RAM?
True, but IMO the most costly/valuable resource is network bandwidth.
Each TCP/IP connection takes roughly 4k, on average. This is in the TCP stack. If the connection is not closed exactly properly, it may have to time-out to be reclaimed. Even if it was, it may not be reclaimed immediately. That means 100 single-torrent requests is 400k of RAM per client in just connection resources. It would be similar to a SYN DoS attack, where the bandwidth isn't targetted, but the TCP stack is, and it used to bring down servers until they learned to ignore the SYN packets. But in this case, they'd be legitimate connections and couldn't be ignored.
But which clients are likely to need 100 scrape responses?
Besides, the requests can/will not be done all in parallel.
Post by Roger Pate
The most valuable resource is usually bandwidth. But if you're running low on other resources, they can become much more important.
And this without even mentioning that doing 100 database lookups instead of 1 has similar consequences.
Database?
High performance trackers don't use databases as primary storage.
Post by Roger Pate
Some mechanism to pass a list of torrent hashes should be used if retrieving all of them is unacceptable.
True.
One (IMO) nice way would be to add something like info_hashes= with a
single string of all info_hashes concatenated in some encoding.



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Jesus Cea
2005-01-20 16:19:47 UTC
Permalink
Post by Olaf van der Spek
One (IMO) nice way would be to add something like info_hashes= with a
single string of all info_hashes concatenated in some encoding.
What about keeping state in the tracker?. That is, provide the hashes to
request in a connection and later request only the state for that
"session". If the tracker has no idea about the session (reboot, purge),
simply recreate a new session.

Tracker capabilities could be advertised in the HTTP tracker response,
using a HTTP header or a new bencoded field.
--
Jesus Cea Avion _/_/ _/_/_/ _/_/_/
***@argo.es http://www.argo.es/~jcea/ _/_/ _/_/ _/_/ _/_/ _/_/
_/_/ _/_/ _/_/_/_/_/
PGP Key Available at KeyServ _/_/ _/_/ _/_/ _/_/ _/_/
"Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/
"My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Nick Johnson
2005-01-20 22:47:16 UTC
Permalink
Post by Olaf van der Spek
Database?
High performance trackers don't use databases as primary storage.
Oh? What do they use that's higher performance?
Post by Olaf van der Spek
True.
One (IMO) nice way would be to add something like info_hashes= with a
single string of all info_hashes concatenated in some encoding.
Exactly what I proposed at the start. :)

-Nick Johnson



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-01-20 22:53:10 UTC
Permalink
Post by Nick Johnson
Post by Olaf van der Spek
Database?
High performance trackers don't use databases as primary storage.
Oh? What do they use that's higher performance?
Local memory in the tracker process itself (C++, not an option with PHP).



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Marcel Popescu
2005-01-21 15:55:13 UTC
Permalink
Post by Olaf van der Spek
Post by Nick Johnson
Post by Olaf van der Spek
High performance trackers don't use databases as primary storage.
Oh? What do they use that's higher performance?
Local memory in the tracker process itself (C++, not an option with PHP).
Not to be overly pedantic, but any collection of organized data is, by
definition, a database <g>. The way you store and access it is rather
irrelevant. Not to mention that any DBMS worth anything (Firebird, anyone?)
will cache frequently accessed data like a charm.

Marcel





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
dropnessence
2005-01-17 16:12:10 UTC
Permalink
Keep-alive is a bad system to use with a bittorrent tracker. By
default clients are set to only hit the server every 5 mins. If you
keep a thread/socket alive for 5 minutes just for that client, and you
have a swarm of clients then you will be keeping thousands of
threads/sockets open for keep-alive. Keep-alive is useful for webpage
where somebody is clicking on something every 3-60 seconds. BT happens
every 300 seconds. If the BT tracker (announce) is implemented by a
web-server (eg, PHP), then keep-alive should be disabled.
Post by Roger Pate
Post by Olaf van der Spek
Post by Nick Johnson
from the tracker, individual requests aren't particularaly efficient,
but nor is fetching everything. I'd still rather fetch everything than
hit the tracker with 100 HTTP requests, though. This is why I think it'd
Why?
100 single-torrent requests consume far less than 3 mb.
Each HTTP request consumes server resources, even if the data
transferred is small. One hundred requests is significant.
Post by Roger Pate
How trackers support keep-alive? That would help, but not eliminate, this problem.
Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Harold Feit - Depthstrike.com Administrator
2005-01-18 13:10:19 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

BitTorrent doesn't have a default announce cycle of 5 minutes. It has
a default announce cycle of 30.

Clients SHOULDN'T report keep-alive capability for announces anyway,
since the connections are typically so short and so far spaced. This
would permit a webserver to run BitTorrent trackers without
Keep-alives.

- -----Original Message-----
From: dropnessence [mailto:***@networkessence.net]
Sent: Monday, January 17, 2005 12:12 PM
To: ***@yahoogroups.com
Subject: [BitTorrent] Re: Tracker scrape extension proposal




Keep-alive is a bad system to use with a bittorrent tracker. By
default clients are set to only hit the server every 5 mins. If you
keep a thread/socket alive for 5 minutes just for that client, and
you
have a swarm of clients then you will be keeping thousands of
threads/sockets open for keep-alive. Keep-alive is useful for webpage
where somebody is clicking on something every 3-60 seconds. BT
happens
every 300 seconds. If the BT tracker (announce) is implemented by a
web-server (eg, PHP), then keep-alive should be disabled.
On Mon, 17 Jan 2005 00:19:42 +0100, Olaf van der Spek
Post by Olaf van der Spek
Post by Nick Johnson
from the tracker, individual requests aren't particularaly
efficient, but nor is fetching everything. I'd still rather
fetch everything
than
Post by Olaf van der Spek
Post by Nick Johnson
hit the tracker with 100 HTTP requests, though. This is why I think it'd
Why?
100 single-torrent requests consume far less than 3 mb.
Each HTTP request consumes server resources, even if the data
transferred is small. One hundred requests is significant.
How trackers support keep-alive? That would help, but not
eliminate,
this problem.






Yahoo! Groups Links








- --
No virus found in this incoming message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.13 - Release Date: 1/16/2005


- --
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.7.0 - Release Date: 1/17/2005


-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0.3

iQEVAwUBQe0Kul8nceBm0DUaAQK+4QgAyPUtRlVsbpRgTm4cgE+v8Rk/SRTLNGYf
Ssaaaj91dwd0Dhw7Ev+lwf9mafoqVLzK+CMyBPkC1ad6v9GGtpTdBZI0sNcouy88
e0eDWoR0TnWcIraa7/j95Glop5qBIKFeMu+wBsbq32xCEnF5XKvZ7ppSO5MQr20N
ZSsMrJmIGkL6jkPI1YR8rtrMSRcBhvElFa0zZq/3nRskwOCvXiuwQGe38Os4oZRS
tIvMW9HILDwQ29OxGLre46w2IWAKY5D8Gslxfsov7Ol8dRPV/QYCrn2Il9i56Ar0
OJKzS/bFCfH8N+hObGxedNFWjb3aWL01KLXHKvup+n+rc13BLHAlUg==
=8PDG
-----END PGP SIGNATURE-----





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Alex
2005-01-18 17:18:34 UTC
Permalink
There are two intervals, the announce - which informs tracker of the
status of things, and the rerequest which is based on whether the peer
has a connection and/or has it's peer list fulfilled. Currently, the
tracker doesn't set a min reqrequest interval (though it could); the
tracker only sets a default announce interval. Thus, peers by default
request peers (thus announcing) every 5 mins until they fulfill their
peers list and then resort to announcing every 30 minutes. This is only
how the source client works, I'm not sure how the others do but I'm
sure they are similar.

By default, the announce request is 5 mins or 30, depending on how many
peers the client has and if it's been connected to. The majority of
swarms won't maintain a full peer list and will average closer to the 5
min interval than the 30 min interval (excepting torrents with peer
counts disproportional to the content's size).

download.py
('rerequest_interval', 5 * 60,
'time to wait between requesting more peers'),
rerequest = Rerequester(response['announce'],
config['rerequest_interval']...)

Rerequest.py (announce/rerequest interval code [summarized]):
class Rerequester:
def __init__(self, url, interval, sched, howmany, minpeers,
connect, externalsched, amount_left, up, down,
port, ip, myid, infohash, timeout, errorfunc, maxpeers,
doneflag,
upratefunc, downratefunc, ever_got_incoming):

self.interval = interval
self.announce_interval = 30 * 60

def c(self):
self.sched(self.c, self.interval)
if self.ever_got_incoming():
getmore = self.howmany() <= self.minpeers / 3
else:
getmore = self.howmany() < self.minpeers
if getmore or time() - self.last_time > self.announce_interval:
self.announce()

def announce(self, event = None):
Thread(target = self.rerequest, args = [s, set]).start()

def rerequest(self, url, set):
self.postrequest(r)

def postrequest(self, data):
r = bdecode(data)
self.interval = r.get('min interval', self.interval)
self.trackerid = r.get('tracker id', self.trackerid)



Alex

On Jan 18, 2005, at 7:10 AM, Harold Feit - Depthstrike.com
Post by Harold Feit - Depthstrike.com Administrator
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
BitTorrent doesn't have a default announce cycle of 5 minutes. It has
a default announce cycle of 30.
Clients SHOULDN'T report keep-alive capability for announces anyway,
since the connections are typically so short and so far spaced. This
would permit a webserver to run BitTorrent trackers without
Keep-alives.
- -----Original Message-----
Sent: Monday, January 17, 2005 12:12 PM
Subject: [BitTorrent] Re: Tracker scrape extension proposal
Keep-alive is a bad system to use with a bittorrent tracker. By
default clients are set to only hit the server every 5 mins. If you
keep a thread/socket alive for 5 minutes just for that client, and
you
have a swarm of clients then you will be keeping thousands of
threads/sockets open for keep-alive. Keep-alive is useful for webpage
where somebody is clicking on something every 3-60 seconds. BT
happens
every 300 seconds. If the BT tracker (announce) is implemented by a
web-server (eg, PHP), then keep-alive should be disabled.
On Mon, 17 Jan 2005 00:19:42 +0100, Olaf van der Spek
Post by Olaf van der Spek
Post by Nick Johnson
from the tracker, individual requests aren't particularaly
efficient, but nor is fetching everything. I'd still rather
fetch everything
than
Post by Olaf van der Spek
Post by Nick Johnson
hit the tracker with 100 HTTP requests, though. This is why I
think it'd
Post by Olaf van der Spek
Why?
100 single-torrent requests consume far less than 3 mb.
Each HTTP request consumes server resources, even if the data
transferred is small. One hundred requests is significant.
How trackers support keep-alive? That would help, but not
eliminate,
this problem.
Yahoo! Groups Links
- --
No virus found in this incoming message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.13 - Release Date: 1/16/2005
- --
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.7.0 - Release Date: 1/17/2005
-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0.3
iQEVAwUBQe0Kul8nceBm0DUaAQK+4QgAyPUtRlVsbpRgTm4cgE+v8Rk/SRTLNGYf
Ssaaaj91dwd0Dhw7Ev+lwf9mafoqVLzK+CMyBPkC1ad6v9GGtpTdBZI0sNcouy88
e0eDWoR0TnWcIraa7/j95Glop5qBIKFeMu+wBsbq32xCEnF5XKvZ7ppSO5MQr20N
ZSsMrJmIGkL6jkPI1YR8rtrMSRcBhvElFa0zZq/3nRskwOCvXiuwQGe38Os4oZRS
tIvMW9HILDwQ29OxGLre46w2IWAKY5D8Gslxfsov7Ol8dRPV/QYCrn2Il9i56Ar0
OJKzS/bFCfH8N+hObGxedNFWjb3aWL01KLXHKvup+n+rc13BLHAlUg==
=8PDG
-----END PGP SIGNATURE-----
Yahoo! Groups Links
Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-01-18 23:23:14 UTC
Permalink
Post by Alex
There are two intervals, the announce - which informs tracker of the
status of things, and the rerequest which is based on whether the peer
has a connection and/or has it's peer list fulfilled. Currently, the
tracker doesn't set a min reqrequest interval (though it could); the
How can it do that?
And which clients respect it?
Post by Alex
tracker only sets a default announce interval. Thus, peers by default
request peers (thus announcing) every 5 mins until they fulfill their
peers list and then resort to announcing every 30 minutes. This is only
how the source client works, I'm not sure how the others do but I'm
sure they are similar.
By default, the announce request is 5 mins or 30, depending on how many
peers the client has and if it's been connected to. The majority of
swarms won't maintain a full peer list and will average closer to the 5
min interval than the 30 min interval (excepting torrents with peer
counts disproportional to the content's size).
Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Alex
2005-01-19 04:10:50 UTC
Permalink
Before this subject, I started out not knowing python but knew BT
really well. Kudos for bringing it up because I learned more about BT
and it seems python wouldn't be as difficult as I expected.

announce_interval is overwritten by a valid tracker response as is min
interval, tracker id, etc. 'tracker id' is not used by the python
tracker either. I would imagine implementing 'tracker id' would be just
as easy to implement.

Rerequester.py (reference code where 'min interval' is overwritten)
def postrequest(self, data):
try:
r = bdecode(data)
check_peers(r)
if r.has_key('failure reason'):
self.errorfunc('rejected by tracker - ' + r['failure
reason'])
else:
if r.has_key('warning message'):
self.errorfunc('warning from tracker - ' +
r['warning message'])
self.announce_interval = r.get('interval',
self.announce_interval)
self.interval = r.get('min interval', self.interval)
self.trackerid = r.get('tracker id', self.trackerid)
self.last = r.get('last')



Here is how to implement tracker-controlled min_interval (this would
compliment the current tracker's support of controlling the
announce_interval):
Alex-G5:~/bt/BitTorrent-3.4.2/BitTorrent nessence$ diff track.py
track_mininterval.py
30a31
('min_interval', 5 * 60, 'seconds downloaders should wait between
rerequests'),
120a122
self.min_interval = config['min_interval']
399c401
< data = {'interval': self.reannounce_interval}
---
self.min_interval}



"...though it could..." == it was not coded into the tracker, but
supported by the client.

I tested this code and it works with the client from bittorrent.com.
The downloading peer only contacted the tracker every 10 minutes
instead of 5. This isn't applicable to the 2nd tracker connection. The
second tracker connection is scheduled before the min_interval is
overwritten by the initial event=started connection.

0) BT client makes intial tracker connection (event=started)
1) BT client connects to tracker client->min_interval seconds after 1)
n) BT client connects to tracker tracker->min_interval seconds after 2)

I'm not sure what other clients do or do not support this, but if you
have a home-made tracker you could easily configure it to ignore
clients which ignore the min_interval.

Just an FYI, the default min_peers value is 20, and the scheduling is
dependent on the min_peers value. So any peer which has less than
(20/3) peers is going to continue using the min_interval. That
basically means that any 'default' client that has less than 7 peers on
a torrent, whether a seeder or not, will announce at min_interval and
not announce_interval (refer sched() function in previous email). From
my experience, most environments have peers all using their
min_interval and hit the tracker once every 5 mins. This equates to 0.2
hits per min, per peer. So, one hit per min for every 5 peers or one
hit per second for every 300 peers.

You might see why a tracker with 30k peers has problems [with python]
handling 100 hits per second. A compiled tracker like BNBT barely
notices the load. I honestly think it has to do with all the hash
lookups. 100 hash lookups and peer lists per second is way faster in
C++ than in python (or the apache+php trackers out there). Afaik,
python uses an array and a flat dstate file whereas C++ can use
structs, linked lists, maps, etc.

Hope I answered your questions.

-Alex
(nessence)
Post by Alex
There are two intervals, the announce - which informs tracker of the
status of things, and the rerequest which is based on whether the peer
has a connection and/or has it's peer list fulfilled. Currently, the
tracker doesn't set a min reqrequest interval (though it could); the
How can it do that?
And which clients respect it?
Post by Alex
tracker only sets a default announce interval. Thus, peers by default
request peers (thus announcing) every 5 mins until they fulfill their
peers list and then resort to announcing every 30 minutes. This is only
how the source client works, I'm not sure how the others do but I'm
sure they are similar.
By default, the announce request is 5 mins or 30, depending on how many
peers the client has and if it's been connected to. The majority of
swarms won't maintain a full peer list and will average closer to the 5
min interval than the 30 min interval (excepting torrents with peer
counts disproportional to the content's size).
Yahoo! Groups Links
Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-01-19 21:26:17 UTC
Permalink
Post by Alex
You might see why a tracker with 30k peers has problems [with python]
handling 100 hits per second. A compiled tracker like BNBT barely
At 375k peers you do notice it. And it just eats 6x as much bandwidth as the
'optimal' condition.
Post by Alex
notices the load. I honestly think it has to do with all the hash
lookups. 100 hash lookups and peer lists per second is way faster in
C++ than in python (or the apache+php trackers out there). Afaik,
python uses an array and a flat dstate file whereas C++ can use
structs, linked lists, maps, etc.
Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-01-21 16:26:14 UTC
Permalink
Post by Marcel Popescu
Post by Olaf van der Spek
Post by Nick Johnson
Post by Olaf van der Spek
High performance trackers don't use databases as primary storage.
Oh? What do they use that's higher performance?
Local memory in the tracker process itself (C++, not an option with PHP).
Not to be overly pedantic, but any collection of organized data is, by
definition, a database <g>. The way you store and access it is rather
irrelevant. Not to mention that any DBMS worth anything (Firebird, anyone?)
will cache frequently accessed data like a charm.
The access latency/penalty is still way higher than accessing a local
variable.



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Loading...