Discussion:
XML Torrents?
Kenneth Porter
2005-03-10 16:14:25 UTC
Permalink
At the risk of getting stoned for heresy ;) I'm wondering why .torrent
files aren't implemented in XML? The recent thread confusing the filename
encoding scheme suggests that a human-readable, extensible,
self-documenting format, one with many off-the-shelf tools available, would
be better than a new binary format.

What constraints make the binary format preferable?

BTW, the archives don't show any previous discussion of XML:

<http://groups.yahoo.com/group/BitTorrent/messagesearch?query=xml>



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Harold Feit - Depthstrike.com Administrator
2005-03-10 20:26:50 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Actually the last time XML torrents was proposed the first thing that
was brought up was "how do you propose to store binary data"

Using a binary format permits the EXTREMELY efficient storage of the
SHA1 hash values for pieces (20 bytes per piece instead of 40) and a
wider scalability for other binary data formats (including binary
storing of future hash type changes).

Translations of individual fields within a torrent (such as using
UTF-8 in filenames) are controlled by the individual client
implementation, not by the format itself.

- -----Original Message-----
From: Kenneth Porter [mailto:***@sewingwitch.com]
Sent: Thursday, March 10, 2005 12:14 PM
To: ***@yahoogroups.com
Subject: [BitTorrent] XML Torrents?


At the risk of getting stoned for heresy ;) I'm wondering why
.torrent
files aren't implemented in XML? The recent thread confusing the
filename
encoding scheme suggests that a human-readable, extensible,
self-documenting format, one with many off-the-shelf tools available,
would
be better than a new binary format.

What constraints make the binary format preferable?

BTW, the archives don't show any previous discussion of XML:

<http://groups.yahoo.com/group/BitTorrent/messagesearch?query=xml>

- --
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.6.2 - Release Date: 3/4/2005


-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0.3

iQEVAwUBQjCtiV8nceBm0DUaAQIziQf+OZ+EN7YUkQN6+0mN9dBPSOxCywYDZzs1
A/Q2a0o2qhUhFWBNnjBA6RzKVkWPawIut8grdRAkx++nvZql902Xi+7FZyiw8dpY
cm2ytdjhvyhf8N3+LIpaDE1xosBn5C2yYpwMFt+Iu5yEQgcFYPydsh5l2rAcimt9
wkUw2/4ojVuf+0E6MCCwLBSEJjt5KSsE47O9J2+6hXo9ikdoIaTjtBrX9tFlXN09
MLNPxfOufEGki9Dvqa0/Q+qJzaMQB6KXQxfKImvz7v2oMG5q14xh6lIh9bDnIxKY
jkxzH+KxcFMQyMAYpPRZhBM7vkQk4Hj87uVeerGWjZcN/3gngSHTrQ==
=wAFI
-----END PGP SIGNATURE-----





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Konstantin 'Kosta' Welke
2005-03-10 22:46:47 UTC
Permalink
Post by Kenneth Porter
At the risk of getting stoned for heresy ;) I'm wondering why .torrent
files aren't implemented in XML? The recent thread confusing the filename
encoding scheme suggests that a human-readable, extensible,
self-documenting format, one with many off-the-shelf tools available, would
be better than a new binary format.
What constraints make the binary format preferable?
XML is bloated. XML has lots of stuff we dont need, and it takes up
much much more space. bencoding is a nice, easily machine-readable,
compact format.

bottom line: There is no reason to switch to XML. And we wont escape
the different character encoding diffuculties by that.

Kosta



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Justin Cormack
2005-03-11 10:18:16 UTC
Permalink
Post by Konstantin 'Kosta' Welke
Post by Kenneth Porter
At the risk of getting stoned for heresy ;) I'm wondering why .torrent
files aren't implemented in XML? The recent thread confusing the filename
encoding scheme suggests that a human-readable, extensible,
self-documenting format, one with many off-the-shelf tools available, would
be better than a new binary format.
What constraints make the binary format preferable?
XML is bloated. XML has lots of stuff we dont need, and it takes up
much much more space. bencoding is a nice, easily machine-readable,
compact format.
bottom line: There is no reason to switch to XML. And we wont escape
the different character encoding diffuculties by that.
Actually the character encoding and space reasons are relatively minor.
For many purposes it would be very useful to have a text format (eg to
send a torrent by email). However it is required that there is a unique
representation of the info portion of the torrent file so that the info
hash can be generated (hence the rules about ordering of fields in bencoding).
XML doesnt give you a fixed representation, so you still need some canonical
form.

You could make an XML form where the info part in bencoded form could be
uniquely regenerated to generate/verify the hash.

Justin



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Konstantin 'Kosta' Welke
2005-03-11 12:35:57 UTC
Permalink
Post by Justin Cormack
For many purposes it would be very useful to have a text format (eg to
send a torrent by email). However it is required that there is a unique
representation of the info portion of the torrent file so that the info
hash can be generated (hence the rules about ordering of fields in bencoding).
XML doesnt give you a fixed representation, so you still need some canonical
form.
XML, or more specifically a DTD for XML, should give you a canocial
representation. There's some fields with names and values with some sort
of structure, thats what both XML and beencoding are all about.
Post by Justin Cormack
You could make an XML form where the info part in bencoded form could be
uniquely regenerated to generate/verify the hash.
We'd have to encode it in Base64, thats all.
The nice things about XML is that we can make it as bloated as we want.
The worst I can think of is
<pieces>
<piece>MDEyMzQ1Njc4OTAxMjM0NTY3ODk=</piece>
<piece>YWtsc2pkb2l3amhlbmE4OTIxM2Y=</piece>
[...]
</pieces>

HTH :)
Kosta



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Kenneth Porter
2005-03-11 18:00:12 UTC
Permalink
--On Friday, March 11, 2005 10:18 AM +0000 Justin Cormack
Post by Justin Cormack
Actually the character encoding and space reasons are relatively minor.
Yeah, I don't see the point of saving 20 bytes when we're dealing with a
protocol designed to transfer hundreds of megabytes.
Penny-wise/pound-foolish. For larger XML files, the "bloat" can easily be
dealt with using commonplace compression like deflate. For example,
OpenOffice documents are stored in zipped XML collections. Much of the
bloat is caused by the large in-line descriptive tags, and those are
natural candidates for leveraging dictionary-based compression.

The encoding issue is somewhat orthogonal. The XML advantage there is that
it has a well-defined mechanism for specifying encodings, but it's not the
"selling point".
Post by Justin Cormack
For many purposes it would be very useful to have a text format (eg to
send a torrent by email). However it is required that there is a unique
representation of the info portion of the torrent file so that the info
hash can be generated (hence the rules about ordering of fields in
bencoding). XML doesnt give you a fixed representation, so you still need
some canonical form.
I see what you mean. It looks like the hash is used as a short-hand way of
representing a given metadata file, to be used as a key into the tracker
and peer state machines to identify a given swarm. Does the hash serve a
security purpose, or just an identification function? If the latter, the
authoring application could hash the initial contents of the XML
representation that it's written and append it as a final element. This
would provide the required uniqueness without requiring that a particular
representation be maintained by those communicating the content.

To me, the advantages of XML are extensibility, self-documentation, and
easy validation. Plus the wide availability of tools to manipulate the
format. This makes it easy to implement the tracker part of new clients in
any language that has an XML parser, and makes it easy to extend the
protocol. Since the tracker acts as a web server, and XML has web origins,
it also makes it natural to adapt existing web machinery to do the job.

Is bencoding so special? If the original BT protocol had been written in
Perl, would everyone be raving about Data::Dumper format? (And I recall
that PHP has its own data structure dump format.)

(BTW, note that I'm not proposing XML for the peer protocol, which is much
more performance-sensitive. For now I'm just talking about the metadata
file (ie. the .torrent).)



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Elliott Mitchell
2005-03-11 20:26:50 UTC
Permalink
Post by Kenneth Porter
--On Friday, March 11, 2005 10:18 AM +0000 Justin Cormack
Post by Justin Cormack
Actually the character encoding and space reasons are relatively minor.
Yeah, I don't see the point of saving 20 bytes when we're dealing with a
protocol designed to transfer hundreds of megabytes.
Key issue here, the hundreds of megabytes are transfered between peers
with major bandwidth. The torrent file may come from a heavily bandwidth
constrained system in Antartica.
Post by Kenneth Porter
Penny-wise/pound-foolish. For larger XML files, the "bloat" can easily be
dealt with using commonplace compression like deflate. For example,
OpenOffice documents are stored in zipped XML collections. Much of the
bloat is caused by the large in-line descriptive tags, and those are
natural candidates for leveraging dictionary-based compression.
However, once they're that large, they're effectively no longer human
readable.
Post by Kenneth Porter
Post by Justin Cormack
For many purposes it would be very useful to have a text format (eg to
send a torrent by email). However it is required that there is a unique
representation of the info portion of the torrent file so that the info
hash can be generated (hence the rules about ordering of fields in
bencoding). XML doesnt give you a fixed representation, so you still need
some canonical form.
I see what you mean. It looks like the hash is used as a short-hand way of
representing a given metadata file, to be used as a key into the tracker
and peer state machines to identify a given swarm. Does the hash serve a
security purpose, or just an identification function? If the latter, the
authoring application could hash the initial contents of the XML
representation that it's written and append it as a final element. This
would provide the required uniqueness without requiring that a particular
representation be maintained by those communicating the content.
Nope, they provide proof that the data blocks have been transfered
successfully. This includes showing that the peer isn't sending you
garbage hoping to fool you into given credit for pieces you haven't
transfered.
Post by Kenneth Porter
To me, the advantages of XML are extensibility, self-documentation, and
easy validation. Plus the wide availability of tools to manipulate the
format. This makes it easy to implement the tracker part of new clients in
any language that has an XML parser, and makes it easy to extend the
protocol. Since the tracker acts as a web server, and XML has web origins,
it also makes it natural to adapt existing web machinery to do the job.
Easy validation? This is why the tools to do so are large? This is why
a lot of folks have gotten it wrong, leading to security holes?

The tracker is assumed to have very limited bandwidth, therefore using
XML with its bloat is a bad thing there.
--
(\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/)
\ ( | ***@gremlin.m5p.com PGP 8881EF59 | ) /
\_ \ | _____ -O #include <stddisclaimer.h> O- _____ | / _/
\___\_|_/82 04 A1 3C C7 B1 37 2A*E3 6E 84 DA 97 4C 40 E6\_|_/___/





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Kenneth Porter
2005-03-12 00:26:11 UTC
Permalink
[info_hash's] provide proof that the data blocks have been transfered
successfully. This includes showing that the peer isn't sending you
garbage hoping to fool you into given credit for pieces you haven't
transferred.
Looking at the Tracker GET request in the protocol spec at
<http://www.bittorrent.com/protocol.html>, I don't see how the info_hash
provides any integrity. I understand how the fragment hashes do that, but
those are computed from the payload file, not the metadata file, and aren't
sensitive to the representation of the metadata.

I also don't see any mechanism for preventing spoofing of one's statistics
to the tracker.



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Elliott Mitchell
2005-03-12 07:24:46 UTC
Permalink
Post by Kenneth Porter
--On Friday, March 11, 2005 8:29 PM +0000 Konstantin 'Kosta' Welke
Post by Konstantin 'Kosta' Welke
Also, the torrents are already too big, they should be an url like http
or ed2k do it. That why there are efforts to eminate them in bt2.
Looking at the spec <http://www.bittorrent.com/protocol.html>, I see a
filename (or list of filename/length pairs), tracker URL, and a set of
fragment checksums. Most of the size would be the checksums. How do you
propose shrinking this to fit in a URL? I suppose you could designate
trusted peers (eg. initial seeds) to host the checksums.
There is talk of the BT2 protocol. Something similar to that would likely
be done.
Post by Kenneth Porter
Post by Konstantin 'Kosta' Welke
Post by Kenneth Porter
Is bencoding so special? If the original BT protocol had been written in
Perl, would everyone be raving about Data::Dumper format? (And I recall
that PHP has its own data structure dump format.)
bencoding is not some "binary dump" format. You can think of it as "XML
without the bloat", i.e. Easy to parse, space-efficient, easy to extend,
but harder to read for the human eye.
My point was just that bencoding is a Python-centric structure dump.
Data::Dumper is the Perl equivalent, and PHP also has a structure dump.
Python-centric? Doesn't look that way to me. The bencode format might of
been created with Python in mind, but it in no way imposes anything that
can be considered Python-centric. Should take less than 5 minutes of work
to produce a parser for your language of choice.
Post by Kenneth Porter
Post by Konstantin 'Kosta' Welke
[info_hash's] provide proof that the data blocks have been transfered
successfully. This includes showing that the peer isn't sending you
garbage hoping to fool you into given credit for pieces you haven't
transferred.
Looking at the Tracker GET request in the protocol spec at
<http://www.bittorrent.com/protocol.html>, I don't see how the info_hash
provides any integrity. I understand how the fragment hashes do that, but
those are computed from the payload file, not the metadata file, and aren't
sensitive to the representation of the metadata.
May of messed up contexts.

Hashing the payload is the way to go for the info_hash. Since the payload
is what you care about, that is the natural thing to identify the torrent
to the tracker. You don't want to hash the metadata file, as that is
merely a means to an end, not the end in and of itself.
Post by Kenneth Porter
I also don't see any mechanism for preventing spoofing of one's statistics
to the tracker.
There isn't. Bram's thought is not use the statistics for anything, but
information. The reasoning being that you can't prevent the spoofing
thereof, so why bother? As long as you use it *strictly* for
informational purposes, there is no incentive to spoof them.
--
(\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/)
\ ( | ***@gremlin.m5p.com PGP 8881EF59 | ) /
\_ \ | _____ -O #include <stddisclaimer.h> O- _____ | / _/
\___\_|_/82 04 A1 3C C7 B1 37 2A*E3 6E 84 DA 97 4C 40 E6\_|_/___/





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Konstantin 'Kosta' Welke
2005-03-11 20:29:45 UTC
Permalink
Post by Kenneth Porter
Yeah, I don't see the point of saving 20 bytes when we're dealing with a
protocol designed to transfer hundreds of megabytes.
Penny-wise/pound-foolish. For larger XML files, the "bloat" can easily be
dealt with using commonplace compression like deflate. For example,
OpenOffice documents are stored in zipped XML collections. Much of the
bloat is caused by the large in-line descriptive tags, and those are
natural candidates for leveraging dictionary-based compression.
Yeah, beencoding is hard to read for humans. But a gzip'ed XML is better?
Okay, you can unzip it to look at it, but that doesnt really seem elegant
to me... Writing an efficient, complete XML parser isnt done in 5 minutes. Okay,
many languages already have one and an XML-Torrent will be a really easy
kind of document, so you can just code one in a few lines.

From a coders point of view, it adds complexity without real advantages,
(you need an XML parser, a base64 decoder and gzip). If you can use
gzip to look at some XML, you can also use some existing graphical
torrent file viewer or use some tool that converts it into a simple
textfile that will be much much easier to read than stupid XML.

Also, the torrents are already too big, they should be an url like http or ed2k
do it. That why there are efforts to eminate them in bt2.
Post by Kenneth Porter
To me, the advantages of XML are extensibility,
There is nothing we need that cant be done with bencoding.
Post by Kenneth Porter
self-documentation,
Hmm, from a programs point of view, its the same with bencoding.
Post by Kenneth Porter
and easy validation.
bencoding is easier to validate.
Post by Kenneth Porter
Plus the wide availability of tools to manipulate the
format.
What do you want to manipulate that cant be done with bittorrent
tools?
Post by Kenneth Porter
This makes it easy to implement the tracker part of new clients in
any language that has an XML parser, and makes it easy to extend the
protocol.
The protocol is already very easy to extend, as you can see by the
numberous extensions floating around :)
Post by Kenneth Porter
Since the tracker acts as a web server, and XML has web origins,
it also makes it natural to adapt existing web machinery to do the job.
Sorry, this is not a rational point.
Post by Kenneth Porter
Is bencoding so special? If the original BT protocol had been written in
Perl, would everyone be raving about Data::Dumper format? (And I recall
that PHP has its own data structure dump format.)
bencoding is not some "binary dump" format. You can think of it as "XML without
the bloat", i.e. Easy to parse, space-efficient, easy to extend, but harder to
read for the human eye.
Post by Kenneth Porter
(BTW, note that I'm not proposing XML for the peer protocol, which is much
more performance-sensitive. For now I'm just talking about the metadata
file (ie. the .torrent).)
That's good because it would just be completely braindead. But XML for
.torrent files can be discussed. I just dont think it would make sense.

Kosta



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Kenneth Porter
2005-03-12 00:16:30 UTC
Permalink
--On Friday, March 11, 2005 8:29 PM +0000 Konstantin 'Kosta' Welke
Post by Konstantin 'Kosta' Welke
Also, the torrents are already too big, they should be an url like http
or ed2k do it. That why there are efforts to eminate them in bt2.
Looking at the spec <http://www.bittorrent.com/protocol.html>, I see a
filename (or list of filename/length pairs), tracker URL, and a set of
fragment checksums. Most of the size would be the checksums. How do you
propose shrinking this to fit in a URL? I suppose you could designate
trusted peers (eg. initial seeds) to host the checksums.
Post by Konstantin 'Kosta' Welke
Post by Kenneth Porter
Is bencoding so special? If the original BT protocol had been written in
Perl, would everyone be raving about Data::Dumper format? (And I recall
that PHP has its own data structure dump format.)
bencoding is not some "binary dump" format. You can think of it as "XML
without the bloat", i.e. Easy to parse, space-efficient, easy to extend,
but harder to read for the human eye.
My point was just that bencoding is a Python-centric structure dump.
Data::Dumper is the Perl equivalent, and PHP also has a structure dump.





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-03-12 10:47:11 UTC
Permalink
Post by Kenneth Porter
--On Friday, March 11, 2005 8:29 PM +0000 Konstantin 'Kosta' Welke
Post by Konstantin 'Kosta' Welke
Also, the torrents are already too big, they should be an url like http
or ed2k do it. That why there are efforts to eminate them in bt2.
Looking at the spec <http://www.bittorrent.com/protocol.html>, I see a
filename (or list of filename/length pairs), tracker URL, and a set of
fragment checksums. Most of the size would be the checksums. How do you
propose shrinking this to fit in a URL? I suppose you could designate
trusted peers (eg. initial seeds) to host the checksums.
You don't need to trust them.
The info_hash 'guards' the info key, so to start a torrent, you only
need info_hash and tracker. And that fits nicely in a URL.
The info key can them be requested from other peers.



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Konstantin 'Kosta' Welke
2005-03-12 14:18:54 UTC
Permalink
Post by Kenneth Porter
--On Friday, March 11, 2005 8:29 PM +0000 Konstantin 'Kosta' Welke
Looking at the spec <http://www.bittorrent.com/protocol.html>, I see a
filename (or list of filename/length pairs), tracker URL, and a set of
fragment checksums. Most of the size would be the checksums. How do you
propose shrinking this to fit in a URL? I suppose you could designate
trusted peers (eg. initial seeds) to host the checksums.
Olaf made a promising approach in that direction. You can take a look
at http://62.216.18.38/bt_merkle/
This is also about what we expect from bt2. Theres lots of discussion
going on here, centered about how to implement this idea the best way.
Post by Kenneth Porter
Post by Konstantin 'Kosta' Welke
bencoding is not some "binary dump" format. You can think of it as "XML
without the bloat", i.e. Easy to parse, space-efficient, easy to extend,
but harder to read for the human eye.
My point was just that bencoding is a Python-centric structure dump.
Data::Dumper is the Perl equivalent, and PHP also has a structure dump.
Your point is wrong.
bencoding was either created especially for bittorrent or has been
adapted by it. It is not Pythons structure dump. Ask your favorite
search engine about "python serialization" and you will find lots
of pages about Pickler, but none about bencoding. Extending your
search to bencoding, you will find a nice page that implements
bencoding in 5 languages (4 scripting and 1 functional).

Also note that bencoding is not sufficient to store a generic
python object.

Where did you get the idea that benconding was in any way
python-specific or -centrifc?

HTH,
Kosta



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Kenneth Porter
2005-03-12 17:57:10 UTC
Permalink
--On Saturday, March 12, 2005 2:18 PM +0000 Konstantin 'Kosta' Welke
Post by Konstantin 'Kosta' Welke
Where did you get the idea that benconding was in any way
python-specific or -centrifc?
I stand corrected. Probably got the idea from reading too many mailing
lists. (I'm on several dozen for all kinds of packages.) I probably ran yum
and BT together in my head.





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Joseph Ashwood
2005-03-12 10:55:23 UTC
Permalink
----- Original Message -----
From: "Kenneth Porter" <***@sewingwitch.com>
Subject: Re: [BitTorrent] XML Torrents?
Post by Kenneth Porter
Looking at the spec <http://www.bittorrent.com/protocol.html>, I see a
filename (or list of filename/length pairs), tracker URL, and a set of
fragment checksums. Most of the size would be the checksums. How do you
propose shrinking this to fit in a URL? I suppose you could designate
trusted peers (eg. initial seeds) to host the checksums.
I believe the concept there is to (as we have been discussing in great
detail with very mixed results) use Merkle trees, this eliminates the long
list of hashes at the end. The proposal also includes the presumption that
multi-file torrents have to be redeigned substantially (again many concepts
and mixed results). By doing this the necessary elements of the torrent
become tracker, filename, root hash. There would have to be come changes in
the formatting of the URI, basically just encoding, but the end result could
look something like:

BitTorrent://tracker.shutmedownbecausemyfilesareillegal.com:1234/announce?filename=pirated+movie.ogm&roothash=1234567890123456789012345678901234567890123456789012345678900123456789001234567890123456&hash_function=SHA-512

I assumed the inclusion of the recently (briefly) discussed hash_function
argument, and obviously SHA-512, although I am not sure "-" is valid in
URIs. Any authorities? Even at that we have the option of (assuming the
tracker or peer protocol is altered to support this, I would vote peer) the
URL can be reduce to not include the filename either.

Even without the further optimizations at 218 bytes this forms a usably
sized, if hard to remember, URI. The same information could be stored in any
other convenient format as well, with bencoding of course being the default,
an XML definition (even though so many people don't like it), etc. It fits
easily into a TCP/IP frame, ethernet frame, Wi-Fi frame, but does not fit
into an ATM frame, not sure about modem frame I haven't looked at that
information for quite some time. The frame fitting is not critical but it
can relieve overhead in a system, and was one of the primary reasons for the
size limits on URI in the beginning (IIRC 4096 bytes to match the maximum
ethernet frame, and yes I am old school enough to have actually done heavy
optimization based on this and achieved throughputs higher than the
theoretical maximum because of it).

I actually think that this poses a very interesting new area for BitTorrent,
or a BitTorrent-like protocol, and at the very least something that can help
in optimizing the system.
Joe




Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Harold Feit - Depthstrike.com Administrator
2005-03-13 01:14:11 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

- ----Original Message-----
From: Joseph Ashwood [mailto:***@msn.com]
Sent: Saturday, March 12, 2005 6:55 AM
To: ***@yahoogroups.com
Subject: Re: [BitTorrent] XML Torrents?


I assumed the inclusion of the recently (briefly) discussed
hash_function
argument, and obviously SHA-512, although I am not sure "-" is valid
in
URIs. Any authorities? Even at that we have the option of (assuming
the
tracker or peer protocol is altered to support this, I would vote
peer) the
URL can be reduce to not include the filename either.
- -----End Original Message-----
RFC 2396 defines the valid characters and encoding for URIs if that's
of any help for you in the construction of an example URI.

- -----Original Message-----
From: Konstantin 'Kosta' Welke [mailto:***@fillibach.de]
Sent: Saturday, March 12, 2005 10:19 AM
To: ***@yahoogroups.com
Subject: Re: [BitTorrent] XML Torrents?

Your point is wrong.
bencoding was either created especially for bittorrent or has been
adapted by it. It is not Pythons structure dump. Ask your favorite
search engine about "python serialization" and you will find lots
of pages about Pickler, but none about bencoding. Extending your
search to bencoding, you will find a nice page that implements
bencoding in 5 languages (4 scripting and 1 functional).

Also note that bencoding is not sufficient to store a generic
python object.

Where did you get the idea that benconding was in any way
python-specific or -centrifc?
- -----End Original Message-----

I agree that Bencoding is NOT python specific. I have libraries
sitting on my program development system for Bencode in PHP (has some
problems with 10-digit numbers), Visual Basic.Net/C#, C++ and
probably a couple other languages I don't even realise. (I actively
use the PHP and VB.Net/C# ones in new programs/scripts).

- --
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.7.2 - Release Date: 3/11/2005


-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0.3

iQEVAwUBQjOT4l8nceBm0DUaAQLTuQgAjcNa5iqB5Ika62ffCHLxhtXHn3A0nBzB
yR7WJxUmR10+WR0bC5E7lQLKOJ+PEi78GCmIxbe0QnpWxN7ru3fkJ/rO2VwxrjUr
gVfaIXFDWaEYk7yUT+CExydT83nD35Y3MBhKqbVa1saMbPJQBWEgV9L2u339I83S
kQtjFccg7t4+TIPQ7korPRKnL9jDoLCh+snKoQ5iKqFvIV8e0dTevg7qRh88hRwN
OSUYk7uQxoIRVoPsKJ9wfoEXaHFmemxg2jO3xvmmTzoZYJFqAc5lbG4mvKFK1wgk
ymR/2ReFjSvjUYGnBEsLCXHo1+XJ0ieSmafA1RI0r2BNDPbM0swkMQ==
=48hV
-----END PGP SIGNATURE-----





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
David Smith
2005-03-11 02:40:28 UTC
Permalink
Post by Konstantin 'Kosta' Welke
XML is bloated. XML has lots of stuff we dont need, and it takes up
much much more space. bencoding is a nice, easily machine-readable,
compact format.
bottom line: There is no reason to switch to XML. And we wont escape
the different character encoding diffuculties by that.
I agree. We don't NEED XML - yet.

However: Bencoding is not fun, and it's not easily HUMAN-readable.

I hate reading through long lines of Bencoded/binary/hard text trying to
discern meaning. That's it.

XML has much more support for a more robust, human-recognizable structure.
When the time comes, I'll be more than willing to implement the XML
structures.

Also, I don't want to have to re-invent tools to manipulate Bencoded
structures. I don't enjoy re-inventing the wheel.

David Smith
Michigan State University
***@msu.edu
248.770.5524


-----Original Message-----
From: Konstantin 'Kosta' Welke [mailto:***@fillibach.de]
Sent: Thursday, March 10, 2005 5:47 PM
To: ***@yahoogroups.com
Subject: Re: [BitTorrent] XML Torrents?
Post by Konstantin 'Kosta' Welke
At the risk of getting stoned for heresy ;) I'm wondering why .torrent
files aren't implemented in XML? The recent thread confusing the filename
encoding scheme suggests that a human-readable, extensible,
self-documenting format, one with many off-the-shelf tools available, would
be better than a new binary format.
What constraints make the binary format preferable?
XML is bloated. XML has lots of stuff we dont need, and it takes up
much much more space. bencoding is a nice, easily machine-readable,
compact format.

bottom line: There is no reason to switch to XML. And we wont escape
the different character encoding diffuculties by that.

Kosta



Yahoo! Groups Links











Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Loading...