Discussion:
Request for protocol extension: get_info/info messages
Olaf van der Spek
2003-11-10 14:47:25 UTC
Permalink
Hi,

As follow-up to my .torrent-less download thread, I'd like two messages to
be added to the BT protocol: get_info and info.
get_info has no payload
info has has payload containing info (bencoded, from .torrent)
get_info could be send after the handshake and before other messages
info should be send after receiving a handshake and get_info, but before
receiving other messages

I'd also like to propose a URL format for downloads instead of a .torrent
file:
btp:protocol:host:port/info_hash/peers/
btp:udp:192.168.1.1:2710/%01%23%45%67%89%ab%cd%ef
%01%23%45%67%89%ab%cd%ef%01%23%45%67/192.168.1.2:6881 (example)

btp: Bit Torrent Protocol
protocol: http, but could be udp in the future
host: IP address or hostname of tracker
port: TCP or UDP port of tracker
info_hash
peers: optional list of IP address/port combinations of listening peers

The info key of a .torrent should be made optional. When it's not present, a
info_hash key should be present instead containing the SHA1 hash of the info
key.

Olaf van der Spek
Almere, Holland
http://xccu.sourceforge.net/


------------------------ Yahoo! Groups Sponsor ---------------------~-->
Buy Ink Cartridges or Refill Kits for your HP, Epson, Canon or Lexmark
Printer at MyInks.com. Free s/h on orders $50 or more to the US & Canada.
http://www.c1tracking.com/l.asp?cid=5511
http://us.click.yahoo.com/mOAaAA/3exGAA/qnsNAA/dkFolB/TM
---------------------------------------------------------------------~->

To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com



Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Olaf van der Spek
2003-10-30 15:24:51 UTC
Permalink
Hi,

As follow-up to my .torrent-less download thread, I'd like two messages to
be added to the BT protocol: get_info and info.
get_info has no payload
info has has payload containing info (bencoded, from .torrent)
get_info could be send after the handshake and before other messages
info should be send after receiving a handshake and get_info, but before
receiving other messages

I'd also like to propose a URL format for downloads instead of a .torrent
file:
btp:protocol:host:port/info_hash/peers/
btp:udp:192.168.1.1:2710/%01%23%45%67%89%ab%cd%ef
%01%23%45%67%89%ab%cd%ef%01%23%45%67/192.168.1.2:6881 (example)

btp: Bit Torrent Protocol
protocol: http, but could be udp in the future
host: IP address or hostname of tracker
port: TCP or UDP port of tracker
info_hash
peers: optional list of IP address/port combinations of listening peers

The info key of a .torrent should be made optional. When it's not present, a
info_hash key should be present instead containing the SHA1 hash of the info
key.

Olaf van der Spek
Almere, Holland
http://xccu.sourceforge.net/



------------------------ Yahoo! Groups Sponsor ---------------------~-->
Buy Ink Cartridges or Refill Kits for your HP, Epson, Canon or Lexmark
Printer at MyInks.com. Free s/h on orders $50 or more to the US & Canada.
http://www.c1tracking.com/l.asp?cid=5511
http://us.click.yahoo.com/mOAaAA/3exGAA/qnsNAA/dkFolB/TM
---------------------------------------------------------------------~->

To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com



Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
iain_wade
2005-01-12 05:09:28 UTC
Permalink
Post by Olaf van der Spek
As follow-up to my .torrent-less download thread, I'd like two
messages to
Post by Olaf van der Spek
be added to the BT protocol: get_info and info.
get_info has no payload
info has has payload containing info (bencoded, from .torrent)
get_info could be send after the handshake and before other messages
info should be send after receiving a handshake and get_info, but before
receiving other messages
Hello Bram, and list members,

I would also like to see a get_info/info extension added.

I work for an ISP and we are eager to cache bittorrent content to
lighten the load on our network links as much as possible as well as
accelerate the performance for our (and yours) users. win/win.

The intention of our caching peer is to have it passively listening
for connections solely from our own customer base, not to participate
in the general torrent distribution.

To be able to do this, we need two things. The first is to get the
clients to connect to us. We need that to happen by default as relying
on user configuration would limit this feature to a small fraction of
the user base. (I have a http proxy (for tracker communications)
written which will add itself to the list of peers before returning,
but getting a few hundred thousand people to change their settings is
going to be a problem for us).

I would like the following patch to be integrated into the official
BitTorrent client (as well as any other clients) for this reason. It
does a single dns lookup each time the program starts for
"btcache.p2p" and adds the ip addresses returned to the peer list:

https://habitue.net/projects/bt/btcache.patch

The second feature would be the get_info/info extension so we can
obtain the "pieces" and "piece length" fields needed for sensible
torrent participation.

https://habitue.net/projects/bt/btgetinfo.patch

If people are against including the getinfo patch for any reason, I
would still be eager to see the first patch included as I have some
code written which will "probe" the block size (powers of 2) and not
do any data checking if I don't have the pieces hashes.

I've got the .torrent-less caching server code written and freely
available under a GPL license.

https://habitue.net/projects/bt/

If anyone has any feedback I'd be glad to hear it.

Regards,
--Iain

Note// I understand JoltId (http://www.joltid.com) have been
contacting bittorrent client authors and paying for them to implement
a caching proxy solution locked in to their proprietary PeerCache
protocol. I would hope any agreements with them do not preclude the
fruits of our own work (open source and freely available) being
accepted.






Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Elliott Mitchell
2005-01-12 07:07:00 UTC
Permalink
Post by iain_wade
The intention of our caching peer is to have it passively listening
for connections solely from our own customer base, not to participate
in the general torrent distribution.
To be able to do this, we need two things. The first is to get the
clients to connect to us. We need that to happen by default as relying
on user configuration would limit this feature to a small fraction of
the user base. (I have a http proxy (for tracker communications)
written which will add itself to the list of peers before returning,
but getting a few hundred thousand people to change their settings is
going to be a problem for us).
Why not replace the returned list with *just* your cache?

The simplest approach might be to return one IP:port pair for each peer
the tracker returns. By then imitating the client actions, you can
reasonably effectively imitate how the client is acting and correctly
tit for tat each peer. Add in one extra record that is purely the cache
to provide cached blocks (prevents a peer from getting over-credited).
Post by iain_wade
I would like the following patch to be integrated into the official
BitTorrent client (as well as any other clients) for this reason. It
does a single dns lookup each time the program starts for
https://habitue.net/projects/bt/btcache.patch
Problem is this pollutes things for folks without a cache in from of
them. Why is this needed? If you can proxy the tracker connection, why is
better than modifying the response?
Post by iain_wade
The second feature would be the get_info/info extension so we can
obtain the "pieces" and "piece length" fields needed for sensible
torrent participation.
https://habitue.net/projects/bt/btgetinfo.patch
Why do you need to ask the client this information for sensible
participation? You can proxy with the piece# plus offset being the keys
and everything will fine. Work on 16/32K blocks and it works fine (either
you have to join 16K blocks together for clients that use 32K blocks, or
split 32K blocks for clients with 16K blocks). If you see the client ask
for the piece again, you can guess that you've got bogus data in your
cache.

I've suggested the "by_hash" mode to solve a similar situation. The nice
part about by_hash is you also add deniability for the cache. You're
caching and cannot be forced to reveal or police what your clients are
downloading (because you cannot know).
--
(\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/)
\ ( | ***@gremlin.m5p.com PGP 8881EF59 | ) /
\_ \ | _____ -O #include <stddisclaimer.h> O- _____ | / _/
\___\_|_/82 04 A1 3C C7 B1 37 2A*E3 6E 84 DA 97 4C 40 E6\_|_/___/





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
iain_wade
2005-01-12 07:57:49 UTC
Permalink
Post by Elliott Mitchell
Post by iain_wade
The intention of our caching peer is to have it passively
listening for connections solely from our own customer base,
not to participate in the general torrent distribution.
To be able to do this, we need two things. The first is to get
the clients to connect to us. We need that to happen by default
as relying on user configuration would limit this feature to
a small fraction of the user base. (I have a http proxy (for
tracker communications) written which will add itself to the
list of peers before returning, but getting a few hundred
thousand people to change their settings is going to be a
problem for us).
Why not replace the returned list with *just* your cache?
The simplest approach might be to return one IP:port pair for
each peer the tracker returns. By then imitating the client
actions, you can reasonably effectively imitate how the client
is acting and correctly tit for tat each peer. Add in one extra
record that is purely the cache to provide cached blocks
(prevents a peer from getting over-credited).
Because that would require proxying all communications to the outside
world, which causes a number of problems (and helps out with a few
others).

It also significantly complicates the code required.

Think performance and scalability.

We have significantly more broadband customers than the 64k available
ports, let alone multiplying that by the number of peers they are
talking to.

As implemented now though, there is an option on the proxy to just
return itself in which case you will only receive blocks that other
users have already downloaded and are available in the cache.

As for implementing a tit-for-tat algorithm for "correctness". This
cache just slams the data out as fast as it can, like any good cache
should.
Post by Elliott Mitchell
Post by iain_wade
I would like the following patch to be integrated into the
official BitTorrent client (as well as any other clients) for
this reason. It does a single dns lookup each time the program
starts for "btcache.p2p" and adds the ip addresses returned to
https://habitue.net/projects/bt/btcache.patch
Problem is this pollutes things for folks without a cache in from
of them. Why is this needed? If you can proxy the tracker
connection, why is better than modifying the response?
I don't consider one extra DNS lookup per program start "pollution".

We have a plugin available for Azureus which caches the results for
three days, but I didn't want to complicate this patch and there was
no configuration/storage facility available in the official client.

I can only proxy tracker connections when the user specifically adds
a proxy setting (i.e. set http_proxy environment variable with the
official client, or Azureus has a GUI setting).

With almost a million customers and a quarter of a million broadband
ones, communicating that change becomes too big a deal. For Windows
XP users running the official client it means going into an obscure
control panel and adjusting their environment variables which could
possible affect other software.

Caching is most effective when more people are using it.
The best case from our perspective is to get it enabled by default.

The source to the cache is available for other folks to run as well.
If it as effective as expected then others may choose to run it.
Post by Elliott Mitchell
Post by iain_wade
The second feature would be the get_info/info extension so we can
obtain the "pieces" and "piece length" fields needed for sensible
torrent participation.
https://habitue.net/projects/bt/btgetinfo.patch
Why do you need to ask the client this information for sensible
participation? You can proxy with the piece# plus offset being the
keys and everything will fine. Work on 16/32K blocks and it works
fine (either you have to join 16K blocks together for clients that
use 32K blocks, or split 32K blocks for clients with 16K blocks).
If you see the client ask for the piece again, you can guess that
you've got bogus data in your cache.
I've suggested the "by_hash" mode to solve a similar situation.
The nice part about by_hash is you also add deniability for the
cache. You're caching and cannot be forced to reveal or police
what your clients are downloading (because you cannot know).
A man-in-the-middle proxy for all communications is not a scalable
option for peer to peer networks. One option would be to implement the
proxy cache only for actual data transfers but that would require much
larger and more invasive patch and I believe this will be almost as
effective.

I'm not interested in deniability, I want to save bandwidth. I don't
save logs, so associating a blob of data on our disk with a user is
not possible.

Got I hate this YahooGroups interface :-/

Regards,
--Iain






Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-01-12 19:37:00 UTC
Permalink
Post by iain_wade
Post by Elliott Mitchell
Post by iain_wade
The intention of our caching peer is to have it passively
listening for connections solely from our own customer base,
not to participate in the general torrent distribution.
To be able to do this, we need two things. The first is to get
the clients to connect to us. We need that to happen by default
as relying on user configuration would limit this feature to
a small fraction of the user base. (I have a http proxy (for
tracker communications) written which will add itself to the
list of peers before returning, but getting a few hundred
thousand people to change their settings is going to be a
problem for us).
Why not replace the returned list with *just* your cache?
The simplest approach might be to return one IP:port pair for
each peer the tracker returns. By then imitating the client
actions, you can reasonably effectively imitate how the client
is acting and correctly tit for tat each peer. Add in one extra
record that is purely the cache to provide cached blocks
(prevents a peer from getting over-credited).
Because that would require proxying all communications to the outside
world, which causes a number of problems (and helps out with a few
others).
It also significantly complicates the code required.
Think performance and scalability.
We have significantly more broadband customers than the 64k available
ports, let alone multiplying that by the number of peers they are
talking to.
You only need a single port. Only the tuple source IP address, port,
destination IP address, port has to be unique.




Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Elliott Mitchell
2005-01-13 00:14:17 UTC
Permalink
Post by iain_wade
Post by Elliott Mitchell
Post by iain_wade
The intention of our caching peer is to have it passively
listening for connections solely from our own customer base,
not to participate in the general torrent distribution.
To be able to do this, we need two things. The first is to get
the clients to connect to us. We need that to happen by default
as relying on user configuration would limit this feature to
a small fraction of the user base. (I have a http proxy (for
tracker communications) written which will add itself to the
list of peers before returning, but getting a few hundred
thousand people to change their settings is going to be a
problem for us).
Why not replace the returned list with *just* your cache?
The simplest approach might be to return one IP:port pair for
each peer the tracker returns. By then imitating the client
actions, you can reasonably effectively imitate how the client
is acting and correctly tit for tat each peer. Add in one extra
record that is purely the cache to provide cached blocks
(prevents a peer from getting over-credited).
Because that would require proxying all communications to the outside
world, which causes a number of problems (and helps out with a few
others).
True.
Post by iain_wade
It also significantly complicates the code required.
Makes it so that care must be taken when writing it, but I must disagree.
Should be roughly the same amount of code.
Post by iain_wade
Think performance and scalability.
Assign different pieces to a series of hosts, assign different ranges of
IP addresses to be proxyed through different servers.
Post by iain_wade
We have significantly more broadband customers than the 64k available
ports, let alone multiplying that by the number of peers they are
talking to.
So?

A connection is defined by the tuple, source address, source port,
destination address and destination port. You're not even come close to
exhausting that space. You will need multiple connections per port, not
the most commonly used setup, but entirely within the capabilities of any
OS.
Post by iain_wade
As implemented now though, there is an option on the proxy to just
return itself in which case you will only receive blocks that other
users have already downloaded and are available in the cache.
Problem is the client will attempt to get more peers if you only give it
one.
Post by iain_wade
As for implementing a tit-for-tat algorithm for "correctness". This
cache just slams the data out as fast as it can, like any good cache
should.
In other words you're caching for the whole world? Great, what is your
proxy's IP address and port? I'd love such an unlimited cache!

I was suggesting that with the scenario I was suggesting you don't need
to implement tit for tat. You only need to proxy the client's actions. If
a client receives a block and disconnects you know the peer is goofy.
Post by iain_wade
Post by Elliott Mitchell
Post by iain_wade
I would like the following patch to be integrated into the
official BitTorrent client (as well as any other clients) for
this reason. It does a single dns lookup each time the program
starts for "btcache.p2p" and adds the ip addresses returned to
https://habitue.net/projects/bt/btcache.patch
Problem is this pollutes things for folks without a cache in from
of them. Why is this needed? If you can proxy the tracker
connection, why is better than modifying the response?
I don't consider one extra DNS lookup per program start "pollution".
One lookup for an invalid domain. Slowing most systems. This also
strongly advertises your use of BitTorrent to the ISP, who shouldn't be
informed for privacy reasons (they can analyze the traffic to find BT,
but your suggestion provides a clear indicator).
Post by iain_wade
With almost a million customers and a quarter of a million broadband
ones, communicating that change becomes too big a deal. For Windows
XP users running the official client it means going into an obscure
control panel and adjusting their environment variables which could
possible affect other software.
Your million customers represent less than 1% of the Internet. Making
a cache easy to utilize is acceptable, making intrusive changes IMO is
not.
Post by iain_wade
Caching is most effective when more people are using it.
The best case from our perspective is to get it enabled by default.
The source to the cache is available for other folks to run as well.
If it as effective as expected then others may choose to run it.
Thing is there will be plenty who either explicitly do not wish to use
it, or don't have one handy. For them it is a disadvantage.
Post by iain_wade
Post by Elliott Mitchell
Post by iain_wade
The second feature would be the get_info/info extension so we can
obtain the "pieces" and "piece length" fields needed for sensible
torrent participation.
https://habitue.net/projects/bt/btgetinfo.patch
Why do you need to ask the client this information for sensible
participation? You can proxy with the piece# plus offset being the
keys and everything will fine. Work on 16/32K blocks and it works
fine (either you have to join 16K blocks together for clients that
use 32K blocks, or split 32K blocks for clients with 16K blocks).
If you see the client ask for the piece again, you can guess that
you've got bogus data in your cache.
I've suggested the "by_hash" mode to solve a similar situation.
The nice part about by_hash is you also add deniability for the
cache. You're caching and cannot be forced to reveal or police
what your clients are downloading (because you cannot know).
A man-in-the-middle proxy for all communications is not a scalable
option for peer to peer networks. One option would be to implement the
proxy cache only for actual data transfers but that would require much
larger and more invasive patch and I believe this will be almost as
effective.
And a non-MitM cache is likely to scale? Seems like they'll both run into
a wall at about the same time.
Post by iain_wade
I'm not interested in deniability, I want to save bandwidth. I don't
save logs, so associating a blob of data on our disk with a user is
not possible.
Good goal. Admirable position. My concern is that less scrupulous folks
may choose to log, or could be forced to log via court order. I'd like it
to be that keeping logs isn't useful because they cannot be made to yield
any information.
Post by iain_wade
Got I hate this YahooGroups interface :-/
Join the club, though as a mailing list it does mostly work.
--
(\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/)
\ ( | ***@gremlin.m5p.com PGP 8881EF59 | ) /
\_ \ | _____ -O #include <stddisclaimer.h> O- _____ | / _/
\___\_|_/82 04 A1 3C C7 B1 37 2A*E3 6E 84 DA 97 4C 40 E6\_|_/___/





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Iain Wade
2005-01-13 01:31:31 UTC
Permalink
Post by Elliott Mitchell
Post by iain_wade
Post by Elliott Mitchell
Problem is this pollutes things for folks without a cache in from
of them. Why is this needed? If you can proxy the tracker
connection, why is better than modifying the response?
I don't consider one extra DNS lookup per program start "pollution".
One lookup for an invalid domain. Slowing most systems. This also
strongly advertises your use of BitTorrent to the ISP, who shouldn't be
informed for privacy reasons (they can analyze the traffic to find BT,
but your suggestion provides a clear indicator).
Using BitTorrent is not a crime.

Using p2p is not a crime.

Copyright infringement is a crime and looking up btcache.p2p does not
notify anyone as to whether you are doing so or not.

This could be used by an ISP to determine if an individual is using a
BitTorrent client, but they would still need to snoop all your traffic
to see what you are downloading.

This could be used by an ISP to determine what percentage of their
users are using BitTorrent clients, but in aggregate form this
information does not breach your privacy.
Post by Elliott Mitchell
Your million customers represent less than 1% of the Internet. Making
a cache easy to utilize is acceptable, making intrusive changes IMO is
not.
I agree.

These are the smallest number of changes possible IMO.

As I said, I would have liked to cache the result of the query but the
official client in particular does not have a config saving facility
currently and I didn't feel the need to add one.

Other client will most likely cache the results for a number of days,
limiting the impact of this change even further.
Post by Elliott Mitchell
Thing is there will be plenty who either explicitly do not wish to use
it, or don't have one handy. For them it is a disadvantage.
I think your are mis-judging the impact of a single dns lookup.

Every time you type an address into your web-browser your machine
would perform at least 4 lookups.

an "AAAA" lookup for bittorrent.com.my.search.domain.
an "AAAA" lookup for bittorrent.com.
an "A" lookup for bittorrent.com.my.search.domain.
an "A" lookup for bittorrent.com.

Some client would perform more if they have a couple of search suffixes.

In contrast, this change performs a single extra "A" record lookup at startup.
Post by Elliott Mitchell
And a non-MitM cache is likely to scale? Seems like they'll both run into
a wall at about the same time.
The difference is that when a MitM cache hits the wall performance
goes to shit for all the users passing through that system.

The non-MitM cache only augments client participation, it doesn't control it.
Post by Elliott Mitchell
Post by iain_wade
I'm not interested in deniability, I want to save bandwidth. I don't
save logs, so associating a blob of data on our disk with a user is
not possible.
Good goal. Admirable position. My concern is that less scrupulous folks
may choose to log, or could be forced to log via court order. I'd like it
to be that keeping logs isn't useful because they cannot be made to yield
any information.
Again I disagree. ISP's are probably less likely to run tcpdump on
their nameservers looking to see which users are trying to resolve
btcache.p2p than they are to perform a court-ordered legal intercept
on all of a specific user's traffic.

Trust me, we've gotten really used to handling legal intercepts since 9/11.
Post by Elliott Mitchell
Post by iain_wade
Got I hate this YahooGroups interface :-/
Join the club, though as a mailing list it does mostly work.
I'm getting significant lag here :(

--Iain



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Elliott Mitchell
2005-01-15 08:00:05 UTC
Permalink
Post by Iain Wade
The problem here is that it is still susceptible to corruption. You
can't trust the other parties in a p2p download without being able to
verify what they've said. The info blob can be verified by comparing
the hash of it to the info_hash on the connection. This ensures all
clients connected with that info_hash are talking about the same
content.
Also, this extension would be more effective in the short term before
it's adoption is widespread because with the get_info/info extension
only one client needs support for the feature and the client only
needs to be available for a short time to upload the info blob, but
with yours at least one client supporting the feature would need to be
available at all times to support checking.
Thing is "You can't trust other parties" applies to your scheme too. I
can poison your cache by asking for a torrent, then giving you bogus info
hash data. I can also generate thousands of garbage header sets, compute
appropriate info hashes for these and overwhelm you with garbage
torrents. I can even generate a header set so huge you don't have enough
disk space for just the header set. Either you're going to run out of
disk space, or you're going to start discarding valid torrents.

The nice part about the incremental schemes is you don't store any data
other than the pieces. Store the piece in a file named for the hash and
nothing else.

Though it involves the most change to the protocol/code, designating
pieces by their hash means the least number of attack avenues. I can't
give you bogus hashes (no one will send anything) nor can I cause you to
repeatedly download anything (the hash will check, no redownload).
Anything I tell you to download *must* be valid. The worst I can do is
waste your bandwidth equally with my own.
Post by Iain Wade
Post by Bill Cox
Couldn't this feature be used to keep track of all user downloads? I
know this is already possible for ISPs with basic traffic snooping, but
I think users might worry about an automatic feature that tells his ISP
that he's about to download a file set. I'd like to think that my ISP
tries hard not to look too closely at my traffic. It'd feel like a
small invasion of privacy otherwise.
It couldn't track downloads. It could alert the ISP of the use of a
BitTorrent client, but really I think if the ISP cared they can
already determine this and users should be aware that FastTrack
clients already lookup "cache.p2p" and emule already look up
"edcache.p2p".
Just because they've made a poor choice is no reason to repeat it.
Post by Iain Wade
Post by Bill Cox
Thing is there will be plenty who either explicitly do not wish to use
it, or don't have one handy. For them it is a disadvantage.
I think your are mis-judging the impact of a single dns lookup.
Every time you type an address into your web-browser your machine
would perform at least 4 lookups.
an "AAAA" lookup for bittorrent.com.my.search.domain.
an "AAAA" lookup for bittorrent.com.
an "A" lookup for bittorrent.com.my.search.domain.
an "A" lookup for bittorrent.com.
Some client would perform more if they have a couple of search suffixes.
In contrast, this change performs a single extra "A" record lookup at startup.
I reject your analysis here.

The bittorrent.com.my.search.domain. requests only happen if a local
search domain has been defined (this is not a certainty). If a local
search domain has been defined, this will likely be over ethernet or
other bandwidth endowed connection, in which case these can be ignored.

A client will be forced to look for both AAAA, A6, and A records for your
bogus domain before giving up; the exact same number as for a valid
lookup. Worse, the nameserver will have to go all the way to the root
nameservers for this query, most likely none of the records will be in a
local cache. So, in the general case you've at least doubled the lookup
time.
Post by Iain Wade
Post by Bill Cox
And a non-MitM cache is likely to scale? Seems like they'll both run into
a wall at about the same time.
The difference is that when a MitM cache hits the wall performance
goes to shit for all the users passing through that system.
Oh, that issue. You're already having to MitM the tracker query so this
is already an issue. You can go back to a combo strategy, return the
tracker's query but add spoof records for each of the returned ones.

We may be decaying into client implementation concerns here.
Post by Iain Wade
Post by Bill Cox
Good goal. Admirable position. My concern is that less scrupulous folks
may choose to log, or could be forced to log via court order. I'd like it
to be that keeping logs isn't useful because they cannot be made to yield
any information.
Again I disagree. ISP's are probably less likely to run tcpdump on
their nameservers looking to see which users are trying to resolve
btcache.p2p than they are to perform a court-ordered legal intercept
on all of a specific user's traffic.
Trust me, we've gotten really used to handling legal intercepts since 9/11.
If there is no cache then the query will go out to the Internet
announcing that there is a client present. If there is a cache then an
attacker will at some point become aware of it, and be attacking it. If
there is no cache then you're announcing your existance where before
someone would of had a lot of work to even figure out that you existed.
Post by Iain Wade
Post by Bill Cox
Post by iain_wade
Got I hate this YahooGroups interface :-/
Join the club, though as a mailing list it does mostly work.
I'm getting significant lag here :(
This is a moderated list. Wait a while and you should get on the white
list.
--
(\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/)
\ ( | ***@gremlin.m5p.com PGP 8881EF59 | ) /
\_ \ | _____ -O #include <stddisclaimer.h> O- _____ | / _/
\___\_|_/82 04 A1 3C C7 B1 37 2A*E3 6E 84 DA 97 4C 40 E6\_|_/___/





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-01-15 09:27:40 UTC
Permalink
Post by Elliott Mitchell
Thing is "You can't trust other parties" applies to your scheme too. I
can poison your cache by asking for a torrent, then giving you bogus info
hash data. I can also generate thousands of garbage header sets, compute
appropriate info hashes for these and overwhelm you with garbage
torrents. I can even generate a header set so huge you don't have enough
disk space for just the header set. Either you're going to run out of
disk space, or you're going to start discarding valid torrents.
Doesn't that depend on your replace policy?
Post by Elliott Mitchell
A client will be forced to look for both AAAA, A6, and A records for your
What's A6?



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Elliott Mitchell
2005-01-17 23:44:02 UTC
Permalink
Post by Olaf van der Spek
Post by Elliott Mitchell
Thing is "You can't trust other parties" applies to your scheme too. I
can poison your cache by asking for a torrent, then giving you bogus info
hash data. I can also generate thousands of garbage header sets, compute
appropriate info hashes for these and overwhelm you with garbage
torrents. I can even generate a header set so huge you don't have enough
disk space for just the header set. Either you're going to run out of
disk space, or you're going to start discarding valid torrents.
Doesn't that depend on your replace policy?
Yes, but it means another thing of significant size to worry about. You
haven't said anything about the more serious problem, cache poisoning. If
my evil client can ask for a particular torrent before anyone else, and
given bogus hashes you're dead.
Post by Olaf van der Spek
Post by Elliott Mitchell
A client will be forced to look for both AAAA, A6, and A records for your
What's A6?
The other IPv6 record. Looks to be disappearing, but a client might still
try to get it.
--
(\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/)
\ ( | ***@gremlin.m5p.com PGP 8881EF59 | ) /
\_ \ | _____ -O #include <stddisclaimer.h> O- _____ | / _/
\___\_|_/82 04 A1 3C C7 B1 37 2A*E3 6E 84 DA 97 4C 40 E6\_|_/___/





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Iain Wade
2005-01-18 00:11:34 UTC
Permalink
Post by Elliott Mitchell
Yes, but it means another thing of significant size to worry about. You
haven't said anything about the more serious problem, cache poisoning. If
my evil client can ask for a particular torrent before anyone else, and
given bogus hashes you're dead.
This can't happen .. if you connect using an info_hash, and the cache
asks you for the info blob because you are the first client connecting
with that hash, you can't reply with bogus data because the (sha1)hash
of the info blob won't match the info_hash you connected with. If you
connect with an info_hash designed around your bogus blob then no
other users will be using that info_hash and you can't affect them.

That is the whole point of sending this field. If there was another
way to get trust-worthy info out of the peer I would be very happy.

--Iain



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Iain Wade
2005-01-17 05:05:44 UTC
Permalink
Hello again,

I hope everyone had a good weekend.
Post by Elliott Mitchell
Thing is "You can't trust other parties" applies to your scheme too. I
can poison your cache by asking for a torrent, then giving you bogus info
hash data. I can also generate thousands of garbage header sets, compute
appropriate info hashes for these and overwhelm you with garbage
torrents. I can even generate a header set so huge you don't have enough
disk space for just the header set. Either you're going to run out of
disk space, or you're going to start discarding valid torrents.
This is all solvable in the cache/peer software.
The cache can choose not to ask for info blob's until a few clients
are concurrently requesting it.
The cache has a maximum message size, it would disconnect on messages
over a certain size.
A hybrid LRU/hit-rate cache cleaning selection algorithm will keep
disk utilisation in check.
Post by Elliott Mitchell
The nice part about the incremental schemes is you don't store any data
other than the pieces. Store the piece in a file named for the hash and
nothing else.
Though it involves the most change to the protocol/code, designating
pieces by their hash means the least number of attack avenues. I can't
give you bogus hashes (no one will send anything) nor can I cause you to
repeatedly download anything (the hash will check, no redownload).
Anything I tell you to download *must* be valid. The worst I can do is
waste your bandwidth equally with my own.
I think I misunderstand your proposal. Do you have a reference page?

I agree that my proposal is not the most optimal, I have taken this
path because it requires the least code changes to clients as I
expected/hoped this would have a quicker acceptance by involved
parties.

How wrong I was :-)
Post by Elliott Mitchell
Post by Iain Wade
Post by Bill Cox
Couldn't this feature be used to keep track of all user downloads? I
know this is already possible for ISPs with basic traffic snooping, but
I think users might worry about an automatic feature that tells his ISP
that he's about to download a file set. I'd like to think that my ISP
tries hard not to look too closely at my traffic. It'd feel like a
small invasion of privacy otherwise.
It couldn't track downloads. It could alert the ISP of the use of a
BitTorrent client, but really I think if the ISP cared they can
already determine this and users should be aware that FastTrack
clients already lookup "cache.p2p" and emule already look up
"edcache.p2p".
Just because they've made a poor choice is no reason to repeat it.
heh. I can approach to this problem has been shaped by my exposure to
kazaa/emule, but I am having troubles seeing a superior alternative.
Post by Elliott Mitchell
Post by Iain Wade
Post by Bill Cox
Thing is there will be plenty who either explicitly do not wish to use
it, or don't have one handy. For them it is a disadvantage.
I think your are mis-judging the impact of a single dns lookup.
Every time you type an address into your web-browser your machine
would perform at least 4 lookups.
an "AAAA" lookup for bittorrent.com.my.search.domain.
an "AAAA" lookup for bittorrent.com.
an "A" lookup for bittorrent.com.my.search.domain.
an "A" lookup for bittorrent.com.
Some client would perform more if they have a couple of search suffixes.
In contrast, this change performs a single extra "A" record lookup at startup.
I reject your analysis here.
The bittorrent.com.my.search.domain. requests only happen if a local
search domain has been defined (this is not a certainty). If a local
search domain has been defined, this will likely be over ethernet or
other bandwidth endowed connection, in which case these can be ignored.
A client will be forced to look for both AAAA, A6, and A records for your
bogus domain before giving up; the exact same number as for a valid
lookup. Worse, the nameserver will have to go all the way to the root
nameservers for this query, most likely none of the records will be in a
local cache. So, in the general case you've at least doubled the lookup
time.
Apparently not :-)

In practice the python gethostbyname lookup doesn't do AAAA lookups by
default, at least on my redhat/fedora systems.

As for hitting root nameservers I expect this would not be an issue as
bind implements a negative lookup cache by default, so a large volume
of these lookups would be handled relatively efficiently at the ISP.

I thought about limiting the scope of that lookup to the local domain
suffix but I am not sure of how to do so in a cross platform way (and
especially not in python).
Post by Elliott Mitchell
Post by Iain Wade
Post by Bill Cox
And a non-MitM cache is likely to scale? Seems like they'll both run into
a wall at about the same time.
The difference is that when a MitM cache hits the wall performance
goes to shit for all the users passing through that system.
Oh, that issue. You're already having to MitM the tracker query so this
is already an issue. You can go back to a combo strategy, return the
tracker's query but add spoof records for each of the returned ones.
The goal is to not run the mitm tracker proxy .. it sucks quite honestly.

It's open to abuse because it cannot differentiate a tracker request
from any other type of http request (it's currently just not passing
non-bencoded responses back to end users, with a few update-check
exceptions for azureus).

The dns lookup negates the need for this component altogether.
Post by Elliott Mitchell
We may be decaying into client implementation concerns here.
I agree.

I still hope these patches will be included as I don't believe there
is a fundamental problem with them and they opened up other
possibilities as outlined in Olaf's original post (i.e. light weight
bittorrent url's which contain the info_hash and a few bootstrap
addresses, letting the clients download the torrent body from the peer
and start traversing that bootstrap hosts' peer list).

We've talked an awfull lot about a 5-line dns lookup patch.
Post by Elliott Mitchell
Post by Iain Wade
Post by Bill Cox
Good goal. Admirable position. My concern is that less scrupulous folks
may choose to log, or could be forced to log via court order. I'd like it
to be that keeping logs isn't useful because they cannot be made to yield
any information.
Again I disagree. ISP's are probably less likely to run tcpdump on
their nameservers looking to see which users are trying to resolve
btcache.p2p than they are to perform a court-ordered legal intercept
on all of a specific user's traffic.
Trust me, we've gotten really used to handling legal intercepts since 9/11.
If there is no cache then the query will go out to the Internet
announcing that there is a client present. If there is a cache then an
attacker will at some point become aware of it, and be attacking it. If
there is no cache then you're announcing your existance where before
someone would of had a lot of work to even figure out that you existed.
if I made it configurable for the paranoid, would you be happy?

can you see any alternative way of achieving the goal: connect to the
cache of my p2p friendly ISP to accelerate my downloads?

--Iain



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Elliott Mitchell
2005-02-06 08:36:48 UTC
Permalink
Post by Iain Wade
I think I misunderstand your proposal. Do you have a reference page?
My main concerns were elsewhere. On further reflection, my proposal
doesn't really help or hurt your's. I conceed and my appologies for
cluttering your thread with non-core issues.


I maintain my stand on one point though.

I believe your lookup of btcache.p2p shouldn't be enabled by default.
This in my mind advertises your use of BT to folks who have no business
knowing. Certainly prominently advertising the existance of such an
option is a good thing though.


Against Bill, rather than you. The hashes *must* be uploaded to the
cache. If otherwise the cache might download a piece, claim to HAVE it
and only afterwords discover it is bad. A client could also request the
download of a piece and then claim the piece was incorrect.
--
(\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/)
\ ( | ***@gremlin.m5p.com PGP 8881EF59 | ) /
\_ \ | _____ -O #include <stddisclaimer.h> O- _____ | / _/
\___\_|_/82 04 A1 3C C7 B1 37 2A*E3 6E 84 DA 97 4C 40 E6\_|_/___/





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
iain_wade
2005-01-14 00:12:26 UTC
Permalink
Post by Elliott Mitchell
Post by iain_wade
Post by Elliott Mitchell
Problem is this pollutes things for folks without a cache in
from of them. Why is this needed? If you can proxy the tracker
connection, why is better than modifying the response?
I don't consider one extra DNS lookup per program start
"pollution".
One lookup for an invalid domain. Slowing most systems. This also
strongly advertises your use of BitTorrent to the ISP, who
shouldn't be informed for privacy reasons (they can analyze the
traffic to find BT, but your suggestion provides a clear indicator).
Using BitTorrent is not a crime.

Using p2p is not a crime.

Copyright infringement is a crime and looking up btcache.p2p does not
notify anyone as to whether you are doing so or not.

This could be used by an ISP to determine if an individual is using a
BitTorrent client, but they would still need to snoop all your traffic
to see what you are downloading.

This could be used by an ISP to determine what percentage of their
users are using BitTorrent clients, but in aggregate form this
information does not breach your privacy.
Post by Elliott Mitchell
Your million customers represent less than 1% of the Internet.
Making a cache easy to utilize is acceptable, making intrusive
changes IMO is not.
I agree.

These are the smallest number of changes possible IMO.

As I said, I would have liked to cache the result of the query but the
official client in particular does not have a config saving facility
currently and I didn't feel the need to add one.

Other client will most likely cache the results for a number of days,
limiting the impact of this change even further.
Post by Elliott Mitchell
Thing is there will be plenty who either explicitly do not wish to
use it, or don't have one handy. For them it is a disadvantage.
I think your are mis-judging the impact of a single dns lookup.

Every time you type an address into your web-browser your machine
would probably perform at least 4 dns lookups.

an "AAAA" lookup for bittorrent.com.my.search.domain.
an "AAAA" lookup for bittorrent.com.
an "A" lookup for bittorrent.com.my.search.domain.
an "A" lookup for bittorrent.com.

You would perform more if you have a couple of search suffixes, or if
there are inline images.

In contrast, this change performs a single extra "A" record lookup at
startup.
Post by Elliott Mitchell
And a non-MitM cache is likely to scale? Seems like they'll both
run into a wall at about the same time.
The difference is that when a MitM cache hits the wall performance
goes to shit for all the users passing through that system.

The non-MitM cache only augments client participation, it doesn't
control it.
Post by Elliott Mitchell
Post by iain_wade
I'm not interested in deniability, I want to save bandwidth. I
don't save logs, so associating a blob of data on our disk with
a user is not possible.
Good goal. Admirable position. My concern is that less scrupulous
folks may choose to log, or could be forced to log via court order.
I'd like it to be that keeping logs isn't useful because they
cannot be made to yield any information.
Again I disagree. ISP's are probably less likely to run tcpdump on
their nameservers looking to see which users are trying to resolve
btcache.p2p than they are to perform a court-ordered legal intercept
on all of a specific user's traffic.

Trust me, we've gotten really used to handling legal intercepts since
9/11.
Post by Elliott Mitchell
Post by iain_wade
Got I hate this YahooGroups interface :-/
Join the club, though as a mailing list it does mostly work.
I'm getting significant lag here :(

And it didn't accept my last two posts when sent from gmail. maybe the
from address didn't match my list subscription or something :(

--Iain






Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Bill Cox
2005-01-12 11:56:25 UTC
Permalink
Post by Olaf van der Spek
Post by Olaf van der Spek
As follow-up to my .torrent-less download thread, I'd like two
messages to
Post by Olaf van der Spek
be added to the BT protocol: get_info and info.
get_info has no payload
info has has payload containing info (bencoded, from .torrent)
get_info could be send after the handshake and before other messages
info should be send after receiving a handshake and get_info, but
before
Post by Olaf van der Spek
receiving other messages
Hello Bram, and list members,
I would also like to see a get_info/info extension added.
Very interesting work. This is also a very interesting problem.

I couldn't find the original message double-quoted above, so I'm taking
what's listed above out of context...

To use get_info/info for .torrent-less downloads, users would already
need to know where to find a tracker and the info_hash value.
Otherwise, they wouldn't know what peers to contact for the info, and
wouldn't be able to send a valid handshake. Is there a scheme for
finding these? In my proposed BT friends protocol extension
(btslave.sf.net/btfriends.html), I send a request_help message before
the handshake (I know, it sounds weird, but it simply works better
before the handshake, not after). It's load provides just enough
information to build a basic torrent object and find the tracker. Then,
the handshake can proceed as usual.

Another problem with the get_info/info extension is that the load of the
info message can be hundreds of KBytes. That's a big message. Even if
you download the message correctly, a user might kill the session early,
wasting the effort of sending the torrent's info in the first place.

In my BT friends extension, I do it differently. After receiving a
piece, I compute it's SHA1, and then send a 'piece_info' message to a
peer that has the piece and supports the extension. It's load is the
piece index and my computed SHA1 value for the piece. The peer replies
with a 'piece_correct' or 'piece_incorrect' message. Once I've received
a 'piece_correct' message, I send my HAVE messages.

This way, only small messages are sent, and the torrent info is only
sent as you need it.
Post by Olaf van der Spek
I work for an ISP and we are eager to cache bittorrent content to
lighten the load on our network links as much as possible as well as
accelerate the performance for our (and yours) users. win/win.
The intention of our caching peer is to have it passively listening
for connections solely from our own customer base, not to participate
in the general torrent distribution.
Reducing the external network traffic for an ISP will obviously save
them money, but you have to be careful not to upset either your
customers or the movie and music industries.

I'd suggest thinking of your program as a repeater, rather than a cache.
I assume that your cache acts just like a normal client: download as
fast as you can, and once you've got the whole thing start acting as a
seeder. This has some real problems. You'd download all kinds of
torrents, and probably wind up with lots of terrabytes of data. The
small problem is that disk cache is expensive. The big problem is that
the music/movie industry will probably have a nice chat with your
employer about seeding their works. Another problem is that you're not
helping the torrent while you're downloading.

A repeater, on the other hand, downloads pieces only when it sees a
chance to help it's peers. If no connected peers are interested in the
pieces you currently have, go get another piece. Otherwise, just seed
the pieces you have. Once you've got a bunch of uninteresting pieces,
exit the torrent, delete the pieces, and re-enter the torrent. I've
prototyped this scheme, and it seems to work quite well for torrents
that have at least a few downloading peers. It helps with all of the
problems listed above. The down-side is it can't help out in torrents
with only one downloading peer. It would be interesting to know what
percentage of download traffic comes from very active torrents vs fairly
inactive torrents.
Post by Olaf van der Spek
To be able to do this, we need two things. The first is to get the
clients to connect to us. We need that to happen by default as relying
on user configuration would limit this feature to a small fraction of
the user base. (I have a http proxy (for tracker communications)
written which will add itself to the list of peers before returning,
but getting a few hundred thousand people to change their settings is
going to be a problem for us).
I would like the following patch to be integrated into the official
BitTorrent client (as well as any other clients) for this reason. It
does a single dns lookup each time the program starts for
https://habitue.net/projects/bt/btcache.patch
Is the idea that this would allow isp's to spoof the btchache.p2p
domain, and return their own bittorent server's address? Then anyone
outside an ISP supporting this would get the lookup error, and thus no
additional peers?

Couldn't this feature be used to keep track of all user downloads? I
know this is already possible for ISPs with basic traffic snooping, but
I think users might worry about an automatic feature that tells his ISP
that he's about to download a file set. I'd like to think that my ISP
tries hard not to look too closely at my traffic. It'd feel like a
small invasion of privacy otherwise.

Here's a potential alternative: with the BT friends extension, you're
client could go to the torrents where peers meet to make friends, and
only make friends with your customers. When they need help downloading
from a torrent, they'll contact you with a request_help message, and you
can go from there. They don't need to know that you're working for
their ISP, and hopefully you wont keep a log of the communication.
Post by Olaf van der Spek
The second feature would be the get_info/info extension so we can
obtain the "pieces" and "piece length" fields needed for sensible
torrent participation.
https://habitue.net/projects/bt/btgetinfo.patch
If people are against including the getinfo patch for any reason, I
would still be eager to see the first patch included as I have some
code written which will "probe" the block size (powers of 2) and not
do any data checking if I don't have the pieces hashes.
I've got the .torrent-less caching server code written and freely
available under a GPL license.
https://habitue.net/projects/bt/
If anyone has any feedback I'd be glad to hear it.
Regards,
--Iain
Note// I understand JoltId (http://www.joltid.com) have been
contacting bittorrent client authors and paying for them to implement
a caching proxy solution locked in to their proprietary PeerCache
protocol. I would hope any agreements with them do not preclude the
fruits of our own work (open source and freely available) being
accepted.
Scary... I'm not against a guy making a few bucks, or ISPs saving
money, but these guys are seriously into spy-ware.

IMO, this is a good reason for the open-source community to help solve
the caching problem for free. So, here's my two cents on how to make it
actually happen...

To get support from users, BT client authors, and Bram, I think you need
a good reason. Saving the ISP money sounds good, but let's face it: no
one cares (unless they're being bribed to care).

Here's a good reason for average Joe to care: you help him download
faster. If you do that, I think the path to acceptance will be much
easier.

My proposed BT friends protocol can help average Joe download faster
even if he's one of two guys in the world using the protocol (or one guy
who runs it at home and at work). I'm pretty confident that it will
catch on quickly when it's done. People like downloading faster for
free.

Is there some way to cooperate?

Bill





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-01-12 19:46:37 UTC
Permalink
Post by Bill Cox
Post by Olaf van der Spek
Post by Olaf van der Spek
As follow-up to my .torrent-less download thread, I'd like two
messages to
Post by Olaf van der Spek
be added to the BT protocol: get_info and info.
get_info has no payload
info has has payload containing info (bencoded, from .torrent)
get_info could be send after the handshake and before other messages
info should be send after receiving a handshake and get_info, but
before
Post by Olaf van der Spek
receiving other messages
Hello Bram, and list members,
I would also like to see a get_info/info extension added.
Very interesting work. This is also a very interesting problem.
I couldn't find the original message double-quoted above, so I'm taking
It's from a long, long time ago.
I skimmed over my own quote and wondered why it looked so familiar. ;->
Post by Bill Cox
what's listed above out of context...
To use get_info/info for .torrent-less downloads, users would already
need to know where to find a tracker and the info_hash value.
Otherwise, they wouldn't know what peers to contact for the info, and
wouldn't be able to send a valid handshake. Is there a scheme for
finding these? In my proposed BT friends protocol extension
The idea was to put those two things in a URL.
Post by Bill Cox
Another problem with the get_info/info extension is that the load of the
info message can be hundreds of KBytes. That's a big message. Even if
Isn't it between 40 and 100 kbyte most of the time?
That's still large. This works best with merkle trees, where the size of the
.torrent is reduced to 1 kb.





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Iain Wade
2005-01-13 00:59:58 UTC
Permalink
Post by Bill Cox
To use get_info/info for .torrent-less downloads, users would already
need to know where to find a tracker and the info_hash value.
I think Olaf was originally proposing that a new URL be defined which
includes the bare minimum of information (info_hash and a few peers)
needed, the client would then connect to the peers and download the
info body and ask for some more peers and it would grow from there.

I don't intend to do any of that, I just want the clients on my
network to connect to me. I'll then start asking them for pieces and
share them around with other users who are downloading the same thing.

It's not the most efficient method, but it is the smallest change
possible and therefore the most likely to be accepted (I hope).
Post by Bill Cox
Another problem with the get_info/info extension is that the load of the
info message can be hundreds of KBytes. That's a big message. Even if
you download the message correctly, a user might kill the session early,
wasting the effort of sending the torrent's info in the first place.
The cache only needs to ask one client, once. The performance benefit
to the user gained from using this cache would far exceed the overhead
incurred from a once off upload of a hundred KB. This burden would
probably be distributed amongst the users over time because the first
person connected to torrentA would likely be a different person to the
first on torrentB.
Post by Bill Cox
In my BT friends extension, I do it differently. After receiving a
piece, I compute it's SHA1, and then send a 'piece_info' message to a
peer that has the piece and supports the extension. It's load is the
piece index and my computed SHA1 value for the piece. The peer replies
with a 'piece_correct' or 'piece_incorrect' message. Once I've received
a 'piece_correct' message, I send my HAVE messages.
This way, only small messages are sent, and the torrent info is only
sent as you need it.
The problem here is that it is still susceptible to corruption. You
can't trust the other parties in a p2p download without being able to
verify what they've said. The info blob can be verified by comparing
the hash of it to the info_hash on the connection. This ensures all
clients connected with that info_hash are talking about the same
content.

Also, this extension would be more effective in the short term before
it's adoption is widespread because with the get_info/info extension
only one client needs support for the feature and the client only
needs to be available for a short time to upload the info blob, but
with yours at least one client supporting the feature would need to be
available at all times to support checking.
Post by Bill Cox
Reducing the external network traffic for an ISP will obviously save
them money, but you have to be careful not to upset either your
customers or the movie and music industries.
This is being done for the customers. I'm sure they will have nothing
but praise once their download start powering ahead at a few hundred
KB/s as soon as they connect.

I'm not a lawyer and don't care one bit about the sensitivities of the
recording and movie industries. If they choose to sue my company we
would either stop doing what we're doing, or if the legal eagles think
it's winnable we might fight them. Not my problem.

We are running a cache for kazaa and edonkey traffic right now and
other copyright infringing content is cached every second of every day
on our http proxy and email servers. In an ISP, caching this content
is a necessity for the smooth running of the network and unavoidable.

There are already commercial peer-to-peer caching solutions available,
however they are expensive and often need to be added inline to the
core of your network, creating an unacceptable single point of
failure.

I don't want to really discuss the legal aspect, and I think these
changes proposed are small enough and generic enough that it's not
important to.
Post by Bill Cox
I'd suggest thinking of your program as a repeater, rather than a cache.
I assume that your cache acts just like a normal client: download as
fast as you can, and once you've got the whole thing start acting as a
seeder. This has some real problems. You'd download all kinds of
torrents, and probably wind up with lots of terrabytes of data. The
small problem is that disk cache is expensive. The big problem is that
This is solved by removing old and/or unpopular content. I assume a
large number of our users download the same content from the same
sources.
Post by Bill Cox
the music/movie industry will probably have a nice chat with your
employer about seeding their works. Another problem is that you're not
helping the torrent while you're downloading.
The cache is effectively Just Another BitTorrent Client(tm) firewalled
to only accept connections from our users and to not need .torrent
files in advance.

so, once it has one piece that a connected user doesn't have it would
be of benefit.
Post by Bill Cox
A repeater, on the other hand, downloads pieces only when it sees a
chance to help it's peers. If no connected peers are interested in the
pieces you currently have, go get another piece. Otherwise, just seed
the pieces you have. Once you've got a bunch of uninteresting pieces,
exit the torrent, delete the pieces, and re-enter the torrent. I've
prototyped this scheme, and it seems to work quite well for torrents
that have at least a few downloading peers. It helps with all of the
problems listed above. The down-side is it can't help out in torrents
with only one downloading peer. It would be interesting to know what
percentage of download traffic comes from very active torrents vs fairly
inactive torrents.
This is certainly an option that I've considered, but it's irrelevent
to the inclusing of the patches sent.

Anyone can write a repeater and use the same mechanism i'm proposing
to have client connect to it.

I've prototyped a different algorithm for the cache software, but
others can write their own implementations if they see benefit.
Post by Bill Cox
Is the idea that this would allow isp's to spoof the btchache.p2p
domain, and return their own bittorent server's address? Then anyone
outside an ISP supporting this would get the lookup error, and thus no
additional peers?
Yes exactly.
Post by Bill Cox
Couldn't this feature be used to keep track of all user downloads? I
know this is already possible for ISPs with basic traffic snooping, but
I think users might worry about an automatic feature that tells his ISP
that he's about to download a file set. I'd like to think that my ISP
tries hard not to look too closely at my traffic. It'd feel like a
small invasion of privacy otherwise.
It couldn't track downloads. It could alert the ISP of the use of a
BitTorrent client, but really I think if the ISP cared they can
already determine this and users should be aware that FastTrack
clients already lookup "cache.p2p" and emule already look up
"edcache.p2p".
Post by Bill Cox
Here's a potential alternative: with the BT friends extension, you're
client could go to the torrents where peers meet to make friends, and
only make friends with your customers. When they need help downloading
from a torrent, they'll contact you with a request_help message, and you
can go from there. They don't need to know that you're working for
their ISP, and hopefully you wont keep a log of the communication.
I would much prefer the clients to connect directly to me without
having to do the run-around :-)
Post by Bill Cox
Post by iain_wade
Note// I understand JoltId (http://www.joltid.com) have been
contacting bittorrent client authors and paying for them to implement
a caching proxy solution locked in to their proprietary PeerCache
protocol. I would hope any agreements with them do not preclude the
fruits of our own work (open source and freely available) being
accepted.
Scary... I'm not against a guy making a few bucks, or ISPs saving
money, but these guys are seriously into spy-ware.
I don't know where you got that impression?
Post by Bill Cox
From my experience with them they are good guys with a solid product
but I don't believe that we need to pay their (rather steep) prices for
a BitTorrent cache which we could implement ourselves in an open-source
manner which benefits everyone.
Post by Bill Cox
IMO, this is a good reason for the open-source community to help solve
the caching problem for free. So, here's my two cents on how to make it
actually happen...
To get support from users, BT client authors, and Bram, I think you need
a good reason. Saving the ISP money sounds good, but let's face it: no
one cares (unless they're being bribed to care).
Here's a good reason for average Joe to care: you help him download
faster. If you do that, I think the path to acceptance will be much
easier.
That is an important goal of course, and one I already listed above.

Saving bandwidth for an ISP is not something the authors of BitTorrent
clients and the protocol should disregard however. Having an
ISP-friendly protocol (by adding some support for caching) will help
the users of your software because ISP's would be less likely to limit
it's impact through other means like blocking or traffic shaping.

--Iain



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Bill Cox
2005-01-15 12:08:06 UTC
Permalink
Hi, Iain.

I think it's worth adding simple extensions to help keep the ISPs happy.
The cache look-up is a little scary, but it could be an option that
users could turn off. I know I'd keep that option on.

The get_info/info is a fairly simple handshake, though I don't like it's
having such a large load. I'd prefer an incremental mechanism like the
piece_info/piece_correct handshake. Anyone benefiting from the cache-
lookup extension would probably also support the related extension in
the peer protocol, so someone would be there to help authenticate pieces
as long as there was anyone interested in downloading from your cache.

Would this be a good point to introduce Merkle hash trees, or should we
wait for BT2? The idea is that the get_info/info handshake would return
a Merkle hash root for the torrent, rather than the whole info blob. It
would be a tiny message. Then, we'd have a get_authentication_path /
authentication_path handshake to authenticate the piece data. Only path
data up to a requested level in the tree would be returned.

Bill





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Iain Wade
2005-01-17 05:23:01 UTC
Permalink
Hi Bill.
Post by Bill Cox
I think it's worth adding simple extensions to help keep the ISPs happy.
The cache look-up is a little scary, but it could be an option that
users could turn off. I know I'd keep that option on.
Fantastic :-)
Post by Bill Cox
The get_info/info is a fairly simple handshake, though I don't like it's
having such a large load. I'd prefer an incremental mechanism like the
piece_info/piece_correct handshake. Anyone benefiting from the cache-
lookup extension would probably also support the related extension in
the peer protocol, so someone would be there to help authenticate pieces
as long as there was anyone interested in downloading from your cache.
OK.

Some other pieces of information needed are total size and piece size.
If the cache needs this information from a client, how can it verify
the accuracy of that information? In the get_info solution they are a
validated portion of the info blob.

Let me ask, if I were to implement something along these lines (with a
significantly larger patch) would people be more accepting? Bram?
Post by Bill Cox
Would this be a good point to introduce Merkle hash trees, or should we
wait for BT2? The idea is that the get_info/info handshake would return
a Merkle hash root for the torrent, rather than the whole info blob. It
would be a tiny message. Then, we'd have a get_authentication_path /
authentication_path handshake to authenticate the piece data. Only path
data up to a requested level in the tree would be returned.
I'm interested, but have no exposure to these algorithms. I guess for
them to be usefull I would still like to see them in a validated info
blob payload and that would require all clients to be updated.. BT2
perhaps.

Thanks for your views.
--Iain



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Bill Cox
2005-01-18 12:40:18 UTC
Permalink
Hi, Iain.

Ok, the get_info/info mechanism is trivial. Any incremental scheme is
more work. That's good enough reason to support get_info/info for now.
Not that my views matter here, but I'll support both of your patches in
btslave.

I've commented more on the Merkle hash trees below, but let's assume I'm
not asking for this as a BT1 extension.

Bill
Post by Iain Wade
Post by Bill Cox
Would this be a good point to introduce Merkle hash trees, or should we
wait for BT2? The idea is that the get_info/info handshake would return
a Merkle hash root for the torrent, rather than the whole info blob. It
would be a tiny message. Then, we'd have a get_authentication_path /
authentication_path handshake to authenticate the piece data. Only path
data up to a requested level in the tree would be returned.
I'm interested, but have no exposure to these algorithms. I guess for
them to be usefull I would still like to see them in a validated info
blob payload and that would require all clients to be updated.. BT2
perhaps.
Merkle hash trees are simple. Every piece has a SHA1 hash value, just
as they do now. These become the leaves of the Merkle tree. Interior
nodes of the tree have hash values which are the SHA1 of the
concatenation of the hash values of both of it's children.

In a BT protocol, the piece hashes get shared by the peers, and are no
longer in the info-blob. The info blob only has the SHA1 hash value of
the root of the tree. This makes torrent info blobs tiny. AFAIK, this
is the main value of using Merkle hash trees.

To validate data, peers send not only piece data to eachother, but
authentication data as well. For compatibility with BT1 reasons, it's
probably best to make these separate messages: the old piece message,
and an get_authentication/authentication pair. The authentication data
is a set of SHA1 hash keys, starting with the piece's sibling leaf in
the tree, and goes up the tree, providing sibling hash keys. With the
piece, and it's sibling's hash value, you can compute the hash value one
level up in the tree. With that node's sibling's hash value, you can go
further up the tree. When you get to the root, you should get the same
value as was advertised in the info-blob. If there is any false data,
the root value and your value wont match.

The scheme as described by Bram still leaves file names and an info-hash
value per file in the torrent file. This is good for compatibility with
BT1, but I'd prefer for info-blobs in BT2 to be a very small. Twenty
bytes would be a good size (just the SHA1 of the root). Everything else
could be embedded in the piece data (file length, file names, an index
to the files in case you just want part of the data). There could be an
agreement that piece info hash are computed for every 1K bytes, so piece
size is gone. You wouldn't waste bandwidth sending SHA1 data down to 1K
byte leaves, because data is sent in bigger gulps than that.

The two things the torrent file needs is a tracker url (8 bytes) and an
info blob (20 bytes). Then, we could share torrents trivially.

Bill





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Justin Cormack
2005-01-19 17:39:18 UTC
Permalink
Post by Bill Cox
The scheme as described by Bram still leaves file names and an info-hash
value per file in the torrent file. This is good for compatibility with
BT1, but I'd prefer for info-blobs in BT2 to be a very small. Twenty
bytes would be a good size (just the SHA1 of the root). Everything else
could be embedded in the piece data (file length, file names, an index
to the files in case you just want part of the data). There could be an
agreement that piece info hash are computed for every 1K bytes, so piece
size is gone. You wouldn't waste bandwidth sending SHA1 data down to 1K
byte leaves, because data is sent in bigger gulps than that.
The two things the torrent file needs is a tracker url (8 bytes) and an
info blob (20 bytes). Then, we could share torrents trivially.
I would be inclined to get rid of the multiple file thing completely, it
adds complexity without adding much power. You can either download multiple
torrents to get seperate files, or use a filesystem image if you want real
files, or a zip/tar or whatever. These allow you to store real file attributes
and so on.

Justin



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-01-19 21:03:21 UTC
Permalink
Post by Justin Cormack
Post by Bill Cox
The scheme as described by Bram still leaves file names and an info-hash
value per file in the torrent file. This is good for compatibility with
BT1, but I'd prefer for info-blobs in BT2 to be a very small. Twenty
bytes would be a good size (just the SHA1 of the root). Everything else
could be embedded in the piece data (file length, file names, an index
to the files in case you just want part of the data). There could be an
agreement that piece info hash are computed for every 1K bytes, so piece
size is gone. You wouldn't waste bandwidth sending SHA1 data down to 1K
byte leaves, because data is sent in bigger gulps than that.
The two things the torrent file needs is a tracker url (8 bytes) and an
info blob (20 bytes). Then, we could share torrents trivially.
I would be inclined to get rid of the multiple file thing completely, it
adds complexity without adding much power. You can either download multiple
torrents to get seperate files, or use a filesystem image if you want real
files, or a zip/tar or whatever. These allow you to store real file attributes
and so on.
The disadvantage is that you have to unpack them. ;-)
Users aren't likely to keep both the unpacked and the packed form and
are likely to delete the packed form.

The disadvantage of multiple torrents is tracker load.
What exactly is the complication of multiple files?



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Justin Cormack
2005-01-19 22:01:02 UTC
Permalink
Post by Olaf van der Spek
Post by Justin Cormack
Post by Bill Cox
The scheme as described by Bram still leaves file names and an info-hash
value per file in the torrent file. This is good for compatibility with
BT1, but I'd prefer for info-blobs in BT2 to be a very small. Twenty
bytes would be a good size (just the SHA1 of the root). Everything else
could be embedded in the piece data (file length, file names, an index
to the files in case you just want part of the data). There could be an
agreement that piece info hash are computed for every 1K bytes, so piece
size is gone. You wouldn't waste bandwidth sending SHA1 data down to 1K
byte leaves, because data is sent in bigger gulps than that.
The two things the torrent file needs is a tracker url (8 bytes) and an
info blob (20 bytes). Then, we could share torrents trivially.
I would be inclined to get rid of the multiple file thing completely, it
adds complexity without adding much power. You can either download multiple
torrents to get seperate files, or use a filesystem image if you want real
files, or a zip/tar or whatever. These allow you to store real file attributes
and so on.
The disadvantage is that you have to unpack them. ;-)
Users aren't likely to keep both the unpacked and the packed form and
are likely to delete the packed form.
The disadvantage of multiple torrents is tracker load.
What exactly is the complication of multiple files?
Well the complication is that they are overlaid rather simply on what is
basically a block rather than file transfer format (eg file starts are not
aligned with requests which is really unpleasant, as a piece write may have
to write to large numbers of files). And as stated above it means more
information to transfer in the torrent file.

The fact that people might not keep the packed form is something I hadnt
thought about. This is not the case if you use loopback file system images
(so long as they dont need to be compressed too) - I quite like the way
MacOS X uses ISO images and other loop fs images all over the place. Of
course I have no idea if Windows supports them well, and Linux doesnt
make it convenient for non root users (though this should be fixed really).

Not sure how much tracker overhead is, really not sure how many files
the average torrent is, I think the number is generally small.

Justin



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-01-19 23:43:43 UTC
Permalink
Post by Justin Cormack
Post by Olaf van der Spek
The disadvantage is that you have to unpack them. ;-)
Users aren't likely to keep both the unpacked and the packed form and
are likely to delete the packed form.
The disadvantage of multiple torrents is tracker load.
What exactly is the complication of multiple files?
Well the complication is that they are overlaid rather simply on what is
basically a block rather than file transfer format (eg file starts are not
aligned with requests which is really unpleasant, as a piece write may have
to write to large numbers of files). And as stated above it means more
information to transfer in the torrent file.
True, true, but merkle trees should solve both.
Post by Justin Cormack
The fact that people might not keep the packed form is something I hadnt
thought about. This is not the case if you use loopback file system images
(so long as they dont need to be compressed too) - I quite like the way
MacOS X uses ISO images and other loop fs images all over the place. Of
True, but ISOs are 'meant' to be burned, which I think frequently
happens. And then you can't (easily) regenerate the ISO for seeding,
while you can easily reseed the files on the CD/DVD itself.
Post by Justin Cormack
course I have no idea if Windows supports them well, and Linux doesnt
make it convenient for non root users (though this should be fixed really).
Not sure how much tracker overhead is, really not sure how many files
the average torrent is, I think the number is generally small.
Justin
Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-01-19 23:44:18 UTC
Permalink
Post by Justin Cormack
Not sure how much tracker overhead is, really not sure how many files
the average torrent is, I think the number is generally small.
There are 2000+ image torrents and 20+ audio torrents. I'd not want to
have more torrents due to that.



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Olaf van der Spek
2005-01-19 23:59:07 UTC
Permalink
Post by Justin Cormack
Not sure how much tracker overhead is, really not sure how many files
the average torrent is, I think the number is generally small.
There are 2000+ image torrents and 20+ audio torrents. I'd not want to
have more torrents due to that.



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Justin Cormack
2005-01-20 00:52:33 UTC
Permalink
Post by Olaf van der Spek
Post by Justin Cormack
Not sure how much tracker overhead is, really not sure how many files
the average torrent is, I think the number is generally small.
There are 2000+ image torrents and 20+ audio torrents. I'd not want to
have more torrents due to that.
Its not that there is any fundamental reason why there shouldnt be a file
based protocol, its just not well designed as one as is. Its a block protocol
with files overlaid on it by concatenation. Even having files starting only
on piece boundaries (ie some pieces other than the last smaller) would fix
most of the nastiness that comes in the code to write out and read from files.

(I cant remember if there was some change to this in Brams BT2 draft, vaguely
remember there was).

Justin



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Bill Cox
2005-01-20 11:55:18 UTC
Permalink
Post by Justin Cormack
The fact that people might not keep the packed form is something I hadnt
thought about. This is not the case if you use loopback file system images
(so long as they dont need to be compressed too) - I quite like the way
MacOS X uses ISO images and other loop fs images all over the place. Of
course I have no idea if Windows supports them well, and Linux doesnt
make it convenient for non root users (though this should be fixed really).
Not sure how much tracker overhead is, really not sure how many files
the average torrent is, I think the number is generally small.
Justin
Hi, Justin.

There is a linux module called fuse: http://fuse.sourceforge.net/

It makes user-space file systems a snap.

I was thinking of using it to create a virtual file system for torrent
files. After mounting one, you could ls them, and cd to them, but if
you read the data in a file, it'd go get the data from the torrent. If
a torrent contained other torrent files, these would be sub-dirs.

It sounds cool, and it's easy to do. I'm just having trouble coming up
with a reason to do it. One thought is that a user on a Linux system
could mount a directory structure this way that is similar to /usr
or /usr/local. Rather than having to install software packages, the
first time he tried to use one, it'd automatically download it from a
torrent (possibly after asking if that's ok).

It gets more complicated if the user needs to write to the file system,
which of course he would.

Bill





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Jesus Cea
2005-01-20 19:32:53 UTC
Permalink
Post by Bill Cox
It gets more complicated if the user needs to write to the file system,
which of course he would.
Read-Only filesystem with an overlay layer. Nice :-)
--
Jesus Cea Avion _/_/ _/_/_/ _/_/_/
***@argo.es http://www.argo.es/~jcea/ _/_/ _/_/ _/_/ _/_/ _/_/
_/_/ _/_/ _/_/_/_/_/
PGP Key Available at KeyServ _/_/ _/_/ _/_/ _/_/ _/_/
"Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/
"My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Justin Cormack
2005-01-20 19:56:03 UTC
Permalink
Post by Bill Cox
Hi, Justin.
There is a linux module called fuse: http://fuse.sourceforge.net/
It makes user-space file systems a snap.
I was thinking of using it to create a virtual file system for torrent
files. After mounting one, you could ls them, and cd to them, but if
you read the data in a file, it'd go get the data from the torrent. If
a torrent contained other torrent files, these would be sub-dirs.
It sounds cool, and it's easy to do. I'm just having trouble coming up
with a reason to do it. One thought is that a user on a Linux system
could mount a directory structure this way that is similar to /usr
or /usr/local. Rather than having to install software packages, the
first time he tried to use one, it'd automatically download it from a
torrent (possibly after asking if that's ok).
It gets more complicated if the user needs to write to the file system,
which of course he would.
Not if you just want to install RPMS off it say. Nor do you expect to be able
to write to must large archive/mirror sites, but mounting it means you can
use lots of file system tools.

There are other good reasons to do this, and I will implement something
similar shortly, but there are a few problems, and the way multifile is
implemented is one of them.

Justin



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Konstantin 'Kosta' Welke
2005-01-21 11:54:21 UTC
Permalink
Post by Bill Cox
I was thinking of using it to create a virtual file system for torrent
files. After mounting one, you could ls them, and cd to them, but if
you read the data in a file, it'd go get the data from the torrent. If
a torrent contained other torrent files, these would be sub-dirs.
Just that i get you right: You want to make one big torrent file for
the whole filesystem (using multi-file) and add some browse option?
This is a great idea, and I thought about something similar to replace
whole ftp servers with bittorrent, but the problem is that either you
have to put the whole directory structure in the torrent file (which
can be HUGE), or use put each subdir in a single torrent, which would
add some latency when browsing directories.

Does this get fixed/easier with bt2's hash trees?
Also, updates on the filessystem are an issue.
Post by Bill Cox
It gets more complicated if the user needs to write to the file system,
which of course he would.
Not necessarily. Most of the file systems on a unix machine are read-only
for the typical user anyway. The whole idea is that there is one master
filesystem that gets, in this case seeded. Of course, it would be modified
on one central location.

On the other hand, if you want a real distributed read/write-bittorrent,
thats really a whole new project. Save it for bt3, maybe ;)

PS: any hints where i can find the bt wiki?

Kosta



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Justin Cormack
2005-01-24 13:18:41 UTC
Permalink
Post by Konstantin 'Kosta' Welke
PS: any hints where i can find the bt wiki?
http://wiki.theory.org/BitTorrentFAQ



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Konstantin 'Kosta' Welke
2005-01-20 12:18:55 UTC
Permalink
On Wed, 19 Jan 2005 17:39:18 +0000 (GMT), Justin Cormack
Post by Justin Cormack
I would be inclined to get rid of the multiple file thing completely, it
adds complexity without adding much power. You can either download multiple
torrents to get seperate files, or use a filesystem image if you want real
files, or a zip/tar or whatever. These allow you to store real file attributes
and so on.
Oh, it adds so much power. If we could have a real directory tree, as it is
planned in bt2 (IIRC), things were drastically improved. You could just
replace
a whole FTP server with a bittorrent server.

btw: In the mailing list, a wiki page is often mentioned, but i cant find
its
url. can anyone help?

Kosta



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Bill Cox
2005-01-20 20:56:52 UTC
Permalink
Post by Konstantin 'Kosta' Welke
Oh, it adds so much power. If we could have a real directory tree, as it is
planned in bt2 (IIRC), things were drastically improved. You could just
replace
a whole FTP server with a bittorrent server.
btw: In the mailing list, a wiki page is often mentioned, but i cant find
its
url. can anyone help?
Kosta
If there is an on-line discussion of BT2 progress, I'd like to know
where to find it (even if all I can do is read, not post).

Thanks,
Bill





Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
iain_wade
2005-01-14 00:06:09 UTC
Permalink
Post by Bill Cox
To use get_info/info for .torrent-less downloads, users would
already need to know where to find a tracker and the info_hash
value.
I think Olaf was originally proposing that a new URL be defined which
includes the bare minimum of information (info_hash and a few peers)
needed, the client would then connect to the peers and download the
info body and ask for some more peers and it would grow from there.

I don't intend to do any of that, I just want the clients on my
network to connect to me. I'll then start asking them for pieces and
share them around with other users who are downloading the same thing.

It's not the most efficient method, but it is the smallest change
possible and therefore the most likely to be accepted (I hope).
Post by Bill Cox
Another problem with the get_info/info extension is that the load
of the info message can be hundreds of KBytes. That's a big
message. Even if you download the message correctly, a user might
kill the session early, wasting the effort of sending the
torrent's info in the first place.
The cache only needs to ask one client, once. The performance benefit
to the user gained from using this cache would far exceed the overhead
incurred from a once off upload of a hundred KB. This burden would
probably be distributed amongst the users over time because the first
person connected to torrentA would likely be a different person to the
first on torrentB.
Post by Bill Cox
In my BT friends extension, I do it differently. After receiving
a piece, I compute it's SHA1, and then send a 'piece_info' message
to a peer that has the piece and supports the extension. It's
load is the piece index and my computed SHA1 value for the piece.
The peer replies with a 'piece_correct' or 'piece_incorrect'
message. Once I've received a 'piece_correct' message, I send
my HAVE messages.
This way, only small messages are sent, and the torrent info is
only sent as you need it.
The problem here is that it is still susceptible to corruption. You
can't trust the other parties in a p2p download without being able to
verify what they've said. In my way, the info blob can be verified by
comparing the hash of it to the info_hash on the connection. This
ensures all clients connected with that info_hash are talking about
the same content.

Also, this extension would be more effective in the short term before
it's adoption is widespread because with the get_info/info extension
only one client needs support for the feature and the client only
needs to be available for a short time to upload the info blob, but
with yours at least one client supporting the feature would need to be
available at all times to support checking.
Post by Bill Cox
Reducing the external network traffic for an ISP will obviously
save them money, but you have to be careful not to upset either
your customers or the movie and music industries.
This is being done for the customers. I'm sure they will have nothing
but praise once their download start powering ahead at a few hundred
KB/s as soon as they connect.

I'm not a lawyer and don't care one bit about the sensitivities of the
recording and movie industries. If they choose to sue my company we
would either stop doing what we're doing, or if the legal eagles think
it's winnable we might fight them. Not my problem.

We are running a cache for kazaa and edonkey traffic right now and
other copyright infringing content is cached every second of every day
on our http proxy and email servers. In an ISP, caching this content
is a necessity for the smooth running of the network and unavoidable.

There are already commercial peer-to-peer caching solutions available,
however they are expensive and often need to be added inline to the
core of your network, creating an unacceptable single point of
failure.

I don't want to really discuss the legal aspect, and I think these
changes proposed are small enough and generic enough that it's not
important to.
Post by Bill Cox
I'd suggest thinking of your program as a repeater, rather than a
download as fast as you can, and once you've got the whole thing
start acting as a seeder. This has some real problems. You'd
download all kinds of torrents, and probably wind up with lots of
terrabytes of data. The small problem is that disk cache is
expensive. The big problem is that
This is easily solved by just clearing out old and/or unpopular
content. I assume a large number of our users download the same
content. I know this is the case for other peer to peer networks.
Post by Bill Cox
the music/movie industry will probably have a nice chat with your
employer about seeding their works. Another problem is that
you're not helping the torrent while you're downloading.
The cache is effectively Just Another BitTorrent Client(tm) firewalled
to only accept connections from our users and to not need .torrent
files in advance.

so, once it has one piece that a connected user doesn't have it would
be of benefit.
Post by Bill Cox
A repeater, on the other hand, downloads pieces only when it sees
a chance to help it's peers. If no connected peers are interested
in the pieces you currently have, go get another piece. Otherwise,
just seed the pieces you have. Once you've got a bunch of
uninteresting pieces, exit the torrent, delete the pieces, and
re-enter the torrent. I've prototyped this scheme, and it seems
to work quite well for torrents that have at least a few
downloading peers. It helps with all of the problems listed above.
The down-side is it can't help out in torrents with only one
downloading peer. It would be interesting to know what percentage
of download traffic comes from very active torrents vs fairly
inactive torrents.
This is certainly an option that I've considered, but it's irrelevent
to the inclusing of the patches sent.

Anyone can write a repeater and use the same mechanism i'm proposing
to have client connect to it.

I've prototyped a different algorithm for the cache software, but
others can write their own implementations if they see benefit.
Post by Bill Cox
Is the idea that this would allow isp's to spoof the btchache.p2p
domain, and return their own bittorent server's address? Then
anyone outside an ISP supporting this would get the lookup error,
and thus no additional peers?
Yes exactly.
Post by Bill Cox
Couldn't this feature be used to keep track of all user downloads?
I know this is already possible for ISPs with basic traffic
snooping, but I think users might worry about an automatic feature
that tells his ISP that he's about to download a file set. I'd
like to think that my ISP tries hard not to look too closely at
my traffic. It'd feel like a small invasion of privacy otherwise.
It couldn't track downloads. It could alert the ISP of the use of a
BitTorrent client, but really I think if the ISP cared they can
already determine this and users should be aware that FastTrack
clients already lookup "cache.p2p" and emule already look up
"edcache.p2p".
Post by Bill Cox
Here's a potential alternative: with the BT friends extension,
you're client could go to the torrents where peers meet to make
friends, and only make friends with your customers. When they
need help downloading from a torrent, they'll contact you with
a request_help message, and you can go from there. They don't
need to know that you're working for their ISP, and hopefully
you wont keep a log of the communication.
I would much prefer the clients to connect directly to me without
having to do the run-around :-)
Post by Bill Cox
Post by iain_wade
Note// I understand JoltId (http://www.joltid.com) have been
contacting bittorrent client authors and paying for them to
implement a caching proxy solution locked in to their proprietary
PeerCache protocol. I would hope any agreements with them do not
preclude the fruits of our own work (open source and freely
available) being accepted.
Scary... I'm not against a guy making a few bucks, or ISPs saving
money, but these guys are seriously into spy-ware.
I don't know where you got that impression?
Post by Bill Cox
From my experience with them they are good guys with a solid product
but I don't believe that we need to pay their (rather steep) prices
for a BitTorrent cache which we could implement ourselves in an
open-source manner which benefits everyone.
Post by Bill Cox
IMO, this is a good reason for the open-source community to help
solve the caching problem for free. So, here's my two cents on how
to make it actually happen...
To get support from users, BT client authors, and Bram, I think you
need a good reason. Saving the ISP money sounds good, but let's
face it: no one cares (unless they're being bribed to care).
Here's a good reason for average Joe to care: you help him download
faster. If you do that, I think the path to acceptance will be
much easier.
That is an important goal of course, and one I already listed above.

Saving bandwidth for an ISP is not something the authors of BitTorrent
clients and the protocol should disregard however. Having an
ISP-friendly protocol (by adding some support for caching) will help
the users of your software because ISP's would be less likely to limit
it's impact through other means like blocking or traffic shaping.

--Iain






Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Johan Sundström
2003-11-20 17:20:23 UTC
Permalink
Post by Olaf van der Spek
I'd also like to propose a URL format for downloads instead of a .torrent
btp:protocol:host:port/info_hash/peers/
btp:udp:192.168.1.1:2710/%01%23%45%67%89%ab%cd%ef
%01%23%45%67%89%ab%cd%ef%01%23%45%67/192.168.1.2:6881 (example)
I'd suggest on adhering to the more common URI syntax of using just one
marker for the protocol part -- that way, you will avoid many headaches
with libraries for decomposing URLs. In other words, something along the
lines of

btp-http://host:post/info_hash/peers

and, as you move on with introducing other network transport layer methods,

btp-udp://host:post/info_hash/peers

I promise that later implementors will be silently grateful. :-) Depending
on what you intend "btp-http" to be, you might be better off not pretending
it is a protocol of its own at all, and just stick to defining a convention
of interpreting the URL as http://host:post/info_hash/peers, much as things
already are today.

Another aspect of the strong points of URLs that may and may not be worth
aiming for, is brevity. Considering you know for sure that the hash is a
gob of binary gibberish, you might as well go for a format of the info_hash
portion without the % escapes, i e having the ASCII hex representation from
the very start, rendering your example into the (mildly) less frightening

btp-udp://192.168.1.1:2710/0123456789abcdef0123456789abcdef
01234567/192.168.1.2:6881
--
/ Johan Sundström

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Buy Ink Cartridges or Refill Kits for your HP, Epson, Canon or Lexmark
Printer at MyInks.com. Free s/h on orders $50 or more to the US & Canada.
http://www.c1tracking.com/l.asp?cid=5511
http://us.click.yahoo.com/mOAaAA/3exGAA/qnsNAA/dkFolB/TM
---------------------------------------------------------------------~->

To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com



Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Gary Fung
2003-11-20 23:06:37 UTC
Permalink
i agree. backward compatibility and non frightening URI is important =b

and the goal of this uri convention is a good one. Torrents can now be
sent p2p, and all the website frontend need is posting the btp uri's. Both
convenient for the user and bandwidth saving for the site.


On Thu, 20 Nov 2003 18:20:23 +0100 (MET), Johan Sundström
Post by Johan Sundström
Post by Olaf van der Spek
I'd also like to propose a URL format for downloads instead of a .torrent
btp:protocol:host:port/info_hash/peers/
btp:udp:192.168.1.1:2710/%01%23%45%67%89%ab%cd%ef
%01%23%45%67%89%ab%cd%ef%01%23%45%67/192.168.1.2:6881 (example)
I'd suggest on adhering to the more common URI syntax of using just one
marker for the protocol part -- that way, you will avoid many headaches
with libraries for decomposing URLs. In other words, something along the
lines of
btp-http://host:post/info_hash/peers
and, as you move on with introducing other network transport layer methods,
btp-udp://host:post/info_hash/peers
I promise that later implementors will be silently grateful. :-) Depending
on what you intend "btp-http" to be, you might be better off not pretending
it is a protocol of its own at all, and just stick to defining a convention
of interpreting the URL as http://host:post/info_hash/peers, much as things
already are today.
Another aspect of the strong points of URLs that may and may not be worth
aiming for, is brevity. Considering you know for sure that the hash is a
gob of binary gibberish, you might as well go for a format of the info_hash
portion without the % escapes, i e having the ASCII hex representation from
the very start, rendering your example into the (mildly) less frightening
btp-udp://192.168.1.1:2710/0123456789abcdef0123456789abcdef
01234567/192.168.1.2:6881
--
Cheers,
Gary

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Buy Ink Cartridges or Refill Kits for your HP, Epson, Canon or Lexmark
Printer at MyInks.com. Free s/h on orders $50 or more to the US & Canada.
http://www.c1tracking.com/l.asp?cid=5511
http://us.click.yahoo.com/mOAaAA/3exGAA/qnsNAA/dkFolB/TM
---------------------------------------------------------------------~->

To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com



Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Olaf van der Spek
2003-11-24 14:25:14 UTC
Permalink
----- Original Message -----
From: "Johan Sundström" <***@lysator.liu.se>
To: <***@yahoogroups.com>
Sent: Thursday, November 20, 2003 6:20 PM
Subject: Re: [BitTorrent] Request for protocol extension: get_info/info
messages
Post by Johan Sundström
Post by Olaf van der Spek
I'd also like to propose a URL format for downloads instead of a .torrent
btp:protocol:host:port/info_hash/peers/
btp:udp:192.168.1.1:2710/%01%23%45%67%89%ab%cd%ef
%01%23%45%67%89%ab%cd%ef%01%23%45%67/192.168.1.2:6881 (example)
I'd suggest on adhering to the more common URI syntax of using just one
marker for the protocol part -- that way, you will avoid many headaches
with libraries for decomposing URLs. In other words, something along the
lines of
btp-http://host:post/info_hash/peers
and, as you move on with introducing other network transport layer methods,
btp-udp://host:post/info_hash/peers
I promise that later implementors will be silently grateful. :-) Depending
on what you intend "btp-http" to be, you might be better off not pretending
it is a protocol of its own at all, and just stick to defining a convention
of interpreting the URL as http://host:post/info_hash/peers, much as things
already are today.
But how would the webbrowser know to start a BT client if a user clicked on
such a link starting with http: ?
mailto: URIs don't have // in them either, so would two protocol really be
an issue?
Post by Johan Sundström
Another aspect of the strong points of URLs that may and may not be worth
aiming for, is brevity. Considering you know for sure that the hash is a
gob of binary gibberish, you might as well go for a format of the info_hash
portion without the % escapes, i e having the ASCII hex representation from
the very start, rendering your example into the (mildly) less frightening
btp-udp://192.168.1.1:2710/0123456789abcdef0123456789abcdef
01234567/192.168.1.2:6881
I'm not sure if later implementors will be grateful for that ;->
But it's worth considering indeed.
Post by Johan Sundström
--
/ Johan Sundström
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
------------------------ Yahoo! Groups Sponsor ---------------------~-->
Buy Ink Cartridges or Refill Kits for your HP, Epson, Canon or Lexmark
Printer at MyInks.com. Free s/h on orders $50 or more to the US & Canada.
http://www.c1tracking.com/l.asp?cid=5511
http://us.click.yahoo.com/mOAaAA/3exGAA/qnsNAA/dkFolB/TM
---------------------------------------------------------------------~->

To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com



Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
g***@shaw.ca
2003-11-24 18:36:26 UTC
Permalink
Post by Olaf van der Spek
But how would the webbrowser know to start a BT client if a user
clicked on
such a link starting with http: ?
URIs don't have // in them either, so would two protocol really be
an issue?
Example URI for joining an IRC channel:

irc://earth.us.finalirc.net/VCD-XDCCZ

The browser would "know" the irc protocol to start the right application to join the channel. When you install mIRC for example, it adds a registry so IE knows to launch mIRC with irc:// links. Ideally future BT clients should do something similar, adding config settings so browsers know to launch BT with btp-blah://
Post by Olaf van der Spek
Post by Johan Sundström
Another aspect of the strong points of URLs that may and may not
be worth
Post by Johan Sundström
aiming for, is brevity. Considering you know for sure that the
hash is a
Post by Johan Sundström
gob of binary gibberish, you might as well go for a format of the
info_hash
Post by Johan Sundström
portion without the % escapes, i e having the ASCII hex
representationfrom
Post by Johan Sundström
the very start, rendering your example into the (mildly) less
frightening>
Post by Johan Sundström
btp-udp://192.168.1.1:2710/0123456789abcdef0123456789abcdef
01234567/192.168.1.2:6881
I'm not sure if later implementors will be grateful for that ;->
But it's worth considering indeed.
Post by Johan Sundström
--
/ Johan Sundström
Your use of Yahoo! Groups is subject to
http://docs.yahoo.com/info/terms/>
------------------------ Yahoo! Groups Sponsor --------------------
-~-->
Buy Ink Cartridges or Refill Kits for your HP, Epson, Canon or Lexmark
Printer at MyInks.com. Free s/h on orders $50 or more to the US &
Canada.http://www.c1tracking.com/l.asp?cid=5511
http://us.click.yahoo.com/mOAaAA/3exGAA/qnsNAA/dkFolB/TM
-------------------------------------------------------------------
--~->
Your use of Yahoo! Groups is subject to
http://docs.yahoo.com/info/terms/
------------------------ Yahoo! Groups Sponsor ---------------------~-->
Buy Ink Cartridges or Refill Kits for your HP, Epson, Canon or Lexmark
Printer at MyInks.com. Free s/h on orders $50 or more to the US & Canada.
http://www.c1tracking.com/l.asp?cid=5511
http://us.click.yahoo.com/mOAaAA/3exGAA/qnsNAA/dkFolB/TM
---------------------------------------------------------------------~->

To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com



Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Loading...