Performance (was Merkle, URLs, etc)

I had always assumed that there are two ways to properly address a node, by
the {file, nodeID (the {tree level, nodeOnLevel} pair)} and by the hash.
Alternately in order to expand the maximum size to infinite the nodes can be
addressed as {parentHash, childNumber}, actually this will only create an
addressable space of sqrt(output space of hash) nodes (by birthday paradox),
and so with SHA-256 the limit would be 2^128 nodes, but I don't consider
2^128 bytes to be too small by any measure, but if the gain of a factor of

(file, nodeID) pairs are ok but assume a standard node ID ordering (which
imposes severe restrictions on how you implement) and the file will be
a hash already so its not very compact. (parenthash, child) causes problems
when adding a file you have: you dont know the parent hashes at the point
where you have hashed the block. I think only hash alone makes sense.

nodeSize is important this can be considered. The advantage of addressing
purely based on the hash is simple, multiple torrents can be created that
share subtrees, or more plainly a multi-file torrent and a series of
singe-file torrents have the majority of the same nodes, assuming we are
willing to accept variable length nodes in the middle of the tree, and have
multiple nodes that claim to be root internally (multi-file torrent root +
for_each(file, file root)). Additionally, multi-torrent connections become
reasonable, especially when used with the MerklePool concept below.

Yes it looks quite nice. Its not even clear to me that there needs to be
such a thing as root (just nodes you dont yet have a parent for, its not
a special concept).

Placing a piece of the tree in the open is difficult, the only thing that
can be dependably done (without completely monotonic serialization) is to
place the node in a pool of data. In a theoretical, but not practical sense
this can be used to build a pool of peers with data, even though they may
have no idea what to do with it except share. Realistically the usefulness
of this is that peer1 can feed peer2 pieces of the file that peer1 knows
peer2 will need (verified by descendancy from the internal/root node that
peer2 requested). This descendancy trait allows SIMD pipelining, that is,
one request results in multiple responses at the discretion of the serving
peer.

Can you explain this a bit more, sorry.

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Joseph Ashwood

2005-03-05 00:12:03 UTC

----- Original Message -----
From: "Justin Cormack" <***@street-vision.com>
Subject: [BitTorrent] addressing (was Merkle, URLs, etc)

Can you explain this a bit more, sorry.

Not a problem I wrote that at 4AM. What this means is that given two peers:
peer1, peer2. When peer2 requests a node from peer1, peer1 can easily start
feeding peer2 not just the node retrieved but the child nodes as well (hence
the comparison to the Single-Instruction, Multiple-Data instructions). This
does not waste bandwidth because peer1 is confident that peer2 has not
requested these from another source, the reason: Until peer2 has the
requested node, peer2 doesn't even know IF there are children let alone what
they are. Making this completely at the discretion of peer1 allows peer1 to
throttle appropriately. This also works when each node is identified by it's
hash because peer2 can quickly determine if that node is already in its data
pool, and throttle peer1 appropriately.
Joe

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Justin Cormack

2005-03-04 12:34:22 UTC

Binary Merkle Trees: Are heavily flawed, and the only suggestion made to fix
it was to send the grandchildren as well, which completely defeats the point
of the child nodes, making the child nodes pure overhead, they serve no
purpose except to enlarge the tranfer and raise the CPU time.

Apart from the performance issue (the nodes are very small and you almost
immediately need another one, what is the problem?

MerklePool: It occured to me during the implementation that in many ways the
most efficient way to implement the storage of the Merkle encoded files
(including the partials) is to store them in one central pool structure,
with search mechanisms used to find nodes by hash, this allows rare blocks
to be distributed widely, and for blocks to be multicast among systems that
should be interested at a later date.
Have maps: Actually with the MerkleTree since it is known that the root must
be retrieved first (at least be designation), there can be a child Have Map
associated with each request, that is specifically that when serving up an
internal node a bitmap of the immediate children is supplied as well. This
map will generally be only a few bytes and will be on the order of 0.2% of
the overall transfers. Also moving the requests for these maps to immediate
descendants of a node (or nodelist to reduce overhead) will have the same
effect.
This works because sending the client a list of haves for nodes they know
nothing about is pointless. The downside is that all requests are now based
on node instead of file location, something that can be troublesome.

Thats why I suggested earlier using Bloom filters, as they transmit sets of
hashes. They need some thought to work out the best method, but you can
trade off size vs accuracy, and you can add hashes you dont know about to
them for later retrieval. You can also specify size to make tradeoffs with
overhead. This could give you a purely hash based system with no numbering.

j

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Joseph Ashwood

2005-03-05 01:28:18 UTC

----- Original Message -----
From: "Justin Cormack" <***@street-vision.com>
Subject: [BitTorrent] Have maps (was Merkle, URLs, etc)

Apart from the performance issue (the nodes are very small and you almost
immediately need another one, what is the problem?

In my view there is no need for another problem. The binary trees create
pure bloat that serves no purpose except to provide for more bloat.

Actually I hadn't thought of using Bloom filters, they do seem a reasonable
approach. The only problem I see is generating the necessary number of hash
functions, the easiest way to solve this problem is to prepend 0, 1, 2,
...., k to the crypto hash already being used, and mod out by the bitfield
size. Now the question is, how to determine the parameters. In my view there
are a number of variables; the number of nodes, the false positive rate, the
size of the Bloom bitfield, the number of hash functions, and the critical
point for measurement of false positives. Locating some old texts I've got:
N = # nodes * measurement Point in %.
M = bitfield size
K = # hash funcs
P = probability of false positive

P = (1-(1-(1/M))^KN)^K

This is easy enough to algebraicly manipulate to find near optimum numbers
given all but one input, I just don't feel like solving it.

The various inputs would also have to be included in the torrent file (or
URL) mostly the number of hash functions, and the bitfield size. The
question in my mind, and one that needs some implementation and testing to
solve is what is the real protocol advantage of Bloom over the prepended
children list, and the flat list. My worry is that under the birthday
paradox it would require nlogn space to store a Bloom field that would have
a 50% probability of collision for n insertions. This worries me some as the
peers approach a complete download since the number of collisions will rise
exponentially, leading to a potentially large number of false positive
queries for these near seeds, possibly slowing their connection. To take a
braindead case, consider a file with 10,000 nodes, but only a single bit
bitfield. The result is that the peer is queried for many nodes it does not
have yet once the first node is downloaded. Obviously this case is unlikely
to happen, but the same happens when you have 9,999 bits in the bitfield, if
only for the last piece. This complicates the endgame and as result concerns
me since the endgame is already the most difficult phase. The BitTorrent
simulator that someone mentioned a bit ago would be extremely helpful in
answering these questions.
Joe

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Elliott Mitchell

2005-03-07 00:33:39 UTC

Post by Joseph Ashwood
----- Original Message -----
Subject: [BitTorrent] Have maps (was Merkle, URLs, etc)

Apart from the performance issue (the nodes are very small and you almost
immediately need another one, what is the problem?

In my view there is no need for another problem. The binary trees create
pure bloat that serves no purpose except to provide for more bloat.

The majority seems to disagree with you and think binary Merkle trees
will work fine. There are certainly places where they are far from ideal,
but remember that a flat DB is also far from ideal in some places.

Your argument of computation time and size being too great for binary
Merkle trees has already been thoroughly ground into the dirt. Notably
the tree overhead is only small-order linear with the size of the
payload.

I agree that binary trees appear to be very much sub-optimal, but this
isn't a fatal flaw. Unless you can bring up some fresh new aspect of this
issue, would you please shut up about this issue?
--
(\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/)
\ ( | ***@gremlin.m5p.com PGP 8881EF59 | ) /
\_ \ | _____ -O #include <stddisclaimer.h> O- _____ | / _/
\___\_|_/82 04 A1 3C C7 B1 37 2A*E3 6E 84 DA 97 4C 40 E6\_|_/___/

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Joseph Ashwood

2005-03-07 01:36:43 UTC

----- Original Message -----
From: "Elliott Mitchell" <***@m5p.com>
Subject: Re: [BitTorrent] Have maps (was Merkle, URLs, etc)

Post by Joseph Ashwood
In my view there is no need for another problem. The binary trees create
pure bloat that serves no purpose except to provide for more bloat.

The majority seems to disagree with you and think binary Merkle trees
will work fine.

I have never argued that binary Merkle trees would not "work fine" I have
always said that they are wasteful of every possible resource and provide
only for increasing the pure overhead. I have also continually sided with
using n-ary trees which would make it so that those people who wish for
their own reason to use binary trees are welcome to do so.

Post by Elliott Mitchell
There are certainly places where they are far from ideal,
but

You forgot one critical aspect, there is no location where a binary Merkle
tree is ideal.

Post by Elliott Mitchell
remember that a flat DB is also far from ideal in some places.

And now you are once again mistaking my preferred usage scenario with the
formatting I have suggested. Besides that rather critical
misinterprettation, name even a single case where a completed DB that is not
as flat as possible offers performance gains over one that is flat.

I'll save you some effort, it doesn't exist. This is the search difference
between hash-search and binary search. A hash search breaks the rules by
enabling indexed searching.

Post by Elliott Mitchell
I agree that binary trees appear to be very much sub-optimal, but this
isn't a fatal flaw. Unless you can bring up some fresh new aspect of this
issue, would you please shut up about this issue?

Besides the simple fact that the choice of binary seems to be based on "but
it's easier to program" which it isn't I believe I have also brought up the
necessary branching factor in the verification, but there is also a
pre-caching problem, a search overhead problem, a construction problem, an
indexing problem, and a hash size vs input size problem, just off the top of
my head.

The pre-caching problem is this: The computations necessary to predictively
load the next requested nodes from disk is exponentially more complex with
binary than with as flat as possible.

The search overhead is related but can be solved using some of the more
esoteric possibilities of the MerklePool concept I laid out before.

Construction problem is that with each layer of the tree that is build
maintaining anything resembling balance (necessary in order to make noth the
pre-caching and search problems as easy as possible, even though still far
worse than n-ary) becomes increasingly difficult and as it requires an
exponential time algorithm, this can become very costly

Indexing problem. In short you run out of indexes much faster this way.

The hash size vs input size problem is that the hashes used slow down as
there is less input, leading to exponential slow down of the entire system
as the inputs shrink.

Is that enough problems or should I think for more than 30 seconds on it?
Joe

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Olaf van der Spek

2005-03-07 16:16:08 UTC

Post by Joseph Ashwood
----- Original Message -----
Subject: Re: [BitTorrent] Have maps (was Merkle, URLs, etc)

Post by Joseph Ashwood
In my view there is no need for another problem. The binary trees create
pure bloat that serves no purpose except to provide for more bloat.

The majority seems to disagree with you and think binary Merkle trees
will work fine.

Didn't I already say that binary merkle trees have the same overhead as
'BT1' when the granularity is equal?

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Konstantin 'Kosta' Welke

2005-03-07 20:22:41 UTC

Post by Elliott Mitchell
There are certainly places where they are far from ideal,
but

You forgot one critical aspect, there is no location where a binary Merkle
tree is ideal.

They are the quickest way of getting a partial tree to verify one
piece.

Post by Elliott Mitchell
remember that a flat DB is also far from ideal in some places.

Same for binary trees.

Post by Joseph Ashwood
And now you are once again mistaking my preferred usage scenario with the
formatting I have suggested. Besides that rather critical
misinterprettation, name even a single case where a completed DB that is not
as flat as possible offers performance gains over one that is flat.

In the case of "I need to verify this one piece to be able to share it".

Post by Joseph Ashwood
I'll save you some effort, it doesn't exist. This is the search difference
between hash-search and binary search. A hash search breaks the rules by
enabling indexed searching.

I do not quite follow you here... I think that the tree implementation should imply
what hash is whose parent and child (for binary trees, this is very easy).
So there is no need for "searching". Did I miss something?

Post by Joseph Ashwood
The pre-caching problem is this: The computations necessary to predictively
load the next requested nodes from disk is exponentially more complex with
binary than with as flat as possible.

Just so I get it right: Are you talking about "What partial trees do I need
next?" or "What partial trees does my peer need?". In the first case, I
cant follow you (should be logarithmic, just like for all trees). In the second
case, I can neither see a computational difference.

Post by Joseph Ashwood
The search overhead is related but can be solved using some of the more
esoteric possibilities of the MerklePool concept I laid out before.

Sorry I must have missed both the problem and your solution. Can you point
me to a date/time and subject of your message so I can re-read it?

Post by Joseph Ashwood
Construction problem is that with each layer of the tree that is build
maintaining anything resembling balance (necessary in order to make noth the
pre-caching and search problems as easy as possible, even though still far
worse than n-ary) becomes increasingly difficult and as it requires an
exponential time algorithm, this can become very costly

The total size of the tree should be easy to calculate for all n-ary trees.
I dont see how balancing makes any sense in a Merkle Tree as it cannot
save any space. Is there a better way than the naive approach of constructing
an n-ary tree? (Naive meaning the first n leaves have one parents, the first
n parents have a parent etc. Disadvantage is that the tree tends to get empty
towards the end in non-optimal cases).

Post by Joseph Ashwood
Indexing problem. In short you run out of indexes much faster this way.

I agree. If using an unsigned :) 32-bit integer as hash index, you can store
2^32-1 leaf hashes in a flat tree, but only 2^31 leaf hashes in a (perfect)
binary tree. For 4K piece size, this is the difference between 16 Terabyte
and 8 Terabyte maximum file size.

Post by Joseph Ashwood
The hash size vs input size problem is that the hashes used slow down as
there is less input, leading to exponential slow down of the entire system
as the inputs shrink.

Can you please rephrase this sentence so that I can understand it?

Post by Joseph Ashwood
Is that enough problems or should I think for more than 30 seconds on it?

Please think of more. ;)

Note that I do neither think that binary trees are the best choice. They are
worst case for tree size but optimal case for quick verification of a single
piece. To know if this is really relevant, this bittorrent simulator might
be helpfull (I think I'll start coding next week). If it is irrelevant, we
should use flat trees. I not, a tradeoff using n-ary trees seems good.

Does anyone have a good URL about n-ary trees that goes a bit further
than http://www.brpreiss.com/books/opus5/html/page257.html and
http://www.utc.edu/Faculty/Christopher-Mawata/petersen/lesson13.htm ?

Kosta

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Joseph Ashwood

2005-03-08 01:21:38 UTC

----- Original Message -----
From: "Konstantin 'Kosta' Welke" <***@fillibach.de>
Subject: Re: [BitTorrent] Have maps (was Merkle, URLs, etc)
[Optimal case for binary trees?]

Post by Konstantin 'Kosta' Welke
In the case of "I need to verify this one piece to be able to share it".

Actually the optimum case for that is having the verification in the node,
regardless of branching. this then leads to the overhead to verification =
depth, binary trees will be deepest, they are not optimal.

Post by Konstantin 'Kosta' Welke
I think that the tree implementation should imply
what hash is whose parent and child (for binary trees, this is very easy).
So there is no need for "searching". Did I miss something?

Searching is important when a piece is requested, there is a search overhead
to determine whether or not that piece is available. Using a perfectly flat
tree this is a search of the minimum possible area, using a binary tree it
is a search of the maximum possible area. These represent the extremes
available, the binary tree is the most costly.

It is the second case (peer need). The overhead of this becomes critical as
the trees and number of connections grows, it is linear in both but
exponential in the combined. By adding overhead to the search for the next
piece you slow down the tit-for-tat, on a single peer with a single file and
a single connection, this is not ciritical, but as these numbers grow it
becomes increasingly necessary to predict what your peers will want in order
to reduce the computation overhead (i.e. don't flush it out of the memory
cache). I am unsure whether or not BT1 does this, but considering the
relatively small number of connections that it maintains and the perfect
flatness of the tree this is not a major issue. The issue is when you have a
situation like Hurricane Electric (which will BitTorrent peer and track for
hosting clients) where a single peer may have thousands of active files,
each with hundreds of active peers, this exponential situation will severely
hamper their potential for system speed.

Post by Joseph Ashwood
The search overhead is related but can be solved using some of the more
esoteric possibilities of the MerklePool concept I laid out before.

Sorry I must have missed both the problem and your solution. Can you point
me to a date/time and subject of your message so I can re-read it?

I didn't post about it. The search overhead problem is succinctly "find me
piece with hash X" finding it in a Merkle tree is costly, binary is the most
costly, flat the least, but finding it in a 256-ary tree in a shared
MerklePool eliminates the advantage/disadvantage in this case.

The better way is to compute all the nodes at a single level (this is where
I began the use of the MerklePool), but the cost is primarily in the depth
of the tree. as the depth increase it becomes necessary to maintain indexing
across multiple levels, by flattening the tree this again exponential
overhead is reduced.

Balance of the tree becomes inportant again in the search, by expending
effort in balancing the tree (which with most algorithms it should come
close) you make it faster to search the depths of the tree for any
information (unbalanced trees have more depth and so the search takes
longer). Binary trees, first off, already have the extra depth, and second
balancing the nodes is in the average case more difficult.

Post by Joseph Ashwood
The hash size vs input size problem is that the hashes used slow down as
there is less input, leading to exponential slow down of the entire system
as the inputs shrink.

Can you please rephrase this sentence so that I can understand it?

Modern hashes have substantial overhead in the finalization operations, by
having the smallest nodes possible the finalization code is executed the
maximum number of times. As the size of the nodes shrinks linearly, the
number of internal nodes increases super-linearly. As the number of nodes
increases the number of times it is necessary to run the finalization code
increases. I did have a misstep there, I believe it is only a polynomial
increase, not exponential.

Post by Joseph Ashwood
Is that enough problems or should I think for more than 30 seconds on it?

Please think of more. ;)
Note that I do neither think that binary trees are the best choice. They are
worst case for tree size but optimal case for quick verification of a single
piece.

Here is where we substantially differ. In verifying a single piece in a
properly formatted n-ary tree (like my proposal) the cost is the tree depth.
This is the same optimal cost for binary trees. The n-ary tree will be
flatter and so offers faster verification of the piece than the binary
version. For reference, my implemenation can verify a single piece of a
478MB file in 4 hashes, assuming 4KB blocks, the same performance for a
binary tree would only be a 65KB file, again assuming 4KB blocks. Verifying
the same 478MB file would take a binary tree 29 hashes (assuming I counted
correctly) approximately 7 times as long. 7 times the time is not a small
performance penalty.

Post by Konstantin 'Kosta' Welke
To know if this is really relevant, this bittorrent simulator might
be helpfull (I think I'll start coding next week). If it is irrelevant, we
should use flat trees. I not, a tradeoff using n-ary trees seems good.

I agree a simulator would be of great help.
Joe

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Elliott Mitchell

2005-03-10 00:47:58 UTC

Post by Joseph Ashwood
[Optimal case for binary trees?]

Post by Konstantin 'Kosta' Welke
In the case of "I need to verify this one piece to be able to share it".

Incorrect.

The binary tree will be deeper, however you only need to send one hash
per level. With a non-binary tree you will need to send all the /other/
hashes for verification at each level. This means with the flat model
you're sending all but one hash every time to verify the node. In general
With 'n' nodes and 'b' branching factor, you'll send logb(n)*(b-1) hashes
(actually ceil(logb(n))*(b-1)) for verification of a leaf node. As 'b'
gets higher logb(b) (the height) will approach 1, however b-1 turns into
n.

If you transfer the verification hashes with each piece (in node
verification), you're expending a total of nlog2(n) bandwidth over the
entire payload while flat will cost n^2. Guess which is better.

This is why I suggest handling of blocks of hashes similarly to payload
hashes at the lowest layer. The (possibly large) cost of transfering of
hashes will be accounted for with the rest of the major data transfer.
This also means hashes are transfered *once*, rather than multiple times
with every node.

This is a client issue, not a protocol issue. The tree is a protocol data
structure. What data structure a _client_ uses to resolve hashes/indicies
into an address where a piece is located is *strictly* a client issue.
The protocol cannot mandate a structure to be used.

I hope you're not doing a linear search of your entire set of hashes to
find a match...

Post by Joseph Ashwood
The search overhead is related but can be solved using some of the more
esoteric possibilities of the MerklePool concept I laid out before.

Sorry I must have missed both the problem and your solution. Can you point
me to a date/time and subject of your message so I can re-read it?

You *don't* search the Merkle tree for a hash X!

The Merkle tree is *strictly* a _protocol_ structure for hash
_verification_, it may or may not relate to the actual data structures
used on the client! The client can (and certainly should) maintain a
search tree or hash table to resolve a hash into the associated piece.

Post by Joseph Ashwood
The hash size vs input size problem is that the hashes used slow down as
there is less input, leading to exponential slow down of the entire system
as the inputs shrink.

Can you please rephrase this sentence so that I can understand it?

Even with finalization being expensive, the more than two orders of
magnitude more data being processed at the leaves overwhelms the cost of
internal node computation.

Post by Joseph Ashwood
Is that enough problems or should I think for more than 30 seconds on it?

Please think of more. ;)
Note that I do neither think that binary trees are the best choice. They are
worst case for tree size but optimal case for quick verification of a single
piece.

The number of times the hash function is run relates to the tree depth.
The amount of data run through the hash function relates to the branching
factor. You are decreasing the number of times the hash function is run,
but increasing the amount of data run through the hash each time it is
run.

If you do verification once (either piecewise, or as the whole tree),
both methods are similar in cost because the node verification is
overwhelmed by the much greater data size of leaf verification. If you
are trying to verify a piece without knowing knowing the validity of the
rest of the tree, the cost of your method is greater because you have to
run a much larger amount of data through the hash.
--
(\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/)
\ ( | ***@gremlin.m5p.com PGP 8881EF59 | ) /
\_ \ | _____ -O #include <stddisclaimer.h> O- _____ | / _/
\___\_|_/82 04 A1 3C C7 B1 37 2A*E3 6E 84 DA 97 4C 40 E6\_|_/___/

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Joseph Ashwood

2005-03-10 10:58:37 UTC

----- Original Message -----
From: "Elliott Mitchell" <***@m5p.com>
Subject: Re: [BitTorrent] Have maps (was Merkle, URLs, etc)

Post by Joseph Ashwood
[Optimal case for binary trees?]

Post by Konstantin 'Kosta' Welke
In the case of "I need to verify this one piece to be able to share it".

Incorrect.
The binary tree will be deeper, however you only need to send one hash
per level.

Actually you will need 2, otherwise you cannot complete the hash computation
for the next level.

Post by Elliott Mitchell
With a non-binary tree you will need to send all the /other/
hashes for verification at each level.

For binary you will as well, or did you forget that the hash actually has to
be computed?

Post by Elliott Mitchell
This means with the flat model
you're sending all but one hash every time to verify the node.

Completely incorrect. Each hash only needs to be known once, so the transfer
overhead is necessarily linear in the number of internal nodes. The n-ary
tree has fewer internal nodes, and hence will have lower cost.

Post by Elliott Mitchell
If you transfer the verification hashes with each piece (in node
verification), you're expending a total of nlog2(n) bandwidth over the
entire payload while flat will cost n^2. Guess which is better.

I seriously hope you were half-asleep when you wrote this. In the binary
tree case you will have nlog2(n) bandwidth, in the K-ary tree case you will
have nlogK(n) bandwidth. Guess which is better.

I believe your misunderstanding is the belief that each child node needs to
have the hashes of all it's direct siblings in order to verify, that is
incorrect, each parent node needs to hashes of it's children in order to
verify, greater branching = lower overhead = faster verification.

Post by Elliott Mitchell
This is why I suggest handling of blocks of hashes similarly to payload
hashes at the lowest layer. The (possibly large) cost of transfering of
hashes will be accounted for with the rest of the major data transfer.
This also means hashes are transfered *once*, rather than multiple times
with every node.

I will grant that there are ways to transfer the hash once instead of the
twice that I have proposed, but hose methods also require downloading the
siblings before verification of a node. If we really want to take this as
far as possible it is also possible to compute the Merkle tree without any
transferred hashes, but that is more wasteful than even the binary trees.

Post by Joseph Ashwood
Modern hashes have substantial overhead in the finalization operations, by
having the smallest nodes possible the finalization code is executed the
maximum number of times. As the size of the nodes shrinks linearly, the
number of internal nodes increases super-linearly. As the number of nodes
increases the number of times it is necessary to run the finalization code
increases. I did have a misstep there, I believe it is only a polynomial
increase, not exponential.

Even with finalization being expensive, the more than two orders of
magnitude more data being processed at the leaves overwhelms the cost of
internal node computation.

Here we have another fallacy on your part. 2 orders of magnitude will not
overcome the finalization cost in the sizes that are typically discussed
(4KB seems the most common). The choice still comes down to the number of
hashes per file size. In the example I gave (4KB blocks, 478 MB) this was a
difference of 7 fold, even if your argument held, that would still leave
N-ary trees more efficient, two orders of magnitude would only bring the
difference down to 1.75x, still well above being equal to N-ary trees.

Post by Elliott Mitchell
The number of times the hash function is run relates to the tree depth.

That is correct.

Post by Elliott Mitchell
The amount of data run through the hash function relates to the branching
factor. You are decreasing the number of times the hash function is run,
but increasing the amount of data run through the hash each time it is
run.

And due to the finalization even if your 2 orders of magnitude was correct,
the N-ary tree would still be better.

Post by Elliott Mitchell
If you do verification once (either piecewise, or as the whole tree),
both methods are similar in cost because the node verification is
overwhelmed by the much greater data size of leaf verification.

Incorrect. At the sizes being discussed the dominant factor is the
finalization of the hash (e.g. IIRC finalization of SHA-512 takes 20 times
the number of computations of inserting 1024-bits), and as such the smaller
the number of hashes, the faster it will be. I will admit that as filesize
approaches blocksize the n-ary advantage disappears, but since we are
discussing blocks in the KB range, and files in the GB range, this is no
where near reality.

In addition you are making heavily flawed assumptions, you are assuming that
the binary tree only has to verify once, but assuming that the n-ary tree
must verify multiple times. In truth the n-ary tree will have to verify
fewer times than the binary.

Post by Elliott Mitchell
If you
are trying to verify a piece without knowing knowing the validity of the
rest of the tree, the cost of your method is greater because you have to
run a much larger amount of data through the hash.

I was going to say that there are cases where that is true, but we've
already established that binary trees require at least as many hash data
insertions as N-ary trees, which leaves the number of finalizations as the
dominant factor, N-ary has fewer, N-ary is more speed efficient.

N-ary is more speed efficient in every case. N-ary is at least as space
efficient in every case. N-ary is more hash efficient. N-ary is inherently
superior to the binary trees for BitTorrent.
Joe

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Olaf van der Spek

2005-03-10 20:12:28 UTC

Post by Joseph Ashwood
----- Original Message -----
Subject: Re: [BitTorrent] Have maps (was Merkle, URLs, etc)

Post by Joseph Ashwood
[Optimal case for binary trees?]

Post by Konstantin 'Kosta' Welke
In the case of "I need to verify this one piece to be able to share it".

Incorrect.
The binary tree will be deeper, however you only need to send one hash
per level.

Actually you will need 2, otherwise you cannot complete the hash computation
for the next level.

But one of those two is from 'below' and can be calculated by yourself.

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Elliott Mitchell

2005-03-10 23:16:13 UTC

Post by Joseph Ashwood
[Optimal case for binary trees?]

Post by Konstantin 'Kosta' Welke
In the case of "I need to verify this one piece to be able to share it".

Incorrect.
The binary tree will be deeper, however you only need to send one hash
per level.

Actually you will need 2, otherwise you cannot complete the hash computation
for the next level.

Olaf responded first on this.

Post by Elliott Mitchell
With a non-binary tree you will need to send all the /other/
hashes for verification at each level.

For binary you will as well, or did you forget that the hash actually has to
be computed?

See above. b-1 hashes need to be sent for every level. In the case of
binary, you only need to send the sibling, because the remaining hash
will be computed during the verification process.

Sending all siblings allows you to verify the node before verifying the
data, but I don't see this as an improvement. In particular if one
(either data block, or verification hashes) is wrong, do you think there
is a significant likelyhood that the other is correct?

Post by Elliott Mitchell
This means with the flat model
you're sending all but one hash every time to verify the node.

You are correct that each hash is only /required/ to be transfered once
to do verification. You've been suggesting "in piece verification" which
sounds like you're suggesting PIECE messages be a block of payload and
some number of hashes for verification. For this situation it is simpler
to have all PIECE messages be the same size.

If you do incremental verification (making sure hashes are only sent
once), the binary tree will only need to transfer half of the hashes. The
other half will be computed by hashing the corresponding blocks and
verifying up the tree. For b-ary trees you'll have to transfer (b-1)/b
of the hashes. While higher branching factors do use fewer nodes, you
lose because you need to transfer many more hashes before you can verify
the first piece.

I seriously hope you were half-asleep when you wrote this. In the binary
tree case you will have nlog2(n) bandwidth, in the K-ary tree case you will
have nlogK(n) bandwidth. Guess which is better.

I was quite awake, and I stand by those results under the assumptions I
had to make.

I suggest you define what you mean by "in node verification". I took it
to mean that when you send a data block, you also send sufficient data to
verify it in lieu of other data (other than the root hash).

For the binary tree log2(n) hashes are needed for verify, as n nodes will
be transfered the total cost over the entire payload will be nlog2(n).
For the flat arrangement n-1 hashes are needed to verify an arbitrary
node, as n nodes will be transfered the total cost over the entire
payload will be n^2.

This is more a commentary on transfering hashes /with/ the pieces, than
on binary versus flat.

Post by Joseph Ashwood
I believe your misunderstanding is the belief that each child node needs to
have the hashes of all it's direct siblings in order to verify, that is
incorrect, each parent node needs to hashes of it's children in order to
verify, greater branching = lower overhead = faster verification.

True, but the computation overhead is small. Bandwidth overhead is far
more important, and there the difference is still small (hashes are much
smaller than the payload).

This is so obviously wrong I'm having difficulty figuring out how to
respond. What the heck do you mean?

Transfering blocks of hashes similarly to payload blocks makes the lowest
layer simplier, while at the same time makes bandwidth accounting much
simplier.

Even with finalization being expensive, the more than two orders of
magnitude more data being processed at the leaves overwhelms the cost of
internal node computation.

4KB is the smallest size anyone has seriously proposed. 16KB or 32KB is
more likely when things happen though. I don't have much to respond to
here as you haven't given me a specific number.

Post by Elliott Mitchell
The number of times the hash function is run relates to the tree depth.

That is correct.

And due to the finalization even if your 2 orders of magnitude was correct,
the N-ary tree would still be better.

True, but the point is this difference is small.

"finalization of SHA-512 takes 20 times the number of computations of
inserting 1024-bits"

Giving a single number would of been easier to work with. So finalization
of SHA-512 is equivalent to sending 2560 bytes of data through it? This
sounds awfully high, I could believe it even though it sounds way too
high.

So, what will the cost be? For the verification of a 4K leaf node the
cost will be 4096+2560 or 6656 byte equivalents. For 16KB/32KB the
cost will be 18944/35328 byte equivalents.

For binary trees, there will be one internal node for every leaf node.
The internal nodes will have a cost of 2560 for overhead and 64 for each
of the hashes being sent through it. So 2560+128 or 2688 byte
equivalents. So 40% overhead if your number is correct and 4KB blocks are
used. Only 14% overhead if the more likely 16KB is used though, and 7%
for 32KB.

Well, with your overhead number (which is rather extreme) binary trees
lose out. Even with this, note that at 32KB the data hash computation is
overwhelming the internal node computation.

Post by Joseph Ashwood
In addition you are making heavily flawed assumptions, you are assuming that
the binary tree only has to verify once, but assuming that the n-ary tree
must verify multiple times. In truth the n-ary tree will have to verify
fewer times than the binary.

Incorrect. I'm not making that assumption, in the best circumstances
either way will only need a single verification of the entire tree.
--
(\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/)
\ ( | ***@gremlin.m5p.com PGP 8881EF59 | ) /
\_ \ | _____ -O #include <stddisclaimer.h> O- _____ | / _/
\___\_|_/82 04 A1 3C C7 B1 37 2A*E3 6E 84 DA 97 4C 40 E6\_|_/___/

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Joseph Ashwood

2005-03-11 02:02:28 UTC

----- Original Message -----
From: "Elliott Mitchell" <***@m5p.com>
Subject: Re: [BitTorrent] Have maps (was Merkle, URLs, etc)

Post by Elliott Mitchell
I suggest you define what you mean by "in node verification". I took it
to mean that when you send a data block, you also send sufficient data to
verify it in lieu of other data (other than the root hash).

Ok, that's where we have the disconnect. I was referring to the node design
I had posted previously (the 6-tuple head and data). This allows every node
to be verified as internally consistent, and the remaining verification is a
binary compare between the expected and received head. I am still depending
on clients being smart enough to collect the parents itself, in fact I have
previously suggested that all transfers be done according to this tree (e.g.
request a hash, recieve a node). Working from the known root hash, this will
give the information about the next level down, requesting from the first
children will give information about the next level, etc.

[snip because communication was not clear]

Post by Elliott Mitchell
This is more a commentary on transfering hashes /with/ the pieces, than
on binary versus flat.

That is true, and it began with, as the subject says, have maps.
Transferring fewer hashes is better, but creates more computation overhead.
Absolute worst case (that I have not seen anybody propose) only the root
hash is transferred, along with a branching rate (or implicitly 2), from
there the entire file is downloaded, the tree is computed, and root is
checked. This creates hashSize overhead , but costs in both transfer to
verify a single piece and in computation time to verify a piece. Best case
for computations the current format is used. We just seem to be arguing over
where in between everything should be.

Post by Joseph Ashwood
I will grant that there are ways to transfer the hash once instead of the
twice that I have proposed, but hose methods also require downloading the
siblings before verification of a node.

Not necessarily. If the child node head is exclusively in the parent, then
the head is transferred only once (I had not previously proposed this and it
looks sufficiently confusing). That is the torrent includes the root head.
The root node has no head only the child heads, etc. This will transfer each
head once, but nodes cannot be verified without knowing the parent.

Post by Elliott Mitchell
4KB is the smallest size anyone has seriously proposed. 16KB or 32KB is
more likely when things happen though. I don't have much to respond to
here as you haven't given me a specific number.

Ok at 32KB node size. N-ary trees with all 64-bit integers, with the design
I proposed before will create a 408-ary tree, and the leafs will each hold
32688 bytes.

Level 1 = 1 node = 32688 bytes
Level 2 = 408 nodes = 12.7 MB
Level 3 = 166464 node = 5.2 GB
Level 4 = 67917312 nodes = 2067.6 GB

Verification for a leaf node in a 1GB file will take 3 hashes, each of
around the point where the hash finalization and the rest of the hash are
about equal, so for the sake of leniency let's mulitply it by 3 for a solid
error margin.

Binary
1GB at 32KB = 15 levels
15 finalizations, the insertion doesn't make any real difference.

This as estimated 9 time units, versus 15 times units to verify the first
piece. Assuming verification of nodes is cached verifying the second piece
will cost on average for the n-ary tree 2 verifications (root is now
verified) for an estimated finalization cost metric of 6, and for the binary
on average 14.5 finalizations, cost difference is now above 2. I don't feel
like doing the math beyond that because it starts to get complicated. Doing
some rough estimations in my head the n-ary tree will stay at 2
verifications for a while, but will decrease to an average of one, long
before the binary tree does. The last few pieces the average cost will be
identical.

The grossest estimate is the 3x penalty for the n-ary nodes. I'll run some
tests later today to determine the penalty at various input sizes.
Joe

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Konstantin 'Kosta' Welke

2005-03-10 02:49:30 UTC

Post by Joseph Ashwood
[Optimal case for binary trees?]

Post by Konstantin 'Kosta' Welke
In the case of "I need to verify this one piece to be able to share it".

IBTD. Branching is as important as tree depth. In order to verify a
piece, one needs its hash. To verify the hash, you need all the siblings
and its parent. So the total number of nodes needed to verify a hash
is about braching*depth. Of course, this number decreases with every
piece we download, as we already have internal nodes.

Post by Joseph Ashwood
Searching is important when a piece is requested, there is a search overhead
to determine whether or not that piece is available. Using a perfectly flat
tree this is a search of the minimum possible area, using a binary tree it
is a search of the maximum possible area. These represent the extremes
available, the binary tree is the most costly.

I thought of an implentation where this search is not necessary. If a piece is
avaiable (i.e. downloaded and checked), some bit will be switched to 1 in a
bitfield of size p. Checking that is O(1) as we know which piece is requested.
I also thought that the tree nodes are numbered consequtively. (This is very easy
for a binary tree) This makes it possible to just use simple arrays to store
the tree, too. I think someone posted some basic math to do that on this list
about a month ago. In other words: All node-retrieving and storing operations
are done in O(1) time. This also means that calculating which nodes are necessary
for a piece are done in O(h*N) time.

Post by Konstantin 'Kosta' Welke
Just so I get it right: Are you talking about "What partial trees do I
need next?" or "What partial trees does my peer need?". In the first case, I
cant follow you (should be logarithmic, just like for all trees). In the
second case, I can neither see a computational difference.

I'll think about it, but cant we just evade the problem by letting the peer
that needs the nodes just request them?

Post by Joseph Ashwood
I didn't post about it. The search overhead problem is succinctly "find me
piece with hash X" finding it in a Merkle tree is costly, binary is the most
costly, flat the least, but finding it in a 256-ary tree in a shared
MerklePool eliminates the advantage/disadvantage in this case.

Why dont just number the pieces. No search required.
And finding a node in an unordered tree in linear to the number of nodes, so there
should not be a very big difference between binary tree flat tree, as the
binary tree only has about twice the nodes of the flat one.

[tree construction]

Post by Joseph Ashwood
The better way is to compute all the nodes at a single level (this is where
I began the use of the MerklePool), but the cost is primarily in the depth
of the tree. as the depth increase it becomes necessary to maintain indexing
across multiple levels, by flattening the tree this again exponential
overhead is reduced.

Sorry, I dont understand :)

Not necessarily.

Post by Joseph Ashwood
As the number of nodes
increases the number of times it is necessary to run the finalization code
increases.

As the number of siblings on one level increases, this increases.

Post by Joseph Ashwood
I did have a misstep there, I believe it is only a polynomial
increase, not exponential.

I think that is it polynomial in O(height^2) or something like that. I was
unaware of that problem, good that you mentioned it.

Post by Konstantin 'Kosta' Welke
Note that I do neither think that binary trees are the best choice. They
are worst case for tree size but optimal case for quick verification of a
single piece.

Here is where we substantially differ. In verifying a single piece in a
properly formatted n-ary tree (like my proposal) the cost is the tree depth.

I think that the cost is depth*N. How do you check a node without knowing
its silblings?

Post by Joseph Ashwood
This is the same optimal cost for binary trees. The n-ary tree will be
flatter and so offers faster verification of the piece than the binary
version. For reference, my implemenation can verify a single piece of a
478MB file in 4 hashes, assuming 4KB blocks, the same performance for a
binary tree would only be a 65KB file, again assuming 4KB blocks. Verifying
the same 478MB file would take a binary tree 29 hashes (assuming I counted
correctly) approximately 7 times as long. 7 times the time is not a small
performance penalty.

Could you write out what exactly is calculated here? Are you assuming
that you only need all ancestors of a leaf to verify it and its contents?

I agree a simulator would be of great help.

Lets start with a tree calculator! ;)

Kosta

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Joseph Ashwood

2005-03-11 01:23:55 UTC

----- Original Message -----
From: "Konstantin 'Kosta' Welke" <***@fillibach.de>
Subject: Re: [BitTorrent] Have maps (was Merkle, URLs, etc)

Post by Joseph Ashwood
[Optimal case for binary trees?]

Post by Konstantin 'Kosta' Welke
"I need to verify this one piece to be able to share it".

the optimum case for that is having the verification in the node,
verification = depth

All you need is the head of the siblings. To refresh everyone's memory I
also proposed a format. This created a 6-tuple for each node (what I will
from here call the head), the 6-tuple was {nodeNumber, nodeSize,
numChildren, nodeID, parent nodeID, hash(head of children)} each node also
included data that was the explicit head for the children (leaf nodes
replaced hash of child heads with hash of data, and the data was in the data
portion). This makes it possible to provisionally verify a node independent
of anything else downloaded.

Using this it is only necessary to have the node to verify and all it's
parents, all other needed heads are included. The binary trees proposed do
not have this ability and do fall into the problem above.

That can be made to work, and the same technique can be used for n-ary
trees, and still be an improvement (fewer hashes, fewer nodes, lower O(...)
result).

[n-ary trees offer better prediction, prediction makes retrieval faster]

Post by Konstantin 'Kosta' Welke
I'll think about it, but cant we just evade the problem by letting the peer
that needs the nodes just request them?

You can but all that's doing is increasing the probability that the data is
not in the cache and so a quick response is not possible. That is exactly
the problem the prediction is designed to avoid. Without the ability to
predict accurately the best cache replacement algorithm will quickly become
random distribution, with prediction it becomes weighted random distribution
which will save disk time, and by relation speed up responses.

Search is still required, consider the case where a file several times the
available memory is being downloaded. The in memory pool will constantly
change, becoming effectively a cache, now the big array method no longer
works without building a big overhead tree (reference the CPU cache dynamics
used by modern CPUs).

Post by Konstantin 'Kosta' Welke
[tree construction]

Sorry, I dont understand :)

Not a problem. In something resembling C++

numNodes = maximum count of nodes at a level
MerkleNode **currentLevel
integer numAtCurrentLevel;
MerkleNode **previosLevel = NULL
integer numAtPrevLevel;
currentLevel = new MerkleNode *[numNodes]

process all data into nodes at current level //leaf nodes
trim extra nodes off currentLevel
numAtCurrentLevel = number of nodes used

while(numAtPrevLevel > 1)
{

write out level previosLevel to disk

currentLevel = new MerkleNode *[numNodes]
numAtCurrentLevel = numNodes
process nodes in batches from previosLevel to currentLevel
trim extra nodes off currentLevel
numAtCurrentLevel = number of nodes used

previosLevel = currentLevel
numAtPrevLevel = numAtCurrentLevel
}

write out previosLevel as root node.

This will create a tree of balanced depth, and worst case will waste
treeDepth-2 nodes.

Not necessarily.

How do you intend to avoid having to compute one hash for each node? As the
branching of the node decreases the number of nodes increases, as the number
of nodes increases the number of hash finalizations increases.

Post by Joseph Ashwood
As the number of nodes
increases the number of times it is necessary to run the finalization code
increases.

As the number of siblings on one level increases, this increases.

Actually it won't. The number of finalizations is strictly dependant on the
number of nodes, if fact they are exactly equal (+/- 1 depending on root
handling into torrent).

Post by Konstantin 'Kosta' Welke
Note that I do neither think that binary trees are the best choice. They
are worst case for tree size but optimal case for quick verification of a
single piece.

Here is where we substantially differ. In verifying a single piece in a
properly formatted n-ary tree (like my proposal) the cost is the tree depth.

I think that the cost is depth*N. How do you check a node without knowing
its silblings?

By having the sibling heads embedded in the parent. That makes it so the the
download of the siblings is not necessary for verification. The algorithm is
instead:

Verify parent hash integrity
Verify child hash integrity
Verify that child head matches parent-child head

The first two require hash finalizations, the last is a binary compare.

Could you write out what exactly is calculated here?

Actually I'm having trouble exactly duplicating my numbers as well. Here is
the full expansion of what I did today.

In it's current form my implementation occupies 80 bytes per head: 2 64-bit
integers for nodeID, 2 64-bit integers for parent nodeID, 1 64-bit integer
for number of children, 32-bytes for the hash = 80 bytes per head.
Each node has a head which accounts for 80 bytes, leaving 4016 bytes in a
4kb block. That leaves room for 50 child heads, and 16 slack bytes. This
means a 50-ary tree and so:
Level 1 = 1 block
Level 2 = 50 Blocks
Level 3 = 2500 Blocks
Level 4 = 125000 Blocks
.....

I also calculated that the tree can only house 4016 bytes per leaf because
of the overhead, multiplying the number of blocks at the level and the
number of bytes in the leaf gives the maximum size for the level

With binary trees I assumed that the nodes are overheadless, so they can
store the full 4096 bytes in the leaf. Since it is binary the blocks counts
are
Level 1 = 1
Level 2 = 2
Level 3 = 4
Level 4 = 8
Level 5 = 16
......

Same as before multiply the number of blocks at the leaf level and the leaf
node size to get the maximum size.
50-ary tree:
Level 1 = 4016 bytes
Level 2 = 200800 bytes
Level 3 = 9.5MB
Level 4 = 478.74MB
Level 5 = ~24GB

Binary:
Level 1 = 4096 bytes
Level 2 = 8192 bytes
Level 3 = 16KB
Level 4 = 32KB//guess I messed up, I accidently gave the binary tree an
extra level in my calculations
Level 5 = 64KB
Level 6 = 128KB
Level 7 = 256KB
Level 8 = 512KB
Level 9 = 1MB
Level 10 = 2MB
Level 11 = 4MB
Level 12 = 8MB
Level 13 = 16MB
Level 14 = 32MB
Level 15 = 64MB
Level 16 = 128MB
Level 17 = 256MB
Level 18 = 512MB
Level 19 = 1GB

Only 4.5 times the overhead for a 400 MB file, not 7.

Post by Konstantin 'Kosta' Welke
Are you assuming
that you only need all ancestors of a leaf to verify it and its contents?

Yes I am. Since there is one hash finalization per level, that was the
determining factor, and the only thing I counted.
Joe

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Konstantin 'Kosta' Welke

2005-03-11 12:26:21 UTC

Post by Konstantin 'Kosta' Welke
I think that the cost is depth*N. How do you check a node without knowing
its silblings?

By having the sibling heads embedded in the parent. That makes it so the the
download of the siblings is not necessary for verification. The algorithm is
Verify parent hash integrity
Verify child hash integrity
Verify that child head matches parent-child head
The first two require hash finalizations, the last is a binary compare.

Okay, now I understand that your last emails weren't just full of
nonsense, but we have a very different idea about what is saved
in the nodes. (To be honest, I think I didnt read the message where
you proposed the 6-tuples carefully, so it may be my fault).
I'll try to make some sense out of your email and come back to you.
If one cannot fake those messages you're taking about, we still have
to compare the extra overhead in the nodes with "just save the damn
hash everything else is implicit".

until later :)
Kosta

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Konstantin 'Kosta' Welke

2005-03-16 19:07:08 UTC

Post by Joseph Ashwood
All you need is the head of the siblings. To refresh everyone's memory I
also proposed a format. This created a 6-tuple for each node (what I will
from here call the head), the 6-tuple was {nodeNumber, nodeSize,
numChildren, nodeID, parent nodeID, hash(head of children)} each node also
included data that was the explicit head for the children (leaf nodes
replaced hash of child heads with hash of data, and the data was in the data
portion). This makes it possible to provisionally verify a node independent
of anything else downloaded.

What is the difference between NodeNumber and NodeID? NodeSize should be
implicit, NodeNumber, numChildren could be. by hash (head of children) you
mean the hash of the 6-tuples of the child nodes? is ID somehow related to
hash? I cant see how exactly...

Post by Joseph Ashwood
Using this it is only necessary to have the node to verify and all it's
parents, all other needed heads are included. The binary trees proposed do
not have this ability and do fall into the problem above.

[...]

Post by Joseph Ashwood
By having the sibling heads embedded in the parent. That makes it so the the
download of the siblings is not necessary for verification. The algorithm is
Verify parent hash integrity
Verify child hash integrity
Verify that child head matches parent-child head
The first two require hash finalizations, the last is a binary compare.

Could you describe this touple format a in a little more detail (i couldnt
find a good description in your older emails either)? Could you also describe
how its security drawbacks?

Sorry that I couldnt make more sense out of this. The problem is that the
basics are too unclear to me...

Kosta

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Joseph Ashwood

2005-03-17 12:23:32 UTC

I suggest that if we continue this thread that we move it to a new subject,
we have distinctly left Have maps completely behind.

----- Original Message -----
From: "Konstantin 'Kosta' Welke" <***@fillibach.de>
Subject: Re: [BitTorrent] Have maps (was Merkle, URLs, etc)

[Joe's proposal]

Post by Konstantin 'Kosta' Welke
What is the difference between NodeNumber and NodeID?

NodeNumber was there for strict file serialization (can be safely dropped),
while nodeID is an address (level, count) so

Post by Konstantin 'Kosta' Welke
NodeSize should be
implicit, NodeNumber, numChildren could be.

Theoretically it could be. I think I placed it there as a convenience in
parsing a flattened (but not decoded) file back into a tree, otherwise a
node header would be necessary, and that can get messy. NodeNumber can be
dropped, but I don't think numChildren can be safely removed. Removing
numChildren would make multifile torrents more difficult, and variably sized
nodes impossible (for balancing). The more assumptions are pushed into the
code the smaller the tree, but the more difficult a tree becomes to handle.
In the current implementation it is not necessary to know the file length
before parsing can begin, in some situations this is a decided advantage.
But it is a trade-off, I went for my preference there with having everything
as explicit as possible.

Post by Konstantin 'Kosta' Welke
by hash (head of children) you
mean the hash of the 6-tuples of the child nodes? is ID somehow related to
hash? I cant see how exactly...

Yes I meant the hash of the 6-tuples of the children. ID is only an
addressing form for finding nodes arbitrarily. If a slight lack of
serialization is acceptable the NodeID concept can be completely discarded,
although this would leave finding the parent difficult unless a replacement
ParentID were put in place (I'd suggest the hash from the parent 6-tuple).

Post by Konstantin 'Kosta' Welke
Could you describe this touple format a in a little more detail (i couldnt
find a good description in your older emails either)? Could you also describe
how its security drawbacks?

As I have currently implemented it (wastes a lot of space), the head if 80
bytes:
64-bit me.level
64-bit me.point
64-bit parent.level
64-bit parent.point
64-bit numChildren
64-bit nodeSize
256-bit SHA-256 hash of data

Except for the leaf nodes the data is a concatenation of the heads of the
children, the leaf fills the remaining area with file data. So for a firmer
example the root node of the 4MB file I just encoded is (in hex):
00 00 00 00 00 00 00 00 (me.level)
00 00 00 00 00 00 00 00 (me.point)
00 00 00 00 00 00 00 00 (parent.level)
00 00 00 00 00 00 00 00 (parent.point)
00 00 00 00 00 00 00 1A (numChildren)
00 00 00 00 00 00 08 20 (nodeSize)
2C 88 89 9A 71 B5 EA 5E
99 71 28 F4 C0 DE F9 4F
6B 3A 82 13 72 D1 87 D9
41 3A 32 26 D0 27 C6 03 (hash of child heads)

And I really don't feel like typing 2080 bytes of hex for the children
heads, they are of the same format, each head is 80 bytes.

There is an enormous amount of wasted space in this; NodeIDs can be dropped
(saves 32-bytes), numChildren and nodeSize can be reduced to either 16 or
24-bits (saves 10-12 bytes), leaving a 36-byte head, this will increase the
space efficiency further.

The trade-off with reducing the nodeSize and numChildren sizes is that node
space efficiency decrease. With an 8-bit nodeSize the maximum internal node
efficiency is 98.44%, and in general the maximum node space efficiency is
( floor( (2^nodeSize)/headSize ) * headSize ) / (2^nodeSize) which may
or may not be of interest.
Joe

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

David Smith

2005-03-17 04:25:13 UTC

"Sorry that I couldnt make more sense out of this. The problem is that the
basics are too unclear to me..."

http://www.open-content.net/specs/draft-jchapweske-thex-02.html

David Smith
Michigan State University
***@msu.edu
248.770.5524

-----Original Message-----
From: Konstantin 'Kosta' Welke [mailto:***@fillibach.de]
Sent: Wednesday, March 16, 2005 2:07 PM
To: ***@yahoogroups.com
Subject: Re: [BitTorrent] Have maps (was Merkle, URLs, etc)

[...]

Could you describe this touple format a in a little more detail (i couldnt
find a good description in your older emails either)? Could you also
describe
how its security drawbacks?

Sorry that I couldnt make more sense out of this. The problem is that the
basics are too unclear to me...
Kosta

Yahoo! Groups Links

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Konstantin 'Kosta' Welke

2005-03-17 14:33:29 UTC

Post by Joseph Ashwood
I suggest that if we continue this thread that we move it to a new subject,
we have distinctly left Have maps completely behind.

Done. :)
Whoever replies please cut the "(was:..." from the subject line. thx.

Post by Joseph Ashwood
NodeNumber was there for strict file serialization (can be safely dropped),
while nodeID is an address (level, count) so

Okay. In my "binary tree world", only NodeNumber would be needed, but
we should discuss that when everybody agrees on the other stuff.

Post by Joseph Ashwood
but I don't think numChildren can be safely removed. Removing
numChildren would make multifile torrents more difficult, and variably sized
nodes impossible (for balancing).

ack

Post by Joseph Ashwood
As I have currently implemented it (wastes a lot of space), the head if 80
64-bit me.level
64-bit me.point
64-bit parent.level
64-bit parent.point
64-bit numChildren
64-bit nodeSize
256-bit SHA-256 hash of data

level = depth and point = "i am child number point of my parent"?

Post by Joseph Ashwood
Except for the leaf nodes the data is a concatenation of the heads of the
children, the leaf fills the remaining area with file data. So for a firmer
00 00 00 00 00 00 00 00 (me.level)
00 00 00 00 00 00 00 00 (me.point)
00 00 00 00 00 00 00 00 (parent.level)
00 00 00 00 00 00 00 00 (parent.point)
00 00 00 00 00 00 00 1A (numChildren)
00 00 00 00 00 00 08 20 (nodeSize)
2C 88 89 9A 71 B5 EA 5E
99 71 28 F4 C0 DE F9 4F
6B 3A 82 13 72 D1 87 D9
41 3A 32 26 D0 27 C6 03 (hash of child heads)

Can you explain how a child can, using these headers, validate its parent?

Usually, a parent can validate its children, but to validate the parent,
you need its parent. This goes up until the root (which is the concept
many others and me proposed).

I understood your Merkle Pool idea in a way that the parent and child
can verify each other "somehow", "for now"; and looked in your tuples
for this. Now it looks like you just proposed that if the parent of
the leaf provides a hash that marks it as correct, you assume that
both are not lying.

In other words: Some malicious client could send bogus data and hashes
that support it without being detected for now. Later, everything would
blow up. In a flat tree, this would mean re-transfer of all leaf hash
nodes, as you cannot tell where the error is at. In an n-ary tree, this
would propably be detected earlier, as the client would try to build up
a complete tree while transfering data.

Now how propable is such an attack?
Lets say the swarm is seeding something someone else doesnt want distributed.
This could be "pirated" content or just a commercial competitor.
The slower (more data needs to be transfered) version would get all
anchestors and anchestors silblings, so that one corrupted piece is
detected directly. The client might decide that if he only gets corrupted
content from a peer, he block him completely.
In the faster (verify only one level above) version, the client might
get a bogus parent from the peer. He can evade this by getting the node
from another client as the data, so he has a (#goodnodes/#totalnodes)
chance of getting a good parent.

In both versions, lots of wrong data will be transferred, but only in
the second one, "good" nodes might share bad data, making the problem
worse.

I personally favor the strict but slow idea.

Any comments?

Kosta

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Joseph Ashwood

2005-03-18 03:14:59 UTC

----- Original Message -----
From: "Konstantin 'Kosta' Welke" <***@fillibach.de>
Subject: Re: [BitTorrent] Merke Tree Structure (was: Have maps)

Post by Konstantin 'Kosta' Welke
level = depth and point = "i am child number point of my parent"?

point = "I am node # point on this level"

Can you explain how a child can, using these headers, validate its parent?

Can't, the verification is purely top-down, the parent ID simply allows the
child to find the parent (albeit undependably e.g. each tree will have a
node (0,0)).

Post by Konstantin 'Kosta' Welke
I understood your Merkle Pool idea in a way that the parent and child
can verify each other "somehow", "for now"; and looked in your tuples
for this.

The MerklePool concept was just a storage medium. Each node in the pool is
self-verified, the root node is known externally. The pool allows for
searches for the children. It is purely a convenience, but useful for
searching.

Post by Konstantin 'Kosta' Welke
Now it looks like you just proposed that if the parent of
the leaf provides a hash that marks it as correct, you assume that
both are not lying.

The root node is assumed trusted, this is a necessary assumption. Beyond the
root each node is considered validated if and only if the node's parent can
computationally vouch for it.

Post by Konstantin 'Kosta' Welke
In other words: Some malicious client could send bogus data and hashes
that support it without being detected for now.

Only if the nodes were unrequested. A big part of the process is only
requesting nodes that can be verified (I'd suggest request by nodeHash),
this is necessary with all Merkle implementations.

Post by Konstantin 'Kosta' Welke
In a flat tree, this would mean re-transfer of all leaf hash
nodes, as you cannot tell where the error is at.

Actually you can tell exactly where the error is. From the root step down
through each level (you have the information to check each internal node
seperately) when a node does not match what it should, all the descendent
nodes are in error.

Post by Konstantin 'Kosta' Welke
In an n-ary tree, this
would propably be detected earlier, as the client would try to build up
a complete tree while transfering data.

As the client should in any tree.

Post by Konstantin 'Kosta' Welke
Now how propable is such an attack?
Lets say the swarm is seeding something someone else doesnt want distributed.
This could be "pirated" content or just a commercial competitor.
The slower (more data needs to be transfered) version would get all
anchestors and anchestors silblings, so that one corrupted piece is
detected directly. The client might decide that if he only gets corrupted
content from a peer, he block him completely.
In the faster (verify only one level above) version, the client might
get a bogus parent from the peer. He can evade this by getting the node
from another client as the data, so he has a (#goodnodes/#totalnodes)
chance of getting a good parent.
In both versions, lots of wrong data will be transferred, but only in
the second one, "good" nodes might share bad data, making the problem
worse.
I personally favor the strict but slow idea.
Any comments?

I am absolutely in favor of only sharing data that has been traced to root.
Unverified subtrees are acceptable for local storage, and in many cases even
beneficial, but no node should distribute data that it has not traced to a
root node.

There are ways of enforcing this, but the one that comes best to my mind
requires a move from file/range transfers to node transfers. There are
certainly arguments on each side, but it has little to do with tree
structure.
Joe

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Konstantin 'Kosta' Welke

2005-03-22 13:58:14 UTC

Damn, I almost thought I got what you were saying. Seems like I got
it wrong. Basically, what I am trying to grasp is what you meant by

|> I think that the cost is depth*N. How do you check a node without knowing
|> its silblings?
|
|By having the sibling heads embedded in the parent. That makes it so the the
|download of the siblings is not necessary for verification. The algorithm is
|instead:
|
|Verify parent hash integrity
|Verify child hash integrity
|Verify that child head matches parent-child head
|
|The first two require hash finalizations, the last is a binary compare.

At some point I lost it and though with your method, you could also
leave out all ancestors.

Now lets recapture: each node not only has the hash of its children, but
also of his silblings. This doubles node payload, but that might be worth
it. Now, to verify one node, you need to download all its ancestors but no
silblings. The ancestor adds the "node to verify" hash to the hash of its
silblings and compares that to its hash of children.

Hmmm, should work. Did I finally get your proposal right?

In that case, binary trees are, of course, the worst way and a flat tree
would be optimal...

Post by Joseph Ashwood
The root node is assumed trusted, this is a necessary assumption. Beyond the
root each node is considered validated if and only if the node's parent can
computationally vouch for it.

Ok. Sounds much better than what I tought you might propose :)

Kosta

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Joseph Ashwood

2005-03-05 00:05:50 UTC

----- Original Message -----
From: "Justin Cormack" <***@street-vision.com>
Subject: [BitTorrent] Performance (was Merkle, URLs, etc)

Post by Justin Cormack
I thought that SHA384 might be an improvement but it
seems to run at the same speed as SHA512 from the benchmarks I can find
(though it saves some space).

SHA-384 is simply SHA-512 where you leave off bits of the output, so there
is no speed gain.

Post by Justin Cormack
Overall I would say that SHA256 is really not
a good choice and it is best to stick with SHA1 until 64 bit machines are
more common (rather soon it seems) and then move to SHA512.

That seems a reasonable approach, for BitTorrent purposes SHA-1 should be
usable for at least another year without any critical attack vectors opening
up, mostly because BitTorrent doesn't make a good attack target.

Post by Justin Cormack
In terms of hashing the internal nodes, I think the main problem is not
the growing size, but that hashing speeds are much lower for small blocks,
so nodes need lots of children: with openssl sha1, 3 children hash at
half the rate of 12 children.

Agreed, making the tree flatter saves a lot of time in all stages.

I should say that it takes a huge amount of time build the tree
from a file, as I type I'm 29 minutes into processing a 180 MB file at 4kb
pieces ~ 95% done, and it occupies huge amounts of RAM, the test will occupy
about 4 times the storage space, but has 2 copies of the original file along
with the tree in memory at one point.

The biggest problem didn't have to do with language, it was some idiot named
Joe decided that it wouldn't be worth programming the planned 256-ary tree
for this version and instead went with linear searching. With the search for
duplicates during insert, it slowed to a crawl. The retrieval was actually
quite good taking about a minute to retrieve and verify once the tree was
constructed, so on that front it was on par with a straight SHA-256
implementation. As to why I had 2 copies plus tree: first I load the file
(copy 1) then construct the tree, retreive the file from the tree (copy 2),
verify the file byte by byte against the original, destroy the copy (copy 2
destroy), retrieve the entire tree, save the tree to disk, destroy the tree
(and data pool), build the tree from the tree file, retrieve the original
data file (copy 2 again), verify the retrieved copy is byte for byte
correct. To do this it was rather inevitable to have 2 copies plus the tree
in memory, swapping wasn't the problem here I never exceeded physical
memory. For testing the implemenatation I decided to take this route because
I had certainty, for real implementations, 1 file copy, plus tree maximum.
The lesson to be learned is that whenever possible use n-ary search
algorithms and avoid linear. A good indicator of the difference this makes
is what happened when I ran the same file through at 4KB and 1MB block
sizes, the 4KB took about 45 minutes of CPU time to run through, the 1MB too
43 seconds. The search algorithm makes an enormous difference.
Joe

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/

<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Joseph Ashwood

2005-03-17 13:26:19 UTC