----- Original Message -----
From: "Konstantin 'Kosta' Welke" <***@fillibach.de>
Subject: Re: [BitTorrent] Have maps (was Merkle, URLs, etc)
Post by Konstantin 'Kosta' WelkePost by Joseph Ashwood[Optimal case for binary trees?]
Post by Konstantin 'Kosta' Welke"I need to verify this one piece to be able to share it".
the optimum case for that is having the verification in the node,
verification = depth
IBTD. Branching is as important as tree depth. In order to verify a
piece, one needs its hash. To verify the hash, you need all the siblings
and its parent. So the total number of nodes needed to verify a hash
is about braching*depth. Of course, this number decreases with every
piece we download, as we already have internal nodes.
All you need is the head of the siblings. To refresh everyone's memory I
also proposed a format. This created a 6-tuple for each node (what I will
from here call the head), the 6-tuple was {nodeNumber, nodeSize,
numChildren, nodeID, parent nodeID, hash(head of children)} each node also
included data that was the explicit head for the children (leaf nodes
replaced hash of child heads with hash of data, and the data was in the data
portion). This makes it possible to provisionally verify a node independent
of anything else downloaded.
Using this it is only necessary to have the node to verify and all it's
parents, all other needed heads are included. The binary trees proposed do
not have this ability and do fall into the problem above.
Post by Konstantin 'Kosta' WelkePost by Joseph AshwoodSearching is important when a piece is requested, there is a search overhead
to determine whether or not that piece is available. Using a perfectly flat
tree this is a search of the minimum possible area, using a binary tree it
is a search of the maximum possible area. These represent the extremes
available, the binary tree is the most costly.
I thought of an implentation where this search is not necessary. If a piece is
avaiable (i.e. downloaded and checked), some bit will be switched to 1 in a
bitfield of size p. Checking that is O(1) as we know which piece is requested.
I also thought that the tree nodes are numbered consequtively. (This is very easy
for a binary tree) This makes it possible to just use simple arrays to store
the tree, too. I think someone posted some basic math to do that on this list
about a month ago. In other words: All node-retrieving and storing operations
are done in O(1) time. This also means that calculating which nodes are necessary
for a piece are done in O(h*N) time.
That can be made to work, and the same technique can be used for n-ary
trees, and still be an improvement (fewer hashes, fewer nodes, lower O(...)
result).
[n-ary trees offer better prediction, prediction makes retrieval faster]
Post by Konstantin 'Kosta' WelkeI'll think about it, but cant we just evade the problem by letting the peer
that needs the nodes just request them?
You can but all that's doing is increasing the probability that the data is
not in the cache and so a quick response is not possible. That is exactly
the problem the prediction is designed to avoid. Without the ability to
predict accurately the best cache replacement algorithm will quickly become
random distribution, with prediction it becomes weighted random distribution
which will save disk time, and by relation speed up responses.
Post by Konstantin 'Kosta' WelkePost by Joseph AshwoodI didn't post about it. The search overhead problem is succinctly "find me
piece with hash X" finding it in a Merkle tree is costly, binary is the most
costly, flat the least, but finding it in a 256-ary tree in a shared
MerklePool eliminates the advantage/disadvantage in this case.
Why dont just number the pieces. No search required.
And finding a node in an unordered tree in linear to the number of nodes, so there
should not be a very big difference between binary tree flat tree, as the
binary tree only has about twice the nodes of the flat one.
Search is still required, consider the case where a file several times the
available memory is being downloaded. The in memory pool will constantly
change, becoming effectively a cache, now the big array method no longer
works without building a big overhead tree (reference the CPU cache dynamics
used by modern CPUs).
Post by Konstantin 'Kosta' Welke[tree construction]
Post by Joseph AshwoodThe better way is to compute all the nodes at a single level (this is where
I began the use of the MerklePool), but the cost is primarily in the depth
of the tree. as the depth increase it becomes necessary to maintain indexing
across multiple levels, by flattening the tree this again exponential
overhead is reduced.
Sorry, I dont understand :)
Not a problem. In something resembling C++
numNodes = maximum count of nodes at a level
MerkleNode **currentLevel
integer numAtCurrentLevel;
MerkleNode **previosLevel = NULL
integer numAtPrevLevel;
currentLevel = new MerkleNode *[numNodes]
process all data into nodes at current level //leaf nodes
trim extra nodes off currentLevel
numAtCurrentLevel = number of nodes used
while(numAtPrevLevel > 1)
{
write out level previosLevel to disk
currentLevel = new MerkleNode *[numNodes]
numAtCurrentLevel = numNodes
process nodes in batches from previosLevel to currentLevel
trim extra nodes off currentLevel
numAtCurrentLevel = number of nodes used
previosLevel = currentLevel
numAtPrevLevel = numAtCurrentLevel
}
write out previosLevel as root node.
This will create a tree of balanced depth, and worst case will waste
treeDepth-2 nodes.
Post by Konstantin 'Kosta' WelkePost by Joseph AshwoodModern hashes have substantial overhead in the finalization operations, by
having the smallest nodes possible the finalization code is executed the
maximum number of times. As the size of the nodes shrinks linearly, the
number of internal nodes increases super-linearly.
Not necessarily.
How do you intend to avoid having to compute one hash for each node? As the
branching of the node decreases the number of nodes increases, as the number
of nodes increases the number of hash finalizations increases.
Post by Konstantin 'Kosta' WelkePost by Joseph AshwoodAs the number of nodes
increases the number of times it is necessary to run the finalization code
increases.
As the number of siblings on one level increases, this increases.
Actually it won't. The number of finalizations is strictly dependant on the
number of nodes, if fact they are exactly equal (+/- 1 depending on root
handling into torrent).
Post by Konstantin 'Kosta' WelkePost by Joseph AshwoodPost by Konstantin 'Kosta' WelkeNote that I do neither think that binary trees are the best choice. They
are worst case for tree size but optimal case for quick verification of a
single piece.
Here is where we substantially differ. In verifying a single piece in a
properly formatted n-ary tree (like my proposal) the cost is the tree depth.
I think that the cost is depth*N. How do you check a node without knowing
its silblings?
By having the sibling heads embedded in the parent. That makes it so the the
download of the siblings is not necessary for verification. The algorithm is
instead:
Verify parent hash integrity
Verify child hash integrity
Verify that child head matches parent-child head
The first two require hash finalizations, the last is a binary compare.
Post by Konstantin 'Kosta' WelkePost by Joseph AshwoodThis is the same optimal cost for binary trees. The n-ary tree will be
flatter and so offers faster verification of the piece than the binary
version. For reference, my implemenation can verify a single piece of a
478MB file in 4 hashes, assuming 4KB blocks, the same performance for a
binary tree would only be a 65KB file, again assuming 4KB blocks. Verifying
the same 478MB file would take a binary tree 29 hashes (assuming I counted
correctly) approximately 7 times as long. 7 times the time is not a small
performance penalty.
Could you write out what exactly is calculated here?
Actually I'm having trouble exactly duplicating my numbers as well. Here is
the full expansion of what I did today.
In it's current form my implementation occupies 80 bytes per head: 2 64-bit
integers for nodeID, 2 64-bit integers for parent nodeID, 1 64-bit integer
for number of children, 32-bytes for the hash = 80 bytes per head.
Each node has a head which accounts for 80 bytes, leaving 4016 bytes in a
4kb block. That leaves room for 50 child heads, and 16 slack bytes. This
means a 50-ary tree and so:
Level 1 = 1 block
Level 2 = 50 Blocks
Level 3 = 2500 Blocks
Level 4 = 125000 Blocks
.....
I also calculated that the tree can only house 4016 bytes per leaf because
of the overhead, multiplying the number of blocks at the level and the
number of bytes in the leaf gives the maximum size for the level
With binary trees I assumed that the nodes are overheadless, so they can
store the full 4096 bytes in the leaf. Since it is binary the blocks counts
are
Level 1 = 1
Level 2 = 2
Level 3 = 4
Level 4 = 8
Level 5 = 16
......
Same as before multiply the number of blocks at the leaf level and the leaf
node size to get the maximum size.
50-ary tree:
Level 1 = 4016 bytes
Level 2 = 200800 bytes
Level 3 = 9.5MB
Level 4 = 478.74MB
Level 5 = ~24GB
Binary:
Level 1 = 4096 bytes
Level 2 = 8192 bytes
Level 3 = 16KB
Level 4 = 32KB//guess I messed up, I accidently gave the binary tree an
extra level in my calculations
Level 5 = 64KB
Level 6 = 128KB
Level 7 = 256KB
Level 8 = 512KB
Level 9 = 1MB
Level 10 = 2MB
Level 11 = 4MB
Level 12 = 8MB
Level 13 = 16MB
Level 14 = 32MB
Level 15 = 64MB
Level 16 = 128MB
Level 17 = 256MB
Level 18 = 512MB
Level 19 = 1GB
Only 4.5 times the overhead for a 400 MB file, not 7.
Post by Konstantin 'Kosta' WelkeAre you assuming
that you only need all ancestors of a leaf to verify it and its contents?
Yes I am. Since there is one hash finalization per level, that was the
determining factor, and the only thing I counted.
Joe
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/BitTorrent/
<*> To unsubscribe from this group, send an email to:
BitTorrent-***@yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/