-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: base protocol with merkle trees and new hash algorithms #59
Conversation
Related discussion: #58 |
beps/bep_0003.rst
Outdated
A list of strings. Each string consists of concatenated hashes | ||
of an intermediate merkle tree layer for each file. The layer is chosen so that | ||
one hash represents one piece. For example if a piece size of 128KiB is used | ||
then 3rd layer up from the leaf hashes is used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this paragraph implies that the leaf size is 16 kiB, perhaps that should be stated explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is stated further down in the pieces root
definition
beps/bep_0003.rst
Outdated
@@ -87,70 +92,115 @@ announce | |||
|
|||
info | |||
This maps to a dictionary, with keys described below. | |||
|
|||
``piece layer`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I interpret this to be a list where each entry represents a file, and the content of each layer is the full tree (except truncated at the piece size). I would expect the key to be called something like "merkle trees" or at least plural "piece layers"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, each list entry represents one file.
No, it does not represent full trees. I'll have to improve that part. It is meant to be only one level of the tree where the data size covered by the hash is equal to the piece length. so if piece length = 16KiB you get leaf hashes, if it's 32KiB then you get 1 level up, if it's 64KiB you get two levels up etc.
I would prefer always having the 16KiB leaves in there but some users today scale their piece length to keep the .torrent file very small, so I assume they would dislike the potentially massive increase in torrent sizes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(except truncated at the piece size).
Oh yeah, right. Exactly. I can rename it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify once more: It's just one level of the tree for each file. Not the levels above - they can be derived if needed - or the levels below - they would take up too much space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, right, of course
beps/bep_0003.rst
Outdated
appear in the files list. The files list is the value | ||
``files`` maps to, and is a list of dictionaries containing | ||
the following keys: | ||
``length`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the distinction between multi-file and single-file torrents is also a frequent source of issues (as the typical client behavior is not necessarily always obvious). But it may not be worth trying to unify these in this proposal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unification would be possible with a few awkward compromises wrt. legacy compatibility. single-file torrents would have to be represented as multifile ones with a single entry and we'd have to make some "how to interpret the directory layout" rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah. on the other hand, one could argue that those rules of how to interpret the directory structure for multi-file torrents are already there (just not documented and kind of de-facto). Sometimes people create single-file torrents using the multi-file structure, and have empty torrent "name" fields. To handle that, I put the hex encoded info-hash as the name, and then users get confused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hate 'digest func' being required. There should be an assumed default value with the key being a marker so clients can cleanly report that they don't support a new algorithm. I don't want to get into a discussion of what that default should be in this thread, but it should be only one thing.
beps/bep_0003.rst
Outdated
``length`` | ||
Length of the file in bytes. | ||
|
||
``pieces root`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the word "pieces" may be a bit misleading here, since the tree doesn't (necessarily) terminate at the piece level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not married to any of the names. "block tree root", "block root", "tree root" all work. I would avoid "root hash" since it's already used in BEP 30.
beps/bep_0003.rst
Outdated
``pieces root`` | ||
The root hash of a merkle tree with a branching factor of 2, | ||
constructed from 16KiB blocks of the file. | ||
The last block of the file may be smaller than 16KiB. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be worth being explicit here. The end piece of a file, should it be (1) padded with zeroes for the purposes of calculating the hash, or (2) should the hash be calculated on the truncated byte range?
I imagine the benefit of (1) is that it creates more coherent pieces whereas (2) may be slightly more efficient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no strong opinion in either direction. As written it currently is intended to be (2).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regarding easing upgrades, zero-padding pieces at the end of a file may provide some benefits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a feeling that (1) would be simpler to implement, as it would preserve the "dense", fixed size pieces for purposes of hashing. It would mean a backwards compatible torrent would need to pad the last file as well. Without any implementation experience, I can't back this up other than with a gut feeling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Padding with zeros avoids a weird layer violation. It isn't particularly expensive because the hash value of 16KiB of zeros can be cached. The extra cost is that file sizes are rounded up to a multiple of 16KiB. I could go either way on this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The padding already is in the current revision. GH should be showing this conversatio as outdated for that reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although it's only padded to the nearest multiple of 16KiB.
It isn't particularly expensive because the hash value of 16KiB of zeros can be cached.
You mean that the remaining leaves of the merkle tree to be derived from 16KiB of zeroes too instead of just initializing the leaf hashes to zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I misunderstood thinking that padding was to the end of the piece. There's some benefit to end-of-piece padding, in that if multiple peers are all downloading just one file they won't have to download extra stuff to complete pieces to send those to each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, files are piece-aligned now. But the holes created by that alignment can be larger than 16KiB if the piece size is larger than 16KiB. The hashing only rounds up to the nearest 16KiB boundary so implementations can use fixed-sized buffers when hashing.
In other words piece-padding != hash-padding
beps/bep_0003.rst
Outdated
|
||
Each dictionary contains | ||
|
||
``path`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little bit worried about how flexible this structure is. I take it the file a/b/c/foo.txt
could be encoded in several different ways:
{ "path": ["a", "b", "c", "foo.txt"], "files": {"length": ..., "pieces root": "..." } }
as well as:
{ "path": ["a"], "files": { "path": ["b"], "files": { "path": ["c"], "files": {"path": ["foo.txt"], "length": ..., "pieces root": "..." } } } }
I appreciate the value of being able to represent paths in more compact ways than to require a separate dictionary at each level though. A strawman alternative could be:
{ "paths": [["a"], ["b"], ["c"], ...], "files": [ { "name": "foo.txt", path: 0, ...}] }
i.e. store the paths in a separate list and have files specify the index to which directory they are in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first case is illegal since path
is mandatory, including in leaves.
The second case is would be the norm of those entries had siblings, otherwise it should be collapsed to path: ["a", "b", "c", "foo.txt"]
The problem with the last case is that it could not be made into a hybrid torrent that is backwards-compatible with BEP3. And it would be less efficient since it would still not encode share prefixes efficiently like trie does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. It was not obvious to me that this field was mandatory, and specifically must be present for each leaf (which I think should be clarified). My concern remains though, of having many different ways of encoding the same thing (I consider it to be in the same class of issues as overlong utf-8 encodings).
Your point about this structure supporting coexisting with the current "files" structure is a good point, and I think it should be documented as well (perhaps I just missed it).
As for reusing path prefixes, yes. However, there's a fair amount of overhead in the dictionary keys in the directory tree in this proposal too, you would need a fair amount of path string reuse to "break even". I wonder if there's a more compact (and perhaps simpler) way to represent directory trees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there's a more compact (and perhaps simpler) way to represent directory trees.
Yes, using dictionary keys as path elements. It would also have the neat property of enforcing uniqueness.
We can do that, but then we'll have to encode it twice for backwards-compatible torrents. I'm not sure what is better, compromising in the new format or compromising how to achieve backwards compatibility.
(perhaps I just missed it).
No, it's part of the TODO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's much gain from compromising the new format to make it look more like the old format because it's breaking backwards compatibility anyways. We might as well define two different formats: A backwards compatible format which is the same as BEP3 except with the pieces root
key added, and a new trie based format. Only one of them would be present in a torrent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would be a bad idea. Since then an implementation that only intends to handle the new format would still have to be able to parse the old one because there could be old-format-but-with-root-hash torrents.
I.e. such a hybrid format would not actually be forward-compatible. We would be creating yet another legacy form.
So I only see two possible approaches for each change
a) have any new field mimic the old form in some way that at least a subset of the allowed values can be used in a hybrid format. this is what I've done with the files
, length
, piece length
and name
field
b) let the new format use a different key and encode the information twice. this is what I would do with pieces
vs. pieces root
+ pieces layers
which essentially is redundant information which can exist side by side.
So if we want a completely new files format and hybrid torrents will also have to encode it twice.
Now that I think about it it's probably not so bad since for many - even if not all - torrents the pieces make up the bulk of the size anyway, so we might as well duplicate that too.
A hybrid format would then look like files
, pieces
existing alongside with pieces layer
, pieces root
and files tree
or something like that.
beps/bep_0003.rst
Outdated
multi-file | ||
---------- | ||
|
||
``files`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
back in the day when talking about how torrents could be made more deterministic (i.e. more likely to generate the same info-hash when the same files are turned into torrents independently) included a few things that may be worth considering with a compatibility break:
- requiring files to be listed in a deterministic order
- for the purposes of the info-hash, only refer to files by their hash, and not their name (in this case it could be the root hash)
I suppose perhaps the approach people would prefer is to identify duplicate files across torrents by their hash separately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The per-file-tree with fixed block sizes should forever solve the dedup problem.
Putting the file information outside the info dict calculation would be possible, but that would make bep9 more complex since magnets also have to convey file naming information, not just piece payload.
beps/bep_0003.rst
Outdated
* pieces field | ||
* double announce behavior | ||
* safe hashing. avoid downgrade attacks | ||
* changes to BEP 9. magnets. send merkle layers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting. I imagine the main value from merkle trees is to extend the PIECE message to include uncle-hashes (like the tribler protocol does). But I suppose extending BEP 9 may provide a simpler upgrade path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as I said in the initial comment, this is intentional to decouple hash transfer from pieces transfer. The tribler approach is great for one thing, terrible for everything else.
One use case that per-file hash trees makes worse than current bittorrent is having lots of small files. In the current protocol a lot of files may fit in a single piece, meaning they only add one hash to the .torrent file. With per-file merkle trees, each file will add a hash, which could make the info dictionary significantly larger. Since the filenames are likely to also use a fair amount of space in the info dictionary, this is a problem currently as well. No obvious solutions to this come to mind though (short of having built-in support for tar files) |
I think the cost of additional hashes is paid for by making the directory layout more efficient, which usually is several layers deep for torrents with >10k files. |
New version with the dictionary based path trie. Since no operating system permits empty path elements I've used that to specify path properties. It could even be used to specify per-directory properties by other BEPs. |
I have replaced the hash agility with a version field. Which actual function will be used is still open since discussion in #58 is ongoing. I also added the reject message and associated state machine from the fast extension. Also added is emphasis that the |
beps/bep_0003.rst
Outdated
For example if the piece size is 16KiB then the leaf hashes are used. | ||
If a piece size of 128KiB is used then 3rd layer up from the leaf hashes is used. | ||
Files smaller or equal to the piece size are represented by an empty string since | ||
the root hash is sufficient to cover the piece. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should specify how nodes which only contain leaves beyond the end of the file are handled here. The obvious choice would be to exclude them from the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, added.
|
beps/bep_0003.rst
Outdated
@@ -87,70 +92,153 @@ announce | |||
|
|||
info | |||
This maps to a dictionary, with keys described below. | |||
|
|||
``piece layers`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A high level comment on this overall structure: There's going to be an overlap in time where there are torrent files which support both new and old style data, and new style clients can connect to either type, then after a transition the old-style data should be dropped. A bit of redundancy between the old and new data isn't terribly problematic. What I'd like to see is that the new-style data all be included in a bencoded string under a single key in the info dict, so that its value is committed to by the old style data, and after a transition all other keys in the info dict are dropped. When new peers connect to each other they should specify the new algorithm hash of just the new data and communicate based on that, and they'll need to use a slightly extended peer protocol to be able to communicate paths to the root of individual chunks.
I mostly agree with the new keys as presented here, but think they should be moved into the new dict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the underlying motivation for that? To be able to transform a hybrid torrent to the new format exclusively without changing the new hash? I guess that's possible, but it would prevent peers that obtained the torrent the new way to interoperate with old peers even if the torrent itself was created in a hybrid format.
But forever lugging around the legacy data on hybrid torrents created during the transition phase is not ideal either.
Hrm, I'll have to think about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handshaking can be done by using one of the handshake bits to specify support for the new format. If both sides handle the new format then the receiving side can send the hash of just the new-style data, even though the first hash sent by the initiating side is still an old-style sha1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handshaking is unrelated. Clients have to do 2 announces / dht lookups anyway since the hashes change, so they know which peer supports which format. The issue is the chain-of-trust from the hash to the hybrid data. If the new hash is only derived from the new data but not from the hybrid format then metadata sourced from the new hash can't be used to talk to old clients since the old data cannot be obtained due to a lack of trust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The client can get disjoint sets of peers from each announce though, so it may receive a peer only under the legacy infohash when it does in fact support both formats. The client would only discover the peer's support for the new format during the handshake, so we need a flag to signal that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm, yeah, that wasn't what I was aiming for, but it's a good point. If both have hybrid torrents and know both infohashes then that can be used to upgrade the connection.
It's okay to trust the old-style metadata for now. Sha1 isn't that badly broken. The goal should be to finish this transition before such concerns become serious. There is a problem with not having paths to the root for pieces downloaded from the old peers. It may be that putting a new style piece hash list outside the info dict (which I think is what you're proposing) is the best thing to do for that, but it kind of sucks having torrent files be so bloated in the interim. |
It looks like we have not reached common understanding yet. If I understand this comment correctly you propose that the new format be structured as follows:
That is indeed what I am proposing, but that is mostly unrelated to the transition. It is supposed to be a permanent feature of the new format. I have moved them out of the info dictionary so that usecases that new fast startup can use the (now modified) metadata exchange to just obtain the file list with root hashes without the lower layers while other use-cases that require more finegrained hash information (deduplication, resuming of partial downloads, interoperation with other protocols such as webseeds) can get the full set. In short: .torrent files contain fine-grained hashes, metadata initially only transfers the lighweight info-dict with per-file roots, fine-grained hashes can be obtained separately. see also: the8472@e25fa15 |
Oh I see what you're saying now. That is indeed what I proposed, but I don't have a strong objection to making the new-style hash be over the whole info dict, because once the old-style data is dropped canonical representation isn't a real problem any more because the dict only has one thing in it. Having a new-style piece list outside the info dict is perfectly fine but it should be optional. A torrent which has only support for new style peers doesn't need it, and getting rid of that bloat matters in some use cases. |
I have designed the .torrent to be the slow path that contains all the data. This ensures that the data is retained for those who need it. Making it optional would mean it could get lost in a game of telephone. Clients that need the fast path should just use the hash or magnet + metadata exchange. |
In fact the data is essential for the use with the core protocol. There is no other way to obtain these hashes short of rehashing a complete file. Metadata exchange allows them to be transferred incrementally, but it is an extension. A barebones client not supporting it will need the fine-grained hashes it in the torrent torrent file. |
Imo it is ready for merging now. Any issues discovered during the first implementations can be addressed in separate PRs. |
beps/bep_0052_torrent_creator.py
Outdated
|
||
|
||
def info_hash_v2(self): | ||
return binascii.hexlify(sha256(encode(self.info)).digest()).decode('ascii') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's no need to use binascii here, just call hexdigest()
instead of digest()
beps/bep_0052_torrent_creator.py
Outdated
@@ -243,3 +253,7 @@ def create(self, tracker, hybrid=True): | |||
args = parser.parse_args() | |||
t = Torrent(args.path, args.piece_length) | |||
open(t.name + '.torrent', 'wb').write(encode(t.create(args.tracker, args.v2_only))) | |||
if args.v2_only: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There needs to be a not
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait no it's correct as it is. Clearly the naming here is not the best.
I know it's a bit late, but is there a chance we can get BEP48 merged into this specification, too? |
Scrapes are not essential for getting bittorrent to work. BEP3/52 basically describe minimum implementations. |
Hi guys! I'm about to begin implementing v2 changes and am currently reading the spec. I'm having a hard time with some of the wording though and would be really grateful, if someone helps with clarification.
What about small files (smaller than or of the same size as the piece size)? Do they still have a key-value pair in
I understand that hashing is not applicable to empty files, so they don't have the
Being more explicit about the interconnection of new concepts would also be beneficial, i.e. stating that
If small files are omitted from the
Also looks to me like the |
Omitted completely. There is nothing to include if the merkle tree only has a single node.
In principle the empty-keyed descriptor dictionary is also allowed for directories, which neither have
This too is formulated in a positive manner, listing a single case, instead of enumerating all the possible cases to exclude: For each file in the file tree that is larger than the piece size it contains one string value. You could also put it this way: What makes it non-trivial?
That's what magnets are for. A .torrent is only in a fully valid state once the piece layers have been included. Those are necessary for partial resumes and stateless torrent clients. So I want to make sure that people don't just go around omitting them because they don't see the use-cases. That said, I know spec wording can be a bit obtuse. But having written it I am blind on that front, so I need strong arguments why something is confusing or can be misunderstood to be convinced. If other people also have issues with it, please speak up! |
Correction. Since the piece layer can be several layers down in the merkle tree it's not about the merkle tree only having a root node, it's about the merkle tree having fewer layers than where the piece layer would be located. merkle tree of a small file, in heap order: P being the piece layer, 0 being beyond-end-of-file leafs. |
Thanks, I agree that the matter of wording is very subjective, and it might as well be me lacking the required mental capacity to connect all the dots at once, so I'd prefer to see more hints and cross-references :)
This is an interesting point by the way.
I assume this means that aside from the piece hashes (i.e.
Is my assumption correct? If yes, then I can't see why a stateless client can't calculate piece hashes in addition to leaf hashes. If my math is correct, it increases the complexity of required hashing by a constant factor in the range of [1.5; 2). |
Another question about
Shouldn't it be a multiple of length*2 ? Otherwise the client will be allowed to request hashes that are not siblings (from different subtrees), and uncle hashes will not be sufficient for verification. |
Kind of, but you're missing two details:
Length is already a power of two. So if your length is 4 you can only use 0, 4, 8, ... as index. That's already aligned to a single subtree, no? Can you construct a case where more than one uncle hash per layer is needed? |
You can also cross-check with the example implementation. |
Is streaming considered real-time? If yes, I can't see how real-time can be considered "rare". As for the hashes.. what exactly stops me from requesting 2 hashes, starting with index 1? The receiver ought to make a sanity check |
Streaming clients can buffer in the general case. For startup they might want to use sub-piece hash checking to reduce the initial latency to playback. But again, that's just a transient use.
Requesting 2 hashes means length = 2 in the request. Which means you can only request index 0, 2, 4, ...
|
Oh, now I see it, my bad...)) I'm probably loosing my sight |
Just checked, might be worth noting that when run under Python 2 for a single-file torrent, it formats |
The shebang is As for the rest, I guess there's some room for a few simple visualizations wrt. merkle tree layers. Maybe in the style of this comment |
I just pushed a commit to add a |
Oh, heh. |
I've put up two test torrents, one v2-only and one hybrid, here: https://libtorrent.org/bittorrent-v2-test.torrent |
For discussion. For now it's as a diff to BEP3. Later I can rebase and move it to a new BEP.
Changes so far:
Note that things that need super-fast torrent startup do not need to concern themselves with the bulky hash list in the torrent root dictionary. Those things can be handled by extending BEP9 to send those hashes piecemeal after obtaining the more lightweight info dictionary.
Also note I'm not using
tr_hashpiece
logic since I think the transfer of hashes should be decoupled from the transfer of payload. This is important for some use-cases as discussed in issue #29