Bittorrent v2 #2197

ssiloti · 2017-07-31T22:21:41Z

I’m opening this issue as a place to discuss and coordinate the implementation of bittorrent v2. A draft spec of the bittorrent v2 protocol has been published as BEP 52.

The first thing I plan to work on is support for creating and loading v2 metadata. This will mainly involve extending create_torrent, torrent_info, and file_storage to support both v1 and v2 metadata.

I predict the most complex and invasive changes will be to support hybrid torrents in torrent. I think we’ll want to split out part of torrent into a swarm class. Then torrent can hold both a v1 and v2 swarm with shared data like piece_picker remaining in torrent.

Should we keep support for BEP 30 merkle tree torrents? BEP 52 effectively renders BEP 30 obsolete, so unless there are some existing users of BEP 30 we can probably drop it and simplify the code.

Obviously this isn’t going into the 1.2 release, so we’ll want to keep bittorrent v2 work on a feature branch at least until the RC_1_2 branch is created.

The text was updated successfully, but these errors were encountered:

arvidn · 2017-08-01T07:18:37Z

I agree that the BEP30 merkle tree does not need to be supported with bittorrent v2.

it sounds like a reasonable approach. I also have a new-disk-io branch which basically revamps the disk I/O (basically re-implements everything behind the disk_interface) to use memory mapped files (on 64 bit systems that supports mmap)

ssiloti · 2017-08-07T18:20:57Z

Another question of keep or discard: The create_torrent::optimize_alignment option is incompatible with v2 metadata so I'd like to deprecate it. This also raises the larger question of whether we should continue to support generating v1 only metadata at all. For the vast majority of people there's no reason not to generate hybrid metadata. There may be someone who really cares about the size of the torrent files, but on-the-other-hand we really want to push people to generate v2 metadata as it's key to enabling the transition to the v2 protocol.

arvidn · 2017-08-07T18:57:29Z

right, the reason optimize_alignment is incompatible is because in v2 all files are aligned and "tail-padded" (i.e. no files ever share a piece, regardless of how small the file is).

v2 basically requires the pad_file_limit == 0, alignment == , tail_padding = true.

The optimize_alignment flag is still required to inject the pad files at all right now, so v2 semantics is similar to that, except the "pad files" are implied

ssiloti · 2017-08-07T20:13:12Z

I'm only talking about dropping support for generating v1 only metadata. By default libtorrent will generate hybrid metadata which will work with both v1 and v2 clients.

ssiloti · 2017-08-07T20:21:18Z

BEP 52 defines the metadata in a way which allows a torrent to have both v1 and v2 metadata in the same info-dict. So you can generate a torrent file which has only v1 keys, only v2 keys, or both. Including both v1 and v2 keys will be the default for the foreseeable future, with v2 only as an option for users who don't care about backwards compatibility.

ssiloti · 2017-08-07T21:16:56Z

I'm not sure what you're referring to taking 10 years. BEP 52 was published a few months ago. The main impetus behind BEP 52 is improved security by changing the hash function from SHA1 to SHA256. This change wasn't seen as urgent until Google published the first SHA1 collision earlier this year.

BEP 52 requires a bit more than BEP 47 pad files. The files also need to be sorted by path.

the8472 · 2017-08-07T21:20:28Z

@Col-blimp you can read the background in bittorrent/bittorrent.org#58 and bittorrent/bittorrent.org#59

ssiloti · 2017-08-08T16:34:43Z

BEP 52 is a modification of BEP 3 so it inherited BEP 3's creation date.

ssiloti · 2017-08-29T23:16:46Z

For those who want to follow along, I've put up a work-in-progress branch. Currently it just has support for generating hybrid torrent files. It's based on arvid's new-disk-io branch because I anticipate that branch will be merged to master before v2 and I'd rather not implement the disk I/O code twice.

ssiloti · 2017-09-25T18:55:31Z

I'm planning on dropping support for generating and parsing torrent files which have a file with the same name as a directory. This is kind of a pain to support in the v2 parsing code and I don't see a good reason to continue support for it. AFAIK most (all?) filesystems forbid such a conflict.

the8472 · 2017-09-25T19:20:00Z

This is kind of a pain to support in the v2 parsing code

The v2 spec forbids it anyway.

ssiloti · 2017-09-25T19:47:41Z

Does it? I don't see any language which explicitly forbids it. The file tree structure can certainly encode such a conflict by placing an empty dict key among a directory's subordinate path elements.

the8472 · 2017-09-25T20:00:13Z

length
Length of the file in bytes. Presence of this field indicates that the dictionary describes a file, not a directory. Which means it must not have any sibling entries.

ssiloti · 2017-09-27T20:52:26Z

I'm planning on restricting the torrent_info::remap_files feature so that the new files must have a size that's a multiple of the piece size, or equal to the remaining size of the torrent. In other words, piece alignment is required and pad files are forbidden. Otherwise this feature would negate much of the simplification we get from requiring piece aligned files in v2.

arvidn · 2017-09-27T23:45:14Z

I'm not sure remap_files() offers a ton of utility in general, and it's probably not worth spending a lot of effort on supporting v2 torrents. Would it be simpler to just make it work on v1 files? and not at all on v2.

oleiba · 2018-08-13T19:15:54Z

What's the state of the draft? Communication seemed to pause since August 2017.

In particular I'm interested in collision-resistant hash migration and Merkle roots .torrent files (replacement for BEP30).

ssiloti · 2018-08-14T01:07:35Z

The draft BEP is unchanged. There's an alpha quality implementation at https://github.com/ssiloti/libtorrent/tree/v2

X-Coder264 · 2018-08-14T19:35:29Z

@ssiloti I have a question regarding your implementation of the v2 spec.

So since hybrid torrents will have two info hashes (a SHA1 for v1 and SHA2-256 for v2) that means there will be two announces to the tracker (one for each info hash). How will the downloaded and uploaded data be announced to the tracker? For example let's say a peer uploaded 100 MB in the v1 swarm and 200 MB in the v2 swarm. Will the announces to the tracker be like /announce?info_hash=<v1_hash>&uploaded=100 MB and /announce?info_hash=<v2_hash>&uploaded=200 MB or are both announces gonna have uploaded=300 MB? Hopefully it's the former (as the latter doesn't make sense and the former is the only one possible if the torrent is v1 only or v2 only).

I'm asking because of this:

Implementations supporting both formats can join both swarms by calculating the new and old infohashes and downloading them to the same storage.

I don't know anything about libtorrent internals nor am I a C++ dev (I just took a quick look at your v2 branch commits), but it seems to me that if the client can join both swarms the tracking of what traffic goes to which swarm/peer gets considerably more complex. Hopefully libtorrent will still be able to track that and send the announce requests properly.

ssiloti · 2018-08-15T02:56:27Z

Right now both announces will report 300 MB because v1 and v2 peers share the same torrent and thus the same stat object.

As you say, keeping separate stats would add significant complexity. Keeping separate counts of corrupt bytes would be particularly troublesome. Ideally trackers which care about these numbers would gain awareness that the v1 and v2 hashes refer to the same torrent, but I suspect that's not going to happen so we're probably going to have to take on the extra complexity.

X-Coder264 · 2018-08-15T11:09:17Z

I was afraid that'd be the answer I'd receive. Actually, I see now that you've already written this in the first post (which I read a long time ago and forgot about it).

I predict the most complex and invasive changes will be to support hybrid torrents in torrent. I think we’ll want to split out part of torrent into a swarm class. Then torrent can hold both a v1 and v2 swarm with shared data like piece_picker remaining in torrent.

It isn't a problem for a tracker to gain awareness that the v1 and v2 hashes refer to the same torrent, but two things:

a) That would make one of the two announces redundant (e.g first announce the tracker is like "OK, cool, your stats have been updated in the database", then the second announce comes and the tracker is again "OK, cool" but the stats have already been updated by the previous announce so we are basically doing nothing here). Also then there's the problem that trackers would have to handle when both of those announce requests get to the server at the same time, which of course also adds complexity to the code.

b) What happens when some client (library) implements the keeping and announcing of separate stats? There's no way for the tracker to behave in two completely different ways (for some clients to just basically "ignore" the second announce while for others having to take into account both announces).

The complete attention of the spec has been given to the client side and to the changes of the .torrent file while none was given to the tracker's side of the story. Stuff like this (how should the client announce when joining both swarms of a hybrid torrent) is something that IMO must be defined in the spec itself. Having some kind of unwritten rules which become the de facto standard somewhere along the road is just bad. It's just a waste of development time (when the trackers need to rewrite stuff to be in line with the client's behavior or vice versa) and also prolongs the adoption of the standard. I don't know where the discussion about the BEP is taking place now (since bittorrent/bittorrent.org#59 was merged), but since both you and @arvidn worked/discussed on that hopefully you can bring this up so that stuff like that will be clearly specified before the BEP status changes from draft to final and accepted version.

ssiloti · 2018-08-16T03:08:00Z

I don't follow what the complexity is that you refer to in paragraph a. Merging announces on multiple infohashes is the same as merging multiple announces on the same infohash. The upload/download stats are cumulative so the tracker takes the maximum of the values it has seen. The tracker doesn't even need to keep track of which infohashes refer to the same torrent, it can key off of the peer_id which will be the same for both v1 and v2 announces.

X-Coder264 · 2018-08-16T12:28:55Z

Actually you are right, forget what I said about the complexity in paragraph a, it's like I completely forgot that stats are cumulative while I was writing that 😛

btw, I've opened bittorrent/bittorrent.org#87 so that this can be further addressed there.

ssiloti · 2019-05-28T03:24:16Z

I've updated the v2 branch in my repo with a heavily squashed and cleaned up patch set and rebased on the current master.

ssiloti · 2019-08-10T17:02:01Z

Protocol v2 support has landed in master! See #3873

atomashpolskiy mentioned this issue Aug 8, 2017

Adopt BitTorrent v2 specification atomashpolskiy/bt#28

Open

ewhal mentioned this issue Aug 9, 2017

Support BitTorrent v2 spec: BEP 52 anacrolix/torrent#175

Open

Chocobo1 mentioned this issue Jan 5, 2018

raise auto piece size selection limit to 16 MB in create_torrent() #2669

Merged

X-Coder264 mentioned this issue Aug 16, 2018

BEP 52 - Define client announce behavior when joining both swarms of a hybrid torrent bittorrent/bittorrent.org#87

Open

LordNyriox mentioned this issue Oct 15, 2018

[WIP] Adapt sources to the libtorrent master branch qbittorrent/qBittorrent#9704

Closed

3 tasks

ssiloti closed this as completed Aug 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bittorrent v2 #2197

Bittorrent v2 #2197

ssiloti commented Jul 31, 2017

arvidn commented Aug 1, 2017

ssiloti commented Aug 7, 2017

arvidn commented Aug 7, 2017

ssiloti commented Aug 7, 2017

ssiloti commented Aug 7, 2017

ssiloti commented Aug 7, 2017

the8472 commented Aug 7, 2017

ssiloti commented Aug 8, 2017

ssiloti commented Aug 29, 2017

ssiloti commented Sep 25, 2017

the8472 commented Sep 25, 2017

ssiloti commented Sep 25, 2017

the8472 commented Sep 25, 2017

ssiloti commented Sep 27, 2017 •

edited

Loading

arvidn commented Sep 27, 2017

oleiba commented Aug 13, 2018

ssiloti commented Aug 14, 2018

X-Coder264 commented Aug 14, 2018

ssiloti commented Aug 15, 2018

X-Coder264 commented Aug 15, 2018 •

edited

Loading

ssiloti commented Aug 16, 2018

X-Coder264 commented Aug 16, 2018

ssiloti commented May 28, 2019

ssiloti commented Aug 10, 2019

Bittorrent v2 #2197

Bittorrent v2 #2197

Comments

ssiloti commented Jul 31, 2017

arvidn commented Aug 1, 2017

ssiloti commented Aug 7, 2017

arvidn commented Aug 7, 2017

ssiloti commented Aug 7, 2017

ssiloti commented Aug 7, 2017

ssiloti commented Aug 7, 2017

the8472 commented Aug 7, 2017

ssiloti commented Aug 8, 2017

ssiloti commented Aug 29, 2017

ssiloti commented Sep 25, 2017

the8472 commented Sep 25, 2017

ssiloti commented Sep 25, 2017

the8472 commented Sep 25, 2017

ssiloti commented Sep 27, 2017 • edited Loading

arvidn commented Sep 27, 2017

oleiba commented Aug 13, 2018

ssiloti commented Aug 14, 2018

X-Coder264 commented Aug 14, 2018

ssiloti commented Aug 15, 2018

X-Coder264 commented Aug 15, 2018 • edited Loading

ssiloti commented Aug 16, 2018

X-Coder264 commented Aug 16, 2018

ssiloti commented May 28, 2019

ssiloti commented Aug 10, 2019

ssiloti commented Sep 27, 2017 •

edited

Loading

X-Coder264 commented Aug 15, 2018 •

edited

Loading