Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encrypted torrents #20

Closed
wants to merge 8 commits into from
Closed

Conversation

the8472
Copy link
Contributor

@the8472 the8472 commented Oct 10, 2015

Based on discussion in #18

The BEP still lacks test vectors, but I would prefer some review/feedback before finalizing it.

- remove section on keys in torrents
- specify separate torrent key files instead
- better single file / multi file handling
- don't include infohashes in key file

``shadow``
bencoded-then-encrypted dictionary of key-value pairs that shadow entries in the info dictionary.
If it is absent only the payload is encrypted and all info dictionary entries are non-shadowed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would roll better off the tongue to say: "...encrypted and no info dictionary is shadowed."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it, does it make sense to define the shadow dictionary as optional value? I.e. is the complexity-feature tradeoff worth it? It allows the client to be certain that the file layout is as it is without decrypting the metadata if the dict is absent. Idk if that's useful enough to justify that.

@the8472
Copy link
Contributor Author

the8472 commented Aug 8, 2016

Simplifications I can think of but have no strong opinion on:

  • always require the shadow dict
  • always make the outer representation single-file (i.e. just length + name)

That would cut down on the possible transformation scenarios between encrypted and plaintext by eliminating by option of leaving the metadata unencrypted and canonical.


``mac``
message authentication code covering the info dictionary

``name``
the name field is a mandatory part of [BEP 3]_, therefore a placeholder MUST be provided if a shadow name is used. An implementation may either generate a random string consisting of filesystem-friendly characters or allow the user to choose a public name that reveals less information than the shadow name.
the name field is a mandatory part of [BEP 3]_. If a shadow name is used then a placeholder MUST be provided. An implementation may either generate a random string consisting of filesystem-friendly characters or allow the user to choose a public name that reveals less information than the shadow name.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, this is what you meant. Let's take an example. uTorrent put some audio and video metadata in torrents. I believe these fields would be quite revealing and should be encrypted. what is the purpose of providing placeholders for them? and what are the expectations on those placeholders, does the entire bencoded sub-tree need to be there? (presumably not, because dictionary keys could contain revealing information too). I think it might make sense to motivate this requirement and describe it in some more details (or drop it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The particular section you're commenting on is only about the name field. The audio/video things are not mandatory fields, thus could go into the shadow dictionary without placeholders.

Shadowing can work differently for each field. Some require special logic, some can just be moved down into the shadow dictionary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read this to mean any field in the shadow dict must also exist in the main info dict: "If a shadow name is used then a placeholder MUST be provided".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems clean enough to me that it is only talking about the name field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I get it.. I see, a place holder name refers to the actual "name" field.. not a a name of a field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could be made a bit clear that "name" refers to the actual field in the dictionary, and not just any name of a field

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you looked at the formatted output or only the raw .rst?

https://github.com/the8472/bittorrent.org/blob/2935d4cd050954a51369c69ce556ec88f1ae0240/html/beps/bep_encrypted_data.rst#metadata-format

I think it's clear if you take the context into account, but if you did misunderstand it despite that, I can improve it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I see. I just looked at the the .rst. it's good as is

@the8472
Copy link
Contributor Author

the8472 commented Aug 15, 2016

Still looking for opinions whether it should be simplified or not.

@the8472
Copy link
Contributor Author

the8472 commented Aug 27, 2016

I realized that there are some some interactions between BEP47 padding files and the public vs. shadow layout. Padding files in the public layout lead to zero-bytes in the ciphertext, which corresponds to random plaintext.

On the other hand if one wants to hide precise file sizes then one has to add padding files in the shadow layout which inserts zeros into the plaintext which results in non-zeroes in the ciphertext.

That probably adds way too much complexity for implementations.

I'm inclined to mandate always single-file layout for the public part and make the shadow dict mandatory. The in-place decryption doesn't seem to be worth it.

@arvidn
Copy link
Contributor

arvidn commented Aug 27, 2016

I agree that it makes sense to require the public view of the content always be a single-file torrent

- mandate single-file layout for public data
- mandatory shadow dictionary
@the8472
Copy link
Contributor Author

the8472 commented Dec 31, 2016

There's another complication due to padding. Since it requires multi-file layout it currently is mutually exclusive with single-file.

So either padding needs to be changed to also be usable with single-file torrents or we need to define a way to have single-file semantics with a files dictionary.

The former would be easy in the scope of encrypted torrents by allowing the ciphertext to be up to a piece size longer than the plaintext and just considering that as implicit padding. But that does not generalize to non-encrypted torrents.

The latter could be defined so that a multi-file layout with only 1 regular (non-padding) file and no or empty path is effectively equivalent to single-file-layout. That would not be backwards-compatible in general, but useful for encrypted torrents now and maybe more generally in the future.

@the8472 the8472 force-pushed the encrypted_torrents branch from d647ea7 to 773bc66 Compare January 1, 2017 21:08
@the8472 the8472 force-pushed the encrypted_torrents branch from 773bc66 to 9f2e3ae Compare January 1, 2017 21:09
@the8472
Copy link
Contributor Author

the8472 commented Jan 1, 2017

I have implemented a prototype, added test vectors and made a few minor changes.

It should be ready for merging now (modulo review).

@the8472
Copy link
Contributor Author

the8472 commented Jan 1, 2017

github-rendered version

@the8472
Copy link
Contributor Author

the8472 commented Jan 3, 2017

Looking at the keys the hex encoding does seem unwieldy. For root keys shorter ones could be used, but for the "only hand out key for 1 torrent" case that does not work.
I'll change magnets to url-safe base64 encoding. And maybe another parameter for human-readable root keys instead of encoded ones.

@the8472
Copy link
Contributor Author

the8472 commented Feb 10, 2017

Discussion in #29 made me realize that the mac scheme would be in the way of translating encrypted torrents to merkle torrents and is also quite complicated. I'll revise it to calculate the mac to pieces-transformed-to-merkle root (that way it should be compatible) and limit it to a limited set of fields.

@the8472
Copy link
Contributor Author

the8472 commented Mar 8, 2017

Closing this for now until we have settled on a new torrent format (#58)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants