Skip to content

Latest commit

 

History

History
299 lines (185 loc) · 14.9 KB

bep_encrypted_data.rst

File metadata and controls

299 lines (185 loc) · 14.9 KB
BEP: ??
Title: Encrypted Torrent Payload
Version: $Revision$
Last-Modified: $Date$
Author: The 8472 <[email protected]>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 04-Oct-2015
Post-History:

Abstract

This BEP specifies a way to apply symmetric encryption to torrent payload at the storage layer and additionally encrypt some metadata with the following goals:

  • confidentiality
  • limited privacy

and non-goals:

  • forward-secrecy
  • anonymity
  • signature-based authentication, already covered by [BEP 35]
  • authentication of peer connections

Rationale

In general BitTorrent swarms are an open system well-suited for mass-distribution of data to the public.

Some use-cases require that the data is only distributed to a closed, trusted group of peers. In other cases the content may be meant for open distribution within a community without intent of announcing the content to the whole world. This is analogous to web content that is open to human visitors but requests via robots.txt[robots]_ that it should not be announced to the world by web crawlers.

While the private flag [BEP-27]_ may be sufficient in a controlled environment to prevent information about the torrent (e.g. its infohash) from escaping and thus preventing others from connecting to the swarm this is a very brittle form of security which also prevents the use of public infrastructure such as open trackers, PEX or the DHT. Similarly Message Stream Encryption provides limited protection from passive eavesdroppers on the network layer but does not prevent the infohash from escaping.

Instead of attempting to restrict access to the swarm or metadata this BEP proposes to make all data opaque to 3rd parties by encrypting it with a shared secret that is not available through any torrent-related protocol, i.e. must be obtained separately by the user.

In principle the same properties can be provided by simply storing the data in an encrypted archive and using a nondescript filename, but that requires users to store the data twice or to use additional filesystem layers to transparently access the data, which is even more cumbersome when encryption is involved. It also prevents bittorrent clients from reusing already-downloaded files in a multi-file torrent.

Metadata format

{
  info: {
    bepXX: {
      mac: <32bytes of hmac output (string)>,
      salt: <32bytes of random binary data (string)>,
      shadow: <optional, bencoded-then-encrypted dictionary (string)>
      v: <version (integer)>,
    },
    length or files: <unchanged>,
    name: <public name (string)>,
    piece length: <unchanged>
    pieces: <N*20 bytes, piece hashes of the payload ciphertext (string)>
  },
}
salt
the random data must be generated by a cryptographically secure RNG to avoid IV reuse.
v
The protocol version used to encrypt the torrent, currently 1. New versions may be introduced by updates to this BEP if cryptographic weaknesses necessitate incompatible changes. Implementations should check if they support the version indicated in the metadata file and otherwise inform the user that they can download the data but not decrypt it.
shadow
bencoded-then-encrypted dictionary whose key-value pairs shadow entries in the info dictionary. If it is absent only the payload is encrypted and no info dictionary entries are shadowed. Implementations should only shadow a whitelist of keys for which they have a shadowing strategy and ignore other keys. Shadowable keys suggested by this BEP: length, files, name, comment.
mac
message authentication code covering the info dictionary
name
the name field is a mandatory part of [BEP 3]_. If a shadow name is used then a placeholder MUST be provided. An implementation may either generate a random string consisting of filesystem-friendly characters or allow the user to choose a public name that reveals less information than the shadow name.
files, length
The shadow dictionary MAY override the single/multifile nature indicated by the public info dictionary. If it does not shadow either key then the public information is canonical. If the files or length are shadowed then the overall payload length MUST be consistent with the public version. If a shadow dictionary is present the public information should be treated as decorative / advisory until it can be determined whether it has been shadowed, i.e. until the shadow data can be decrypted.

To protect privacy an implementation should use shadowing for any additional keys that reveal information about the payload

Encryption

Building blocks used in version 1: SHA2-256[rfc6234]_, ChaCha20[rfc7539]_, HMAC[rfc2104]_, PBKDF2[rfc2898]_

|| is the concat operator

Key.root = random key, recommended strength: 256bits

Key.payload = PBKDF2(HMAC−SHA256, Key.root, salt || "payload", 4096, 256)

Key.shadow =  sha256(Key.payload || "shadow")

mac = HMAC−SHA256(info-dict with mac placeholder, Key.shadow)

IV.payload = truncate_64(sha256(salt || "payload"))

IV.shadow = truncate_64(sha256(salt || "shadow"))

PBKDF2 key derivation is used in case root keys with less entropy than the recommended are used, e.g. for password-based schemes. But for general use this BEP assumes that the root key consists of random binary data and hence mandates hexadecimal encoding when the keys need to be displayed in a human-readable format.

ChaCha20 is used to both encrypt the shadow dictionary and the torrent payload.

The optional shadow dictionary is encrypted after bencoding with Key.shadow and IV.shadow.

The mac is calculated over the bencoded info-dictionary with an 32 zero bytes as placeholder for the mac value itself. If other extensions perform similar hashing over intermediate representations of the metadata the order in which they are applied needs to be specified.

The encryption is applied when file data is loaded into the piece address space. Which means the pieces hashes are calculated over the encrypted data using Key.payload and IV.payload. The key stream of the cipher applied according ot the position of the data in the piece space. I.e. any padding, holes or alignment of piece data also affects which part of the key stream is used. This BEP only covers pieces representing file entries. Should future extensions put other data into the piece address space the interaction with this BEP will need to be defined.

An implementation unaware of this BEP will simply store the ciphertext to the disk in a length-sized file with the public name.

This scheme only provides integrity verification for the ciphertext through the pieces hashes, i.e. correct decryption is not verified. An incorrect key could result in garbage plaintext, but this does not introduce a new problem since bittorrent never guaranteed that the files contain what the metadata claims.

Key reuse and hierarchy

The usage of a salt to derive the payload key from the root key allows the root key to be reused across several torrents while still generating distinct payload keys for each. But UI design SHOULD encourage random key generation for each new torrent and require explicit user action for key reuse.

An implementation may provide the option to attempt to decrypt a torrent with the same key as another torrent in case a key is only communicated once and individual torrents are later distributed without explicitly providing keys.

In some circumstances it may make sense to reveal a particular key lower in the hierarchy without revealing an upper key. For example a user may upload a torrent to an indexing site and provide the shadow key so it can extract keywords for fulltext search.

Or a user may want to share a particular torrent without revealing the root key used to protect multiple other torrents, in that case revealing the payload key for that torrent will be sufficient.

The mac can also be used to determine to which level of the hierarchy a key belongs by first assuming it is the shadow key and attempting to verify the info-dictionary against it, then assuming it is the payload key, deriving the shadow key and then attempting to verify it etc.

Key sharing

Implementations SHOULD provide a way to view and input the different keys for a torrent so users can share them in unstructured ways. The hex-encoded form should be used for this purpose.

Encouraging users to share keys without bundling them with torrents or magnets in a structured way allows them to exchange them over separate channels and also makes it slightly more difficult to crawl the internet for unintentionally disclosed keys.

Web services that request that users reveal keys for a specific use-case (e.g. metadata extraction) can ask for the key in a separate input field in their forms / APIs. They SHOULD NOT store or in turn reveal the keys to visitors if that is not essential for their use-case.

Keys MUST NOT be included in .torrent files in any form. Too much infrastructure for crawling and automatic mass-distribution of .torrent files exists and to a user it would not be obvious whether a torrent contains keys or not, thus making accidental disclosure likely.

Magnets

Clients should only include a key if the user explicitly requests it or if the secret part has been sufficiently highlighted to make him aware of what type of secret he is sharing.

To include a key in magnet links the parameter &key=<key> can be added where the key is in hex-encoded form.

The importing client can determine which type of key it is based on the mac in the metadata.

Key files

To export keys to a file, e.g. for archival purposes or for bulk torrent migration between clients, the following bencoded format can be used:

{
  torrent-keys: {
    <torrent identifier, 32 bytes>: {
      root: <optional (string)>,
      payload: <optional, 32 bytes (string)>,
      shadow: <optional, 32 bytes (string)>
    },
    ...
  },
}
torrent identifier
A unique, use-specific identifier calculated from the torrent's mac via SHA256(mac || ".torrent-keys"). This allows a torrent client to locate keys for a metadata file while preventing reverse lookups for those who do not have access to the metadata.

.torrent-keys should be used as file extension. By default filesystem permissions should be set appropriately to restrict access to key files to the current user.

A key file can contain keys for multiple torrents. Only one key needs to be included per torrent, as the lower keys can be derived. Keys must be included in their raw, unencoded form.

Storage layer

This BEP does not mandate how an implementation should store encrypted or decrypted data on disk.

However, if a client wants to be more flexible than either ignoring this BEP (thus storing ciphertext on disk) or always requiring the keys before starting a torrent it will have to consider the following:

  • clients can be in 3 states regarding key knowledge: no keys, shadow key only, keys that decrypt plaintext; two encryption states: encrypted, decrypted; 3 file layout 3 states: encrypted, multi-file, single-file
  • a user may start downloading a torrent before keys are available. this requires a way to input keys and to convert between encrypted and decrypted storage
  • for performance or security reasons a seeder may want to import plaintext data, encrypt it and then discard the keys to directly seed the encrypted data from disk.

Since encrypted torrents may contain confidential / private data implementations may also want to set more restrictive file permissions when decrypting data to reduce exposure in multi-user environments.

Security Properties

The goal is to provide security equivalent to publicly distributing an encrypted archive where the file index is encrypted with a separate key that can be revealed without revealing the payload key.

In particular that means:

  • swarms remain open, anyone can participate in a swarm, with or without access to the secrets
  • an observer without access to the secrets does not know what data is being shared
  • correctness of the metadata cannot be confirmed without access to both secrets
  • observing that someone participated in a swarm and uploaded data is no longer equivalent to knowing that they had access to the plaintext or knowledge of the metadata
  • the ciphertext is accessible to the public. this may be desirable to provide upload bandwidth without knowledge of the content, e.g. to allow untrusted servers to distribute confidential data to trusted clients or to enable hosting without the need to proactively moderate user content.

Limitations:

  • there is no forward secrecy. should the secrets become available to an unauthorized party at some future point they will be able to decrypt ciphertext they have downloaded in the past and retroactively associate content with observed users
  • deniability is fairly weak, if someone learns the shared secrets or has knowledge how they are distributed they may also draw conclusions whether a particular participant in a swarm could have had access to it.

UI concerns

This section is advisory.

Shared secrets are handled by many parties, therefore the system is as weak as the weakest human. Thus making intentional, correct handling of secrets simple and convenient while making unintentional disclosure hard is an important aspect of keeping the system secure.

Information that a client may want to make visible:

  • encrypted/decrypted status of a torrent
  • which keys it knows (+ option to discard if storage is encrypted)

Torrent creation

  1. user selects whether he wants to use encryption at all
  2. if yes then offer to
    • generate a random key. user may instead opt to reuse a key from another torrent
    • provide a meaningful public name distinct from the shadow name
    • only encrypt the payload and not shadow any metadata

Key input

  • input choices: manual, magnet link, .torrent-keys file, reusing key from another torrent
  • immediate feedback whether keys match the mac and what kind of key was imported (root, payload, shadow)
  • option to decrypt data or leave it encrypted
    • offer directory layout choices that would normally be offered when a torrent is imported

Magnet/Key export

Provide option to

  • not include key [default]
  • include shadow key only, if there is any shadowed metadata
  • include payload key.
  • include root key. if the client knows that the key has been reused for other torrents it should indicate this to the user

Test Vectors

## TODO

References

## TODO

Copyright

This document has been placed in the public domain.