Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_ANSIX923PaddingContext isn't thread-safe #12553

Open
ngoldbaum opened this issue Mar 4, 2025 · 1 comment
Open

_ANSIX923PaddingContext isn't thread-safe #12553

ngoldbaum opened this issue Mar 4, 2025 · 1 comment

Comments

@ngoldbaum
Copy link
Contributor

ngoldbaum commented Mar 4, 2025

While writing tests in support of #12489, I ran across a case where the GIL is implicitly locking some data stored in Python objects, but in a manner that can be worked around, even with the GIL.

Consider the following script, which updates a padder in multiple threads in a random order using a different chunk size for each thread. The data are chosen such that the final state of the padder should be the same no matter which order the threads updated it.

import sys
import threading

from cryptography.hazmat.primitives.padding import ANSIX923

# force threads to switch more quickly to make the race more likely
sys.setswitchinterval(.0000001)

num_threads = 4
padder = ANSIX923(num_threads * 256).padder()
validate_padder = ANSIX923(num_threads * 256).padder()
chunk = b"abcd1234"
data = chunk * 16
validate_padder.update(data * num_threads)
expected_pad = validate_padder.finalize()

b = threading.Barrier(num_threads)

def pad_in_chunks(chunk_size):
    index = 0
    b.wait()
    while index < len(data):
        padder.update(data[index : index + chunk_size])
        index += chunk_size

threads = []
for threadnum in range(num_threads):
    chunk_size = len(data) // (2**threadnum)
    thread = threading.Thread(target=pad_in_chunks, args=(chunk_size,))
    threads.append(thread)

for thread in threads:
    thread.start()
for thread in threads:
    thread.join()

calculated_pad = padder.finalize()

assert expected_pad == calculated_pad, (expected_pad, calculated_pad)

On My M3 Macbook Pro with the standard build of CPython 3.13.2, this test sometimes runs to completion and sometimes the final assert fails with errors like:

goldbaum at Nathans-MBP in ~/Documents/test
$  python test.py
Traceback (most recent call last):
  File "/Users/goldbaum/Documents/test/test.py", line 39, in <module>
    assert expected_pad == calculated_pad, (expected_pad, calculated_pad)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: (b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80', b'abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10')

This code is adapted from the test_threaded_hashing test in CPython's hashlib - I've already written a similar test for the hash digest context and that worked fine.

This is happening because _ANSIX923PaddingContext stores its state in a bytestring and since updates to the bytestring aren't atomic, if the thread switch interval is short enough (or someone gets really unlucky in the default configuration) then there is a race to update the _buffer.

class _ANSIX923PaddingContext(PaddingContext):
_buffer: bytes | None
def __init__(self, block_size: int):
self.block_size = block_size
# TODO: more copies than necessary, we should use zero-buffer (#193)
self._buffer = b""
def update(self, data: bytes) -> bytes:
self._buffer, result = _byte_padding_update(
self._buffer, data, self.block_size
)
return result
def _padding(self, size: int) -> bytes:
return bytes([0]) * (size - 1) + bytes([size])
def finalize(self) -> bytes:
result = _byte_padding_pad(
self._buffer, self.block_size, self._padding
)
self._buffer = None
return result

One solution is to add a threading.lock in the python implementation. Another is to move the code into Rust, where runtime borrow checking would turn this silently incorrect result into a runtime borrow-check error.

@alex
Copy link
Member

alex commented Mar 4, 2025

Ah. Good catch, I think this probably applies to all of the unpadders.

The right move is probably to move this stuff to Rust. There's no coherent behavior for attempting to concurrently calling update() from multiple-threads with no synchronization (except in the degenerate case of the updates having the same values, I guess), so raising is totally reasonable.

Some of the KDFs may have a race condition on checking for whether they were finalized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants