You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While writing tests in support of #12489, I ran across a case where the GIL is implicitly locking some data stored in Python objects, but in a manner that can be worked around, even with the GIL.
Consider the following script, which updates a padder in multiple threads in a random order using a different chunk size for each thread. The data are chosen such that the final state of the padder should be the same no matter which order the threads updated it.
importsysimportthreadingfromcryptography.hazmat.primitives.paddingimportANSIX923# force threads to switch more quickly to make the race more likelysys.setswitchinterval(.0000001)
num_threads=4padder=ANSIX923(num_threads*256).padder()
validate_padder=ANSIX923(num_threads*256).padder()
chunk=b"abcd1234"data=chunk*16validate_padder.update(data*num_threads)
expected_pad=validate_padder.finalize()
b=threading.Barrier(num_threads)
defpad_in_chunks(chunk_size):
index=0b.wait()
whileindex<len(data):
padder.update(data[index : index+chunk_size])
index+=chunk_sizethreads= []
forthreadnuminrange(num_threads):
chunk_size=len(data) // (2**threadnum)
thread=threading.Thread(target=pad_in_chunks, args=(chunk_size,))
threads.append(thread)
forthreadinthreads:
thread.start()
forthreadinthreads:
thread.join()
calculated_pad=padder.finalize()
assertexpected_pad==calculated_pad, (expected_pad, calculated_pad)
On My M3 Macbook Pro with the standard build of CPython 3.13.2, this test sometimes runs to completion and sometimes the final assert fails with errors like:
goldbaum at Nathans-MBP in ~/Documents/test
$ python test.py
Traceback (most recent call last):
File "/Users/goldbaum/Documents/test/test.py", line 39, in <module>
assert expected_pad == calculated_pad, (expected_pad, calculated_pad)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: (b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80', b'abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10')
This code is adapted from the test_threaded_hashing test in CPython's hashlib - I've already written a similar test for the hash digest context and that worked fine.
This is happening because _ANSIX923PaddingContext stores its state in a bytestring and since updates to the bytestring aren't atomic, if the thread switch interval is short enough (or someone gets really unlucky in the default configuration) then there is a race to update the _buffer.
# TODO: more copies than necessary, we should use zero-buffer (#193)
self._buffer=b""
defupdate(self, data: bytes) ->bytes:
self._buffer, result=_byte_padding_update(
self._buffer, data, self.block_size
)
returnresult
def_padding(self, size: int) ->bytes:
returnbytes([0]) * (size-1) +bytes([size])
deffinalize(self) ->bytes:
result=_byte_padding_pad(
self._buffer, self.block_size, self._padding
)
self._buffer=None
returnresult
One solution is to add a threading.lock in the python implementation. Another is to move the code into Rust, where runtime borrow checking would turn this silently incorrect result into a runtime borrow-check error.
The text was updated successfully, but these errors were encountered:
Ah. Good catch, I think this probably applies to all of the unpadders.
The right move is probably to move this stuff to Rust. There's no coherent behavior for attempting to concurrently calling update() from multiple-threads with no synchronization (except in the degenerate case of the updates having the same values, I guess), so raising is totally reasonable.
Some of the KDFs may have a race condition on checking for whether they were finalized.
While writing tests in support of #12489, I ran across a case where the GIL is implicitly locking some data stored in Python objects, but in a manner that can be worked around, even with the GIL.
Consider the following script, which updates a padder in multiple threads in a random order using a different chunk size for each thread. The data are chosen such that the final state of the padder should be the same no matter which order the threads updated it.
On My M3 Macbook Pro with the standard build of CPython 3.13.2, this test sometimes runs to completion and sometimes the final assert fails with errors like:
This code is adapted from the
test_threaded_hashing
test in CPython's hashlib - I've already written a similar test for the hash digest context and that worked fine.This is happening because
_ANSIX923PaddingContext
stores its state in a bytestring and since updates to the bytestring aren't atomic, if the thread switch interval is short enough (or someone gets really unlucky in the default configuration) then there is a race to update the_buffer
.cryptography/src/cryptography/hazmat/primitives/padding.py
Lines 137 to 159 in 0ce9f53
One solution is to add a
threading.lock
in the python implementation. Another is to move the code into Rust, where runtime borrow checking would turn this silently incorrect result into a runtime borrow-check error.The text was updated successfully, but these errors were encountered: