Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sorting a Slice Raises an Illegal Instruction Signal #136103

Closed
jonathan-gruber-jg opened this issue Jan 26, 2025 · 3 comments · Fixed by #136163
Closed

Sorting a Slice Raises an Illegal Instruction Signal #136103

jonathan-gruber-jg opened this issue Jan 26, 2025 · 3 comments · Fixed by #136163
Assignees
Labels
C-bug Category: This is a bug. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@jonathan-gruber-jg
Copy link

jonathan-gruber-jg commented Jan 26, 2025

As the issue title indicates, I have discovered that, in some cases, sorting a slice via slice::sort erroneously raises SIGILL (illegal instruction). Below is a minimal test case for the observed behavior.

use std::fmt;

#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
struct S(bool, [u8; 999_999]);

impl fmt::Debug for S {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        fmt::Debug::fmt(&self.0, f)
    }
}

fn main() {
    let n = 101;
    let mut objs = Vec::with_capacity(n);

    let mut on = false;

    for _ in 0..n {
        objs.push(S(on, [0; 999_999]));
        on = !on;
    }

    println!("Done adding elements.");
    println!("Objects: {:?}", objs);

    objs.sort();

    println!("Objects: {:?}", objs);
}

If you run this code in a debugger, you will see that the SIGILL comes from the call to core::intrinsics::abort here. That is, something is causing scratch.len() to be less than len when it shouldn't be.

I think I have traced the cause of the problem to how min_good_run_len is calculated here. If len <= MIN_SQRT_RUN_LEN * MIN_SQRT_RUN_LEN holds true and len - len / 2 (which is really just ceil(len / 2)) happens to be <= MIN_SQRT_RUN_LEN, then min_good_run_len is given the value len - len / 2 (which, again, is just ceil(len / 2)). When the function create_run in the same file cannot find an ascending or descending run of at least min_good_run_len elements, it extracts an unsorted run that extends for min_good_run_len elements or until the end of the slice/array (whichever is earlier). Later, the function logical_merge in the same file will sort an unsorted run if it and the run with which it is to be merged do not together fit in scratch.

scratch.len() is at least floor(len / 2), and floor(len / 2) is less than ceil(len / 2) when len is an odd integer, so a problem arises when scratch.len() is equal to floor(len / 2), min_good_run_len is equal to ceil(len / 2), and len is odd, because, then, logical_merge might try to sort an unsorted run that cannot fit in scratch. If my math is correct, this is exactly the situation occurring in my minimal test case.

Replacing len - len / 2 with len / 2 in the calculation of min_good_run_len would therefore fix the issue, if I am not mistaken. I do not know why the author(s) of the driftsort chose len - len / 2 instead of len / 2. Neither the commit (in the original driftsort GitHub repository) where it first appeared (Voultapher/driftsort@dbe1d24) nor the associated pull request (Voultapher/driftsort#39) appears to give an explanation.

Meta

rustc --version --verbose:

rustc 1.86.0-nightly (f7cc13af8 2025-01-25)
binary: rustc
commit-hash: f7cc13af822fe68c64fec0b05aa9dd1412451f7c
commit-date: 2025-01-25
host: x86_64-unknown-linux-gnu
release: 1.86.0-nightly
LLVM version: 19.1.7
@jonathan-gruber-jg jonathan-gruber-jg added the C-bug Category: This is a bug. label Jan 26, 2025
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jan 26, 2025
@orlp
Copy link
Contributor

orlp commented Jan 26, 2025

I believe the @jonathan-gruber-jg's analysis to be accurate as to the cause of the bug. Because of the (abort-based) assert it hits I don't believe there's a security issue, only a potential denial-of-service issue.

The reason this only triggers with a large type is because up until 8MB we always allocate a scratch space for n elements rather than the n / 2 we do for very large arrays. However we don't have a hard switch at 8MB from n to n/2, the formula is more complicated (paraphrased):

let alloc_len = cmp::max(len / 2, cmp::min(len, 8_000_000 / size_of::<T>());

The only way to trigger this bug is if the len / 2 term dominates, so we must have len / 2 >= 8_000_000 / size_of::<T>() and thus we see that size_of::<T>() >= 8_000_000 * 2 / len (modulo rounding off-by-ones). However we must also have that len - len / 2 <= MIN_SQRT_RUN_LEN == 64, thus we conclude len == 127 is the largest array possible that can trigger this bug, putting a minimum bound on size_of::<T>() of 125000 bytes. Empirically I see the threshold is at 125000 bytes exactly (with 125001 being the smallest type that fails). This is thus the minimal reproducer:

fn main() {
    let n = 127;
    let mut objs: Vec<_> =
        (0..n).map(|i| [(i % 2) as u8; 125001]).collect();
    objs.sort();
}

If you try to make n any bigger in an attempt to reduce size_of::<T>() < 125001 you will no longer hit the bug. Thus while I agree this is an issue that should be fixed I don't think there is any realistic denial-of-service attack surface here as anyone sorting objects which are 125KB+ large (as in size_of::<T>()) is already denying their own service (and/or already getting dangerously close to stack overflows).


Neither the commit (in the original driftsort GitHub repository) where it first appeared (Voultapher/driftsort@dbe1d24) nor the associated pull request (Voultapher/driftsort#39) appears to give an explanation.

The explanation was done in private communication at the time. I proposed to switch to len - len / 2:

it [min_good_run_len] needs to be len - len / 2 (AKA ceil_div) otherwise you end up with 3 runs for odd-length inputs.

I still believe this justification to be sound so I don't believe the min_good_run_len computation should be changed, rather, we should ensure the scratch always has length at least len - len / 2 instead of only len / 2.

@saethlin saethlin added T-libs Relevant to the library team, which will review and decide on the PR/issue. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Jan 26, 2025
@cuviper
Copy link
Member

cuviper commented Jan 27, 2025

For the record, this did go through a [email protected] report, and we agreed that the conditions for denial-of-service were unlikely enough that we can treat this as an open issue, rather than a privately-coordinated security issue.

@uellenberg
Copy link
Contributor

@rustbot claim

uellenberg added a commit to uellenberg/rust that referenced this issue Jan 28, 2025
uellenberg added a commit to uellenberg/rust that referenced this issue Jan 28, 2025
uellenberg added a commit to uellenberg/rust that referenced this issue Jan 29, 2025
workingjubilee added a commit to workingjubilee/rustc that referenced this issue Jan 31, 2025
…=Mark-Simulacrum

Fix off-by-one error causing slice::sort to abort the program

Fixes rust-lang#136103.
Based on the analysis by `@jonathan-gruber-jg` and `@orlp.`
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 1, 2025
…=Mark-Simulacrum

Fix off-by-one error causing slice::sort to abort the program

Fixes rust-lang#136103.
Based on the analysis by ``@jonathan-gruber-jg`` and ``@orlp.``
@bors bors closed this as completed in 1565254 Feb 1, 2025
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Feb 1, 2025
Rollup merge of rust-lang#136163 - uellenberg:driftsort-off-by-one, r=Mark-Simulacrum

Fix off-by-one error causing slice::sort to abort the program

Fixes rust-lang#136103.
Based on the analysis by ``@jonathan-gruber-jg`` and ``@orlp.``
github-actions bot pushed a commit to tautschnig/verify-rust-std that referenced this issue Feb 20, 2025
github-actions bot pushed a commit to tautschnig/verify-rust-std that referenced this issue Feb 20, 2025
github-actions bot pushed a commit to carolynzech/rust that referenced this issue Feb 20, 2025
github-actions bot pushed a commit to model-checking/verify-rust-std that referenced this issue Feb 20, 2025
github-actions bot pushed a commit to thanhnguyen-aws/verify-rust-std that referenced this issue Feb 21, 2025
github-actions bot pushed a commit to thanhnguyen-aws/verify-rust-std that referenced this issue Feb 21, 2025
github-actions bot pushed a commit to model-checking/verify-rust-std that referenced this issue Feb 22, 2025
github-actions bot pushed a commit to carolynzech/rust that referenced this issue Feb 22, 2025
github-actions bot pushed a commit to tautschnig/verify-rust-std that referenced this issue Feb 22, 2025
github-actions bot pushed a commit to thanhnguyen-aws/verify-rust-std that referenced this issue Feb 22, 2025
github-actions bot pushed a commit to thanhnguyen-aws/verify-rust-std that referenced this issue Mar 3, 2025
github-actions bot pushed a commit to carolynzech/rust that referenced this issue Mar 4, 2025
github-actions bot pushed a commit to carolynzech/rust that referenced this issue Mar 4, 2025
github-actions bot pushed a commit to thanhnguyen-aws/verify-rust-std that referenced this issue Mar 4, 2025
github-actions bot pushed a commit to tautschnig/verify-rust-std that referenced this issue Mar 6, 2025
github-actions bot pushed a commit to thanhnguyen-aws/verify-rust-std that referenced this issue Mar 6, 2025
github-actions bot pushed a commit to tautschnig/verify-rust-std that referenced this issue Mar 11, 2025
github-actions bot pushed a commit to tautschnig/verify-rust-std that referenced this issue Mar 11, 2025
…=Mark-Simulacrum

Fix off-by-one error causing slice::sort to abort the program

Fixes rust-lang#136103.
Based on the analysis by ``@jonathan-gruber-jg`` and ``@orlp.``
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants