Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeId documentation should mention gotcha re covariant subtypes #89150

Closed
uazu opened this issue Sep 21, 2021 · 18 comments · Fixed by #136246
Closed

TypeId documentation should mention gotcha re covariant subtypes #89150

uazu opened this issue Sep 21, 2021 · 18 comments · Fixed by #136246
Labels
A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@uazu
Copy link

uazu commented Sep 21, 2021

A covariant subtype gets a different type-ID, but can be converted freely from one to the other. This means that taking a type-ID of a generic type argument in a type's new method is not guaranteed to give the same type-ID that the type will eventually be called with. The solution is to force the generic type argument to be invariant using PhantomData. See this bug in qcell crate.

It would be helpful if this were documented to avoid similar bugs occurring in other crates.

use std::any::TypeId;
fn main() {
    println!(
        "{:?}, {:?}",
        TypeId::of::<fn(&'_ ())>(),
        TypeId::of::<fn(&'static ())>(),
    );
}
@clubby789
Copy link
Contributor

@rustbot label +A-docs

@rustbot rustbot added the A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools label Mar 30, 2023
@jieyouxu jieyouxu added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jun 18, 2024
@hkBst
Copy link
Member

hkBst commented Jan 24, 2025

@uazu, could you translate this: "This means that taking a type-ID of a generic type argument in a type's new method is not guaranteed to give the same type-ID that the type will eventually be called with." to code, so I have a better chance of understanding this issue?

Alternatively, since this seems quite technical and no-one has turned up to fix this, maybe propose your own fix?

@uazu
Copy link
Author

uazu commented Jan 25, 2025

I am not a type theorist! My understanding of the problem is probably incomplete.

However the way I understand it is as follows: Normally Rust won't let you silently convert types to other types (I'm talking about types themselves, not values). So if you take the type-id of a type, then later take the type-id of the same type specified separately, it will be the same. BUT that doesn't apply to lifetimes. Rust will happily let you change the lifetime on a type, and that will give a different type-id! So you can't count on the type-id staying the same if you have lifetimes involved. This is called covariance, as I understand it.

This caused an actual bug in the qcell crate, and it allowed a crate user to cause undefined behaviour (see the bug report mentioned above). The fix was to have a struct: struct Invariant<T>(fn(T) -> T);, which gets rid of the compiler's freedom to change the lifetimes within T. This would be used for example in type Id<'id> = PhantomData<Invariant<&'id ()>>; to create a type Id whose lifetime can't be changed.

The best code to demonstrate was provided by @steffahn in the bug report, which demonstrates bypassing TCell's type-id based check by converting the lifetime on a generic type using as.

So if I have a struct that records a type-id of its generic type argument in one call, that type-id might not actually correspond to the generic type argument with which the exact same instance of that struct is called later on. Rust allows the caller to change the generic type argument through covariance to another type with a different lifetime and different type-id, without the instance's knowledge (converting TCellOwner<T1> to TCellOwner<T2> in steffahn's example -- the instance still thinks it's a TCellOwner<T1>). It's like casting a Vec<bool> to a Vec<u8> without Vec knowing that anything has happened. Rust doesn't allow that case, but it does allow a similar case where only a lifetime changes. If Vec relied on the type-id to know what is going on, it would be in trouble. Vec doesn't use the type-id, but my TCellOwner does.

The change I would like to see to Rust is simply to document this, so that someone like me coming along in the future won't be tripped up by this. But I do not know how best to phrase it. Essentially the issue is that you cannot assume that the type-id of a generic type argument is always the same from one call to another if lifetimes are involved. Perhaps that would be enough of a warning, and maybe mention the Invariant work-around. But ideally someone who fully understands this should write the text.

I hope this makes things clearer.

@hkBst
Copy link
Member

hkBst commented Jan 25, 2025

https://doc.rust-lang.org/nomicon/subtyping.html has an explanation about subtyping and variance, but in short: Something with a longer lifetime can always be used somewhere that only a shorter lifetime is necessary, like objects with a 'static lifetime can be used anywhere. Then if a generic type T using the 'static lifetime (T::<'static>) can be used anywhere that that same type with a shorter lifetime (T::<'a>) can be used, then T is covariant in that generic lifetime parameter.

Then I look at https://doc.rust-lang.org/nightly/std/any/struct.TypeId.html and your example code and it strikes me that the TypeId's produced are different and that they indeed compare (==) as different. And I see "A TypeId represents a globally unique identifier for a type." and "A TypeId is currently only available for types which ascribe to 'static, but this limitation may be removed in the future." in the docs. Certainly that "ascribe to 'static" is quite poetic, though it seems to refer to the bound on T in TypeId::of::<T: 'static + ?Sized>.

If I write "fn x<'a> (: &'a i32) -> bool {
TypeId::of::<&'static i32>() == TypeId::of::<&'
i32>()
}"
it seems to always return true... sorry still not quite there, but perhaps closer?

@steffahn
Copy link
Member

The relevant detail here is that subtyping also supports coercions from HRTB-like types (i.e. trait objects with higher-ranked bounds like dyn for<'a> Trait<'a>, or function pointer types like for<'a> fn(Type<'a>)) down to more concrete versions of the same type (e.g. dyn Trait<'b> or fn(Type<'b>) for concrete lifetimes 'b). This creates a subtyping relationship between multiple types T that each fulfill T: 'static. Finally, the fn pointer types used here contain some implicit higher-ranked-ness, because fn(&()) or fn(&'_ ()) are effectively syntactic sugar for for<'a> fn(&'a ()).

For a more illustrative and explanation, check out (the top part of) this comment by @danielhenrymantilla in a different issue:

(it comes with a nice diagram 😻)


As @hkBst correctly observes, merely differentiating &'static i32 from &'_ i32 doesn’t make a difference in run-time type identity; though the better way towards this conclusion is to argue about lifetimes being erased during compilation, and the code example provided, with TypeId::of::<&'_ i32>, doesn’t demonstrate anything, as the T: 'static bound of TypeId::of makes the compiler infer the &'_ i32 here as &'static i32, anyway.

As a TL;DR, the choice of some lifetime 'a vs 'b doesn’t really matter (and for creating a TypeId, you’d usually need to use 'static everywhere anyway, as far as concrete lifetimes are concerned), but the choice between different versions of higher-ranked lifetimes, (the thing you can get with for<'a> … syntax) can make a difference in TypeId.


As a fun additional fact: in the context of higher-ranked lifetimes, you can even have different types that are mutual subtypes of each other. E.g. these two types:

type T1 = for<'a, 'b> fn(&'a (), &'b ());
type T2 = for<'c> fn(&'c (), &'c ());

(playground, showing the subtyping and TypeIds)

@hkBst
Copy link
Member

hkBst commented Jan 25, 2025

So if I understand correctly, using two types where S is a subtype of T and so can be used in its place, if you have a covariant parameter in some generic type G, then TypeId::of::<G<S>> would be different from TypeId::of::<G<T>>, but in other code you could freely take a G<S> and treat it as a G<T>. This may then lead to method confusion, where changing the type would lead to method calls being resolved to different impl blocks, and code from two different impl blocks being mixed. I can see how that is a problem of you rely on other methods in your impl block to behave in a certain way to uphold correctness or safety invariants.

@uazu
Copy link
Author

uazu commented Jan 27, 2025

For my case, it is the same impl block, but I'm relying (for safety invariants) on the type-id staying the same through all calls to the struct methods. That was obviously an error on my part, since the caller can freely change the type (within the subtype restrictions), and hence the type-id. This was quite unexpected, and lead to the bug. So some form of warning in the docs would be very helpful.

@hkBst
Copy link
Member

hkBst commented Jan 27, 2025

Different instantiations of the same generic impl block?

I think we should add an example like this, to showcase the danger.

@uazu
Copy link
Author

uazu commented Jan 27, 2025

Yes, it must be different instantiations of the same generic impl block. I tried it in the playground to verify that that was occurring, and it is (in playground):

use std::any::TypeId;
use std::marker::PhantomData;

// T1 subtype of T2, both 'static
type T1 = fn(&());
type T2 = fn(&'static ());

struct Test<T> {
    phantomdata: PhantomData<T>,
}

impl<T: 'static> Test<T> {
    fn new() -> Self {
        println!("Created with type-id {:?}", TypeId::of::<T>());
        Self { phantomdata: PhantomData }
    }

    fn test(&self) {
        println!("Type-id is now {:?}", TypeId::of::<T>());
    }
}

fn main() {
    let test = Test::<T1>::new() as Test<T2>;
    test.test();
}

I think an example along these lines would definitely help explain the danger.

(In my case the type-id check occurs only once, and it's a lifetime comparison done by the compiler on another method that causes the issue, but if I had repeated the type-id check on that method, I would have been able to see the divergence between my first check and the second.)

@hkBst
Copy link
Member

hkBst commented Jan 28, 2025

@uazu I think this is an excellent example showing that you cannot rely on the TypeId for safety. I think I have enough information now to attempt some doc fix, if you still prefer that, probably using this example mostly verbatim. But since you have now done a lot of the preparatory work already, perhaps you feel you might as well take the final step? I'm happy to go either way!

@uazu
Copy link
Author

uazu commented Jan 28, 2025

Please go ahead with making the doc changes with this example or whatever you feel is appropriate. I think it's better done by someone with more familiarity with the internal Rust organization processes. I'd consider it a "win" if it leads to people avoiding this class of bugs in the future.

@hkBst
Copy link
Member

hkBst commented Jan 29, 2025

@uazu Alright. I'm looking with fresh eyes at the bug report and code in qcell and it is making a lot more sense suddenly :D. I'm considering upgrading your example to something that is perhaps a bit closer to one of the cell types in qcell. Which cell type do you think would be most suited for that, in that the result would be the simplest to understand?

@hkBst
Copy link
Member

hkBst commented Jan 29, 2025

Hmm, seems QCell is the simplest, but it uses an allocation for uniqueness, so it does not seem useful here, so maybe TCell.

@hkBst
Copy link
Member

hkBst commented Jan 29, 2025

Just a 100 lines, but probably too much to include:

use std::any::TypeId;
use std::cell::UnsafeCell;
use std::collections::HashSet;
use std::marker::PhantomData;
use std::sync::{LazyLock, Mutex};

static ID_SET: LazyLock<Mutex<HashSet<TypeId>>> = LazyLock::new(|| Mutex::new(HashSet::new()));

mod owner {
    use super::*;

    // Due to its private data member, outside this module,
    // this struct can only be created using `new`.
    pub struct Owner<O: 'static>(PhantomData<O>);

    impl<O: 'static> Owner<O> {
        pub fn id() -> TypeId {
            TypeId::of::<O>()
        }

        pub fn new() -> Option<Self> {
            let mut set = ID_SET.lock().unwrap();
            set.insert(Self::id()).then(|| Self(PhantomData))
        }

        // Safety: requires mutable reference to a possession's unique owner,
        // so the created mutable reference to a possession's precious is also unique
        pub fn handle<T>(&mut self, p: &Possession<O, T>) -> &mut T {
            unsafe { &mut *p.precious.get() }
        }
    }

    impl<O: 'static> Drop for Owner<O> {
        fn drop(&mut self) {
            let mut set = ID_SET.lock().unwrap();
            (!set.remove(&Self::id())).then(|| panic!("intruder detected"));
        }
    }
}

use owner::*;

struct Possession<OwnerT, T> {
    owner: PhantomData<OwnerT>,
    precious: UnsafeCell<T>,
}

impl<O: 'static, T: 'static> Possession<O, T> {
    fn new(precious: T) -> Self {
        eprintln!("Created with id: {:?}", Owner::<O>::id());
        Self {
            owner: PhantomData,
            precious: UnsafeCell::new(precious),
        }
    }

    fn owner_id(&self) -> TypeId {
        eprintln!("Current id: {:?}", Owner::<O>::id());
        Owner::<O>::id()
    }
}

#[derive(Debug, PartialEq)]
struct TheOneRing;

macro_rules! the_hobbit {
    ($sneak:ty, $hobbit:ty) => {{
        let mut gollum = Owner::<$sneak>::new().unwrap();
        let mut bilbo = Owner::<$hobbit>::new().unwrap();

        let precious = { Possession::<$sneak, _>::new(Some(TheOneRing)) };
        // gollum plays with the one ring
        gollum.handle(&precious);

        // bilbo cannot play with the one ring
        // bilbo.handle(&precious); // arguments to this method are incorrect

        // bilbo pretends to be gollum and steals the one ring
        let mut sneaky_bilbo = bilbo as Owner<$sneak>;
        let ring = sneaky_bilbo.handle(&precious).take();
        std::mem::forget(sneaky_bilbo); // or bilbo will be detected
        ring
    }};
}

// compile_fail: non-primitive cast: `owner::Owner<expected::Hobbit>` as `owner::Owner<expected::Sneak>
#[test]
#[should_panic]
fn expected() {
    struct Sneak;
    struct Hobbit;
    assert_eq!(None, the_hobbit!(Sneak, Hobbit));
}

#[test]
fn an_unexpected_journey() {
    type Sneak = fn(&'static ());
    type Hobbit = fn(&());
    assert_eq!(Some(TheOneRing), the_hobbit!(Sneak, Hobbit));
}

@danielhenrymantilla
Copy link
Contributor

danielhenrymantilla commented Jan 29, 2025

To clarify, the purpose of the documentation w.r.t. these snippets is to insist on the need for such type abstractions (such as Owner<O: 'static> here) to be invariant rather than having any kind of variance, right?

  • as an aside, I wish PhantomData-only-wrapper types had an (even just opt-in) warning about its implicit variance, and required an allow for that variance to, in a way, force the author to explicitly commit to it:

    #[allow(covariant_phantomdata_usage)]
    struct Owner<O: 'static>(PhantomData<O>);

    which, alongside the rule of thumb of "with unsafe code, default to invariance, and only introduce variance if needed, and after taking implications into careful consideration" could help avoid these issues

@hkBst
Copy link
Member

hkBst commented Jan 29, 2025

Simpler example with ~50 lines:

use std::any::TypeId;
use std::collections::HashSet;
use std::marker::PhantomData;
use std::sync::{LazyLock, Mutex};

use unique::Unique;

static ID_SET: LazyLock<Mutex<HashSet<TypeId>>> = LazyLock::new(|| Mutex::new(HashSet::new()));

mod unique {
    use super::*;

    // Due to its private data member, outside this module,
    // this struct can only be created using `new`.
    #[derive(Debug, PartialEq)]
    pub struct Unique<TypeAsId: 'static>(PhantomData<TypeAsId>);

    impl<TypeAsId: 'static> Unique<TypeAsId> {
        pub fn new() -> Option<Self> {
            let mut set = ID_SET.lock().unwrap();
            set.insert(TypeId::of::<TypeAsId>())
                .then(|| Self(PhantomData))
        }
    }

    impl<TypeAsId: 'static> Drop for Unique<TypeAsId> {
        fn drop(&mut self) {
            let mut set = ID_SET.lock().unwrap();
            (!set.remove(&TypeId::of::<TypeAsId>())).then(|| panic!("duplicity detected"));
        }
    }
}

// A FnRef can be used wherever a FnStaticRef can be used,
// so FnRef is a subtype of FnStaticRef.
// Both are 'static, and thus have a TypeId.
type FnRef = fn(&());
type FnStaticRef = fn(&'static ());

fn main() {
    type TheOneRing = FnStaticRef;

    let the_one_ring = Unique::<TheOneRing>::new().unwrap();
    assert_eq!(Unique::<TheOneRing>::new(), None);

    type OtherRing = FnRef;

    let other_ring = Unique::<OtherRing>::new().unwrap();
    let fake_one_ring = other_ring as Unique<TheOneRing>;
    assert_eq!(fake_one_ring, the_one_ring);
    std::mem::forget(fake_one_ring);
}

@hkBst
Copy link
Member

hkBst commented Jan 29, 2025

To clarify, the purpose of the documentation w.r.t. these snippets is to insist on the need for such type abstractions (such as Owner<O: 'static> here) to be invariant rather than having any kind of variance, right?

I would say, the purpose is to warn about the consequences of unintended variance, but certainly, automatically inferred variance can be a bit of a nuisance.

hkBst added a commit to hkBst/rust that referenced this issue Jan 29, 2025
@hkBst
Copy link
Member

hkBst commented Jan 29, 2025

PR is live, please comment there if you have ideas on it!

hkBst added a commit to hkBst/rust that referenced this issue Jan 29, 2025
hkBst added a commit to hkBst/rust that referenced this issue Jan 29, 2025
hkBst added a commit to hkBst/rust that referenced this issue Jan 29, 2025
hkBst added a commit to hkBst/rust that referenced this issue Jan 31, 2025
Fixes rust-lang#89150

Co-authored-by: Daniel Henry-Mantilla <[email protected]>
hkBst added a commit to hkBst/rust that referenced this issue Jan 31, 2025
Fixes rust-lang#89150

Co-authored-by: Daniel Henry-Mantilla <[email protected]>
hkBst added a commit to hkBst/rust that referenced this issue Jan 31, 2025
Fixes rust-lang#89150

Co-authored-by: Daniel Henry-Mantilla <[email protected]>
hkBst added a commit to hkBst/rust that referenced this issue Feb 11, 2025
Fixes rust-lang#89150

Co-authored-by: Daniel Henry-Mantilla <[email protected]>
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 11, 2025
@bors bors closed this as completed in e279c4e Feb 12, 2025
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Feb 12, 2025
Rollup merge of rust-lang#136246 - hkBst:patch-29, r=ibraheemdev

include note on variance and example

Fixes rust-lang#89150
github-actions bot pushed a commit to model-checking/verify-rust-std that referenced this issue Feb 22, 2025
Fixes rust-lang#89150

Co-authored-by: Daniel Henry-Mantilla <[email protected]>
github-actions bot pushed a commit to carolynzech/rust that referenced this issue Feb 22, 2025
Fixes rust-lang#89150

Co-authored-by: Daniel Henry-Mantilla <[email protected]>
github-actions bot pushed a commit to tautschnig/verify-rust-std that referenced this issue Feb 22, 2025
Fixes rust-lang#89150

Co-authored-by: Daniel Henry-Mantilla <[email protected]>
github-actions bot pushed a commit to thanhnguyen-aws/verify-rust-std that referenced this issue Feb 22, 2025
Fixes rust-lang#89150

Co-authored-by: Daniel Henry-Mantilla <[email protected]>
github-actions bot pushed a commit to thanhnguyen-aws/verify-rust-std that referenced this issue Mar 3, 2025
Fixes rust-lang#89150

Co-authored-by: Daniel Henry-Mantilla <[email protected]>
github-actions bot pushed a commit to thanhnguyen-aws/verify-rust-std that referenced this issue Mar 4, 2025
Fixes rust-lang#89150

Co-authored-by: Daniel Henry-Mantilla <[email protected]>
github-actions bot pushed a commit to tautschnig/verify-rust-std that referenced this issue Mar 6, 2025
Fixes rust-lang#89150

Co-authored-by: Daniel Henry-Mantilla <[email protected]>
github-actions bot pushed a commit to thanhnguyen-aws/verify-rust-std that referenced this issue Mar 6, 2025
Fixes rust-lang#89150

Co-authored-by: Daniel Henry-Mantilla <[email protected]>
github-actions bot pushed a commit to tautschnig/verify-rust-std that referenced this issue Mar 11, 2025
Fixes rust-lang#89150

Co-authored-by: Daniel Henry-Mantilla <[email protected]>
github-actions bot pushed a commit to tautschnig/verify-rust-std that referenced this issue Mar 11, 2025
github-actions bot pushed a commit to tautschnig/verify-rust-std that referenced this issue Mar 11, 2025
Fixes rust-lang#89150

Co-authored-by: Daniel Henry-Mantilla <[email protected]>
github-actions bot pushed a commit to tautschnig/verify-rust-std that referenced this issue Mar 11, 2025
github-actions bot pushed a commit to model-checking/verify-rust-std that referenced this issue Mar 11, 2025
Fixes rust-lang#89150

Co-authored-by: Daniel Henry-Mantilla <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants