-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TypeId
documentation should mention gotcha re covariant subtypes
#89150
Comments
@rustbot label +A-docs |
@uazu, could you translate this: "This means that taking a type-ID of a generic type argument in a type's new method is not guaranteed to give the same type-ID that the type will eventually be called with." to code, so I have a better chance of understanding this issue? Alternatively, since this seems quite technical and no-one has turned up to fix this, maybe propose your own fix? |
I am not a type theorist! My understanding of the problem is probably incomplete. However the way I understand it is as follows: Normally Rust won't let you silently convert types to other types (I'm talking about types themselves, not values). So if you take the type-id of a type, then later take the type-id of the same type specified separately, it will be the same. BUT that doesn't apply to lifetimes. Rust will happily let you change the lifetime on a type, and that will give a different type-id! So you can't count on the type-id staying the same if you have lifetimes involved. This is called covariance, as I understand it. This caused an actual bug in the The best code to demonstrate was provided by @steffahn in the bug report, which demonstrates bypassing TCell's type-id based check by converting the lifetime on a generic type using So if I have a struct that records a type-id of its generic type argument in one call, that type-id might not actually correspond to the generic type argument with which the exact same instance of that struct is called later on. Rust allows the caller to change the generic type argument through covariance to another type with a different lifetime and different type-id, without the instance's knowledge (converting The change I would like to see to Rust is simply to document this, so that someone like me coming along in the future won't be tripped up by this. But I do not know how best to phrase it. Essentially the issue is that you cannot assume that the type-id of a generic type argument is always the same from one call to another if lifetimes are involved. Perhaps that would be enough of a warning, and maybe mention the I hope this makes things clearer. |
https://doc.rust-lang.org/nomicon/subtyping.html has an explanation about subtyping and variance, but in short: Something with a longer lifetime can always be used somewhere that only a shorter lifetime is necessary, like objects with a 'static lifetime can be used anywhere. Then if a generic type T using the 'static lifetime (T::<'static>) can be used anywhere that that same type with a shorter lifetime (T::<'a>) can be used, then T is covariant in that generic lifetime parameter. Then I look at https://doc.rust-lang.org/nightly/std/any/struct.TypeId.html and your example code and it strikes me that the TypeId's produced are different and that they indeed compare (==) as different. And I see "A TypeId represents a globally unique identifier for a type." and "A TypeId is currently only available for types which ascribe to 'static, but this limitation may be removed in the future." in the docs. Certainly that "ascribe to 'static" is quite poetic, though it seems to refer to the bound on T in TypeId::of::<T: 'static + ?Sized>. If I write "fn x<'a> (: &'a i32) -> bool { |
The relevant detail here is that subtyping also supports coercions from HRTB-like types (i.e. trait objects with higher-ranked bounds like For a more illustrative and explanation, check out (the top part of) this comment by @danielhenrymantilla in a different issue: (it comes with a nice diagram 😻) As @hkBst correctly observes, merely differentiating As a TL;DR, the choice of some lifetime As a fun additional fact: in the context of higher-ranked lifetimes, you can even have different types that are mutual subtypes of each other. E.g. these two types: type T1 = for<'a, 'b> fn(&'a (), &'b ());
type T2 = for<'c> fn(&'c (), &'c ()); (playground, showing the subtyping and |
So if I understand correctly, using two types where S is a subtype of T and so can be used in its place, if you have a covariant parameter in some generic type G, then |
For my case, it is the same impl block, but I'm relying (for safety invariants) on the type-id staying the same through all calls to the struct methods. That was obviously an error on my part, since the caller can freely change the type (within the subtype restrictions), and hence the type-id. This was quite unexpected, and lead to the bug. So some form of warning in the docs would be very helpful. |
Different instantiations of the same generic impl block? I think we should add an example like this, to showcase the danger. |
Yes, it must be different instantiations of the same generic impl block. I tried it in the playground to verify that that was occurring, and it is (in playground):
I think an example along these lines would definitely help explain the danger. (In my case the type-id check occurs only once, and it's a lifetime comparison done by the compiler on another method that causes the issue, but if I had repeated the type-id check on that method, I would have been able to see the divergence between my first check and the second.) |
@uazu I think this is an excellent example showing that you cannot rely on the TypeId for safety. I think I have enough information now to attempt some doc fix, if you still prefer that, probably using this example mostly verbatim. But since you have now done a lot of the preparatory work already, perhaps you feel you might as well take the final step? I'm happy to go either way! |
Please go ahead with making the doc changes with this example or whatever you feel is appropriate. I think it's better done by someone with more familiarity with the internal Rust organization processes. I'd consider it a "win" if it leads to people avoiding this class of bugs in the future. |
@uazu Alright. I'm looking with fresh eyes at the bug report and code in qcell and it is making a lot more sense suddenly :D. I'm considering upgrading your example to something that is perhaps a bit closer to one of the cell types in qcell. Which cell type do you think would be most suited for that, in that the result would be the simplest to understand? |
Hmm, seems QCell is the simplest, but it uses an allocation for uniqueness, so it does not seem useful here, so maybe TCell. |
Just a 100 lines, but probably too much to include: use std::any::TypeId;
use std::cell::UnsafeCell;
use std::collections::HashSet;
use std::marker::PhantomData;
use std::sync::{LazyLock, Mutex};
static ID_SET: LazyLock<Mutex<HashSet<TypeId>>> = LazyLock::new(|| Mutex::new(HashSet::new()));
mod owner {
use super::*;
// Due to its private data member, outside this module,
// this struct can only be created using `new`.
pub struct Owner<O: 'static>(PhantomData<O>);
impl<O: 'static> Owner<O> {
pub fn id() -> TypeId {
TypeId::of::<O>()
}
pub fn new() -> Option<Self> {
let mut set = ID_SET.lock().unwrap();
set.insert(Self::id()).then(|| Self(PhantomData))
}
// Safety: requires mutable reference to a possession's unique owner,
// so the created mutable reference to a possession's precious is also unique
pub fn handle<T>(&mut self, p: &Possession<O, T>) -> &mut T {
unsafe { &mut *p.precious.get() }
}
}
impl<O: 'static> Drop for Owner<O> {
fn drop(&mut self) {
let mut set = ID_SET.lock().unwrap();
(!set.remove(&Self::id())).then(|| panic!("intruder detected"));
}
}
}
use owner::*;
struct Possession<OwnerT, T> {
owner: PhantomData<OwnerT>,
precious: UnsafeCell<T>,
}
impl<O: 'static, T: 'static> Possession<O, T> {
fn new(precious: T) -> Self {
eprintln!("Created with id: {:?}", Owner::<O>::id());
Self {
owner: PhantomData,
precious: UnsafeCell::new(precious),
}
}
fn owner_id(&self) -> TypeId {
eprintln!("Current id: {:?}", Owner::<O>::id());
Owner::<O>::id()
}
}
#[derive(Debug, PartialEq)]
struct TheOneRing;
macro_rules! the_hobbit {
($sneak:ty, $hobbit:ty) => {{
let mut gollum = Owner::<$sneak>::new().unwrap();
let mut bilbo = Owner::<$hobbit>::new().unwrap();
let precious = { Possession::<$sneak, _>::new(Some(TheOneRing)) };
// gollum plays with the one ring
gollum.handle(&precious);
// bilbo cannot play with the one ring
// bilbo.handle(&precious); // arguments to this method are incorrect
// bilbo pretends to be gollum and steals the one ring
let mut sneaky_bilbo = bilbo as Owner<$sneak>;
let ring = sneaky_bilbo.handle(&precious).take();
std::mem::forget(sneaky_bilbo); // or bilbo will be detected
ring
}};
}
// compile_fail: non-primitive cast: `owner::Owner<expected::Hobbit>` as `owner::Owner<expected::Sneak>
#[test]
#[should_panic]
fn expected() {
struct Sneak;
struct Hobbit;
assert_eq!(None, the_hobbit!(Sneak, Hobbit));
}
#[test]
fn an_unexpected_journey() {
type Sneak = fn(&'static ());
type Hobbit = fn(&());
assert_eq!(Some(TheOneRing), the_hobbit!(Sneak, Hobbit));
} |
To clarify, the purpose of the documentation w.r.t. these snippets is to insist on the need for such type abstractions (such as
|
Simpler example with ~50 lines: use std::any::TypeId;
use std::collections::HashSet;
use std::marker::PhantomData;
use std::sync::{LazyLock, Mutex};
use unique::Unique;
static ID_SET: LazyLock<Mutex<HashSet<TypeId>>> = LazyLock::new(|| Mutex::new(HashSet::new()));
mod unique {
use super::*;
// Due to its private data member, outside this module,
// this struct can only be created using `new`.
#[derive(Debug, PartialEq)]
pub struct Unique<TypeAsId: 'static>(PhantomData<TypeAsId>);
impl<TypeAsId: 'static> Unique<TypeAsId> {
pub fn new() -> Option<Self> {
let mut set = ID_SET.lock().unwrap();
set.insert(TypeId::of::<TypeAsId>())
.then(|| Self(PhantomData))
}
}
impl<TypeAsId: 'static> Drop for Unique<TypeAsId> {
fn drop(&mut self) {
let mut set = ID_SET.lock().unwrap();
(!set.remove(&TypeId::of::<TypeAsId>())).then(|| panic!("duplicity detected"));
}
}
}
// A FnRef can be used wherever a FnStaticRef can be used,
// so FnRef is a subtype of FnStaticRef.
// Both are 'static, and thus have a TypeId.
type FnRef = fn(&());
type FnStaticRef = fn(&'static ());
fn main() {
type TheOneRing = FnStaticRef;
let the_one_ring = Unique::<TheOneRing>::new().unwrap();
assert_eq!(Unique::<TheOneRing>::new(), None);
type OtherRing = FnRef;
let other_ring = Unique::<OtherRing>::new().unwrap();
let fake_one_ring = other_ring as Unique<TheOneRing>;
assert_eq!(fake_one_ring, the_one_ring);
std::mem::forget(fake_one_ring);
} |
I would say, the purpose is to warn about the consequences of unintended variance, but certainly, automatically inferred variance can be a bit of a nuisance. |
PR is live, please comment there if you have ideas on it! |
Fixes rust-lang#89150 Co-authored-by: Daniel Henry-Mantilla <[email protected]>
Fixes rust-lang#89150 Co-authored-by: Daniel Henry-Mantilla <[email protected]>
Fixes rust-lang#89150 Co-authored-by: Daniel Henry-Mantilla <[email protected]>
Fixes rust-lang#89150 Co-authored-by: Daniel Henry-Mantilla <[email protected]>
include note on variance and example Fixes rust-lang#89150
Rollup merge of rust-lang#136246 - hkBst:patch-29, r=ibraheemdev include note on variance and example Fixes rust-lang#89150
Fixes rust-lang#89150 Co-authored-by: Daniel Henry-Mantilla <[email protected]>
Fixes rust-lang#89150 Co-authored-by: Daniel Henry-Mantilla <[email protected]>
Fixes rust-lang#89150 Co-authored-by: Daniel Henry-Mantilla <[email protected]>
Fixes rust-lang#89150 Co-authored-by: Daniel Henry-Mantilla <[email protected]>
Fixes rust-lang#89150 Co-authored-by: Daniel Henry-Mantilla <[email protected]>
Fixes rust-lang#89150 Co-authored-by: Daniel Henry-Mantilla <[email protected]>
Fixes rust-lang#89150 Co-authored-by: Daniel Henry-Mantilla <[email protected]>
Fixes rust-lang#89150 Co-authored-by: Daniel Henry-Mantilla <[email protected]>
Fixes rust-lang#89150 Co-authored-by: Daniel Henry-Mantilla <[email protected]>
include note on variance and example Fixes rust-lang#89150
Fixes rust-lang#89150 Co-authored-by: Daniel Henry-Mantilla <[email protected]>
include note on variance and example Fixes rust-lang#89150
Fixes rust-lang#89150 Co-authored-by: Daniel Henry-Mantilla <[email protected]>
A covariant subtype gets a different type-ID, but can be converted freely from one to the other. This means that taking a type-ID of a generic type argument in a type's
new
method is not guaranteed to give the same type-ID that the type will eventually be called with. The solution is to force the generic type argument to be invariant using PhantomData. See this bug inqcell
crate.It would be helpful if this were documented to avoid similar bugs occurring in other crates.
The text was updated successfully, but these errors were encountered: