-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Linked list cursors #2570
RFC: Linked list cursors #2570
Conversation
This RFC seems quite vague on the specifics. For example:
This seems like the kind of thing that might be better implemented as a crate first - given the amount of new API surface being added, I don't see how it could ever be accepted without at least a sample implementation. |
@Diggsey good points I'll work on a reference implementation so this can be accepted |
Currently working on the implementation. All is done except for splitting lists. See here: https://github.com/4e554c4c/list_cursors |
Perhaps this could be an eRFC to allow greater experimentation? I would really like to play around with your implementation on nightly before deciding if we want to keep it as is, modify it, or go a different route. |
@mark-i-m the API would land unstable regardless of the e-ness of the RFC. |
@sfackler Yes, but the e-ness allows us to proceed without as clear of a vision on what API we will eventually adopt. In contrast, IIUC, if we accept a normal RFC for this, we are saying that we are reasonably confident in this approach. I think the approach has merit and is worth trying out, but I can't personally support strongly accepting the RFC because I simply don't have any experience with such an API. |
I'm still learning the Rust RFC process. How would I submit an eRFC?
|
@4e554c4c put an "e" before "RFC" ^.^ There's not much difference other than a change in intent. |
I have edited the title to show that this is an eRFC. The API is not finalized and possibly incomplete and I am taking suggestions. |
The cursor API that you propose seems to be almost identical to the one that I implemented a while ago for One thing to note in particular as to how cursors work in my crate:
Another difference is that I have Regarding |
@Amanieu thank you for commenting! |
If I am not mistaken, the proposed API is not safe, i.e. let mut c = list.cursor_mut();
c.move_next();
// here we create a mutable reference to an element
let u = c.current();
c.move_prev();
// drop the element
drop(c.pop());
// here we drop this element
let mut dangle = u.unwrap();
// dereferencing a dangling reference to the now no longer existing element
**dangle = &777;
// use after free confirmed by valgrind... (I am still new to rust, so take the following with a grain of salt...) The reason is that in the code above Now we have basically created two mutable references to the same data: I think one way to prevent this is to separate the operation of editing the list and editing its elements, i.e. another If I remember correctly this is the reason, we have no similar interface in the std... N.B. This is a nice reference highlighting some interesting problems. |
Interesting! I thought that the reference would borrow the cursor (like how IndexMut borrows slices). |
@4e554c4c having given it a little more thought I think the prototype for pub fn current(&mut self) -> Option<&mut T> {...}
// that is elided to
pub fn current<'b>(&'b mut self) -> Option<&'b mut T> {...} not pub fn current(&mut self) -> Option<&'a mut T> {...} The first gives the reference a lifetime that is smaller than |
yep! That seems to fix the problem. The lifetime annotation I added seems to have made this worse. I'm going to look at the lifetimes more and update the RFC fn main() {
let mut list = LinkedList::new();
let mut c = list.cursor_mut();
c.insert(3);
c.insert(2);
c.insert(1);
let u = {
let mut c = c.as_cursor();
c.move_next();
c.current();
};
drop(c.pop());
// use after free!
println!("element: {:?}", u);
} |
Ok, lifetimes should be added to the RFC and fixed in the reference implementation. Tell me what y'all think. |
@4e554c4c: I think the example is actually OK. (I think you meant with the modified signatures impl<'a, T> Cursor<'a, T> {
...
pub fn current(&self) -> Option<&T> {...}
...
}
...
impl<'a, T> CursorMut<'a, T> {
...
pub fn current(&mut self) -> Option<&mut T> {...}
...
} the borrow checker (on nightly) complains about |
If my reading of the nomicon is correct ( pub fn as_cursor<'cm>(&'cm self) -> Cursor<'cm, T>; By elision rules this should be the same as pub fn as_cursor(&self) -> Cursor<T>; (clippy nags about it...) |
To be precise, the methods on |
Actually, this would probably work: impl<'a, T> Cursor<'a, T> {
...
pub fn current(&self) -> Option<&'a T> {...}
...
}
...
impl<'a, T> CursorMut<'a, T> {
...
pub fn current_mut(&mut self) -> Option<&mut T> {...}
pub fn current(&self) -> Option<&T> {...}
// Same with peek/peek_mut, etc
...
} The advantage of having |
Good point, I'll leave that one how it was. |
The reference implementation should be pretty much complete thanks to @xaberus |
Having tinkered with the reference implementation for a while, here are my impressions and comments so far.
I hope to continue this list when I have time to work on it. What are your comments so far? |
I think this can work. The main reason that If the cursor doesn't start at the empty element, could we just do away with the empty element altogether? We could have the invariant that the The only problem I see with this is that we would need to add some extra method to determine whether the cursor is at the beginning/end of a list, e.g. |
This is what I wrote in one of my first attempts, but it is kind of surprising in the case of an empty list:
v.s.
If I request a cursor at the |
@xaberus Note that a wrapping cursor (where BeforeFirst is the same as AfterLast) avoids this problem. This is why I went with that approach in |
@Amanieu I see. In way, this not a real problem: The invariants (sans the last one) enforce a wrapping behavior for inserts which effectively collapses the cursor states for the empty list to // non-wrapping
let c = list.cursor(BeforeFirst);
for _ in 0..n {
c.move_next();
}
c.current() v.s. // wrapping
let c = list.cursor(BeforeFirst);
c.move_next();
for _ in 0..n {
if c.current().is_none() {
return None;
} else {
c.move_next();
}
}
c.current() |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
The |
I would prefer if some changes were made to the method names. In particular, I think that every method name should explicitly specify the direction in which the operation is performed (forwards or backwards).
|
I'm just driving by, but @Amanieu has got it right. This is not a clunky interface at all as long as the meaning of those API methods are really clear and unambiguous about which elements are being affected. This RFC actually strongly reminds me of the more general Zipper concept that Haskell has explored at length (and linked lists are just degenerate trees). I suspect that's not really pertinent to the matter at hand, but I think it's interesting. |
@Amanieu Aiming for symmetry with explicitly specified directions sounds good; however, doesn't this bit present a smidgen of discord in the plan?
Why to the next element rather than the previous? Why aren't there then two versions of it, with explicitly specified directions, like the others? Of course as a matter of pragmatics, you more often want to move forwards rather than backwards, but if we're choosing not to privilege the forwards direction in the rest of the API... |
@glaebhoerl: I second that. Removing an item usually can be decided by a predicate let mut c = list.cursor_mut(BeforeHead);
while c.peek_next().is_some() {
if let Some(true) = c.peek_next().map(f) {
c.pop_next();
} else {
c.move_next();
}
} sounds like a reasonable interface that is symmetric by replacing |
I think that it might be possible to allow for multiple mutable cursors, safely, with a method that splits out two bounded cursors. These bounded cursors will then recognize the position at which the cursor was split, and if the linked list element pointed to is that node, following that pointer through the bounded cursor API instead goes to that bounded cursor's sentinel, and looping back around from the far end will go to the node adjacent to the split-at node on the bounded cursor's side instead of to the other end. Splitting at the sentinel doesn't make sense, so if a split is attempted at the sentinel, then pub fn split_at_mut<'cm>(&'cm mut self) -> Option<(BoundedCursorMut<'cm, T>, &'cm mut T, BoundedCursorMut<'cm, T>)>; (The node in the middle where the list is split is a no man's land for both bounded cursors, so that if the resulting bounded cursors are used to manipulate the nodes bordering the split, they don't have to change the pointers on the other side's nodes to retain linked list consistency, instead only having to change the middle node's pointer pointing towards their side.) |
The final comment period, with a disposition to merge, as per the review above, is now complete. |
Anyone going to make a tracking issue? |
@clarcharr I usually made them but I missed this one for some reason... @sfackler given the recent discussion, can someone from T-libs create the tracking issue and merge the RFC to make sure that y'all have taken in and are OK with the recent discussion? |
Just my grain of salt, but I have used a similar API (https://contain-rs.github.io/linked-list/linked_list/struct.Cursor.html) for a circular linked_list, and the ghost element made it hard to move around since I wasn't interested in where was the beginning/end of the list. The solution I had was to implement seek_forward/seek_backward methods that ensured that the next element was never the ghost, but it was clunky. Would it be somehow possible to have a way to say "I don't want a ghost"? i.e. the ghost would only be here for the empty list, but as soon as you insert an element, it becomes a loop to itself. We could potentially have it as a generics parameter? (not sure about it) |
Ping? |
I am currently on holiday, I will merge this RFC when I get back next week. |
Huzzah! This RFC is hereby merged! Tracking issue: rust-lang/rust#58533 |
The
Instead, the code in the example will visit the same element infinitely until the predicate function returns false. Also, it never checks the first element. Here is a rework: fn remove_replace<T, P, F>(list: &mut LinkedList<T>, p: P, f: F)
where P: Fn(&T) -> bool, F: Fn(T) -> T
{
let mut cursor = list.cursor_front_mut();
loop {
let should_replace = match cursor.current() {
Some(element) => p(element),
None => break,
};
if should_replace {
let old_element = cursor.remove_current().unwrap();
cursor.insert_before(f(old_element));
} else {
cursor.move_next();
}
}
} |
🖼️ Rendered
⏭ Tracking issue
This is my first RFC, so feel free to critique :)
📝 Summary
Many of the benefits of linked lists rely on the fact that most operations (insert, remove, split, splice etc.) can be performed in constant time once one reaches the desired element. To take advantage of this, a
Cursor
interface can be created to efficiently edit linked lists. Furthermore, unstable extensions like theIterMut
changes will be removed.The reference implementation is here, feel free to make implementation suggestions. :)
❌ TODO
split
,split_before
split lists (this seems to be ambiguous in the specification)