-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Addition/update of various dictionary, set variants #6
Conversation
Thanks for the efforts to migrate the ordered containers here. Just out of curiosity, what approach does you use to implement the ordered containers? (In C++ STL, a balanced binary search tree of some sort is typically used, but it doesn't seem to be the case here). Also note that there are other efforts to implement red-black trees (see #5). |
Thanks Dahua. The I was just about to comment on the other issue. |
Very nice! |
@kmsquire Thanks for the clarification. I am generally happy with the PR. Just that we probably need some tests. |
Thanks Dahua. I agree, and will try to get to it this weekend. |
Okay, this should be in good shape. I did make some minor updates to the structure, in anticipation of a forthcoming
One choice I made is that I wanted each of the above dictionary types to be actual concrete classes (with their own constructors, etc.), which required that I implement them as wrappers. This makes the implementation slightly more verbose than using, e.g., type aliases, but I felt the ease of use made the trade-off worth it. Also, I should add explicitly that I used a Feedback and suggestions are very welcome. If all is good, I can squash and merge sometime tomorrow (Tuesday) (or someone else can). |
I took a brief skim. Curious about |
If you got any use out of |
Thanks for reviewing, Dahua. So that the record was more visible, I updated the pull request abstract with a long explanation about the organization of |
This PR looks good to me. @kmsquire: please go ahead to merge it if you think it is ready. (Let me know if you don't have the commit access). |
* OrderedDict (new) * OrderedSet (new) * DefaultDict (updated) * DefaultOrderedDict (new) The PR for this commit contains a detailed explanation of the evolution of OrderedDicts
RFC: Addition/update of various dictionary, set variants
Squashed and merged! |
Thanks for taking the time to go through with this! |
My pleasure! |
OrderedDict
(new)OrderedSet
(new)DefaultDict
(updated)DefaultOrderedDict
(new)The basic structure is there, but not yet any tests or documentation. Bugs are likely still lurking...
cc: @toivoh
Edit Here's an account of the evolution of
OrderedDicts
:I've tried various approaches to implement ordered dicts, and this was pretty much the only one with reasonable performance (about 10% slower than
Dicts
).Background: In
Dicts
, for efficiency, keys and values are stored in separate arrays, as:An easy way to implement
OrderedDicts
is to use a type or tuple forvals
which contains ordering information (generally, a doubly linked list). Unfortunately, this kills performance, becausevals
is (generally) no longer contiguous. (Note that I originally did this long before @loladiro's tuple updates).So, the most performant method is to include ordering information as part of the
Dict
type.HashDict
implements this. It is parametrized asOrdered
orUnordered
, which are aliases forInt
andNothing
, respectively.Nothing
arrays take up little or no space, so other than two additional variables in theDict
type, there is no memory difference and no performance difference betweenHashDict{K,V,Unordered}
andBase.Dict{K,V}
. (The idea forNothing
came from Base.Set.) Ideally, I'd like to seeBase.Dict
implemented based onHashDict
.In JuliaLang/julia#4038, I did this:
Unfortunately, it had its own issues
a. Constructing plain
Dicts
using constructor notation always required theUnordered
type parameter.b.
(Type1=>Type2)[]
producedDict{Type1,Type2,Unordered}
, which required a bit of mucking around in the parser to get things working. It was kinda fun, but it takes a bit of work.c.
OrderedDict
could be a typealias or constructor, but not both (see JuliaLang/julia#3427)Given all of this, I chose to make the
HashDict
type, and implementOrderedDicts
a thin wrapper around it (for aesthetics), and hope that at some point,HashDict
will become the base for bothDict
andOrderedDict
.One other consideration for the
OrderedDict
/HashDict
implementation (for the future)HashDict
uses two additional arrays to maintain ordering information:order
is an ordering of indices in thekeys
/vals
arrayidxs
is the same size askeys
andvals
, and points back into theorder
array, for easy updatingThis could allow direct access to the
nth
key-value pair (which was the original inspiration).Instead, these could be implemented as
prev
andnext
in a doubly linked list, which might actually allow some simplification. (Should probably try this.)