Titbits #3
thorwhalen
started this conversation in
General
Replies: 1 comment
-
Grouping by key and finding duplicatesThe following function groups values by their key, then returns a dict with the sub-dict accumulated, filtered by a value condition: from collections import defaultdict
import operator
import functools
def groupby_key(pairs, *, val_filt=lambda x: True):
d = defaultdict(list)
for key, value in pairs:
d[key].append(value)
return {k: v for k, v in d.items() if val_filt(v)} With this, we can make a values_of_duplicate_keys = functools.partial(groupby_key, val_filt = lambda x: len(x) > 1)
kv_pairs = [(1, 2), (1, 3), (2, 3), (3, 4), (3, 5), (3, 6)]
assert values_of_duplicate_keys(kv_pairs) == {1: [2, 3], 3: [4, 5, 6]} One annoying problem is that because we use a filter that is a lambda function, our import pickle
deserialized_func = pickle.loads(pickle.dumps(values_of_duplicate_keys))
# PicklingError: Can't pickle <function <lambda> at 0x12b371630>: attribute lookup <lambda> on __main__ failed
To solve this (if necessary!) we can use a trick mentioned in my [pickle(pickle, pickle)](https://medium.com/@thorwhalen1/partial-partial-partial-f90396901362) medium article:
```python
from dol import Pipe # just a function that composes functions (easy to make your own)
greater_than_1 = functools.partial(operator.lt, 1)
# verify the function works
assert greater_than_1(2)
assert not greater_than_1(1)
length_greater_than_1 = Pipe(len, greater_than_1)
values_of_duplicate_keys = functools.partial(groupby_key, filt = length_greater_than_1)
values_of_duplicate_keys(kv_pairs)
# we can serialize this:
import pickle
deserialized_func = pickle.loads(pickle.dumps(values_of_duplicate_keys))
# and it works
assert deserialized_func(kv_pairs) == {1: [2, 3], 3: [4, 5, 6]} |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Here's to accumulate some ideas on titbits, before they make it to the actual package.
Beta Was this translation helpful? Give feedback.
All reactions