-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve k-mer filter #11
Comments
nthash may simplify this. It might even be worth trying this instead of rollinghash in the main function, but this choice needs to be finalized now as it will change sketch results |
nthash (6a4a4a2) looks to be slightly quicker on both tests, and has similar memory use. This might be due to by poor seqio.hpp move code for the next and out, or just that they integrated all of this with c strings anyway. But whatever the case I will switch to this, especially as:
|
Closed in 7029c0c |
Improving on #4. This is left for future work, for now.
The current general hash table approach is a reasonable first trade-off between accuracy (it's totally accurate), fast (it's quite slow, but could be worse) and memory use (~1Gb -> will fit on a raspberry pi). As this filtering is the main resource user for read sketching, but changing it will not change sketch results, this is a potential area for future improvement.
Possible improvements include:
Other references:
The text was updated successfully, but these errors were encountered: