Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter: Allow --group-by week, refactoring #1067

Merged
merged 4 commits into from
Oct 25, 2022

Conversation

victorlin
Copy link
Member

@victorlin victorlin commented Oct 20, 2022

Description of proposed changes

Some group-by cleanup and implementation of --group-by week.

Related issue(s)

Testing

  • Checks pass
  • Added new Cram test

Checklist

  • Add a message in CHANGES.md summarizing the changes in this PR. Keep headers and formatting consistent with the rest of the file.

Group by day is not practical because:

1. The amount of groups would overly bias temporal diversity
2. In many datasets, daily sequencing volume varies between days of the week.
Put the generated columns in a set so it can be added to more easily.
@victorlin victorlin requested a review from a team October 20, 2022 18:50
@victorlin victorlin self-assigned this Oct 20, 2022
@codecov
Copy link

codecov bot commented Oct 20, 2022

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 61.77%. Comparing base (f148f32) to head (3043718).
Report is 1189 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1067      +/-   ##
==========================================
+ Coverage   61.68%   61.77%   +0.09%     
==========================================
  Files          52       52              
  Lines        6300     6316      +16     
  Branches     1585     1550      -35     
==========================================
+ Hits         3886     3902      +16     
  Misses       2141     2141              
  Partials      273      273              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@victorlin victorlin force-pushed the victorlin/update-filter-group-by branch from 89df17d to c4df460 Compare October 20, 2022 22:21
@victorlin victorlin force-pushed the victorlin/update-filter-group-by branch 2 times, most recently from 6c21ae2 to a6ec84a Compare October 20, 2022 23:12
@victorlin victorlin force-pushed the victorlin/update-filter-group-by branch from a6ec84a to c0abbda Compare October 21, 2022 01:23
@victorlin victorlin requested a review from a team October 21, 2022 18:28
@victorlin victorlin force-pushed the victorlin/update-filter-group-by branch from c0abbda to f2a6d71 Compare October 24, 2022 18:17
@victorlin victorlin requested a review from joverlee521 October 24, 2022 18:17
@victorlin
Copy link
Member Author

victorlin commented Oct 24, 2022

This PR now has some downstream PRs. Ideally, I'd like to merge in this order to avoid merge conflicts:

  1. This PR: filter: Allow --group-by week, refactoring #1067
  2. filter: Use intermediate columns for grouping #1070
  3. Use stacked ambiguous date checking #1072

but I can rebase if needed.

Copy link
Contributor

@joverlee521 joverlee521 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment, but looks good to merge 👍
I'll move on to the other PRs 😄

@victorlin victorlin force-pushed the victorlin/update-filter-group-by branch from f2a6d71 to 3043718 Compare October 25, 2022 17:33
@victorlin victorlin merged commit 0a5bfc1 into master Oct 25, 2022
@victorlin victorlin deleted the victorlin/update-filter-group-by branch October 25, 2022 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

filter: Reduce over-sampling in partial months with --group-by month
3 participants