You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
$ cat data/ncbi/metadata.tsv | csvtk grep -t -f strain -P auspice.strains.tsv | csvtk cut -t -f strain,genoflu | grep -v B3.13 | csvtk pretty -tstrain genoflu ------------------------------------- ----------------------------------------------------------------------------------A/cattle/Texas/24-009499-002/2024 Not assigned: No Matching Genotypes A/cattle/Texas/24-009308-003/2024 Not assigned: Only 7 segments >98.0% match found of total 8 segments in input fileA/cattle/NewMexico/24-010195-004/2024 Not assigned: Only 7 segments >98.0% match found of total 8 segments in input fileA/cattle/Colorado/Broad_MD_041/2024 Not assigned: Only 6 segments >98.0% match found of total 8 segments in input fileA/cattle/Colorado/Broad_ME_003/2024 Not assigned: Only 6 segments >98.0% match found of total 8 segments in input fileA/cattle/Idaho/Broad_ME_018/2024 Not assigned: Only 6 segments >98.0% match found of total 8 segments in input fileA/cattle/Idaho/Broad_ME_020/2024 Not assigned: Only 7 segments >98.0% match found of total 8 segments in input fileA/cattle/Colorado/Broad_MF_011/2024 Not assigned: Only 5 segments >98.0% match found of total 8 segments in input fileA/cattle/Missouri/Broad_MD_031/2024 Not assigned: Only 7 segments >98.0% match found of total 8 segments in input fileA/cattle/Texas/Broad_MD_027/2024 Not assigned: Only 5 segments >98.0% match found of total 8 segments in input fileA/cattle/Colorado/Broad_MF_016/2024 Not assigned: Only 6 segments >98.0% match found of total 8 segments in input fileA/cattle/Michigan/Broad_ME_010/2024 Not assigned: Only 7 segments >98.0% match found of total 8 segments in input fileA/Cattle/USA/24-031346-001/2024 Not assigned: Only 7 segments >98.0% match found of total 8 segments in input fileA/Cattle/USA/24-032636-001/2024 Not assigned: Only 7 segments >98.0% match found of total 8 segments in input fileA/Cattle/USA/24-034010-002/2024 Not assigned: Only 7 segments >98.0% match found of total 8 segments in input fileA/Cattle/USA/24-034010-001/2024 Not assigned: Only 7 segments >98.0% match found of total 8 segments in input fileA/Cattle/USA/24-033997-001/2024 Not assigned: Only 7 segments >98.0% match found of total 8 segments in input fileA/PETFOOD/USA/24-037325-013/2024 Not assigned: Only 7 segments >98.0% match found of total 8 segments in input fileA/PETFOOD/USA/24-037325-012/2024 Not assigned: Only 5 segments >98.0% match found of total 8 segments in input file
Including full genome strains which don't match B3.13
We may wish to relax the 98% cutoff. Looking at some of those examples above the number of Ns is perhaps behind their exclusion:
A/cattle/Texas/24-009499-002/2024 has 4.5kb of Ns on the branch leading to it, although few mutations indicating that it is likely to be part of the outbreak
A/cattle/Texas/24-009308-003/2024 - similarly - 4.5kb of Ns
Including strains with fewer than 8 segments sequenced
If we modify GenoFLU to report segment-level calls for strains with <8 segments then we can match on (e.g) "7 segments sequenced and all agree with B3.13 constellation". This improvement to GenoFLU was also mentioned here as being desirable more generally.
The text was updated successfully, but these errors were encountered:
With the recent move to B3.13 filtering defining the cattle-outbreak genome build we are not able to include strains with fewer than 8 sequenced segments (and thus the implementation in #111 is outdated). Furthermore we're going to drop some strains because their genoFLU calls aren't B3.13. Comparing the last successful cattle-flu dataset we are going to drop the following strains due to not being B3.13:
Including full genome strains which don't match B3.13
We may wish to relax the 98% cutoff. Looking at some of those examples above the number of Ns is perhaps behind their exclusion:
A/cattle/Texas/24-009499-002/2024
has 4.5kb of Ns on the branch leading to it, although few mutations indicating that it is likely to be part of the outbreakA/cattle/Texas/24-009308-003/2024
- similarly - 4.5kb of NsIncluding strains with fewer than 8 segments sequenced
If we modify GenoFLU to report segment-level calls for strains with <8 segments then we can match on (e.g) "7 segments sequenced and all agree with B3.13 constellation". This improvement to GenoFLU was also mentioned here as being desirable more generally.
The text was updated successfully, but these errors were encountered: