You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[Adapting for Another Jurisdiction](#adapting-for-another-jurisdiction)
21
+
-[Customization for Local Adaptation](#customization-for-local-adaptation)
22
22
-[Contributing](#contributing)
23
23
-[License](#license)
24
24
-[Acknowledgements](#acknowledgements)
@@ -109,9 +109,6 @@ _Mencattelli, G., Ndione, M.H.D., Silverj, A. et al. Spatial and temporal dynami
109
109
The Washington focused WNV build uses the sequence [AF481864](https://www.ncbi.nlm.nih.gov/nuccore/AF481864) as this is the sequence that is most closely related to the sequences isolated from New York in 1999.
110
110
_Hadfield J, Brito AF, Swetnam DM, Vogels CBF, Tokarz RE, Andersen KG, Smith RC, Bedford T, Grubaugh ND. Twenty years of West Nile virus spread and evolution in the Americas visualized by Nextstrain. PLoS Pathog. 2019 Oct 31;15(10):e1008042. doi: 10.1371/journal.ppat.1008042. PMID: 31671157; PMCID: PMC6822705._
111
111
112
-
### Subsampling
113
-
The Washington focused WNV build pulls all the WNV sequences available in NCBI and filters the data in the phylogenetic workflow based on criteria defined in the config.yml file that is located inside the build-configs/washington-state folder. The subsampling criteria focuses on geographic location selecting all sequences from Washington, neighboring states, and region but up to a maximum of 5,000 sequences; and up to 300 sequences selected randomly from the rest of the states. All sequences have to meet a minimum genome length that is also specified as part of the subsampling criteria. There is more information about how to subsample data in Nextstrain here [Filter and Subsampling](https://docs.nextstrain.org/en/latest/guides/bioinformatics/filtering-and-subsampling.html)
114
-
115
112
### Lineage Designation
116
113
For global lineage designations, we query [pathoplexus](https://pathoplexus.org/)
117
114
@@ -121,17 +118,16 @@ We further refined the information in the NCBI Host column by categorizing it in
121
118
### Determination of Minimum Genome Length
122
119
The average genome length of WNV is 10,948 bp. Nextstrain's phylogenetic workflow defaults to excluding sequences with less than 90% genome coverage, as the alignment of short sequences can be unreliable. However, due to the limited number of WNV sequences available in NCBI, we evaluated minimum genome length thresholds of 90% (9,800 bp), 80% (8,700 bp), 75% (8,200 bp), and 70% (7,700 bp). For each threshold, we ran the Washington-focused build and compared: (1) the number of sequences included, (2) data gap locations in the alignment files using an alignment viewer, and (3) the topology and lineage assignments from the phylogenetic tree outputs to determine the optimal threshold. We concluded that a minimum genome length of 75% (8,200 bp) included a higher number of sequences while balancing alignment quality. Lastly, we validated this threshold using the global build.
123
120
* To modify the minimum length of nucleotide sequence in the WNV global build enter the desired threshold in the --min-length <MIN_LENGTH> parameter that is listed in the [defaults/config.yaml](https://github.com/nextstrain/WNV/blob/main/phylogenetic/defaults/config.yaml) file
124
-
* To modify the minimum length of nucleotide sequence in the WNV Washington focused build enter the desired threshold in the --min-length <MIN_LENGTH> parameter that is listed in the [washington-state/config.yaml](https://github.com/nextstrain/WNV/blob/main/phylogenetic/build-configs/washington-state/config.yaml) file
121
+
* To modify the minimum length of nucleotide sequence in the WNV Washington focused build enter the desired threshold in the --min-length <MIN_LENGTH> parameter that is listed in the [washington-state/config.yaml](https://github.com/nextstrain/WNV/blob/main/phylogenetic/build-configs/washington-state/config.yaml) file.
125
122
126
-
### Map Specific Locations
127
-
We have added the option to map specific locations using coordinates. The sample data for this feature is available in the file `ingest/defaults/annotations.tsv`. this file is in long data format and contains information for six randomly selected places unrelated to WNV data.
128
-
This feature is useful for states or agencies that need to map the locations of mosquito traps, for example. If the data is sensitive, we recommend modifying the annotations.tsv file locally when running the build.
129
-
To visualize the locations in Auspice:
130
-
1. Navigate to the **Map** options in the left panel.
131
-
2. In the **Geographic resolution** dropdown menu, select the level of data you entered in the `annotations.tsv` file. For example, the sample data maps to location.
123
+
## Customization for Local Adaptation
124
+
This build can be customized for use by other jurisdictions, including as states, cities, counties, or countries.
125
+
126
+
### Subsampling
127
+
The Washington focused WNV build retrieves all available WNV sequences from NCBI and filters the data within the phylogenetic workflow based on criteria defined in the config.yaml file, located in the [build-configs/washington-state](https://github.com/nextstrain/WNV/blob/main/phylogenetic/build-configs/washington-state/config.yaml) folder. For details on the current subsampling configuration and instructions on modifying the criteria, refer to the [phylogenetic/build-configs/washington-state README.md](https://github.com/nextstrain/WNV/blob/main/phylogenetic/build-configs/washington-state/README.md).
132
128
133
-
##Adapting for Another Jurisdiction
134
-
*[Brief overview on how to adapt this build for another jurisdiction, such as a state, city, county, or country. Including links to Readmes in other sections that contain detailed instructions on what and how to modify the files]*
129
+
### Incorporating Additional Metadata
130
+
We have added the option to integrate additional metadata, which can include either public or sensitive information. This feature is especially useful for jurisdictions that need to annotate the phylogenetic trees or map visualizations in Auspice. For example, in the Washington focused WNV build, we mapped the centroids of zip codes where mosquito traps are located. This information is within the phylogenetic workflow in the metadata.tsv file, located in the [phylogenetic/data-private](https://github.com/nextstrain/WNV/tree/main/phylogenetic/data-private) folder. For more details on the current metadata configuration and instructions on modifying it, refer to the [phylogenetic/data-private README.md](https://github.com/nextstrain/WNV/tree/main/phylogenetic/data-private).
135
131
136
132
## Contributing
137
133
For any questions please submit them to our [Discussions](insert link here) page otherwise software issues and requests can be logged as a Git [Issue](insert link here).
0 commit comments