Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove augur.utils.read strains #1749

Merged
merged 3 commits into from
Feb 25, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

* Updated default latitudes and longitudes for geography traits that includes location name changes. See the pull request for more details. [#1744][] (@joverlee521)
* curate apply-geolocation-rules: Augur's standard geolocation rules are used by default and rules provided via `--geolocation-rules` are considered custom rules that have precedence over the default rules. The `--no-default-rules` flag can be used to ignore the default rules. See the pull request for more details. [#1745][] (@joverlee521)
* `augur.utils.read_strains` has been removed as it's been deprecated since January 2024. The same function is available through the public API as `augur.io.read_strains`. [#1749][] (@joverlee521)

### Features

Expand All @@ -20,8 +21,9 @@ Note that names with spaces in the FASTA header (description line) continue to b

[#1744]: https://github.com/nextstrain/augur/pull/1744
[#1745]: https://github.com/nextstrain/augur/pull/1745
[#1755]: https://github.com/nextstrain/augur/pull/1755
[#1749]: https://github.com/nextstrain/augur/pull/1749
[#1750]: https://github.com/nextstrain/augur/pull/1750
[#1755]: https://github.com/nextstrain/augur/pull/1755

## 28.0.1 (10 February 2025)

Expand Down
2 changes: 1 addition & 1 deletion DEPRECATED.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Users who have both 'name' and 'strain' fields in their data, and want to favor

## `augur.utils.read_strains`

*Deprecated in version 24.0.0 (January 2024). Planned for removal March 2024 or after.*
*Deprecated in version 24.0.0 (January 2024). Removed in version 29.0.0 (February 2025).*

This is part of a [larger effort](https://github.com/nextstrain/augur/issues/1011)
to formalize Augur's Python API.
Expand Down
17 changes: 4 additions & 13 deletions augur/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,11 @@
import pandas as pd
from collections import defaultdict, OrderedDict
from io import RawIOBase
from textwrap import dedent
from .__version__ import __version__

from augur.data import as_file
from augur.io.file import PANDAS_READ_CSV_OPTIONS, open_file
from augur.io.sequences import read_single_sequence
from augur.io.print import print_err

from augur.types import ValidationMode
from augur.errors import AugurError
Expand Down Expand Up @@ -205,7 +203,7 @@ def _read_nuc_annotation_from_gff(record, reference):
types. Note that 'source' isn't really a GFF feature type, but is used
widely in the Nextstrain ecosystem. If there are multiple we check that the
coordinates agree.

Parameters
----------
record : :py:class:`Bio.SeqRecord.SeqRecord`
Expand Down Expand Up @@ -246,7 +244,7 @@ def _read_nuc_annotation_from_gff(record, reference):
if len(nuc.values())>1:
coords = [(name, int(feat.location.start), int(feat.location.end)) for name,feat in nuc.items()]
if not all(el[1]==coords[0][1] and el[2]==coords[0][2] for el in coords):
raise AugurError(f"Reference {reference!r} contained contradictory coordinates for the seqid/genome. We parsed the following coordinates: " +
raise AugurError(f"Reference {reference!r} contained contradictory coordinates for the seqid/genome. We parsed the following coordinates: " +
', '.join([f"{el[0]}: [{el[1]+1}, {el[2]}]" for el in coords]) # +1 on the first coord to shift to one-based GFF representation
)

Expand Down Expand Up @@ -358,7 +356,7 @@ def _read_nuc_annotation_from_genbank(record, reference):
according to <https://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html>.)

See <https://www.insdc.org/submitting-standards/feature-table/> for more.

Parameters
----------
record : :py:class:`Bio.SeqRecord.SeqRecord` reference: string
Expand Down Expand Up @@ -389,7 +387,7 @@ def _read_genbank(reference, feature_names):
Read a GenBank file. We only read GenBank feature keys 'CDS' or 'source'.
We create a "feature name" via:
- for 'source' features use 'nuc'
- for 'CDS' features use the locus_tag or the gene. If neither, then silently ignore.
- for 'CDS' features use the locus_tag or the gene. If neither, then silently ignore.

Parameters
----------
Expand Down Expand Up @@ -766,13 +764,6 @@ def load_mask_sites(mask_file):
}


def read_strains(*files, comment_char="#"):
print_err(dedent("""
DEPRECATION WARNING: augur.utils.read_strains is no longer maintained and will be removed in the future.
Please use augur.io.read_strains instead."""))
return set(read_entries(*files, comment_char=comment_char))


def read_entries(*files, comment_char="#"):
"""Reads entries (one per line) from one or more plain text files.

Expand Down
5 changes: 5 additions & 0 deletions tests/functional/mask/variants.vcf.gz_maskTemp
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
MTB_anc 1199
MTB_anc 1200
MTB_anc 8321
MTB_anc 8322
MTB_anc 8323
Loading