Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot include both --query_cover and --subject_cover in my search. #514

Open
KPapac opened this issue Jun 4, 2024 · 5 comments
Open

Cannot include both --query_cover and --subject_cover in my search. #514

KPapac opened this issue Jun 4, 2024 · 5 comments
Labels

Comments

@KPapac
Copy link

KPapac commented Jun 4, 2024

Hi there,

I have installed eggnog through conda and I have made a custom database with Drosophilidae proteins to annotate against. To do that I execute:

emapper.py --output_dir . -o testing_unclustered --decorate_gff yes --go_evidence non-electronic --itype CDS -i unclustered.fna --cpu 30 --dmnd_db /space/Software/eggnog_5.0_drosophilids_DB/myDrosophila.dmnd --decorate_gff longest_func_annotI15_noWil_noLowGCAllIsoforms.gff3 --evalue 0.000001,

which works as expected. Now I want to make this search more strict by requiring that both the query and the subject coverages are more than 80%.

If I run:
emapper.py --output_dir . -o testing_unclustered --decorate_gff yes --go_evidence non-electronic --itype CDS -i unclustered.fna --cpu 30 --dmnd_db /space/Software/eggnog_5.0_drosophilids_DB/myDrosophila.dmnd --decorate_gff longest_func_annotI15_noWil_noLowGCAllIsoforms.gff3 --evalue 0.000001 --query_cover 80.0,

I get outputs as expected. I also get outputs if I use --subject_cover 80.0 instead of the --query_cover 80.0.

But when I add both flags in my command, like:
emapper.py --output_dir . -o testing_unclustered --decorate_gff yes --go_evidence non-electronic --itype CDS -i unclustered.fna --cpu 30 --dmnd_db /space/Software/eggnog_5.0_drosophilids_DB/myDrosophila.dmnd --decorate_gff longest_func_annotI15_noWil_noLowGCAllIsoforms.gff3 --evalue 0.000001 --query_cover 80.0 --subject_cover 80.0

I get this error:
Error running diamond: Error: vector::_M_range_check: __n (which is 8547) >= this->size() (which is 4095)

4095 is the number of fasta sequences I have in my query file "unclustered.fna", but I don't recognise the 8547 number.

I googled a bit and this error shows up in programs written in C++, which I am not familiar with. Does anyone have any idea what is going on?

I have Python 3.12.3, diamond v2.1.9.163, emapper-2.1.12.

@Cantalapiedra
Copy link
Collaborator

Dear @KPapac ,

Among the output messages of eggnog-mapper you may have a command calling diamond. If it is the case, could you try running the diamond command to see if that one works?

You may also try with the same diamond version that is bundled with eggnog-mapper, which I think is 2.0.11.

Thank you.

Best,
Carlos

@KPapac
Copy link
Author

KPapac commented Jun 4, 2024

Hey Carlos,

Thanks for the quick reply!

Yes there is a message with the diamond command. I execute:
/space/Software/eggnog/bin/diamond blastx -d '/space/Software/eggnog_5.0_drosophilids_DB/myDrosophila.dmnd' -q '/space/no_backup/Kostas/annotating_fly_genomes/unclustered.fna' --threads 30 -o '/space/no_backup/Kostas/annotating_fly_genomes/testing_unclustered.emapper.hits' --sensitive --iterate -e 1e-06 --query-cover 80.0 --subject-cover 80.0 --top 3 --outfmt 6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore qcovhsp scovhsp

and I get the same error: Error: vector::_M_range_check: __n (which is 8547) >= this->size() (which is 4095). This is diamond bundled with eggnog via conda.

You may also try with the same diamond version that is bundled with eggnog-mapper, which I think is 2.0.11.

Do you mean to get the previous version of emapper v2.0.11 and try the diamond that comes with that?

Cheers,
Kostas

@Cantalapiedra
Copy link
Collaborator

Cantalapiedra commented Jun 4, 2024

No, sorry, I meant with diamond 2.0.11. Maybe you could try something like:
mamba install -c bioconda diamond=2.0.11
(or conda instead of mamba)

@KPapac
Copy link
Author

KPapac commented Jun 5, 2024

Hey there, using diamond 2.0.11 works! Thanks for your help. I also made an issue on this in the diamond github, so maybe someone makes a fix for v2.1.9.163.

@KPapac KPapac closed this as completed Jun 5, 2024
@Cantalapiedra
Copy link
Collaborator

Just as a reminder. We should probably fix the bioconda recipe, limiting the diamond version to the tested ones:

https://github.com/bioconda/bioconda-recipes/blob/master/recipes/eggnog-mapper/meta.yaml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants