You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi everyone, this is not an issue, but I’m looking for some advice on following the recipes you provide in the "A few recipes" section of v2.1.5 to v2.1.12.
A little context: I ran an RNAseq analysis, and my output is the count-genes.tsv file. Using reference genomes from RefSeq, the annotation of most of the genes in these files is generally fine; most genes were mapped to their corresponding gene name. However, I have some unknown genes with no associated gene symbol, like LOCXXXXXXX (where X is any number).
I plan to find the corresponding orthologs for those genes using related species to increase the number of annotated genes. With this in mind, I ran Orthofinder with related species (mammalian species). In short, the output is orthogroup fasta files that contain orthologous proteins in each file. These files have protein IDs in the format NP_XXXXXXXX or XP_XXXXXXXX. So now, the plan is to use Eggnog-mapper to identify the functional annotations related to these proteins in each orthogroup.
Here’s where I’m a little confused about the next step: I will get the annotations, but I’m wondering how I can track the functional annotation to their respective genes and determine if it is an LOCXXXXXXX-type gene. For example, in the "A few recipes" section, you have options like:
Run search and annotation, using MMseqs after translating input CDS to proteins. Add the search and annotation results to the attributes of an existing GFF file (GFF decoration), using the GeneID field to link features from the GFF to the annotation results. (This seems the most appropriate to me because I can download GFF files from RefSeq-genomes.)
Run gene prediction using a genome to train Prodigal
Repeat the annotation step, using specific taxa as target and reporting the one-to-one orthologs found (This seems like another option, but I’m concerned that this depends on the number of species in the phylogeny since I don’t have too many.)
Do you think these ideas are realistic? Even if I get the functional annotation of the orthologs, I may need to trace them back to their respective positions on the chromosome and check if the gene symbol is unknown. Then, maybe I can use a parameter to confidently replace the gene symbol with its respective ortholog.
In general, I’m looking for guidance on using eggnog-mapper for the potential workflow I have in mind. I’m posting here because some papers have used eggnog-mapper to map to their respective orthologs.
Any comment, suggestion or idea is more that welcome!
The text was updated successfully, but these errors were encountered:
Hi everyone, this is not an issue, but I’m looking for some advice on following the recipes you provide in the "A few recipes" section of v2.1.5 to v2.1.12.
A little context: I ran an RNAseq analysis, and my output is the count-genes.tsv file. Using reference genomes from RefSeq, the annotation of most of the genes in these files is generally fine; most genes were mapped to their corresponding gene name. However, I have some unknown genes with no associated gene symbol, like LOCXXXXXXX (where X is any number).
I plan to find the corresponding orthologs for those genes using related species to increase the number of annotated genes. With this in mind, I ran Orthofinder with related species (mammalian species). In short, the output is orthogroup fasta files that contain orthologous proteins in each file. These files have protein IDs in the format NP_XXXXXXXX or XP_XXXXXXXX. So now, the plan is to use
Eggnog-mapper
to identify the functional annotations related to these proteins in each orthogroup.Here’s where I’m a little confused about the next step: I will get the annotations, but I’m wondering how I can track the functional annotation to their respective genes and determine if it is an LOCXXXXXXX-type gene. For example, in the "A few recipes" section, you have options like:
Do you think these ideas are realistic? Even if I get the functional annotation of the orthologs, I may need to trace them back to their respective positions on the chromosome and check if the gene symbol is unknown. Then, maybe I can use a parameter to confidently replace the gene symbol with its respective ortholog.
In general, I’m looking for guidance on using
eggnog-mapper
for the potential workflow I have in mind. I’m posting here because some papers have usedeggnog-mapper
to map to their respective orthologs.Any comment, suggestion or idea is more that welcome!
The text was updated successfully, but these errors were encountered: