Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move panaroo to iuc repo #6811

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

move panaroo to iuc repo #6811

wants to merge 5 commits into from

Conversation

mthang
Copy link
Contributor

@mthang mthang commented Feb 28, 2025

FOR CONTRIBUTOR:

  • I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • License permits unrestricted use (educational + commercial)
  • This PR adds a new tool or tool collection
  • This PR updates an existing tool or tool collection
  • This PR does something else (explain below)

@SaimMomin12 I am moving this panaroo to iuc repo. Please review and merge it when you can. Please let me know after you merged it into IUC repo.

@SaimMomin12 SaimMomin12 self-assigned this Feb 28, 2025

]]></command>
<inputs>
<param name="gff_input_collection" type="data_collection" format="gff" collection_type="list" label="GFF Input Collection" help="A list of gff files (i.e prokka)"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be gff3? I think this was a question from a Galaxy user as well.

xref: https://github.com/gtonkinhill/panaroo/blob/0fc3d0c5cfdae1815a368cbe47edad43bca46d37/panaroo/__main__.py#L52

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of users did ask about this as well. I did have a look at Different input formats and it's required to add all the gff3 into a text file first . In other words, the input is a tab delimited text file contaings gff3 and fasta file (see the different input formats in details) . We'll need to find a way to implement this to support gff3 file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in this case, would it be a good idea to keep both gff and gff3 as input file formats in Galaxy?

Copy link
Contributor

@SaimMomin12 SaimMomin12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few preliminary comment inline.

Copy link
Contributor

@SaimMomin12 SaimMomin12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mthang,

Thanks for your PR for moving Panaroo to IUC. I have reviewed the tool and left some of the comments. Feel free to ping me if something is unclear.

<section name="graph_correction_option" title="Graph Correction" expanded="false">
<param argument="--min_trailing_support" type="integer" value="2" label="Minimum cluster size to keep a gene called at the end of a contig" help="--min_traiiing_support [relexed mode : 2 is used]"/>
<param argument="--trailing_recursive" type="integer" value="1" label="Number of times to perform recursive trimming of low support nodes near the end of contigs" help="--trailing_recursive [relaxed mode: 1 is used]"/>
<param name="edge_support_threshold" type="integer" value="1" label="Edge support threshold" help="--edge_support_threshold [ Minimal edge 1 is used ]"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can all the below params be arguments? (For eg "--edge_support_threshold", "--remove_by_consensus", etc)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I can make the changes.

<param argument="-a" type="select" label="Output alignments of core genes or all genes." help="-a [optional: core or pan; default: None">
<expand macro="gene_alignment"/>
</param>
<param argument="--aligner" type="select" label="Specify an aligner" help="--aligner [mafft|prank|clustal][default: mafft]">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also add prank and clustal to tool requirements? AFAIK, prank and clustal are not internally present in Panaroo and relies on external tool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prank has already been included in the macro file, but not the clustal. In memroy, there are many version of clustal (i.e clustalw, clustalo and etc), so I have not added it to the wrapper yet. As you mentioned, both prank and clustal are depending on external tool. It requires Galaxy admin to install them on Galaxy prior to test the wrapper.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we add the requirements, those tools would be automatically installed in the biocontainer and the tool can use it further.

<element name="gff11.gff" value="11_small.gff"/>
</collection>
</param>
<output_collection name="output" count="13"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we capture contents of some important outputs such gene_presence_absence.csv, etc? This will help to maintain the tool in longer run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries! Let me have a look and update the test section

**INPUTS**
Panaroo now supports multiple input formats. To use non-standard GFF3 files you must profile the input file as a list in a text file (one per line). Separate GFF and FASTA files can be provided per isolate by providing each file delimited by a space or a tab. Genbank file formats are also supported with extensions '.gbk', '.gb' or '.gbff'. These must compliant with Genbank/ENA/DDJB. This can be forced in Prokka by specifying the --compliance parameter.

- a list of gff format in a collection
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gff3?

Suggested change
- a list of gff format in a collection
- a list of gff3 format in a collection

</test>
</tests>
<help><![CDATA[
Panaroo_ is A Bacterial Pangenome Analysis Pipeline.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps adding a few lines about what Panaroo does might be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants