Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Seqera containers in pipeline #36

Open
wants to merge 12 commits into
base: dev
Choose a base branch
from
15 changes: 15 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ Thank you to everyone else that has contributed by reporting bugs, enhancements

### Enhancements & fixes

:warning: Bumped minimum Nextflow version required by the pipeline to `24.04.0` to use Seqera Community Containers

[PR #19](https://github.com/seqeralabs/nf-aggregate/pull/19) - Allow underscores in workspace name regex
[PR #22](https://github.com/seqeralabs/nf-aggregate/pull/22) - Copy in nf-test CI from nf-core/fetchngs
[PR #23](https://github.com/seqeralabs/nf-aggregate/pull/23) - Bump Platform CLI version to `0.9.2`
Expand All @@ -30,6 +32,19 @@ Thank you to everyone else that has contributed by reporting bugs, enhancements
[PR #31](https://github.com/seqeralabs/nf-aggregate/pull/31) - Update Platform API URI
[PR #32](https://github.com/seqeralabs/nf-aggregate/pull/32) - Update all nf-core modules and subworkflows
[PR #33](https://github.com/seqeralabs/nf-aggregate/pull/33) - Remove `docker.userEmulation`
[PR #36](https://github.com/seqeralabs/nf-aggregate/pull/36) - Use Seqera containers in pipeline

### Parameters

| Old parameter | New parameter |
| ------------------------------------- | ------------- |
| `--singularity_pull_docker_container` | |

> **NB:** Parameter has been **updated** if both old and new parameter information is present.
>
> **NB:** Parameter has been **added** if just the new parameter information is present.
>
> **NB:** Parameter has been **removed** if new parameter information isn't present.

### Software dependencies

Expand Down
2 changes: 1 addition & 1 deletion modules/local/plot_run_gantt/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ process PLOT_RUN_GANTT {
tag "$meta.id"

conda 'click=8.0.1 pandas=1.1.5 plotly_express=0.4.1 typing=3.10.0.0'
container 'seqeralabs/nf-aggregate:click-8.0.1_pandas-1.1.5_plotly_express-0.4.1_typing-3.10.0.0--ccea219dc6c3d6a1'
container 'community.wave.seqera.io/library/click_pandas_plotly_express_typing:21adb9e2d1b605a5'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOOOOOOOOOO

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We made this to have containers made via the Conda packages declaration. Why bumping the community container image?

Copy link
Member Author

@drpatelh drpatelh May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we figured out how to run these pipelines in offline environments? Is this going to be inherently taken care of by nf-core download and not handled by NF/Wave anymore @ewels ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - if we want to run offline then we cannot use Wave.

The only way that I can see around this is to have the Nextflow Wave plugin create the conda + dockerfile locally and from that figure out what the image name should be. Then this could be checked against the local docker / singularity cache without doing any calls to the internet. But I've spoken to @pditommaso about this already and I think that it was difficult / impossible.

So until we figure that out, I can't see any way to use Wave directly in nf-core pipelines, as it will break all offline usage. So the hardcoded docker + singularity URIs have to stay.

We can of course use Wave to build those images and so my hope is that the developer can specify just the conda requirements and then the automation will pin the image names into a config without human intervention.

If anyone has any better ideas, shout!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are near to have nextflow inspect to report the HTTP uri of the Singularity image files. Would not that allow handling them in the same manner as it's done for Biocontainers right now?

Copy link
Contributor

@adamrtalbot adamrtalbot May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way that I can see around this is to have the Nextflow Wave plugin create the conda + dockerfile locally and from that figure out what the image name should be.

Is it possible to use the hash and check for existing containers before hitting Wave? This doesn't help with offline usage, but does bypass Wave when it isn't needed. Furthermore, it could check for existing Singularity containers for some offline use right now. I'm assuming the hash in the filename is complete enough to do this check which seems like quite a large assumption.

We can of course use Wave to build those images and so my hope is that the developer can specify just the conda requirements and then the automation will pin the image names into a config without human intervention.

I assume this is the end goal but don't see how we'd get there and still include offline support?

Currently the workflow for nf-core would be essentially:

  • play with 'interactive' Wave + Conda for development
  • use https://seqera.io/containers/ to build the 'static' container and grab the URI
  • add this to process container directive(s)
  • open PR to nf-core/modules

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use the hash and check for existing containers before hitting Wave?

Wave generates the hash, so no

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, we'll have to stick with process.container for now.


input:
tuple val(meta), path(run_dump)
Expand Down
3 changes: 2 additions & 1 deletion modules/local/seqera_runs_dump/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,9 @@ include { getRunMetadata } from './functions'

process SEQERA_RUNS_DUMP {
tag "$meta.id"

conda 'tower-cli=0.9.2'
container 'seqeralabs/nf-aggregate:tower-cli-0.9.2--hdfd78af_1'
container 'community.wave.seqera.io/library/tower-cli:0.9.2--dc544a7e574ab73b'

input:
val meta
Expand Down
7 changes: 0 additions & 7 deletions modules/nf-core/multiqc/environment.yml

This file was deleted.

6 changes: 2 additions & 4 deletions modules/nf-core/multiqc/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

21 changes: 7 additions & 14 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -67,11 +67,12 @@ profiles {
cleanup = false
nextflow.enable.configProcessNamesValidation = true
}
wave {
wave.enabled = true
wave.strategy = ['conda', 'container', 'dockerfile', 'spack']
wave.build.repository = 'quay.io/seqeralabs/nf-aggregate'
wave.freeze = true
seqera_containers {
apptainer.ociAutoPull = true
singularity.ociAutoPull = true
Comment on lines +71 to +72

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
apptainer.ociAutoPull = true
singularity.ociAutoPull = true

Not needed. Those forces the use of Docker image what's converted by Singularity

wave.enabled = true
wave.freeze = true
wave.strategy = 'conda'
Comment on lines +71 to +75
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are using Wave behind the scenes to get the container URI, do I need to move these out of a profile and into the top-level config so they are always on?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's keep into a separate profile

}
conda {
conda.enabled = true
Expand Down Expand Up @@ -157,14 +158,6 @@ profiles {
}
}

// Set default registry for Apptainer, Docker, Podman and Singularity independent of -profile
// Will not be used unless Apptainer / Docker / Podman / Singularity are enabled
// Set to your registry if you have a mirror of containers
apptainer.registry = 'quay.io'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could set these to community.wave.seqera.io if we want think. For the same reasons we added quay.io here in the first place.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

community.wave.seqera.io/library/tower-cli:0.9.2--dc544a7e574ab73b

What does the library mean in the URI? Will this always be the same for different registries? So do we need:

apptainer.registry   = 'community.wave.seqera.io'
docker.registry      = 'community.wave.seqera.io'
podman.registry      = 'community.wave.seqera.io'
singularity.registry = 'community.wave.seqera.io'

or this:

apptainer.registry   = 'community.wave.seqera.io/library'
docker.registry      = 'community.wave.seqera.io/library'
podman.registry      = 'community.wave.seqera.io/library'
singularity.registry = 'community.wave.seqera.io/library'

or none of these based on @pditommaso comment?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to use container with Community registry, hence you don't need registry either

Copy link
Contributor

@adamrtalbot adamrtalbot May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for including this is so people can quickly repoint to their own registry if they are using freeze, i.e. set all of these to myprivateecr.registry.io. On the other hand, they can just do this themselves and disregard all the community containers 🤔 . I don't think we ever worked out the developer workflow here.

@drpatelh you should do either 1 of two things here:

  1. docker.registry = 'community.wave.seqera.io/library' and the container directive is container 'click_pandas_plotly_express_typing:21adb9e2d1b605a5'
  2. Do not set docker.registry and use the directivecontainer 'community.wave.seqera.io/library/click_pandas_plotly_express_typing:21adb9e2d1b605a5'

The docker.registry is added as a prefix if it does not exist in the URI of the container already (i.e. it's idempotent). What you have done here is fine but redundant and the registry config option will be ignored.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s. not certain about library, this depends on Wave + ECR behaviour.

docker.registry = 'quay.io'
podman.registry = 'quay.io'
singularity.registry = 'quay.io'

// Nextflow plugins
plugins {
id 'nf-validation' // Validation of pipeline parameters and creation of an input channel from a sample sheet
Expand Down Expand Up @@ -212,7 +205,7 @@ manifest {
homePage = 'https://github.com/seqeralabs/nf-aggregate'
description = """Pipeline to aggregate pertinent metrics across pipeline runs on the Seqera Platform."""
mainScript = 'main.nf'
nextflowVersion = '!>=23.10.0'
nextflowVersion = '!>=24.04.0'
defaultBranch = 'main'
version = '0.2.0'
doi = ''
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading