FAQ¶
How can I use a custom docker registry to build the containers?
How can I write the config files for the different nextflow profiles?
How can I pass specific options to run docker or singularity containers?
How can a conda environment be activated in docker or singularity containers?
How can I generate all the files automatically created by geniac without installing the pipeline?
How can a process have a label which is defined by a variable?
Why does the conda profile fail to build its environment or take to much time?
How can I use geniac on an existing repository?¶
The structure of the repository is based on nf-core and additional files and folders are expected.
All the resources for geniac are available here:
Example:
useCases.bash
Follow the guidelines below if you want to use geniac on an existing repository.
Create a the folder geniac¶
The guidelines and additional utilities we developed are in geniac
should be located in a folder named geniac
in your new repository. The utilities in the geniac
folder can either be copied or link to your pipeline repository as a
git submodule.
Note
If the geniac
is used as a submodule in your repository, execute the command git submodule update --init --recursive
once you have created the geniac
submodule, otherwise the geniac
folder will remain empty.
If you want to create a submodule, you can edit and modify the variables in the file createSubmodule.bash
and follow the procedure.
Create additional files and folders¶
The following files are mandatory:
CMakeLists.txt
: create a folder namedmodules/fromSource
and copy this file inside if your need to Install from source code. Check that the file is namedCMakeLists.txt
.geniac.config
: copy the file in the folderconf
. This file contains a scope namesgeniac
that defines all the nextflow variables needed to build, deploy and run the pipeline.
Moreover, depending on which case your are when you Add a process, you can create whenever you need them the following folders:
├── env
├── modules
└── recipes
├── conda
├── dependencies
├── docker
└── singularity
How does the repository look like?¶
The source code of your repository should look like this:
├── assets # assets needed for runtime
├── bin # scripts or binaries for the pipeline
├── conf # configuration files for the pipeline
│ ├── geniac.config # contains the scopes mandatory for geniac
├── docs # documentation of the pipeline
├── env # process specific environment variables
├── geniac # geniac utilities
│ ├── cmake # source files for the configuration step
│ ├── docs # guidelines for installation
│ ├── install # scripts for the build step
├── main.nf
├── modules
│ └── fromSource # tools installed from source code
│ ├── CMakeLists.txt
│ └── helloWorld
├── nextflow.config
├── recipes # installation recipes for the tools
│ ├── conda
│ ├── dependencies
│ ├── docker
│ └── singularity
└── test # data to test the pipeline
└── data
How can I install custom commands in the docker/singularity recipes automatically generated by geniac?¶
For some tools, it migth be necessary to add custom commands in the docker/singularity recipes automatically generated by geniac. In the conf/geniac.config
file, you ca use:
params.geniac.containers.cmd.post
: to define commands which will be executed at the end of the default commands generated by geniac.params.geniac.containers.cmd.envCustom
: to define environment variables which will be set inside the docker and singularity images.
For more details on how to proceed, see Add custom commands and environment variables in the docker/singularity recipes automatically generated by geniac.
How can I use a custom docker registry to build the containers?¶
Geniac automatically generate recipes for Docker and Singularity. To build the containers, it bootstraps on two docker containers from the official 4geniac docker hub registry which includes several Linux distributions used for the containers. Instead of using the official docker hub registry, you may want to use a custom registry. In this case, make sure that the exact same container tags available on 4geniac are available on this custom registry and use the following option at the configuration step:
cmake ${SRC_DIR}/geniac -DCMAKE_INSTALL_PREFIX=${INSTALL_DIR} -Dap_docker_registry=my-registry-url/
For example, my-registry-url
could be a docker registry available in your gitlab.
How can I write the config files for the different nextflow profiles?¶
The utilities we propose allow the automatic generation of all the config files for the nextflow Profiles. However, if you really want to write them yourself follow the examples described in Config files for the different nextflow profiles.
How should I define the path to the genome annotations?¶
When the pipeline is installed with geniac, the Structure of the installation directory tree contains a directory named annotations
. This directory can be a symlink to the directory with your existing annotations (can be set during Configure with the option ap_annotation_path
). Check that:
The file
geniac.config
defines thegenomeAnnotationPath
in the scopeparams
as follows:
params {
genomeAnnotationPath = params.genomeAnnotationPath ?: "${projectDir}/../annotations"
}
All the paths to your annotations are defined using the variable
params.genomeAnnotationPath
as shown in the filegenomes.config
You use the variables defined in the
genomes.config
in themain.nf
, for exampleparams.genomes['mm10'].fasta
How can I pass specific options to run docker or singularity containers?¶
If needed, you can set the singularityRunOptions
and dockerRunOptions
values to whatever is needed for your configuration in the geniac.config
file. This will set the runOption
parameters (see Nextflow configuration) of the Singularity and Docker directive respectively to the selected value when the Singularity and Docker profiles will be called.
How can a conda environment be activated in docker or singularity containers?¶
In order to activate a conda environment with conda activate myConda_env
when a docker or singularity container is executed, some configuration are required insie the recipes. This is described in How can a conda environment be activated in docker or singularity containers?.
How can I see the recipes for the containers?¶
As geniac automatically generates the recipes of the containers, they are not available in the git repository. However, they can be easily retrieved in several ways. An example to Generate the recipes for the containers is provided.
How can I generate all the files automatically created by geniac without installing the pipeline?¶
There are several ways:
either using make commans: see Build configuration files and Build recipes and containers,
or using Geniac CLI: see Generate the configuration files and Generate the container recipes.
Is geniac compatible with nextflow DSL2?¶
Since version 20.07.1, Nextflow provides the DSL2 syntax that allows the definition of module libraries and simplifies the writing of complex data analysis pipelines. geniac is fully compatible with DSL2 and we provide geniac demo DSL2 as an example. The guidelines to Add a process remain exactly the same.
The main difference between geniac demo and geniac demo DSL2 are:
each process is located in one dedicated file in the folder
nf-modules/local/process
each subworkflow that combines different processes is located in the folder
nf-modules/local/subworkflow
the
main.nf
includes these two folders and uses theworkflow
directive
├── assets # assets needed for runtime
├── bin # scripts or binaries for the pipeline
├── conf # configuration files for the pipeline
│ ├── geniac.config # contains the geniac scope mandatory for nextflow
├── docs # documentation of the pipeline
├── env # process specific environment variables
├── geniac # geniac utilities
│ ├── cmake # source files for the configuration step
│ ├── docs # guidelines for installation
│ ├── install # scripts for the build step
├── main.nf
├── modules # tools installed from source code
│ ├── CMakeLists.txt
│ ├── helloWorld
├── nextflow.config
├── nf-modules # nextflow files for DSL2
│ └── local
│ ├── process
│ │ ├── alpine.nf
│ │ ├── checkDesign.nf
│ │ ├── execBinScript.nf
│ │ ├── fastqc.nf
│ │ ├── getSoftwareVersions.nf
│ │ ├── helloWorld.nf
│ │ ├── multiqc.nf
│ │ ├── outputDocumentation.nf
│ │ ├── standardUnixCommand.nf
│ │ ├── trickySoftware.nf
│ │ └── workflowSummaryMqc.nf
│ └── subworkflow
│ ├── myWorkflow0.nf
│ └── myWorkflow1.nf
├── recipes # installation recipes for the tools
│ ├── conda
│ ├── dependencies
│ ├── docker
│ └── singularity
└── test # data to test the pipeline
└── data
The geniac demo DSL2 can be run as follows:
export WORK_DIR="${HOME}/tmp/myPipelineDSL2"
export SRC_DIR="${WORK_DIR}/src"
export INSTALL_DIR="${WORK_DIR}/install"
export BUILD_DIR="${WORK_DIR}/build"
export GIT_URL="https://github.com/bioinfo-pf-curie/geniac-demo-dsl2.git"
mkdir -p ${INSTALL_DIR} ${BUILD_DIR}
# clone the repository
# the option --recursive is needed if you use geniac as a submodule
# the option --remote-submodules will pull the last geniac version
# using the release branch from https://github.com/bioinfo-pf-curie/geniac
git clone --remote-submodules --recursive ${GIT_URL} ${SRC_DIR}
cd ${BUILD_DIR}
cmake ${SRC_DIR}/geniac -DCMAKE_INSTALL_PREFIX=${INSTALL_DIR}
make
make install
cd ${INSTALL_DIR}/pipeline
nextflow -c conf/test.config run main.nf -profile multiconda
How can a process have a label which is defined by a variable?¶
With nextflow, it is possible to define a label using a variable instead of a fixed string. In this case, the label value must be given in parenthesis. geniac also support such label. However, the label must be defined according to the following format: label (params.someValue ?: 'toolPrefix')
. In any case, the content of the params.someValue
must start by the toolPrefix
value. The geniac linter will check that there is a tool with a name starting with such a prefix, if it is not the case, it will throw an error.
A typical use case is the possibility to launch a pipeline with a version of a tool given as an option on the nextflow command line. Let’s consider that you have declare three versions the mySoft
tool in the geniac.config
file as follows:
params {
geniac{
tools {
mySoft = "conda-forge::mySoft=v0=r351h96ca727_1003`
mySoft_v1 = "conda-forge::mySoft=v1=r351h96ca727_1003`
mySoft_v2 = "conda-forge::mySoft=v1=r351h96ca727_1003`
}
}
}
Then, in the netxflow process, define the label as follows:
process mySoft {
label (params.mySoftVersion ?: 'mySoft')
label 'minMem'
label 'minCpu'
script:
"""
mySoft --version
"""
}
When you launch nextflow, pass the option --mySoftversion
to set which version of mySoft
you want to use.
nextflow run main.nf --mySoftversion v2 -profile test,singularity
You may also write your nextflow code to use the default version (i.e. v0
with the mySoft
label) if no version is specified.
What are the @git_*@ variables?¶
You will find in both the main.nf
and nextflow.config
some variables surrounded by @
such as @git_repo_name@
. These variables are used during the cmake
step to extract the information from the git repository and replace them by their value. These variables are used in the nextflow manifest for example. If needed, you can remove these variables and set the value to whatever you want.
Why does the conda profile fail to build its environment or take to much time?¶
The conda relies on the environment.yml
that is automatically generated by geniac. However, building a Conda recipe can sometimes be very tricky as the order of the channels and the dependencies matters. geniac can not guess what is the appropriate order. Moreover, Conda may want to solve conflicts between incompatible packages. Thus, in some cases, you will have no choice but to correct the environment.yml
file manually, add it the git repository (where is located the main.nf
file) and install the pipeline with the following options:
cmake ${SRC_DIR}/geniac -DCMAKE_INSTALL_PREFIX=${INSTALL_DIR} -Dap_keep_envyml_from_source=ON
Note that it may be impossible to have a working environment.yml
file due to the incompatibility between tools. Use the multiconda profile instead of the conda profile.
Why are the tools available from source installed in pipeline/bin/fromSource and not in pipeline/bin?¶
The tools available from source are installed in pipeline/bin/fromSource to ensure that, when using the singularity or docker profiles, the tools installed inside the containers are used. Indeed, nextflow add the folder pipeline/bin in the environment variable PATH
when the container is launched. To illustrate the impact of this setting, assume that the pipeline has been installed with the singularity images in ${INSTALL_DIR}/pipeline
. Then, create the file ${INSTALL_DIR}/pipeline/bin/helloWorld
which contains the following bash script:
#! /bin/bash
echo "Buenos dias!"
Make this bash script executable with chmod +x ${INSTALL_DIR}/pipeline/bin/helloWorld
and execute the pipeline as follows:
nextflow run main.nf -profile test,singularity
Then, the file ${INSTALL_DIR}/pipeline/results/helloWorld/helloWorld.txt
will contain Buenos dias! instead of Hello World!. This means that singularity uses the bash script in ${INSTALL_DIR}/pipeline/bin/helloWorld
instead of the helloWorld
executable which has been installed inside the image which raises reproducibility issue we can avoid by installing the tools in the folder pipeline/bin/fromSource which is not in the PATH
.
What privileges do I need to build the singularity images?¶
There are several ways.
Using Geniac CLI (see Install the pipeline with the singularity images):
if you have the sudo privileges, use the singularity mode,
if your are allowed to use the fakeroot option, use the singularityfakeroot mode.
Using standard cmake options:
if you have the sudo privileges, pass the option
-Dap_install_singularity_images=ON
to cmake, and then runsudo make
(see Install and run with singularity),if your are allowed to use the fakeroot option, pass both options
-Dap_install_singularity_images=ON
and-Dap_singularity_build_options=--fakeroot
to cmake, and then runmake
.
What is the difference between singularity and apptainer?¶
In may 2021, the commercial entity sylabs behind Singularity forked the project. The original Singularity repository has been moved to https://github.com/apptainer/singularity which will persist as an archive and be set to read-only after the first release of Apptainer (https://github.com/apptainer/apptainer). Apptainer will provide singularity as a command line link and will maintain as much of the CLI and environment functionality as possible. From the user’s perspective, very little, if anything, will change.