R with reproducible environments using renv package¶
The renv package helps you to create reproducible environments for your R projects. The renv.lock
lockfile records the state of your project’s private library, and can be used to restore the state of that library as required. geniac
can use a renv.lock
lockfile to install all the package dependencies needed by your R environment. However, this is a use case which requires some manual configuration as explained below. geniac
allows you to add as many tools as you wish using renv
. In this section, we provide an example using a tool with the label renvGlad
.
Important
For any tool using renv
, its label must have the prefix renv
!
Create a conda recipe¶
Create the conda recipes in the folder recipes/conda
which defines which R version you want to use, for example create recipes/conda/renvGlad.yml
as follows:
name: renvGlad
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- r-base=4.3.1=h29c4799_3
Add the label in geniac.config¶
In the section params.geniac.tools
of the file conf/geniac.config
, add the label with the three scopes yml
, env
and bioc
, for example:
renvGlad {
yml = "${projectDir}/recipes/conda/renvGlad.yml"
env = "${params.condaCacheDir}/custom_renvGlad"
bioc = "3.17"
}
renvGlad.yml
provides the path to the conda recipe. It should be located in"${projectDir}/recipes/conda"
.renvGlad.env
defines the name of the environment in the conda cache dir.renvGlad.bioc
sets the Bioconductor version which is possibly required to install the R packages.
Create a process to init the renv¶
This process allows the usage of the R software with the multiconda
and conda
profiles. During this process, the dependencies provided in the renv.lock
will be installed. The process renvInit
is provided with the documentation: copy the code renvInit
into the file nf-modules/local/process/renvInit.nf
:
process renvInit {
label 'onlyLinux'
label 'minCpu'
label 'minMem'
input:
val renvName
output:
val renvInitDone, emit: renvInitDone
script:
def renvYml = params.geniac.tools.get(renvName).get('yml')
def renvEnv = params.geniac.tools.get(renvName).get('env')
def renvBioc = params.geniac.tools.get(renvName).get('bioc')
def renvLockfile = projectDir.toString() + '/recipes/dependencies/' + renvName + '/renv.lock'
// The code below is generic, normally, no modification is required
if (workflow.profile.contains('multiconda') || workflow.profile.contains('conda')) {
renvInitDone = "Conda will be created if it does not exist"
"""
if conda env list | grep -wq ${renvEnv} || [ -d "${params.condaCacheDir}" -a -d "${renvEnv}" ] ; then
echo "prefix already exists, skipping environment creation"
else
CONDA_PKGS_DIRS=. conda env create --prefix ${renvEnv} --file ${renvYml}
fi
set +u
conda_base=\$(dirname \$(which conda))
if [ -f \$conda_ conda/../../etc/profile.d/conda.sh ]; then
conda_script="\$conda_base/../../etc/profile.d/conda.sh"
else
conda_script="\$conda_base/../etc/profile.d/conda.sh"
fi
echo \$conda_script
source \$conda_script
conda activate ${renvEnv}
set -u
export PKG_CONFIG_PATH=\$(dirname \$(which conda))/../lib/pkgconfig
export PKG_LIBS="-liconv"
R -q -e "options(repos = \\"https://cloud.r-project.org\\") ; install.packages(\\"renv\\") ; options(renv.consent = TRUE, renv.config.install.staged=FALSE, renv.settings.use.cache=TRUE) ; install.packages(\\"BiocManager\\"); BiocManager::install(version=\\"${renvBioc}\\", ask=FALSE) ; renv::restore(lockfile = \\"${renvLockfile}\\")"
"""
} else {
renvInitDone = "Conda env not needed"
"""
echo "profiles: ${workflow.profile} ; skip renv step"
"""
}
}
In your main.nf
, include the file ./nf-modules/local/process/renvInit.nf
as a nextflow module. The module should be included using a prefix wich is the same as the name of the label of the process that will use renv. In this example, we will consider that the process glad
has the label renvGlad
. Therefore, renvInit
module is included as renvGladInit
(i.e. concatenate the label name with Init
suffix):
include { renvInit as renvGladInit } from './nf-modules/local/process/renvInit'
Then, in your main.nf
:
invoke the nextflow module
renvGladInit
using the label of the tool'renvGlad'
as an argumentcall the process
glad
taking as an argument the output of the processrenvInitGlad
renvGladInit('renvGlad')
glad(renvGladInit.out.renvInitDone)
If you have several processes using renv, do the extact same procedure just using the other label name of your other process.
Copy you renv.lock
file is a sublder inside recipes/dependencies/
¶
We assume that the reader is familiar with renv. In the folder recipes/dependencies/
, create a subfolder with the name of the label of the tool, for example recipes/dependencies/renvGald
. Then, copy your renv.lock
file in this subfolder . Here is an example of a renv.lock
file:
{
"R": {
"Version": "4.3.1",
"Repositories": [
{
"Name": "CRAN",
"URL": "https://cloud.r-project.org"
}
]
},
"Bioconductor": {
"Version": "3.17"
},
"Packages": {
"BiocManager": {
"Package": "BiocManager",
"Version": "1.30.22",
"Source": "Repository",
"Repository": "CRAN",
"Requirements": [
"utils"
],
"Hash": "d57e43105a1aa9cb54fdb4629725acb1"
},
"BiocVersion": {
"Package": "BiocVersion",
"Version": "3.17.1",
"Source": "Bioconductor",
"Requirements": [
"R"
],
"Hash": "f7c0d5521799b7b0d0a211143ed0bfcb"
},
"GLAD": {
"Package": "GLAD",
"Version": "2.64.0",
"Source": "Bioconductor",
"Requirements": [
"R",
"aws"
],
"Hash": "efd50c9f23e052f086c8a22fc92c07c8"
},
"aws": {
"Package": "aws",
"Version": "2.5-3",
"Source": "Repository",
"Repository": "CRAN",
"Requirements": [
"R",
"awsMethods",
"gsl",
"methods"
],
"Hash": "0832bb1e53afaba3a37732c1a5760deb"
},
"awsMethods": {
"Package": "awsMethods",
"Version": "1.1-1",
"Source": "Repository",
"Repository": "CRAN",
"Requirements": [
"R",
"methods"
],
"Hash": "2dfceb7e0b4e9979cc392acb569c6c0d"
},
"gsl": {
"Package": "gsl",
"Version": "2.1-8",
"Source": "Repository",
"Repository": "CRAN",
"Requirements": [
"R"
],
"Hash": "8f8205cbb9f1066a94e22898fe949184"
},
"renv": {
"Package": "renv",
"Version": "1.0.1",
"Source": "Repository",
"Repository": "CRAN",
"Requirements": [
"utils"
],
"Hash": "6523639dd021b32c3199a41cbe6db340"
}
}
}
Add a process which uses the renv¶
Write you process using the label with the renv
tool and always define in the input
section of the val renvInitDone
process glad {
label 'renvGlad'
label 'minCpu'
label 'lowMem'
publishDir "${params.outDir}/GLAD", mode: 'copy'
input:
val renvInitDone
output:
path "BkpInfo.tsv"
script:
"""
Rscript ${projectDir}/bin/apGlad.R
"""
}