R with reproducible environments using renv package

The renv package helps you to create reproducible environments for your R projects. The renv.lock lockfile records the state of your project’s private library, and can be used to restore the state of that library as required. geniac can use a renv.lock lockfile to install all the package dependencies needed by your R environment. However, this is a use case which requires some manual configuration as explained below. geniac allows you to add as many tools as you wish using renv. In this section, we provide an example using a tool with the label renvGlad.

Important

For any tool using renv, its label must have the prefix renv!

Create a conda recipe

Create the conda recipes in the folder recipes/conda which defines which R version you want to use, for example create recipes/conda/renvGlad.yml as follows:

name: renvGlad
channels:
    - conda-forge
    - bioconda
    - defaults
dependencies:
    - r-base=4.3.1=h29c4799_3

Add the label in geniac.config

In the section params.geniac.tools of the file conf/geniac.config, add the label with the three scopes yml, env and bioc, for example:

renvGlad {
  yml = "${projectDir}/recipes/conda/renvGlad.yml"
  env = "${params.condaCacheDir}/custom_renvGlad"
  bioc = "3.17"
}
  • renvGlad.yml provides the path to the conda recipe. It should be located in "${projectDir}/recipes/conda".

  • renvGlad.env defines the name of the environment in the conda cache dir.

  • renvGlad.bioc sets the Bioconductor version which is possibly required to install the R packages.

Create a process to init the renv

This process allows the usage of the R software with the multiconda and conda profiles. During this process, the dependencies provided in the renv.lock will be installed. The process renvInit is provided with the documentation: copy the code renvInit into the file nf-modules/local/process/renvInit.nf:


process renvInit {
  label 'onlyLinux'
  label 'minCpu'
  label 'minMem'

  input:
    val renvName

  output:
    val renvInitDone, emit: renvInitDone

  script:
    def renvYml = params.geniac.tools.get(renvName).get('yml')
    def renvEnv = params.geniac.tools.get(renvName).get('env')
    def renvBioc = params.geniac.tools.get(renvName).get('bioc')
    def renvLockfile = projectDir.toString() + '/recipes/dependencies/' + renvName + '/renv.lock'
    

    // The code below is generic, normally, no modification is required
    if (workflow.profile.contains('multiconda') || workflow.profile.contains('conda')) {
        renvInitDone = "Conda will be created if it does not exist"
        """
        if conda env list | grep -wq ${renvEnv} || [ -d "${params.condaCacheDir}" -a -d "${renvEnv}" ] ; then
            echo "prefix already exists, skipping environment creation"
        else
            CONDA_PKGS_DIRS=. conda env create --prefix ${renvEnv} --file ${renvYml}
        fi
  
        set +u
        conda_base=\$(dirname \$(which conda))
        if [ -f \$conda_ conda/../../etc/profile.d/conda.sh ]; then
          conda_script="\$conda_base/../../etc/profile.d/conda.sh"
        else
          conda_script="\$conda_base/../etc/profile.d/conda.sh"
        fi
  
        echo \$conda_script
        source \$conda_script
        conda activate ${renvEnv}
        set -u
  
        export PKG_CONFIG_PATH=\$(dirname \$(which conda))/../lib/pkgconfig
        export PKG_LIBS="-liconv"
  
        R -q -e "options(repos = \\"https://cloud.r-project.org\\") ; install.packages(\\"renv\\") ; options(renv.consent = TRUE, renv.config.install.staged=FALSE, renv.settings.use.cache=TRUE) ; install.packages(\\"BiocManager\\"); BiocManager::install(version=\\"${renvBioc}\\", ask=FALSE) ; renv::restore(lockfile = \\"${renvLockfile}\\")"
        """
    } else {
        renvInitDone = "Conda env not needed"
        """
        echo "profiles: ${workflow.profile} ; skip renv step"
        """
    }
}

In your main.nf, include the file ./nf-modules/local/process/renvInit.nf as a nextflow module. The module should be included using a prefix wich is the same as the name of the label of the process that will use renv. In this example, we will consider that the process glad has the label renvGlad. Therefore, renvInit module is included as renvGladInit (i.e. concatenate the label name with Init suffix):

include { renvInit as renvGladInit } from './nf-modules/local/process/renvInit'

Then, in your main.nf:

  • invoke the nextflow module renvGladInit using the label of the tool 'renvGlad' as an argument

  • call the process glad taking as an argument the output of the process renvInitGlad

renvGladInit('renvGlad')
glad(renvGladInit.out.renvInitDone)

If you have several processes using renv, do the extact same procedure just using the other label name of your other process.

Copy you renv.lock file is a sublder inside recipes/dependencies/

We assume that the reader is familiar with renv. In the folder recipes/dependencies/, create a subfolder with the name of the label of the tool, for example recipes/dependencies/renvGald. Then, copy your renv.lock file in this subfolder . Here is an example of a renv.lock file:

{
  "R": {
    "Version": "4.3.1",
    "Repositories": [
      {
        "Name": "CRAN",
        "URL": "https://cloud.r-project.org"
      }
    ]
  },
  "Bioconductor": {
    "Version": "3.17"
  },
  "Packages": {
    "BiocManager": {
      "Package": "BiocManager",
      "Version": "1.30.22",
      "Source": "Repository",
      "Repository": "CRAN",
      "Requirements": [
        "utils"
      ],
      "Hash": "d57e43105a1aa9cb54fdb4629725acb1"
    },
    "BiocVersion": {
      "Package": "BiocVersion",
      "Version": "3.17.1",
      "Source": "Bioconductor",
      "Requirements": [
        "R"
      ],
      "Hash": "f7c0d5521799b7b0d0a211143ed0bfcb"
    },
    "GLAD": {
      "Package": "GLAD",
      "Version": "2.64.0",
      "Source": "Bioconductor",
      "Requirements": [
        "R",
        "aws"
      ],
      "Hash": "efd50c9f23e052f086c8a22fc92c07c8"
    },
    "aws": {
      "Package": "aws",
      "Version": "2.5-3",
      "Source": "Repository",
      "Repository": "CRAN",
      "Requirements": [
        "R",
        "awsMethods",
        "gsl",
        "methods"
      ],
      "Hash": "0832bb1e53afaba3a37732c1a5760deb"
    },
    "awsMethods": {
      "Package": "awsMethods",
      "Version": "1.1-1",
      "Source": "Repository",
      "Repository": "CRAN",
      "Requirements": [
        "R",
        "methods"
      ],
      "Hash": "2dfceb7e0b4e9979cc392acb569c6c0d"
    },
    "gsl": {
      "Package": "gsl",
      "Version": "2.1-8",
      "Source": "Repository",
      "Repository": "CRAN",
      "Requirements": [
        "R"
      ],
      "Hash": "8f8205cbb9f1066a94e22898fe949184"
    },
    "renv": {
      "Package": "renv",
      "Version": "1.0.1",
      "Source": "Repository",
      "Repository": "CRAN",
      "Requirements": [
        "utils"
      ],
      "Hash": "6523639dd021b32c3199a41cbe6db340"
    }
  }
}

Add a process which uses the renv

Write you process using the label with the renv tool and always define in the input section of the val renvInitDone

process glad {
  label 'renvGlad'
  label 'minCpu'
  label 'lowMem'
  publishDir "${params.outDir}/GLAD", mode: 'copy'

  input:
  val renvInitDone

  output:
  path "BkpInfo.tsv"

  script:
  """
  Rscript ${projectDir}/bin/apGlad.R
  """
}