Skip to main content

Installing software packages on a workstation

While we try to provide software blueprints with many popular software packages pre-installed, there is often the need to install additional software into a workstation. The following is a non-exhaustive list of methods for installing new tools.

Installed packages will be persisted between stop/start cycles of a workstation if you follow the methods on this page.

Using conda/mamba

All software blueprints are currently utilize conda for package management. We recommend using conda to find and install packages whenever possible. Mamba, a drop-in replacement for conda, is faster and offers a better command line experience than conda. Therefore, we recommend using the mamba command whenever possible.

Examples

  1. Install a new package into the base conda environment. Here, we search the bioconda repository for samtools and install the latest version:

    Commands for installing samtools with mamba
    mamba search -c bioconda samtools # samtools version 1.16.1 is the most recent
    mamba install -c bioconda samtools=1.16.1 # select y when asked to confirm
  2. Create a new environment with certain specifications. Sometimes, conflicts between package dependencies result in unsatisfiable environments. In this case, it is best to create a new conda environment with the desired packages. Here, we install an older version of R and the R package ggplot2, and then activate the environment for use.

    Commands for installing an older version of R with mamba
    mamba create -y -n r-4.0 -c conda-forge r==4.0 r-ggplot2 # Install R version 4.0 \
    # and only packages that are compatible with R version 4.0
    mamba activate r-4.0 # activate the environment for use

Using the Ubuntu Linux package manager, APT

All current software blueprints are based on Ubuntu Linux, and therefore come with the Advanced Package Tool (APT). Many popular software packages, and even computational biology tools, are available through APT. While we recommend using conda where possible, this example installs the graph visualization software graphviz.

Commands for installing graphviz with apt
sudo apt update           # updates package repository information
apt-cache search graphviz # search for packages with graphviz in the name or description
sudo apt install graphviz # install the package after finding the right name

Installing Python packages

Python users have a few different ways to install packages.

Using conda for Python packages

Many python packages are present in the conda-forge channel. We recommend using conda to install python packages whenever possible. See the above instructions for examples on using conda/mamba to install packages.

Using pip

pip is the primary package installer for Python. You can use pip to install Python packages on workstations that have Python installed. For example, you can use the following commands to install Biopython:

Commands for installing Biopython with pip
python --version                # ensure you have a working python installation: should be Python 3.N.N
python -m pip --version # ensure you have a working pip installation: should be pip 2X.Y.X
python -m pip install biopython # install it!

Using pip to install a package from GitHub

pip can also be used to install packages from GitHub repositories. This is useful for installing a private package or a specific development version of a public package. For example, you can use the following commands to install a specific revision of Scanpy, a package for single-cell gene expression analysis:

Commands for installing Scanpy from GitHub with pip
python3 -m pip install 'scanpy @ git+https://github.com/scverse/scanpy@d7e13025b931ad4afd03b4344ef5ff4a46f78b2b'
python3 -c "import scanpy; print(scanpy.__version__)"

Installing a package from a clone of a Git repository

Python packages can be installed from clones of Git repositories. For example, you can use the following commands to install the python-helloworld package:

Commands for installing the helloword package from a clone of its Git repository
git clone https://github.com/dbarnett/python-helloworld.git
cd python-helloworld
python3 setup.py install
python3 helloworld.py # will print "Hello, world"

Installing R packages

R users also have a few different ways to install packages.

Using conda to install R packages

Most R packages in CRAN and Bioconductor are now available through the conda-forge and bioconda channels. We recommend using conda/mamba to install R packages whenever possible. These commands must be typed in a terminal, outside of an R session. In RStudio, you can use the "Terminal" tab (next to "Console").

Packages in CRAN are available in the conda-forge channel and have the prefix r-. Packages in Bioconductor are available in the bioconda channel and have the prefix bioconductor-.

Commands for installing ggplot2 (in CRAN) and limma (in Bioconductor)
mamba search -c conda-forge *ggplot2*        # search for the right package name
mamba install -c conda-forge r-ggplot2 # install ggplot2
mamba search -c bioconda *limma* # search for the right package name
mamba install -c bioconda bioconductor-limma # install limma

Using the install module to install packages from CRAN

The Comprehensive R Archive Network (CRAN) repository offers many general purpose tools that can be installed with the built-in install module. For example, you can use the following commands to install the popular packages ggplot2 and tidyr:

Commands for installing ggplot2 from CRAN
install.packages(c("ggplot2", "tidyr"))

Using Bioconductor

Bioconductor provides open-source computational biology and bioinformatics packages. For example, you can use the following commands to install GenomicRanges:

Commands for installing GenomicRanges from Bioconductor
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")

BiocManager::install("GenomicRanges")

Using devtools to install packages from their source code

The devtools package can also be used to install packages from their sources, such as from GitHub. For example, you can use the following commands to install ArchR, a package for processing and analyzing single-cell ATAC-seq data:

Commands for installing ArchR with devtools
# install devtools and BiocManager, if needed
if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
# install ArchR
devtools::install_github("GreenleafLab/ArchR", ref="master", repos = BiocManager::repositories())
# Load ArchR and install extra dependencies
library(ArchR)
ArchR::installExtraPackages()

Compiling packages from their sources

When the above solutions cannot be used, packages can also be installed from their sources. For example, you can use the following commands to install samtools from its source. Note the argument to ./configure, which changes the default directory for the make install command. By placing the installation in the home directory, it becomes persistent.

Commands for installing samtools from its source
# samtools is distributed as a tar.bz2 file, which requires the lbzip2 package to uncompress
# other samtools dependencies are installed with apt
sudo apt install --yes --no-install-recommends \
make gcc zlib1g-dev liblzma-dev \
lbzip2 libncurses5-dev libbz2-dev
# download samtools source and uncompress it
wget https://github.com/samtools/samtools/releases/download/1.16.1/samtools-1.16.1.tar.bz2
tar xvf samtools-1.16.1.tar.bz2
# change into the samtools directory and configure the build
# we set the prefix to be within the home directory, so the
# installation is persistent between stop/start cycles.
cd samtools-1.16.1
./configure -prefix=/home/bench-user/.local
# build it
make
# install
make install