Installing software packages on a workstation
While we try to provide software blueprints with many popular software packages pre-installed, there is often the need to install additional software into a workstation. The following is a non-exhaustive list of methods for installing new tools.
Installed packages will be persisted between stop/start cycles of a workstation if you follow the methods on this page.
Using conda/mamba
All software blueprints are currently utilize conda for package management. We recommend using conda to find and install packages whenever possible. Mamba, a drop-in replacement for conda, is faster and offers a better command line experience than conda. Therefore, we recommend using the mamba
command whenever possible.
Examples
Install a new package into the
base
conda environment. Here, we search thebioconda
repository forsamtools
and install the latest version:Commands for installing samtools with mambamamba search -c bioconda samtools # samtools version 1.16.1 is the most recent
mamba install -c bioconda samtools=1.16.1 # select y when asked to confirmCreate a new environment with certain specifications. Sometimes, conflicts between package dependencies result in unsatisfiable environments. In this case, it is best to create a new conda environment with the desired packages. Here, we install an older version of R and the R package
ggplot2
, and then activate the environment for use.Commands for installing an older version of R with mambamamba create -y -n r-4.0 -c conda-forge r==4.0 r-ggplot2 # Install R version 4.0 \
# and only packages that are compatible with R version 4.0
mamba activate r-4.0 # activate the environment for use
Using the Ubuntu Linux package manager, APT
All current software blueprints are based on Ubuntu Linux, and therefore come with the Advanced Package Tool (APT). Many popular software packages, and even computational biology tools, are available through APT. While we recommend using conda where possible, this example installs the graph visualization software graphviz.
sudo apt update # updates package repository information
apt-cache search graphviz # search for packages with graphviz in the name or description
sudo apt install graphviz # install the package after finding the right name
Installing Python packages
Python users have a few different ways to install packages.
Using conda for Python packages
Many python packages are present in the conda-forge
channel. We recommend using conda to install python packages whenever possible. See the above instructions for examples on using conda/mamba to install packages.
Using pip
pip is the primary package installer for Python. You can use pip to install Python packages on workstations that have Python installed. For example, you can use the following commands to install Biopython:
python --version # ensure you have a working python installation: should be Python 3.N.N
python -m pip --version # ensure you have a working pip installation: should be pip 2X.Y.X
python -m pip install biopython # install it!
Using pip to install a package from GitHub
pip can also be used to install packages from GitHub repositories. This is useful for installing a private package or a specific development version of a public package. For example, you can use the following commands to install a specific revision of Scanpy, a package for single-cell gene expression analysis:
python3 -m pip install 'scanpy @ git+https://github.com/scverse/scanpy@d7e13025b931ad4afd03b4344ef5ff4a46f78b2b'
python3 -c "import scanpy; print(scanpy.__version__)"
Installing a package from a clone of a Git repository
Python packages can be installed from clones of Git repositories. For example, you can use the following commands to install the python-helloworld package:
git clone https://github.com/dbarnett/python-helloworld.git
cd python-helloworld
python3 setup.py install
python3 helloworld.py # will print "Hello, world"
Installing R packages
R users also have a few different ways to install packages.
Using conda to install R packages
Most R packages in CRAN and Bioconductor are now available through the conda-forge
and bioconda
channels. We recommend using conda/mamba to install R packages whenever possible. These commands must be typed in a terminal, outside of an R session. In RStudio, you can use the "Terminal" tab (next to "Console").
Packages in CRAN are available in the conda-forge
channel and have the prefix r-
. Packages in Bioconductor are available in the bioconda
channel and have the prefix bioconductor-
.
mamba search -c conda-forge *ggplot2* # search for the right package name
mamba install -c conda-forge r-ggplot2 # install ggplot2
mamba search -c bioconda *limma* # search for the right package name
mamba install -c bioconda bioconductor-limma # install limma
Using the install
module to install packages from CRAN
The Comprehensive R Archive Network (CRAN) repository offers many general purpose tools that can be installed with the built-in install
module. For example, you can use the following commands to install the popular packages ggplot2
and tidyr
:
install.packages(c("ggplot2", "tidyr"))
Using Bioconductor
Bioconductor provides open-source computational biology and bioinformatics packages. For example, you can use the following commands to install GenomicRanges
:
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("GenomicRanges")
Using devtools to install packages from their source code
The devtools
package can also be used to install packages from their sources, such as from GitHub. For example, you can use the following commands to install ArchR
, a package for processing and analyzing single-cell ATAC-seq data:
# install devtools and BiocManager, if needed
if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
# install ArchR
devtools::install_github("GreenleafLab/ArchR", ref="master", repos = BiocManager::repositories())
# Load ArchR and install extra dependencies
library(ArchR)
ArchR::installExtraPackages()
Compiling packages from their sources
When the above solutions cannot be used, packages can also be installed from their sources. For example, you can use the following commands to install samtools
from its source. Note the argument to ./configure
, which changes the default directory for the make install
command. By placing the installation in the home directory, it becomes persistent.
# samtools is distributed as a tar.bz2 file, which requires the lbzip2 package to uncompress
# other samtools dependencies are installed with apt
sudo apt install --yes --no-install-recommends \
make gcc zlib1g-dev liblzma-dev \
lbzip2 libncurses5-dev libbz2-dev
# download samtools source and uncompress it
wget https://github.com/samtools/samtools/releases/download/1.16.1/samtools-1.16.1.tar.bz2
tar xvf samtools-1.16.1.tar.bz2
# change into the samtools directory and configure the build
# we set the prefix to be within the home directory, so the
# installation is persistent between stop/start cycles.
cd samtools-1.16.1
./configure -prefix=/home/bench-user/.local
# build it
make
# install
make install