Skip to main content

Downloading files from a workstation

There are several ways to download files from a workstation onto your local computer.

SSH is the best way to download large files from your workstation to your local computer. We recommend using rsync, a free tool for copying files. rsync can be installed with the system package manager on Linux/WSL or with Homebrew on Mac. scp is a similar tool without as many features, and comes pre-installed on most Linux distributions and Mac.

  • rsync (over SSH)
  • WinSCP (over SSH, for Windows users)
  • scp (over SSH)
  • VS Code server
  • JupyterLab
  • RStudio Server
tip

We recommend you first complete the advanced SSH configuration to reduce the verbosity in these commands.

Download a single file from a workstation

Run the command below to copy the single file /home/bench-user/sample_1.fastq.gz on your workstation to the fastq directory on your local machine (which must exist already), replacing the following variables:

  • org-id: Replace with the ID of your organization, such as acme-bio.
  • workstation-id: Replace with the ID of your workstation, such as exceptional-panda-g1i.
  • blueprint-id: Replace with the ID of the blueprint, such as python.
  • compute-cluster-id: Replace with the ID of your compute cluster, such as us-west-2.aws.
  • ssh-private-key-file: Replace with the path to your private SSH key, such as ~/.ssh/id_rsa.
rsync
rsync -P -e \
'ssh -i {ssh-private-key-file} -o "ProxyCommand openssl s_client -quiet -connect {compute-cluster-id}.bench.deeporigin.io:2222 -servername {workstation-id}-{blueprint-id}.org-{org-id}"' \
bench-user@{workstation-id}:/home/bench-user/sample_1.fastq.gz \
fastq

Download a directory from a workstation

Run the command below to copy the directory /home/bench-user/fastq on the workstation to the data directory on the local machine, replacing the following variables:

  • org-id: Replace with the ID of your organization, such as acme-bio.
  • workstation-id: Replace with the ID of your workstation, such as exceptional-panda-g1i.
  • blueprint-id: Replace with the ID of the blueprint, such as python.
  • compute-cluster-id: Replace with the ID of your compute cluster, such as us-west-2.aws.
  • ssh-private-key-file: Replace with the path to your private SSH key, such as ~/.ssh/id_rsa.
rsync
rsync -rP -e \
'ssh -i {ssh-private-key-file} -o "ProxyCommand openssl s_client -quiet -connect {compute-cluster-id}.bench.deeporigin.io:2222 -servername {workstation-id}-{blueprint-id}.org-{org-id}"' \
bench-user@{workstation-id}:/home/bench-user/fastq \
data

It's important to leave the trailing slash off of the remote path fastq, otherwise rsync will copy the contents of fastq, rather than the whole directory.

Helpful rsync options

A few particularly helpful rsync flags are listed below. Use man rsync to get a full explanation of all the options.

  • --update, -u
    • This forces rsync to skip any files which exist on the destination and have a modified time that is newer than the source file. (If an existing destination file has a modification time equal to the source file's, it will be updated if the sizes are different.)
  • -P
    • The -P option is equivalent to --partial --progress. Its purpose is to make it much easier to specify these two options for a long transfer that may be interrupted.
  • --partial
    • By default, rsync will delete any partially transferred file if the transfer is interrupted. In some circumstances it is more desirable to keep partially transferred files. Using the --partial option tells rsync to keep the partial file which should make a subsequent transfer of the rest of the file much faster.
  • --progress
    • This option tells rsync to print information showing the progress of the transfer.
  • --recursive, -r
    • This tells rsync to copy directories recursively.
  • --archive, -a
    • It is a quick way of saying you want recursion and want to preserve almost everything.
  • --dry-run, -n
    • This makes rsync perform a trial run that doesn't make any changes (and produces mostly the same output as a real run). It is most commonly used in combination with the --verbose (-v) and/or --itemize-changes (-i) options to see what an rsync command is going to do before one actually runs it.

Downloading with WinSCP

WinSCP is a free tool for managing files on remote servers from a Windows computer. Please follow the instructions for uploading to set up a WinSCP connection.

Downloading when connected to VS Code over SSH

  1. Connect to your workstation with VS Code over SSH.
  2. Navigate to the location on the workstation with the file to download, either through the "Explorer" tab, or by selecting "File > Open Folder" from the menu.
  3. In the "Explorer" tab on the left side of the interface, right click the file and select "Download".
  4. Select a target location on your local computer and click "Download".
  5. The status bar at the bottom of the VS Code window will indicate the download status.

Downloading over the web

Many of the web endpoints provided for workstations have a way to download single files. These may be slower than SSH, don't offer the ability to download folders, and can't selectively download newer files or resume failed transfers.

Downloading files in VS Code server

  1. Open the VS Code server web endpoint from the workstation list.
  2. Navigate to the location on the workstation with the file to download, either through the "Explorer" tab, or by selecting "File > Open Folder" from the menu.
  3. In the "Explorer" tab on the left side of the interface, right click the file and select "Download".
  4. The file will appear in your browser's download queue. Downloading files with VS Code server

Downloading files in JupyterLab

  1. Open the JupyterLab web endpoint from the workstation list.
  2. From the "File Browser" sidebar, locate the file you want to download, and right click it.
  3. Select "Download" from the menu.
  4. The file will appear in your browser's download queue. Downloading files with JupyterLab

Downloading files in RStudio Server

  1. Open the RStudio Server web endpoint from the workstation list.
  2. From the "File" sidebar, locate and select the files and folders you want to download. RStudio will create a zip file if the selection contains more than one file.
  3. Select More > Export from the toolbar
  4. Specify the file name for the download and click OK.
  5. The file will appear in your browser's download queue. Downloading files with RStudio