I have been actively using CircleCI in my projects for years now and I love it! However, I'm ashamed to admit that most of the failed jobs took me up to 1 hour to get right. By the time I saw CI pass, I had grew an unrivaled hatred towards the red color.
In this post I would like to share a mind-blowing technique of debugging failed CircleCI jobs suggested by Ricardo Feliciano. As the time passed, I have adopted this technique even outside of CI, but I believe there are developers who are as alien to this approach just as I once was. I will also add a few personal tricks and tips to make the most out of this topic. Let's learn how to resolve failed jobs in a fast and efficient manner.
Reproducing an issue
Reliable issue verification is the absolute must in order to resolve any problem. We follow steps not only to reproduce an issue, but also to verify when it's been successfully fixed. There is a myriad of things that can affect an issue's reproduction: environment, connection speed, asynchronicity, dark magic. When your CI runs remotely, all of those factors are, inevitably, at play.
Thus, whenever a remote job fails, it's in our best interest to match the context in witch it failed as close as possible to provide a viable fix. Gladly, there is a way to connect to the very machine that's running our job and explore its state. Enter SSH.
Secure Shell (also SSH) is a network protocol that encrypts data packages sent over an insecure connection. SSH establishes a tunnel between a verified client and a server, encrypting any data that flow in-between.
Originally designed by Tatu Ylönen in Helsinki University of Technology, SSH has been widely adopted, gradually becoming a standard in software engineering. Most of the websites you visit today use SSH for various purposes, like sendind and acceptind traffic, or transfering files.
In this tutorial we are primarily interested in these applications of SSH:
- Logging to a shell on a remote host (machine);
- Transferring files.
Connecting to a remote host via SSH is like openning a door: one requires a key. An SSH key allows a remote machine to recognize a connecting host and decide whether to grant access. For the purpose of this tutorial I presume that your CircleCI is connected with your GitHub acount. This means that GitHub will be responsible for providing your SSH key to CI.
You can skip this section if you are familiar with SSH keys and have an SSH key generated and connected to your CircleCI account.
Check existing SSH keys
First, check if you don't have any SSH keys already on the SSH and GPG keys page in your GitHub account. In case you do, you should see the list of your active SSH keys:
Alternatively, check for any existing SSH keys on your local machine by running:
$ ls -la ~/.ssh
Running the command above will list the available SSH keys, if any. SSH keys file names usually look like this:
In case you don't have any SSH keys, or would like to create a new one for the sake of this tutorial, please follow the instructions below.
Cretate new SSH key
First, let's generate a new SSH key:
$ ssh-keygen -t rsa -b 4096 -C "email@example.com"
Use your GitHub user email.
-t, a type of a key to create (
-b, the number of bits in the key (for the
rsakey type should be greater than
-C, a key comment. The value of this comment is an email that is going to be validated to identify your connections.
ssh-keygen command above will ask you where to save a newly created SSH key, and you can choose the default option, which would be in the
~/.ssh directory. You may also choose to encrypt your new SSH key with a pass-pharse, just bear in mind you would have to enter it each time you use that key.
Verify that a new SSH key has been created by running the following command:
1$ ls -la ~/.ssh2# Should return the list of files that includes your newly created SSH key:3#4# -rw------- 1 username staff 1234 Feb 1 2020 id_rsa5# -rw-r--r-- 1 username staff 1234 Feb 1 2020 id_rsa.pub
Add SSH key to ssh-agent
Regardless of which SSH key you've decided to use, it must be added to yout ssh-agent. Think of SSH agent as a keychain that stores your SSH keys.
Start the agent in the background:
1$ eval "$(ssh-agent -s)"2# Agent pid 12345
~/.ssh/config file by adding the following section:
1Host *2 AddKeysToAgent yes3 UseKeychain yes4 # Point to the generated SSH key5 IdentityFile ~/.ssh/id_rsa
Lastly, add the SSH key to the agent:
$ ssh-add -K ~/.ssh/id_rsa
-K, is a MacOS-specific option that tells ssh-agent to store your key in the keychain. Skip this option when using a different OS.
Verify that you have got your SSH key loaded to the agent by running this command:
1$ ssh-add -l2# 4096 SHA256:03eb924754b981b2aed90b699ff513cb0b6f4f93c35 firstname.lastname@example.org (RSA)
Add SSH key to GitHub
Running CircleCI job with SSH
Once our SSH setup is done, let's switch to the CI part. In order to ssh into a remote host responsible for a particular job, that job must be run with SSH first. You can do that directly from the CircleCI UI by following these instructions:
- Open a failed job's detail page.
- Locate the "Rerun workflow" button.
- Open the dropdown and choose "Rerun job with SSH" option.
Clicking on the option will run the respective job anew, yet this time CircleCI will issue an SSH connection to the remote machine. You can notice that a new step has been added that describes details of that connection:
Those details are likely be different in your case, but your primarily interest should fall onto the machine's host (
126.96.36.199) and port number (
64536). Remember those. CircleCI would also print out a shorthand connection command, which we are going to use in the next step.
Into the remote!
Using default key
(Recommended) If your main SSH key is the one associated with your GitHub account, execute the SSH command CircleCI printed out in the "Enable SSH" step:
$ ssh -p 64536 188.8.131.52
Using explicit key
In case you have got multiple SSH keys you would have to specify which one to use to connect to the remote machine. Provide the path to the same SSH key you use for GitHub as the value of the
-i flag in
$ ssh -i ~/.ssh/id_rsa -p 64536 184.108.40.206
Using a UNIX-based computer, you can connect to a remote SSH server by running
$ ssh -p 64536 220.127.116.11
Answer "yes" if prompted with "Are you sure you want to continue connecting". When successfully executed, you will notice your terminal's working directory changing to the remote machine's name. Any commands issued in this process from now on are executed on the remote machine. For example, we can run the entire build again, or each command in isolation (i.e.
npm test, or
npm run build).
By default you are going to see the
/home/circleci/ directory opened. Depending on your CircleCI configuration you will have a different directory tree there. In my case I have configured my job to run in the
1version: 22jobs:3 build:4 working_directory: ~/release5 steps:6 - checkout7 - ...
Pay attention to your setup and directory structure when applying commands used in this article. I am going to list those according to my setup to remain consistent.
Taking a snapshot
Working with a remote file-system is helpful, but you might quickly find yourself limited. It is a different environment that lacks your favorite IDE and other helpful tools you use for debugging. It may be helpful to know how to download directories and files from the remote machine.
Although you should strive towards your project being reproducible, I highly recommend to download a complete snapshot of the file-system, including installed dependencies. This way you eliminate any possible deviations and operate on the 1-1 instance of your project from the failed job. Depending on the project's (and its dependencies) size, transferring its entire directory may take a significant amount of time. We can compress the working directory into a tarball to decrease its size and make the download procedure faster.
Being on the remote machine let's compress the current working directory into a tarball archive:
$ tar -czvf snapshot.tar.gz ./release
-ccreate an archive;
-zcompress the archive using gzip;
-vdisplay progress in the terminal (verbose);
-faccept a file name of the archive.
After the archive is created you can transfer
snapshot.tar.gz using SCP (Secure Copy Protocol). Open a new terminal window, because you need to be on the local machine to do this step.
$ scp <OPTIONS> <USER>@<HOST>:/<SRC_PATH> <DEST_PATH>
$ scp -P 64536 email@example.com:/home/circleci/snapshot.tar.gz ~/Desktop
-Pconnection port to use (default is
This command will copy the file at
/home/circleci/snapshot.tar.gz from the remote machine to your local
~/Desktop. Unarchive the snapshot of the CI and debug it as if it was a regular folder, because it is now.
Thanks for reading through this article! I hope it will be of good use to you when debugging the next failed CI. Let me know your thoughts on the topic on Twitter!