Rebuilding Jenkins from scratch

Restoring an existing Jenkins instance

If you’re restoring a Jenkins instance that’s disappeared for some reason, you can reuse the existing jenkins Terraform module definition in the digitalmarketplace-aws repo.

Run AWS_PROFILE=main-infrastructure make plan then AWS_PROFILE=main-infrastructure make apply from the main account folder.

Creating a new Jenkins instance

To create a new Jenkins instance (e.g. for testing), you’ll need to:

  • Create the AWS resources for the new Jenkins instance using Terraform.
  • Set up authorisation for the new instance
  • Provision the Jenkins app using Ansible
  • Set up a new URL for the instance
  • Copy any data (e.g. build history) from the old instance to the new instance

Warning

These steps assume that the shared AWS infrastructure for all Jenkins instances (the IAM profile/policy document, ELB certificate, EBS snapshot policy and the S3 bucket to store access logs) are already set up for the account. If you need to rebuild the shared resources, check with the team first.

Creating the AWS resources for the new instance

The infrastructure on which each Jenkins instance relies on is created in the jenkins.jenkins module. of the main AWS account. The module can be instantiated in the main account as many times as we like (typically once).

Define a new Jenkins module in the main account’s jenkins.tf, with a unique name to allow Terraform to namespace resources:

module "jenkins2_the_jenkinsening" { ...

The module will require the following variables to be defined:

Variable Description
source The relative path to the Jenkins module.
name A unique name for the AWS resources, to allow name spacing. A good value for this would be the module block name.
dev_user_ips A list of IP addresses to whitelist for the security groups. Automatically injected from the credentials repo using var.dev_user_ips.
jenkins_public_key_name The name of the key pair which will be put on the ec2 to allow access. This can be reused. If you use the existing variable var.jenkins_public_key you will be able to reuse the private key defined in our credentials repo. Alternatively, generate a new key pair and use the name here.
jenkins_instance_profile The name of the shared instance profile to use to give Jenkins its permissions (this should exist already).
jenkins_wildcard_elb_cert_arn The arn of the shared ELB certificate defined in the main account. Terraform grabs this automatically with aws_acm_certificate.jenkins_wildcard_elb_certificate.arn.
ami_id The ID of the machine image you want to base your image on. If you’re upgrading the server to a new operating system, or fixing security issues, this is what you’ll want to update. New images can be found here.
instance_type The type of EC2 instance to use. Currently we use t3.large.
dns_zone_id The id of our DNS zone. Terraform grabs this automatically with aws_route53_zone.marketplace_team.zone_id
dns_name The DNS address of the new instance, for example ci2.marketplace.team.
log_bucket_name The name of the shared S3 bucket that jenkins should log to (this should exist already).

From the main account Terraform project, digitalmarketplace-aws/terraform/accounts/main run:

$ make plan

This will plan the new module and output to stdout what it’s going to do. Check it carefully to make sure everything looks correct. If all looks good, run:

$ make apply

This will cause Terraform to actually create the module. Cross your fingers.

If there are any errors, Terraform will let you know. If there are you will need to fix them and then plan and apply again. If not, then all the new infrastructure should be up and running. You may need to wait a little while for DNS records to be propagated, but you should be able to ssh into the new instance immediately using the elastic IP address created (find it through the AWS console).

You should now have created:

  • a new EC2 instance
  • a new elastic load balancer (ELB), that uses the shared certificate
  • a new DNS ‘A’ record in Route 53
  • new security groups for the EC2 and ELB instances

Creating a new OAuth app in Github

  • Log into Github as the dm-ssp-jenkins user. Credentials are in the credentials repo in pass/github.com/jenkins-ci. You will need someone with 2FA for dm-ssp-jenkins handy.

  • Go to Settings / Developer Settings / OAuth Apps and click New OAuth App.

  • Give it a friendly name. Something like Jenkins 2 OAuth app.

  • Set Homepage URL as the full URL to the new instance, for example https://ci2.marketplace.team

  • Application description is optional.

  • Set Authorization callback URL to <Homepage URL>/securityRealm/finishLogin. To follow the example from above you would use https://ci2.marketplace.team/securityRealm/finishLogin

  • Click Register application and take note of the provided Client ID and Client Secret.

  • Add the client id and client secret for the app to digitalmarketplace-credentials/jenkins-vars/jenkins.yaml. They need to be stored as a nested dict under jenkins_github_auth_by_hostname with the host name for the new instance as the key, along side the existing applicatin. For example:

    jenkins_github_auth_by_hostname:
        ci2.marketplace.team:
            client_id: <CLIENT-ID>
            client_secret: <CLIENT-SECRET>
        ci.marketplace.team:
          .....
    

Provision the box with Ansible

In the digitalmarketplace-jenkins repo, update /playbooks/hosts with:

  • The URL of the new instance
  • The data volume ID. Usually this will be jenkins_data_device=/dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_<VOLUME_ID_WITHOUT_HYPHENS>

If you used a new ssh key pair when defining jenkins_public_key_name, you will need to update the Makefile’s make jenkins recipe to copy your new private key to the $$PRIVATE_KEY_FILE variable. If you used the existing key, do nothing.

Run the full Ansible playbook from the root of the Jenkins repo as follows:

$ make jenkins TAGS=all

This will start the Ansible playbook for Jenkins, and apply all configuration to the instance. Answer “yes” if prompted about authenticity by ssh. The process make take around 20 minutes so get some biscuits.

When running TAGS=all, all jobs will be disabled by default. This is so jobs don’t run unexpectedly once Jenkins is up and running.

Once Ansible is finished, you should be able to access the new Jenkins at the URL you defined earlier.

Checking everything is working and secure

When you login to the new Jenkins server for the first time, there are some things you’ll want to check:

  • Are all the jobs disabled? When you’re first setting up a new Jenkins instance you probably don’t want it to start building things right away. The Ansible playbook should have disabled all jobs, but it’s worth checking before you get some nasty surprises!
  • Is the server accessible outside of the GDS building? It shouldn’t be! See if you can access it when your device is connected to the GovWifi network.
  • You might need to dismiss the “access control for builds” warning (see this screenshot for an example). This warning isn’t relevant for us, as we don’t use access control for Jenkins users.

Copying build history from an old instance to a new instance

It’s important that we maintain the history of some of our jobs, if possible, for audit purposes. We can get the builds of these jobs onto a new box by copying certain files over. We don’t have an automated way of doing this at the moment. This is a method for how it has been done. There may (read probably) be better ways.

  • ssh on to the new instance (lets assume it’s using ci2 as the subdomain):

    $ ssh -i <path to private key> ubuntu@ci2.marketplace.team
    
  • Switch to the jenkins user:

    $ sudo su - jenkins
    
  • Generate a private key pair, the following will work:

    $ ssh-keygen -o -a 100 -t ed25519 -f ~/.ssh/my_cool_keypair
    
  • Copy the public part of the new key pair. You’ll find it in ~/.ssh/my_cool_keypair.pub

  • In a new terminal, ssh on to the old Jenkins instance and switch to the jenkins user.

  • Add the public key you just copied to the end of ~/.ssh/authorized_keys. Create the file if it doesn’t exist. Be careful about accidentally including extra whitespace!

  • Return to the terminal logged in to the new instance.

  • The following script should be available in /data/jenkins/jobs/copy_old_builds.sh. You will need to update it to point to the private key file you just created, as well as the hostname of the source server. If the script isn’t there copy the below, save it to the new instance and update its permissions to make it executable with chmod 700 copy_old_builds.sh:

    #!/usr/bin/env bash
    
    set -o errexit
    
    for d in {release-*,functional-tests-*,visual-regression-*,build-image,database-migration-paas,tag-application-deloyment,clean-and-apply-db-dump-*,index-services-*,index-briefs-*,update-*-index-alias,data-retention-*,database-backup,rotate-api-tokens,rotate-production-notify-callback-token}/; do
      rsync -vazh -e "ssh -i ~/.ssh/my_cool_keypair" --progress --include 'builds**' --include 'last*' --include 'nextBuildNumber' --exclude '*' jenkins@ci.marketplace.team:/data/jenkins/jobs/"$d" /data/jenkins/jobs/"$d"
    done
    
  • If you want to copy the history of other jobs, add the glob pattern to those defined in the for loop.

  • Ensure that the ssh key used by rsync is the private key just created, and the URL is for the old instance.

  • Run the script:

    $ copy_old_builds.sh
    
  • Once done, the history of the old jobs should appear on the new instance. Any pipelines will need to be restarted as this doesn’t seem to transfer their current state across.

Changing the URL for the new instance

  • You may want to change the URL of the new instance. We normally use ci.marketplace.team.
  • If so, update the dns_name variable in the new module definition in the main.tf file of the main account. If there is another module already using this name, you may need to change or remove it first.
  • Run make plan then make apply.
  • It may take a few minutes for DNS setting to propagate.