What I learned from the CI of Facebook and their Open Source Projects
October 05, 2018Aurélien Le Masson8 min read
Stop wasting your time on tasks your CI could do for you.
Find 4 tips on how to better use your CI in order to focus on what matters - and what you love: code. Let's face it: as a developer, a huge part of the value you create is your code.
Note: Some of these tips use the GitHub / CircleCI combo. Don't leave yet if you use BitBucket or Jenkins! I use GitHub and CircleCi on my personal and work-related projects, so they are the tools I know best. But most of those tips could be set up with every CI on the market.
Tip 1: Automatic Changelogs
I used to work on a library of React reusable components, like Material UI. Several teams were using components from our library, and with our regular updates, we were wasting a lot of time writing changelogs. We decided to use Conventional Commits. Conventional Commits is a fancy name for commits with a standardized name:
The standard format is “TYPE(SCOPE): DESCRIPTION OF THE CHANGES”.
TYPE can be
- feat: a new feature on your project
- fix: a bugfix
- docs: update documentation / Readme
- refactor: a code change that neither fixes a bug nor adds a feature
- or others…
SCOPE (optional parameter) describes what part of your codebase is changed within the commit.
DESCRIPTION OF THE CHANGES is pretty much what you would write in a “traditional” commit message. However, you can use keywords in your commit message to add more information. For instance:
fix(SomeButton): disable by default to fix IE7 behaviour BREAKING CHANGE: prop `isDisabled` is now mandatory
Why is this useful? Three main reasons:
- Allow scripts to parse the commit names, and generate changelogs with them
- Help developers thinking about the impact of their changes (Does my feature add a Breaking Change?)
- Allow scripts to choose the correct version bump for your project, depending on “how big” the changes in a commit are (bugfix: x.y.Z, feature: x.Y.z, breaking change: X.y.z)
This standard version bump calculation is called Semantic Versioning. Depending on the version bump, you can anticipate the impact on your app and the amount of work needed.
Be careful though! Not everyone follows this standard, and even those who do can miss a breaking change! You should never update your dependencies without testing everything is fine 😉
How to set up Conventional Commits
- Install Commitizen
- Install Semantic Releases
- Add GITHUB_TOKEN and NPM_TOKEN to the environment variables of your CI
- Add `npx semantic-release` after the bundle & tests steps on your CI master/production build
- Use `git cz` instead of `git commit` to get used to the commit message standard
- Squash & merge your feature branch on master/production branch
When you get used to the commit message standard, you can go back to `git commit`, but remember the format! (e.g: `git commit -m “feat: add an awesome feature”`)
Now, every developer working on your codebase will create changelogs without even noticing it. Plus, if your project is used by others, they only need a glance at your package version/changelog to know what changes you’ve made, and if they are Breaking.
Tip 2a: Run parallel tasks on your CI
Why do I say task instead of tests? Because a CI can do a lot more than run tests! You can:
- Generate automatic changelogs 😉 and version your project
- Build and push the bundle on a release branch
- Deploy your app
- Deploy your documentation site
There are several ways to use parallelism to run your tasks.
The blunt approach
This simply consists of using the built-in parallelism of your tasks, combined with a multi-thread CI container.
With Jest, you can choose the number of workers (threads) to use for your test with the `--max-workers` flag.
With Pytest, try xdist and the `-n` flag to split your tests on multiple CPUs.
Another way of parallelizing tests is by splitting the test files between your CI containers, as React tries to do it. However, I won't write about this approach in this article since the correct way of doing it is nicely explained in the CircleCi docs.
Tip 2b: CircleCI Workflows
With Workflows, we reduced our CI Build time by 25% on feature branches (from 11" to 8"30) and by 30% on our master branch (from 16"30 to 11"30). With an average of 7 features merged on master a day, this is 1 hour and 30 minutes less waiting every day for our team.
Workflow is a feature of CircleCI. Group your tasks in Jobs, then order your Jobs how it suits your project best. Let's imagine you are building a library of re-usable React Components (huh, I think I've already read that somewhere...). Your CI:
- Sets up your project (maybe spawn a docker, install your dependencies, build your app)
- Runs unit/integration tests
- Runs E2E tests
- Deploys your Storybook
- Publishes your library
Each of those bullet points can be a Job: it may have several tasks in it, but all serve the same purpose. But do you need to wait for your unit tests to pass before launching your E2E tests? Those two jobs are independent and could be running on two different machines.
As you can see, it is pretty straight-forward to re-order or add dependencies between steps.
Note: Having trouble setting up a workflow? You can SSH on the machine during the build.
But be careful with the parallelism: resources are not unlimited; if you share your CI plan with other teams in your organization, make sure using more resources for parallelism will not be counter-productive at a larger scale. You can easily understand why using 2 machines for 10 minutes can be worse than using 1 machine for 15 minutes:
Plus, sharing the Workspace (the current state) of one machine to others (e.g: after running `yarn`, to make your dependencies installed for every job) costs time (both when saving the state on the first machine and loading it on the other).
So, when should I parallelize my CI tasks?
The most optimized formula would be to split jobs when jobDuration > (nb_containers_available * workspaceSharingDuration).
If you want to remember something simpler, a good rule of thumb is always merge jobs which duration is < 1 min.
Workspace sharing can take up to a minute for a large codebase. You should try several workflow configurations to find what's best for you.
Tip 3: Set up cron(tab)s
Crontabs help make your CI more reliable without making builds longer.
- Want to run in-depth performance tests that need to send requests to your app? Schedule it for night time with a cron!
- Want to publish a new version of your app every week? Cron.
- Want to train your ML model but it takes hours? Your CI could trigger the training every night.
Some of you may wonder: what is a cron/crontab? Cron(tab) is an abbreviation of ChronoTable, a job scheduler. A cron is a program that executes a series of instructions at a given time. It can be once an hour, once a day, once a year...
I worked on a project in finance linking several sources of data and API's. Regression was the biggest fear of our client. If you give a user outdated or incorrect info, global financial regulators could issue you a huge fine.
Therefore, I built a tool to generate requests with randomized parameters (country, user profile...), play them, and check for regressions. The whole process can take an hour. We run it via our CI, daily, at night, and it saved the client a lot of trouble.
You can easily set up crons on CircleCi if you've already tried Jobs/Workflows. Check out the documentation.
Note: Crons use the POSIX date-time notation, which can be a bit tricky at first. Check out this neat Crontab Tester tool to get used to it!
- Learn Shell! All Continuous Integration / Continuous Delivery systems can run Shell scripts. Don't be afraid to sparkle some scripts in your build! Add quick checks between/during tasks to make debugging easier, or make your build fail faster: you don't want to wait for the full 10 minutes when you can check at 2'30 that your lock file is not up-to-date!
- Use cache on your project dependencies!
- Add an extra short task to your CI to connect useful tools like Codecov.io or Danger
If you have any other tip you would like to share, don't hesitate!
Aurélien Le Masson
Architect Developer - Python & CI Enthusiast