Jeuicy. suspendisse tellus Social
Overview
Initially, Linear's infrastructure was concentrated in a single location - Google Cloud's us-east1. While this configuration served most users well, it presented long-term challenges. We identified two primary reasons to diversify our data hosting locations.
First, having a separate region with a full instance of the Linear application makes future scaling simpler. If we can host some workspaces in a particular infrastructure deployment (application servers, databases, etc.), then we can add other regions behind the scenes in the future to avoid hitting scaling limits on, for instance, the size of our primary Postgres server.
From the early beginnings of Linear, we’ve sought to invest in our foundation preemptively, always having an eye on potential future bottlenecks we might encounter. This enables us to build out the best possible infrastructure and application framework, without being forced to urgently implement sub-par solutions to scaling issues when we’re hitting those bottlenecks. This is also why we decided to tackle multi-region infrastructure earlier than other companies would typically do.
THE CHALLENGE
- When creating a shared record, we always create it in the authentication service first, and use the returned ID to create a corresponding record in the regional database. This is to ensure that any uniqueness constraints (such as on a workspace's URL key) are applied globally first.
- Deleting works in the same way as creating records, with an additional fallback using Postgres triggers to create an audit log of deleted records, which accounts for records that are deleted due to a foreign key cascade from another table.
- For updating records, we already have a lot of logic around creating efficient updates for clients using our sync engine. We were able to reuse this to also schedule an asynchronous update to the authentication service with the new data, so that updates are easiest for developers: they just update the record in the regional service as normal.
Primary Goals
After prototyping a few different options for the internal API between the API service and the authentication service, we settled on GraphQL. It isn't ideal for service to service communication, but we already had strong tooling for GraphQL in our codebase (our public API is GraphQL), and we used Zeus to generate a type-safe client for the API service to call the authentication service.
From the start, the biggest requirement was that the region selection be invisible to users except when creating a workspace. In practice, this means we didn't want to have a separate domain for our Europe region–you should be able to use our client (linear.app) and API (api.linear.app) via these primary domains, regardless of where your workspace is hosted. We also extended this requirement for integrations and internal tooling, to make the migration to multi-region seamless and require no code changes outside of our application. Every single feature in Linear should work, regardless of which region you are using.
The Solution
We wanted to isolate all multi-region complexity to a few sub-systems and APIs. Engineers should never have to think about multi-region when developing functionality for the Linear backend or clients. They should be able to work in their local development environments and multi-region should simply work without any additions when their code is deployed into production.
We identified a simple architecture that would achieve this. Add a proxy in front of all traffic that would be able to authenticate requests, associate them with a user and their workspace, and route the request to the correct region. We wanted to isolate all multi-region complexity to a few sub-systems and APIs. Engineers should never have to think about multi-region when developing functionality for the Linear backend or clients. They should be able to work in their local development environments and multi-region should simply work without any additions when their code is deployed into production.
We identified a simple architecture that would achieve this. Add a proxy in front of all traffic that would be able to authenticate requests, associate them with a user and their workspace, and route the request to the correct region: