Here’s a diagram of the current Build Civitas (and Meta Jon) architecture:
graph LR
beta.buildcivitas.com-->DNS;
meta.jlericson.com-->DNS;
DNS-->nginx;
nginx-->LE[https://letsencrypt.org];
subgraph vm [virtual machine]
style vm fill:#32CD32
nginx-->errorpages;
subgraph web_only
D((Discourse));
end
subgraph data
C[(civitas)];
J[(jlericson)];
end
mail-receiver-->D;
end
nginx-->D;
D-->C ;
D-->J;
C-->S3[S3 backup];
J-->S3;
This is based on Discourse’s multisite facility.[1]
Test environment
Thanks to putting my backups on S3 I can use virtually the same setup for a test/staging environment. The difference between staging and production is simply these changes to web_only.yml
:
45c45
< DISCOURSE_HOSTNAME: beta.buildcivitas.com
---
> DISCOURSE_HOSTNAME: test.buildcivitas.com
83a84,90
> ## Staging server specific settings## Staging server specific settings
> DISCOURSE_AUTOMATIC_BACKUPS_ENABLED: false
> DISCOURSE_LOGIN_REQUIRED: true
> DISCOURSE_DISABLE_EMAILS: 'non-staff'
> DISCOURSE_S3_DISABLE_CLEANUP: true
> DISCOURSE_ALLOW_RESTORE: true
>
135c143
< - meta.jlericson.com
---
> - test.jlericson.com
It can also help to update the version
parameter to match production rather than taking the latest tests-passed
version of Discourse.
Upgrading Discourse
Having a separate data
container for PostgreSQL and Redis means we can save time on upgrades that don’t require updates to the database. I did some timing on a Droplet with 2 vCPUs 4GB / 50GB Disk which currently costs $24 a month:
Process | Time |
---|---|
rebuild data | 58s |
rebuild web_only | 14m 26s |
Total | 15m 24s |
As you can see, the rebuilding the web_only
container takes the bulk of the time. So skipping the data
container only shaves a minute off the 15 minute proces. The real gain comes from bootstrapping a new web_only
while the site is still running. The launcher rebuild
command does three things:
bootstrap
a new container, but not start it.destroy
the old container andstart
the new container.
During the time it takes to destroy and start the container, the site is down. Fortunately this can take less than half a minute. The bootstrap takes the bulk of the time:
Process | Time |
---|---|
bootstrap web_only | 13m 35s |
destroy web_only | 14s |
start web_only | 10s |
Total | 13m 59s |
This only works if bootstrapping can happen while the site is running. The key limitation is memory. The cheapest Droplet that can run Discourse has 2GB of RAM. That’s not enough to run the site and bootstrap a new container. 4GB work fine. Fortunately, DigitalOcean allows us to resize the Droplet quickly and easily. The total downtime is about a two minutes:
Process | Time |
---|---|
power off Droplet | 21s |
resize Droplet | 44s |
power on Droplet | 10s |
destroy web_only | 14s |
start web_only | 10s |
Total | 1m 39 |
At some point we’ll run on a virtual machine with 4GB RAM, but since it doubles the cost, it’s not worth it until the site has more people depending on it. As it is, nobody but me will even notice the downtime.
Which is based, in turn, on the Ruby on Rails multisite feature. ↩︎