Skip to content

Commit e57c891

Browse files
authored
Merge pull request #17 from QuLogic/docs
Document maintenance procedures
2 parents 2840dc1 + fb2c59d commit e57c891

File tree

1 file changed

+173
-2
lines changed

1 file changed

+173
-2
lines changed

README.md

+173-2
Original file line numberDiff line numberDiff line change
@@ -31,12 +31,12 @@ the server with consistent settings.
3131
Setup
3232
-----
3333

34-
Before you can run our ansible playbooks, you need to meet the following
34+
Before you can run our Ansible playbooks, you need to meet the following
3535
prerequisites:
3636

3737
* Create a DigitalOcean API token, and pass it to the inventory generator by
3838
setting the `DO_API_TOKEN` environment variable.
39-
* Set the vault decryption password of the ansible vaulted file with our
39+
* Set the vault decryption password of the Ansible vaulted file with our
4040
secrets. This may be done by setting the `ANSIBLE_VAULT_PASSWORD_FILE`
4141
environment variable to point to a file containing the password.
4242
* Download all the collections the playbooks depend on with the following
@@ -55,3 +55,174 @@ There is currently only one playbook:
5555

5656
* `matplotlib.org.yml`, for the main matplotlib.org hosting. This playbook
5757
operates on droplets with the `website` tag in DigitalOcean.
58+
59+
Adding a new subproject
60+
=======================
61+
62+
When a new repository is added to the Matplotlib organization with
63+
documentation (or an existing repository adds documentation), it will be
64+
necessary to re-configure the server to serve those files. Note, it is
65+
currently assumed that the documentation is on the `gh-pages` branch of the
66+
repository, and it will be served from the top-level subdirectory with the same
67+
name as the repository (similar to GitHub Pages.) There are 4 steps to achieve
68+
this:
69+
70+
1. Generate a secret to secure the webhook. You can follow [GitHub's
71+
instructions for creating
72+
one](https://docs.github.com/en/developers/webhooks-and-events/webhooks/securing-your-webhooks).
73+
2. Add repository to Ansible:
74+
75+
1. Add an entry to the `repos` variable at the top of `matplotlib.org.yml`.
76+
2. Add the webhook secret to `files/webhook_vars.yml`.
77+
78+
3. Re-run Ansible on the playbook like [below](#running-ansible). This should
79+
clone the new repository and update the webhook handler.
80+
4. Configure a webhook on the new repository with the following settings:
81+
82+
- Payload URL of `https://do.matplotlib.org/gh/<repository>`
83+
- Content type of application/json
84+
- Use the secret generated in step 1
85+
- Trigger only on "push" events
86+
87+
If everything is done correctly, the GitHub webhook should have posted an
88+
initial "ping" event successfully, and documentation should be available at
89+
`https://matplotlib.org/<repository>`.
90+
91+
Provisioning a new server
92+
=========================
93+
94+
Naming
95+
------
96+
97+
We follow a simplified version of the naming scheme on [this blog
98+
post](https://mnx.io/blog/a-proper-server-naming-scheme/):
99+
100+
* Servers are named `<prefix>.matplotlib.org` in A records.
101+
* Servers get a functional CNAME alias (e.g., `web01.matplotlib.org`).
102+
* matplotlib.org is a CNAME to the functional CNAME of a server.
103+
104+
We use [planets in our Solar System](https://namingschemes.com/Solar_System)
105+
for the name prefix. When creating a new server, pick the next one in the list.
106+
107+
Initial setup
108+
-------------
109+
110+
The summary of the initial setup is:
111+
112+
1. Create the droplet with monitoring and relevant SSH keys.
113+
2. Assign new droplet to the matplotlib.org project and the Web firewall.
114+
3. Grab the SSH host fingerprints.
115+
4. Reboot.
116+
117+
We currently use a simple $5 droplet from DigitalOcean. You can create one from
118+
the control panel, or using the `doctl` utility. Be sure to enable monitoring,
119+
and add the `website` tag and relevant SSH keys to the droplet. An example of
120+
using `doctl` is the following:
121+
122+
```
123+
doctl compute droplet create \
124+
--image fedora-35-x64 \
125+
--region tor1 \
126+
--size s-1vcpu-1gb \
127+
--ssh-keys <key-id>,<key-id> \
128+
--tag-name website \
129+
--enable-monitoring \
130+
venus.matplotlib.org
131+
```
132+
133+
Note, you will have to use `doctl compute ssh-key list` to get the IDs of the
134+
relevant SSH keys saved on DigitalOcean, and substitute them above. Save the ID
135+
of the new droplet from the output, e.g., in:
136+
137+
```
138+
ID Name Public IPv4 Private IPv4 Public IPv6 Memory VCPUs Disk Region Image VPC UUID Status Tags Features Volumes
139+
294098687 mpl.org 1024 1 25 tor1 Fedora 35 x64 new website monitoring,droplet_agent
140+
```
141+
142+
the droplet ID is 294098687.
143+
144+
145+
You should also assign the new droplet to the `matplotlib.org` project and the
146+
`Web` firewall:
147+
148+
```
149+
doctl projects list
150+
# Get ID of the matplotlib.org project from the output.
151+
doctl projects resources assign <project-id> --resource=do:droplet:<droplet-id>
152+
153+
154+
doctl compute firewall list
155+
# Get ID of the Web firewall from the output.
156+
doctl compute firewall add-droplets <firewall-id> --droplet-ids <droplet-id>
157+
```
158+
159+
Then, to ensure you are connecting to the expected server, you should grab the
160+
SSH host keys via the DigitalOcean Droplet Console:
161+
162+
```
163+
for f in /etc/ssh/ssh_host_*_key; do
164+
ssh-keygen -l -f $f;
165+
done
166+
```
167+
168+
Note down the outputs to verify later, e.g.,
169+
170+
```
171+
# Use these for comparison when connecting yourself.
172+
1024 SHA256:ExviVyBRoNKsZpgmIfBaejh1ElOpJ/9fC+ki2Fn5Xj4 root@venus.matplotlib.org (DSA)
173+
256 SHA256:hLA7ePr0D4AgiC21IXowtbpcUNnTGgpPB7NOYepQtxg root@venus.matplotlib.org (ECDSA)
174+
256 SHA256:MggFZQbZ7wID1Se2EmOwAm8AaJeA97L8sD8DhSrKy1g root@venus.matplotlib.org (ED25519)
175+
3072 SHA256:MCkDgfbn0sMTCtvAtfD0HmGJV3LVTjpUj6IcfWRHRQo root@venus.matplotlib.org (RSA)
176+
```
177+
178+
Finally, you should reboot the droplet. This is due to a bug in cloud-init on
179+
DigitalOcean, which generates a new machine ID after startup, causing system
180+
logs to be seem invisible.
181+
182+
DNS setup
183+
---------
184+
185+
1. Add an A record for `<prefix>.matplotlib.org` to the IPv4 address of the new
186+
droplet.
187+
2. Add a CNAME record for `webNN.matplotlib.org` pointing to the given
188+
`<prefix.matplotlib.org>`.
189+
190+
Running Ansible
191+
---------------
192+
193+
You must setup Ansible as described above. Verify that the new droplet is
194+
visible to Ansible by running:
195+
196+
```
197+
ansible-inventory --graph
198+
```
199+
200+
which should list the new droplet in the `website` tag:
201+
202+
```
203+
@all:
204+
|--@website:
205+
| |--venus.matplotlib.org
206+
```
207+
208+
Then execute the Ansible playbook on the servers by running:
209+
210+
```
211+
ansible-playbook --user root matplotlib.org.yml
212+
```
213+
214+
During the initial "Gathering Facts" task, you will be prompted to accept the
215+
server's SSH fingerprint, which you should verify against the values found
216+
earlier. If there are existing servers that you don't want to touch, then you
217+
can use the `--limit` option. If you are using a non-default SSH key, you may
218+
wish to use the `--private-key` option.
219+
220+
Flip main DNS
221+
-------------
222+
223+
You can verify that the server is running correctly by connecting to
224+
`https://<prefix>.matplotlib.org` in your browser.
225+
226+
Once everything is running, you should flip the DNS for the main site, changing
227+
the `matplotlib.org` CNAME to point to the new server's `webNN.matplotlib.org`
228+
functional name.

0 commit comments

Comments
 (0)