aboutsummaryrefslogtreecommitdiff
path: root/TODOs.org
blob: 471f59cb492254474a39d6e621efe8501707ba4a (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
* Tasks
** DONE Provision DigitalOcean's droplet from Terraform
CLOSED: [2019-05-25 Sat 13:29]
** DONE Properly provision Ubuntu droplet
CLOSED: [2019-05-25 Sat 17:50]
** DONE Automate deployment of updates
CLOSED: [2019-05-29 Wed 17:42]
*** DONE Subtasks
CLOSED: [2019-05-29 Wed 17:42]
**** DONE Fix Debian import of GPG keys
CLOSED: [2019-05-26 Sun 14:34]
While NixOS image isn't fixed, use Debian instead.

The GPG data was all in a single line. I copied and pasted it properly and it was identified correctly.
**** CANCELLED Properly install Nix on Debian image
NixOS patch was applied.
**** DONE Fix NixOS GPG key importing in builds.sr.ht
CLOSED: [2019-05-26 Sun 17:37]
See patch and discussion in [[https://lists.sr.ht/~sircmpwn/sr.ht-dev/%3C20190526162135.1646-1-eu%40euandre.org%3E][sr.ht-dev mailing list]].
**** DONE Use ssh configuration from environment instead of creating and alias for =ssh=
CLOSED: [2019-05-26 Sun 19:44]
***** DONE Relative =IdentityFile= path
CLOSED: [2019-05-26 Sun 19:42]
Used =envsubst= to properly interpolate variables in =ssh.conf=
***** DONE Omit =-F ssh.conf= from command
CLOSED: [2019-05-26 Sun 19:42]
Put it in an environment variable?

Done by appending to content of =~/.ssh/config=.
**** DONE Use DigitalOcean's Floating IP in front of the droplet
CLOSED: [2019-05-28 Tue 23:22]
**** DONE Automate deployment with Terraform and deployment scripts
CLOSED: [2019-05-29 Wed 15:54]
**** DONE Backup data during deployments
CLOSED: [2019-05-28 Tue 00:48]
Is this approach feasible? Will it make the deployment take too much longer? What are the alternatives?

Initial sketch of the backup commands:
#+BEGIN_SOURCE shell
rsync --verbose --progress --stats --update --recursive "$HOME/backups/" "$RSYNC_REMOTE"
borg create -svp -C lzma,6 "~/borgbackup::{hostname}-{now}-${VPS_COMMIT_SHA} ${VOLUME_HOME}"
rsync --verbose --progress --stats --update --recursive "$RSYNC_REMOTE" "$HOME/borgbackups/"
#+END_SOURCE

Implemented with help from https://jstaf.github.io/2018/03/12/backups-with-borg-rsync.html
**** DONE Namecheap whitelist IP limitation
CLOSED: [2019-05-26 Sun 17:14]
Namecheap requires you to specifically whitelist an IP that can perform changes to their API.

[[https://lists.sr.ht/~sircmpwn/sr.ht-discuss/%20%3CCAJk2QMbq8uE1pcG3Uy6w37HUY7W15cQ+sHoj-UBWN-W11AtcrA%40mail.gmail.com%3E][builds.sr.ht]] don't guarantee any specific IP, so whitelisting it isn't an option.

The best candidate so far is using DigitalOcean's Floating IP feature to link a hardcoded IP to a droplet, while the droplet's IP may change. This way any new deployment wouldn't change the public IP of the box, and wouldn't require me to change the DNS A and AAAA records on Namecheap.

This has also an advantage of allowing the email server to keep it's IP address.

The downside is that the deployment of DNS registries isn't fully automated: whenever I have to change a DNS entry, either for adding a new CNAME record or changing an AAAA record, I'll have to:
1- get my own IP;
2- whitelist it on Namecheap's web interface;
3- run a separate Terraform recipe.

The upside is that this should happen less often than a deployment, but still not ideal. The ideal would be to run the Terraform recipe every time, and Terraform would realize that there was no DNS related change and do nothing.
*** Limitations
During build, decrypt content of files and update the deployment.

How does Terraform tfstate file can be handled in this case?

UPDATE:
Terraform does support the so called "backends" to coordinate lock and usage of the =.tfstate= files. On this regard there are no restrictions on continously deploying with Terraform from the CI pipelines.

However the current applications do *not* properly support blue/green deployment, like email, Nextcloud, etc.

We could try to share a shared volume, but that would be a consistency nightmare.

The other option is to always recreate everything, with downtime. The advantage is that we get actual immutable deployments with stateful storage, but there would be downtime for every deployment. This is due to the nature of most of the packaged applications being single node *only*.

There's also the IP reputation issue: recreating everything from scratch every time would lead to new droplets with new IP addresses, which is not a good thing to be changing in a server box.

A reasonable alternative would be to redeploy everything on a different node, with a different TLD, and manually check that. But that would be just like an staging environment, with all of it's downsides too.

In this situation, I if go on with automating the deployment I'd rather pick the downtime option.

I'll start with other services other than email and consider alternatives later.
** DONE Correctly load the SSH keypair using =user_data=
CLOSED: [2019-06-05 Wed 18:16]
*** DONE Disable the =user_data=
CLOSED: [2019-06-05 Wed 17:39]
*** DONE Generate and manually copy the =user-data.env= file
CLOSED: [2019-06-05 Wed 17:39]
*** CANCELLED Run it on the system
*** DONE Run each step individually and check them
CLOSED: [2019-06-05 Wed 18:15]
Check the content of the generated key files.
*** DONE Try to login
CLOSED: [2019-06-05 Wed 18:15]
Problem was on file typo and private key permissions.

Bonus: change SSH port
** DONE Test key rotation
CLOSED: [2019-06-05 Wed 19:28]
See if it is actually working as expected.
** DONE Use Digital Ocean's Volumes for persistent extended storage
CLOSED: [2019-06-05 Wed 20:38]
** DONE Make VPS provisioning more robust
CLOSED: [2019-06-10 Mon 09:01]
*** DONE Use Ansible (or an equivalent tool) instead of custom Bash scripts
CLOSED: [2019-06-05 Wed 16:41]
They are now more fragile, ad-hoc and imperative than I would like.

Today Terraform won't run the =deploy.sh= if no infrastructure changes are required. Split infrastructure provisioning from server configuration with somethong like Ansible or =nix copy closure= and add extra command in the pipeline run.
*** DONE Always perform a blue/green infrastructure deployment with Terraform
CLOSED: [2019-06-10 Mon 09:01]
Recreate a new Droplet from scratch, even if no changes happened.

This way every deployment tests the code path of creating everything from scratch again, from the DNS public IP all the way to restoring backups.
*** DONE Destroy and recreate the volume on deployment
CLOSED: [2019-06-10 Mon 09:01]
Restore from the latest backup with:
#+BEGIN_SOURCE shell
borg list --short --sort-by timestamp | tail -n 1
#+END_SOURCE
** DONE Configure DNS from Terraform
Handling DNS with DigitalOcean did it. Namecheap and GoDaddy API are bad, and all I had to do manually was configure a [[https://www.digitalocean.com/community/tutorials/how-to-point-to-digitalocean-nameservers-from-common-domain-registrars][custom nameserver to point to DigitalOcean's nameserver]].
CLOSED: [2019-06-09 Sun 22:52]
*** DONE Test provisioning DNS entries with other DNS registrars
CLOSED: [2019-06-09 Sun 22:52]
DNS registrar API are bad in general (from what I've seen). Using DigitalOcean's DNS was more straightforward.
*** DONE Have dynamic Floating IP (a.k.a. =$PINNED_IP=)
CLOSED: [2019-06-09 Sun 22:52]
Floating IP is dynamically attached to the DNS entry in DigitalOcean itself.
** TODO Create snapshots before destroying resources
This way the previous good state can be reverted if the deployment fails or the backup can't be restored.

Can a TTL be added to the Droplet and the Volume's snapshots?
** TODO Harden the server
https://www.reddit.com/r/selfhosted/comments/bw8hqq/top_3_measures_to_secure_your_virtual_private/
https://docs.nextcloud.com/server/stable/admin_manual/installation/harden_server.html
https://ownyourbits.com/2017/03/25/nextcloud-a-security-analysis/
Check for HSTS header configuration
** TODO Use git-remote-gcrypt instead of git-crypt for vps-state
Also put all of the content of =secrets/*= into vps-state? Maybe rename it to vps-secret?

Right now, secrets are scattered between the two repositories. By moving I can completely remove =git-crypt= from this repository.
** TODO Run backup on Terraform destroy action instead of manually in =provision.sh=
** DONE Explicitly destroy Droplets before running Terraform apply
CLOSED: [2019-06-05 Wed 19:48]
** DONE Store updated =.tfstate= even in case of deployment failure
CLOSED: [2019-06-10 Mon 21:21]
Right now the script fails on Terraform commands before reaching git commands. I should trap the error, store on git and only then fail.
** DONE Fix alias in =bash-profile.sh=
CLOSED: [2019-06-10 Mon 09:01]
** DONE Email verbose (Ansible) log files in case of error
CLOSED: [2019-06-10 Mon 16:59]
builds.sr.ht only emails the link. Should it be extended to support encrypted log attachments?
** TODO Use environment variables for SSH key paths and volume mounts
** DONE Don't allow backups to fail
CLOSED: [2019-06-10 Mon 11:21]
** TODO Don't hardcode =/root/= paths: use =~/= instead to allow for different users
* Services
** DONE =$tld=: Static webhosting
CLOSED: [2019-05-26 Sun 10:17]
** DONE =wallabag.$tld=: Wallabag
CLOSED: [2019-05-25 Sat 18:02]
A bookmark application must:
- allow me to save and read articles on Android and Firefox;
- allow me to have tags for links;
** NEXT =nextcloud.$tld=: Nextcloud: storage, calendar, contacts, notes
https://github.com/nextcloud/docker

Do I need to configure =NEXTCLOUD_TRUSTED_DOMAINS= or it should work without it?

Start with =cloud.$tld= before =mail.$tld= so I can retire =arrobaponto.org= and reuse it for other projects.

Activate client-side [[https://docs.nextcloud.com/server/11/user_manual/files/encrypting_files.html][encryption]] of files. Activate two-factor authentication for admin and user accounts.

Nextcloud bookmarks instead of Wallabag? Does it have browser extension and Android app? How about the password manager? Is is client-side encrypted?

Should I consider using an external storage provider, like S3, instead of solely local (DigitalOcean's attached volume)?
** TODO =mail.$tld=: Email + webmail
https://github.com/tomav/docker-mailserver
https://mailu.io/master/demo.html
https://mailcow.email/
https://poste.io/
https://github.com/hardware/mailserver

=mail.$tld= could be the Nextcloud mail application!
** TODO =www.$tld= and =blog.$tld=: Redirect to =$tld=
** TODO =hydra.$tld=: Hydra
Does Hydra support release management? The source tarball can live in git.sr.ht, but what about compiled outputs?

I'd like to release both pre-compiled binaries and Docker images.
** TODO =cuirass.$tld=: [[https://git.savannah.gnu.org/cgit/guix/guix-cuirass.git][Cuirass]]
** TODO =annex.$tld=: Public content from Git Annex repositories
Only an static file server, with folders for individual assets.
** TODO =pi-hole.$tld=: Pi-hole
** TODO =pwk.$tld=: Piwik
** TODO =git.$tld=: CGit or GitWeb
https://github.com/iconoeugen/docker-gitweb
** TODO =songbooks.$tld=: Songbooks demo application
** TODO =pires.$tld=: Pires deployment
** TODO =paste.$tld=: Static pastebin
Use Hakyll like in =euandre.org/pastebin/*=, but with root at =paste.$tld/paste-title.html=.
** TODO =link.$tld=: YOURLS
No need for short URLs, but maybe useful for tracking link usage?

What are the privacy implications? Related relevant article: http://stage.vambenepe.com/archives/1596
** CANCELLED =perkeep.$tld=: Perkeep
I'm already covered by using Git Annex for almost everything.
** TODO =matrix.$tld=: Matrix Synapse server
I'm not using IRC a lot right now. Wait for me to interact more with mailing lists and gauge the need of IRC.

It's better than IRC, OTR (XMPP) and everything else, and interoperates with everything. As well said by @matthew, I'm using Matrix mostly as [[https://discourse.mozilla.org/t/matrix-and-irc-mozillians-custom-client/2911/7][a glorified IRC bouncer]]. I do use some Matrix rooms, but mostly for IRC itself.

Also from https://matrix.org/blog/2015/06/22/the-matrix-org-irc-bridge-now-bridges-all-of-freenode:

: Doing this is trivial - you just /join #freenode#channelname:matrix.org from a typical Matrix client - e.g. /join #freenode#Node.js:matrix.org will give you access to all of #Node.js via Matrix, effectively using Matrix as a great big distributed IRC bouncer in the sky ;)

Should continue to consider doing as I continue to use IRC.

Test the Emacs Matrix client along with the server installation.
** WAITING =search.$tld=: Searx instance
Would it be actually more private?
* Questions
** TODO Critiques of Docker?
What does NixOps, DisNix and Dysnomia are trying to accomplish that overlap with Docker? Use sqldiff for NixOps?

Do they do a better job? Why? Why not?

Get a book on advanced Docker, or container fundamentals and dig deeper.
** TODO Should I have an extra backup location?
Maybe rsync the contents of the Borg repository into S3. Should I restore backups from these too?
** TODO Should I be using something like [[https://www.vaultproject.io/][Vault]] instead of git-crypt?
Can it do key rotation?
** TODO Can the =setup.sh= and =provision.sh= scripts be run inside a chroot or a NixOS contained environment?
Right now they can't simply be a derivation because =setup.sh= needs access to the disk and =provision.sh= needs to access the internet.
** TODO Is there a way to commit the /signed/ public key?
This way it would remove the need of using the =--always-trust= option of =gpg=.
** DONE How to dynamically handle Floating IPs?
CLOSED: [2019-06-10 Mon 08:59]
Right now the current Floating IP defined in =.envrc= was created manually in DigitalOcean's web UI and copied from it to the environment variable.

If everything was teared down, I couldn't recreate everything from source, because the Floating IP would be different.

The ultimate goal would be to upsert a Floating IP address? If no Floating IP address exists, create one. If one already exists (I don't how to get a reference to it), use it.

In other words, I don't want any hardcoded IPs in the recipe. The IP address has to be fixed, and the same on the DNS registrar and DigitalOcean's Floating IP.
*** Solution
I provisioned both the Floating IP and the DNS A record in the same recipe. Now everything is recreated from scratch every time.
** DONE Do I want or need Docker? Should I use it?
CLOSED: [2019-05-25 Sat 18:1980]
It was a better path than sticking with NixOps and nixcloud-webservices. It's more widespread and has more things done for it.
** CANCELLED How to share the Nix store across services?
** DONE How to leverage DigitalOcean's block storage?
CLOSED: [2019-05-25 Sat 18:19]
Provision it using Terraform, and use it's path as the =$VOLUME_HOME= variable for containers.

This was I can compartimentalize the data storage to easily backup and duplicate, but also destroy a running droplet and create a new one.
* Nice to have
** =euandreh.org= as =$tld=
** Nix Terraform provisioning
** WAITING Upgrade =terraform-godaddy= to 0.12 to support looping over CNAME records
When using =terraform-godaddy= this made sense:
#+BEGIN_SRC hcl
locals {
  cname_subdomains = [
    "${var.wallabag_tld_prefix}",
    "${var.nextcloud_tld_prefix}",
  ]
}

resource "godaddy_domain_record" "vps_tld" {
  domain = "${var.tld}"
  addresses   = ["${var.floating_ip}"]

  dynamic "record" {
    for_each = local.cname_subdomains

    content {
      type = "CNAME"
      name = tag
      data = "${var.tld}"
    }
  }
}
#+END_SRC
However, when transitioning to DNS provisioning using DigitalOcean, there's a catch: the =digitalocean_record= resource in Terraform lives on the toplevel, not nested. I tried doing a similar thing, but [[https://www.hashicorp.com/blog/hashicorp-terraform-0-12-preview-for-and-for-each][Terraform 0.12 doesn't support =for_each= loops on =resources=]]:

: During the development of Terraform 0.12 we've also laid the groundwork for supporting for_each directly inside a resource or data block as a more convenient way to create a resource instance for each element in a list or map. Unfortunately we will not be able to fully complete this feature for the Terraform 0.12 initial release, but we plan to include this in a subsequent release to make it easier to dynamically construct multiple resource instances of the same type based on elements of a given map.

The equivalent code should look like:
#+BEGIN_SRC hcl
locals {
  cname_subdomains = [
    "${var.wallabag_tld_prefix}",
    "${var.nextcloud_tld_prefix}",
  ]
}

resource "digitalocean_record" "subdomains" {
  for_each = local.cname_subdomains

  domain = "${digitalocean_domain.vps_tld.name}"
  type   = "CNAME"
  name   = each
  value  = "${digitalocean_domain.vps_tld.name}."
}
#+END_SRC
** Upgrade =docker-compose.yaml= file from version 2 to version 3
** Full blue/green deployments without downtime
Only when doing a voluntary restore from backup in a newly created volume.

Is there email software capable of doing this? A distributed email server that doesn't rely so much on the server file system, but on a database instead?
** Do all this in a Raspberry Pi
Even the email server can be in it. Is RAM the beiggest limitation for it?

Raspberry Pi vs VPS

Imagine 2 Raspberry Pis, doing immutable blue/green deployments on it, with a large local of a few TBs!
** README with setup instructions
** Improve rotation of SSH port
Remove need for manual intervention
* Resources
** [[https://github.com/mail-in-a-box/mailinabox][Mail-in-a-Box]]
** [[https://sealedabstract.com/code/nsa-proof-your-e-mail-in-2-hours/][NSA-proof your e-mail in 2 hours]]
** [[https://www.iredmail.org/][iRedMail]]
** [[https://blog.harveydelaney.com/hosting-websites-using-docker-nginx/][Hosting Multiple Websites with SSL using Docker, Nginx and a VPS]]
** [[https://github.com/sovereign/sovereign/][Sovereign]]
** [[https://github.com/nixcloud/nixcloud-webservices][nixcloud-webservices]]
** [[https://github.com/Kickball/awesome-selfhosted#email][Awesome-Selfhosted: Email]]
** [[https://arstechnica.com/information-technology/2014/04/taking-e-mail-back-part-4-the-finale-with-webmail-everything-after/][Taking e-mail back]]
* Decisions
** Use external git repository as an encrypted database
Terraform does have the support for "backends" where it can store =.tfstate= files.

From the list of supported backends, the [[https://www.terraform.io/docs/backends/types/s3.html][S3]] option initially stands out as the simplest to configure. It doesn't however support state locking, only if also configuring DynamoDB.

This extra configuration and complexity isn't attractive, and I can achieve similar outcomes by using the =local= backend and storing it properly. Even better than sending to S3 and setting up the proper revision headers is to just use a separate repository to keep it.

Using the same repository would create an unwanted cyclic process where the repository pipeline commits in itself.

All data stored on git is encrypted with [[https://www.agwa.name/projects/git-crypt/][git-crypt]], which means git isn't being actually used as a source code repository, but as a versioned filesystem database.

By taking advantage of the sourcehut ecosystem, it was easier to setup the access of the pipeline to the ad-hoc Terraform backend.

I created a repository called [[https://git.sr.ht/~euandreh/vps-state/][=vps-state=]] to store the encrypted =.tfstate= and =.tfplan= files. During the CI run, the pipeline creates new a =.tfplan= file and commits it into =vps-state=, and after applying the plan it updates the =.tfstate= file and adds this change to =vps-state=.
** Configuration of =StrictHostKeyChecking=
We have 3 cases where I'm pushing things to the server and I'm dealing with it differently:
*** 1. Pushing updates to the =vps-state= repository
I could whitelist the SSH keys from the =git.sr.ht= servers, but this could break on every key rotation of the server.

In can of the server address being spoofed, the content would be readable by the attacker, since we're doing all the encryption on the client. We would, however, lose a Terraform state file update. As of right now, I'm OK with this trade-off.
*** 2. Running =scp= to the deployed VPS
On this situation I want to be sure I know where I'm pushing to.

In order to avoid adding =StrictHostKeyChecking= when running =ssh= and =scp=, every time the SSH key is rotated I generate a new =./secrets/ssh/known-hosts.txt= file with the proper SSH public key.

This way we can avoid prompting for SSH server fingerprint trust on the CI and avoid adding =StrictHostKeyChecking= on those calls.
*** 3. Backup server
Even though the backup is encrypted before sending the data, I don't want to risk loosing a backup to an spoofed server. I'd rather break the build instead.
** Don't use Ansible as a =local-exec= provisioner from Terraform
Instead, explicitly call =ansible-playbook= after =terraform apply= finished running.

This way we test the DNS A record -> Floating IP -> Droplet IP path. We can't do that inside Terraform declaration because the =local-exec= provisioning command runs before the =digitalocean_floating_ip_assignment= is created, and we can't create a cyclic dependency between the two resources.

We could use the raw Droplet IP instead of the DNS A record, but I prefer calling it later in order to always test the full DNS resolution.
* COMMENT
** DONE Must
CLOSED: [2019-06-10 Mon 08:51]
*** DONE Fully deployable from code
CLOSED: [2019-06-10 Mon 08:51]
Use +NixOps+ Ansible and Terraform to fully automate all of the configuration.
* Scrath