From 285591f218f1bf245e16d331009bb46a98d39558 Mon Sep 17 00:00:00 2001 From: EuAndreh Date: Tue, 26 Jan 2021 13:23:38 -0300 Subject: TODOs.md: Mark #task-fec292ff-b9de-4d6c-b156-a9adc4771f35 as done --- TODOs.md | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/TODOs.md b/TODOs.md index 6f69278..9b263a8 100644 --- a/TODOs.md +++ b/TODOs.md @@ -50,7 +50,22 @@ The final period is marked as bold, too. [b0f64583bf02f84cadcfad9b610d9c02ec6c4ec2](https://git.euandreh.xyz/remembering/commit/?id=b0f64583bf02f84cadcfad9b610d9c02ec6c4ec2). - TODO in 2021-01-21 -## TODO Optimize for large lists {#task-fec292ff-b9de-4d6c-b156-a9adc4771f35} +## DONE Optimize for large lists {#task-fec292ff-b9de-4d6c-b156-a9adc4771f35} +- DONE in 2021-01-26 + + Instead of using `while read VAR < $FILE`, and looping through each record, + the `$MERGED` and `$FILTERED` files are built differently. + + Building the `$MERGED` file exploits the fact that `sort -u -k2,2` picks the first + entry it sees, regardless of what is in column 1, or other columns. With that, + we feed the reversed (with `tac`) list to it, and `$MERGED` is built in a single + pass of sort. + + Building the `$FILTERED` file is now done with a simple AWK script, that performs much + better that a `while read VAR < $FILE` loop. + + Done in commit + [000b74b1140f2ac41cb5d00a9070db735abdc9c4](https://git.euandreh.xyz/remembering/commit/?id=000b74b1140f2ac41cb5d00a9070db735abdc9c4). - TODO in 2021-01-21 ## DONE Add tests {#task-146fab37-e53b-489e-95d0-3fcdd4c3eaef} -- cgit v1.2.3