diff options
| -rw-r--r-- | doc/c10m.adoc | 117 |
1 files changed, 117 insertions, 0 deletions
diff --git a/doc/c10m.adoc b/doc/c10m.adoc new file mode 100644 index 0000000..081d4a8 --- /dev/null +++ b/doc/c10m.adoc @@ -0,0 +1,117 @@ + = C10M: 10 Million Concurrent Connections + :author: papod + :revdate: 2026-04-24 + + == Goal + + Cap a single papod process at 10 million concurrent connections across + multiple networks. This is a C10M target. + + == Architecture: Virtual Threads + + Each client connection runs in its own Java virtual thread + (`Thread/ofVirtual`). Virtual threads are cheap (~1KB stack) and + multiplexed onto a small pool of carrier threads (default: CPU core + count). + + This works well *if* virtual threads never pin their carrier thread. + + == Pinning Problem + + A virtual thread *pins* its carrier thread when it blocks inside a + `synchronized` block or JNI call. The carrier thread cannot run other + virtual threads while pinned. With ~8–16 carrier threads and millions + of virtual threads, even a small percentage pinning simultaneously + exhausts the carrier pool and stalls the entire server. + + === Datomic + + Datomic Free/Pro was written before Project Loom. Its internals almost + certainly use `synchronized` rather than `java.util.concurrent.locks`. + Every `@(d/transact ...)` and `(d/q ...)` call may pin. + + At IRC-hobby scale (hundreds of connections) this is harmless. At C10M + it is a showstopper. + + === Diagnosis + + Run with `-Djdk.tracePinnedThreads=short` under load to identify + pinning sites. + + == Mitigation Strategies + + === 1. Bounded Datomic thread pool + + + Offload all `d/transact` and `d/q` calls to a fixed-size pool of + platform threads. Virtual threads submit work and `await` via + `CompletableFuture` (no pinning). + + [source,clojure] + ---- + (def ^:private datomic-pool + (java.util.concurrent.Executors/newFixedThreadPool 64)) + + (defn transact-async [conn tx-data] + (let [cf (java.util.concurrent.CompletableFuture.)] + (.submit datomic-pool + (reify Runnable + (run [_] + (try + (.complete cf @(d/transact conn tx-data)) + (catch Exception e + (.completeExceptionally cf e)))))) + cf)) + ---- + + Pros:: Simple, contained. Virtual threads never touch Datomic directly. + Cons:: Adds a queue. Pool size must be tuned (too small = backpressure, + too large = Datomic contention). + + === 2. Use `d/transact-async` natively + + Datomic's `d/transact-async` returns a future without blocking the + caller. Restructure the reply pipeline to be async: `handle-privmsg` + returns a deferred result, `send-replies!` awaits it. + + Pros:: No extra pool. Uses Datomic's own async path. + Cons:: Requires restructuring the entire request→reply pipeline from + synchronous to async. Large change. + + === 3. Replace Datomic Free + + Use a database client that is virtual-thread-safe: + + - XTDB v2 (built on Arrow/Kafka, Java 21 aware) + - Raw JDBC + HikariCP (virtual-thread-friendly since 5.1.0) + - SQLite via JDBC (single-writer, but no `synchronized` in the driver) + + + Pros:: Eliminates the root cause. + Cons:: Migration cost. Loss of Datomic's immutable history model (which + papod relies on for CHATHISTORY, edit history, audit). + + == Recommendation + + Start with *Strategy 1* (bounded thread pool). It is the smallest + change, keeps the existing synchronous handler signatures, and can be + implemented incrementally: + + 1. Wrap `d/transact` and `d/q` calls behind helper functions. + 2. Run those helpers on a fixed platform-thread pool. + 3. Virtual threads `(.get future)` on the result — this blocks the + virtual thread (fine, no pinning) without blocking a carrier thread. + + Revisit if the pool becomes a bottleneck under load testing. + + == Open Questions + + - What is the actual pinning profile of Datomic Free under load? + (Measure before optimizing.) + - Should the persistence layer be swappable (protocol/interface) to + allow future migration? + - Is the in-memory `clients` atom (`atom {}` with 10M entries) + performant enough, or does it need a concurrent map? + - At C10M, `doseq` over channel members for PRIVMSG fan-out is O(n). + Channels with 100K members need a different broadcast strategy. + |
