summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--doc/c10m.adoc117
1 files changed, 117 insertions, 0 deletions
diff --git a/doc/c10m.adoc b/doc/c10m.adoc
new file mode 100644
index 0000000..081d4a8
--- /dev/null
+++ b/doc/c10m.adoc
@@ -0,0 +1,117 @@
+ = C10M: 10 Million Concurrent Connections
+ :author: papod
+ :revdate: 2026-04-24
+
+ == Goal
+
+ Cap a single papod process at 10 million concurrent connections across
+ multiple networks. This is a C10M target.
+
+ == Architecture: Virtual Threads
+
+ Each client connection runs in its own Java virtual thread
+ (`Thread/ofVirtual`). Virtual threads are cheap (~1KB stack) and
+ multiplexed onto a small pool of carrier threads (default: CPU core
+ count).
+
+ This works well *if* virtual threads never pin their carrier thread.
+
+ == Pinning Problem
+
+ A virtual thread *pins* its carrier thread when it blocks inside a
+ `synchronized` block or JNI call. The carrier thread cannot run other
+ virtual threads while pinned. With ~8–16 carrier threads and millions
+ of virtual threads, even a small percentage pinning simultaneously
+ exhausts the carrier pool and stalls the entire server.
+
+ === Datomic
+
+ Datomic Free/Pro was written before Project Loom. Its internals almost
+ certainly use `synchronized` rather than `java.util.concurrent.locks`.
+ Every `@(d/transact ...)` and `(d/q ...)` call may pin.
+
+ At IRC-hobby scale (hundreds of connections) this is harmless. At C10M
+ it is a showstopper.
+
+ === Diagnosis
+
+ Run with `-Djdk.tracePinnedThreads=short` under load to identify
+ pinning sites.
+
+ == Mitigation Strategies
+
+ === 1. Bounded Datomic thread pool
+
+
+ Offload all `d/transact` and `d/q` calls to a fixed-size pool of
+ platform threads. Virtual threads submit work and `await` via
+ `CompletableFuture` (no pinning).
+
+ [source,clojure]
+ ----
+ (def ^:private datomic-pool
+ (java.util.concurrent.Executors/newFixedThreadPool 64))
+
+ (defn transact-async [conn tx-data]
+ (let [cf (java.util.concurrent.CompletableFuture.)]
+ (.submit datomic-pool
+ (reify Runnable
+ (run [_]
+ (try
+ (.complete cf @(d/transact conn tx-data))
+ (catch Exception e
+ (.completeExceptionally cf e))))))
+ cf))
+ ----
+
+ Pros:: Simple, contained. Virtual threads never touch Datomic directly.
+ Cons:: Adds a queue. Pool size must be tuned (too small = backpressure,
+ too large = Datomic contention).
+
+ === 2. Use `d/transact-async` natively
+
+ Datomic's `d/transact-async` returns a future without blocking the
+ caller. Restructure the reply pipeline to be async: `handle-privmsg`
+ returns a deferred result, `send-replies!` awaits it.
+
+ Pros:: No extra pool. Uses Datomic's own async path.
+ Cons:: Requires restructuring the entire request→reply pipeline from
+ synchronous to async. Large change.
+
+ === 3. Replace Datomic Free
+
+ Use a database client that is virtual-thread-safe:
+
+ - XTDB v2 (built on Arrow/Kafka, Java 21 aware)
+ - Raw JDBC + HikariCP (virtual-thread-friendly since 5.1.0)
+ - SQLite via JDBC (single-writer, but no `synchronized` in the driver)
+
+
+ Pros:: Eliminates the root cause.
+ Cons:: Migration cost. Loss of Datomic's immutable history model (which
+ papod relies on for CHATHISTORY, edit history, audit).
+
+ == Recommendation
+
+ Start with *Strategy 1* (bounded thread pool). It is the smallest
+ change, keeps the existing synchronous handler signatures, and can be
+ implemented incrementally:
+
+ 1. Wrap `d/transact` and `d/q` calls behind helper functions.
+ 2. Run those helpers on a fixed platform-thread pool.
+ 3. Virtual threads `(.get future)` on the result — this blocks the
+ virtual thread (fine, no pinning) without blocking a carrier thread.
+
+ Revisit if the pool becomes a bottleneck under load testing.
+
+ == Open Questions
+
+ - What is the actual pinning profile of Datomic Free under load?
+ (Measure before optimizing.)
+ - Should the persistence layer be swappable (protocol/interface) to
+ allow future migration?
+ - Is the in-memory `clients` atom (`atom {}` with 10M entries)
+ performant enough, or does it need a concurrent map?
+ - At C10M, `doseq` over channel members for PRIVMSG fan-out is O(n).
+ Channels with 100K members need a different broadcast strategy.
+