Firebase for [[Conflict Free Replicated Data Type|Conflict Free Replicated Data Types]]. An article recently posted on [[Hacker News]] about [[Local-First Software]] ([[Local-first software You own your data, in spite of the cloud]]) mentions the idea of Firebase for CRDTs. What would such a service look like? Ultimately at the core, I think it would require a very strong event service, append-mostly datastores (more on this later), and off-the-shelf libraries in several languages that implement the *client-side* portion of a set of blessed CRDT algorithm (maybe just one). Additionally, in order to be well understood by as many people as Firebase is understood by, we would also need a off-the-shelf UI for working with such a system. Visualizing conflicts can/should be relegated to the client, but administration requires things such as snapshot-based compaction of events, user and permissions management, and "document" management. # Proposal - [[NATS]] based events - SQLite based event storage - [[Golang]] for backend - Loose spec for frontend with implementations in: - [[Golang]], [[Rust]], [[JavaScript]], and [[Python]] - Command-line application for testing algorithms, data synchronization Initial goals are to implement the most basic CRDT system possible using only slow event propagation. ## Potential Off the Shelf Tools - [editorjs](https://editorjs.io/) - [[Open Source Software|Open Source]] [[Notion]]-like editor - [automerge](https://automerge.org/automerge-binary-format-spec/) ## CRDT Data Structure The initial datastructure should be text block based. That is, similar to the interaction layout of [[Logseq]] and [[Notion]]. Only one "user" can be editing a block at any given moment and each block behaves as a singly-linked list pointing to the previous text block at any given document state. [[UUID]] v7 identifiers are mapped to each block, with each edit incrementing the counter field in `rand_a` (which starts globally for the "document" at 0). Deletes are virtual, marked as a boolean flag in the datastore. Snapshots are debounce based. After a certain amount of time with no edits (say, 1 minute, but it can be configurable), that is, groups of similar edits (ie. to the same block) are coalesced into a single edit using the ID of the last edit. Note: the terminology "snapshot" may imply the whole document, but that is not the case. Perhaps past a very long timescale (months) or manually as a data-saving measure can documents be snapshotted as a "whole." ## Identities [[UUID]] v7 with the optional counter starting at 0 for each document. ## Documents The unit of collaboration is a "document." A document corresponds to a single file and consists of zero or more nodes, some (optional) metadata, an identity (path and [[UUID]] identifier), and have a corresponding access control which may be inherited by a parent "folder." ## Nodes Nodes are the unit of editability in this system. That is, they are the smallest unit of data that is tracked discretely in the datastore. Nodes can be thought of as coalesced *events*, however those events are largely ephemeral and the *identity* of the block becomes the identity of the last event that modified it. When a newly joining client requests the state of the document, it receives a list of _nodes_, not a list of events.