clockworklabs_spacetimedb

mirror of https://github.com/clockworklabs/SpacetimeDB.git synced 2026-03-20 09:01:05 +08:00

Author	SHA1	Message	Date
Kim Altintop	7cd72b8adf	commitlog: Resumption of sealed commitlog (#4650 ) The commitlog so far assumed that the latest segment is never compressed and can be opened for writing (if it is intact). However, restoring the entire commitlog from cold storage results in all segments being compressed. Make it so the resumption logic reads the metadata from the potentially compressed last segment, and starts a new segment for writing if the latest one was indeed compressed. # Expected complexity level and risk 1.5 # Testing Added a test.	2026-03-17 12:02:17 +00:00
Kim Altintop	17cc15ef4c	Append commit instead of individual transactions to commitlog (#4404 ) Re-open #4140 (reverted in #4292). The original patch was merged a bit too eagerly. It should go in _after_ 2.0 is released with some confidence.	2026-03-03 15:08:05 +00:00
Noa	e3582131fe	Migrate to Rust 2024 (#3802 ) # Description of Changes It'd be best to review this commit-by-commit, and using [difftastic](https://difftastic.wilfred.me.uk) to easily tell when changes are minor in terms of syntax but a line based diff doesn't show that. # Expected complexity level and risk 3 - edition2024 does bring changes to drop order, which could cause issues with locks, but I looked through [all of the warnings that weren't fixed automatically](`ba80f3fecd/warnings.html`) and couldn't find any issues. # Testing n/a; internal code change	2026-03-03 11:06:52 +00:00
Shubham Mishra	ab65b60fe4	fix index truncate edge cases (#4501 ) # Description of Changes Index offset `truncate` methods returns `IndexError::KeyNotFound` when asked to truncate on empty index offset file or the key in input is smaller than the first entry. This was causing `commitlog::reset_to` method to return error, and stuck replicas in re-spawn loop. # API and ABI breaking changes NA # Expected complexity level and risk 1 # Testing Added new tests to cover edge scenerios.	2026-03-02 07:31:14 +00:00
Kim Altintop	8aa22da034	commitlog: Improve `committed_meta` (#4338 ) - Extends `commit::Metadata` to include the checksum - Extends `segment::Metadata` to include `Some(commit::Metadata)` containing the last commit in the segment (if there is one) - Changes `committed_meta` to: - ignore empty segments at the end of the log - try harder to provide useful metadata, even if only a prefix of the latest segment is readable This is allows to eliminate remaining `Commitlog::open` calls with the purpose of querying the latest commit (offset). `Commitlog::open` creates an empty segment if the tail of the log is corrupt, which is a non-obvious side-effect that can be confusing when debugging. It also allows to eliminate uses where the `commits_from` iterator is used to find the latest full commit. The `Commits` iterator requires the caller to handle the case of a corrupted commit at the end of the log, by advancing the iterator once more after it has yielded an error in order to check that it is exhausted, and then deciding whether to ignore the error. This is easy to forget. `committed_meta` now just does the right thing, preserving information about tail corruption for when that's useful.	2026-02-23 14:16:59 +00:00
clockwork-labs-bot	bad5335114	Revert "Append commit instead of individual transactions to commitlog (#4140 )" (#4292 ) Reverts #4140 per @kim's request — was not ready to merge yet. Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com>	2026-02-13 10:23:24 -05:00
Kim Altintop	c4c3bf78b3	Append commit instead of individual transactions to commitlog (#4140 ) Changes the commitlog (and durability) write API, such that the caller decides how many transactions are in a single commit, and has to supply the transaction offsets. This simplifies commitlog-side buffering logic to essentially a `BufWriter` (which, of course, we must not forget to flush). This will help throughput, but offers less opportunity to retry failed writes. This is probably a good thing, as disks can fail in erratic ways, and we should rather crash and re-verify the commitlog (suffix) than continue writing. To that end, this patch liberally raises panics when there is a chance that internal state could be "poisoned" by partial writes, which may be debatable. # Motivation The main motivation is to avoid maintaining the transaction offset in two places in such a way that they could diverge. As ordering commits is the responsibility of the datastore, we make it authoritative on this matter -- the commitlog will still check that offsets are contiguous, and refuse to commit if that's not the case. A secondary, related motivation is the following: A "commit" is an atomic unit of storage, meaning that a torn (partial) write of a commit will render the entire commit corrupt. There hasn't been a compelling case where we would want this, and have always configured the server to write exactly one transaction per commit. The code to handle buffering of transactions is, however, rather complex, as it tries hard to allow the caller to retry writes at commit boundaries. An unfortunate consequence of this is that we'd flush to the OS very often, leaving throughput performance on the table. So, if there is a compelling case for batching multiple transactions in a commit, it should be the datastore's responsibility. # API and ABI breaking changes Breaks internal APIs only. # Expected complexity level and risk 5 - Mostly for the risk # Testing Existing tests.	2026-02-13 13:10:30 +00:00
Tyler Cloutier	2ec07a3f70	Standardize query builder syntax across Rust, TypeScript, and C# (Server/Client) (#4261 ) # Description of Changes Standardizes the query builder API across all three language SDKs (Rust, TypeScript, C#) for consistency. Rust: - Rename `Query` struct to `RawQuery`, make `Query` a trait with `fn into_sql(self) -> String` - All builder types (`Table`, `FromWhere`, `LeftSemiJoin`, `RightSemiJoin`) implement `Query<T>` trait - Views can return `-> impl Query<T>` instead of specifying exact builder types - The `#[view]` macro auto-detects `impl Query<T>` and rewrites to `RawQuery<T>` - Add `Not` variant to `BoolExpr` with `.not()` method TypeScript: - Add `ne()` to `ColumnExpression` - Refactor `BooleanExpr` to `BoolExpr` class with chainable `.and()`, `.or()`, `.not()` methods - Make builders valid queries directly (`.build()` deprecated but still works) - Deprecate `from()` wrapper — use `tables.person.where(...)` directly - Merge `query` export into `tables` so table refs are also query builders - Add subscription callback form: `subscribe(ctx => ctx.from.person.where(...))` - Unify `useTable` with query builder syntax; deprecate `filter.ts` C#: - Add `Not()` method to `BoolExpr<TRow>` - Add `IQuery<TRow>` interface implemented by all builder types (`Table`, `FromWhere`, `LeftSemiJoin`, `RightSemiJoin`, `Query`) - Add `ToSql()` to all builder types so `.Build()` is no longer required - Update `AddQuery` to accept `IQuery<TRow>` instead of `Query<TRow>` # API and ABI breaking changes - Rust: `Query<T>` is now a trait (was a struct). The struct is renamed to `RawQuery<T>`. This is a breaking change for any code that used `Query<T>` as a type directly. - TypeScript: `BooleanExpr` is now a `BoolExpr` class (was a discriminated union type). The `query` export is deprecated in favor of `tables`. - C#: `AddQuery` now accepts `Func<QueryBuilder, IQuery<TRow>>` instead of `Func<QueryBuilder, Query<TRow>>`. Existing `.Build()` calls still work since `Query<TRow>` implements `IQuery<TRow>`. # Expected complexity level and risk 3 — Changes touch multiple language SDKs and codegen, but each individual change is straightforward. The Rust macro rewrite for `impl Query<T>` detection is the most complex piece. All existing `.build()`/`.Build()` calls continue to work. # Testing - [x] `cargo test -p spacetimedb-query-builder` — 16/16 tests pass - [x] `cargo check -p spacetimedb` — clean, no warnings - [x] `cargo check` on views-query, views-sql, views-basic, views-trapped smoketest modules — all clean - [x] `cargo test -p spacetimedb-codegen codegen_csharp` — snapshot updated, passes - [x] `npm test` (TypeScript) — 101/101 tests pass - [x] C# QueryBuilder tests — new tests for `Not()`, `IQuery<T>` interface - [ ] CI passes	2026-02-13 04:22:49 +00:00
Phoebe Goldman	7136c37ed3	Expose a couple things to enable some work in another repo (#3986 ) # Description of Changes Just adds public accessors for some existing internal functionality, to enable a change in another repo. Review starting from there. # API and ABI breaking changes These crates are marked unstable. # Expected complexity level and risk 0.5 # Testing See other PR. --------- Co-authored-by: Kim Altintop <kim@eagain.io>	2026-01-15 18:44:12 +00:00
Kim Altintop	fc784fd233	commitlog: Change default max-records-in-commit to 1 (#3681 ) It is almost always wrong / undesirable to pack more than one transaction in a commit, so adjust the default accordingly. This also avoids surprises when using `#[serde(default)]` with nested structs -- serde evaluates the default depth-first, so overriding a single field in a nested struct will not consider any `#[serde(default = "custom_default")]` annotations on the parent.	2025-11-18 19:44:02 +00:00
Kim Altintop	cfd0d4b712	commitlog,durability: Support preallocation of disk space (#3437 ) When a new commitlog segment is created, allocate disk space for it up to the maximum segment size. Also do this when resuming writes to an existing segment, such that segments created without preallocation will allocate as well when the database is opened. Preallocation is gated behind the feature "fallocate", because it is not always desirable to preallocate, e.g. for local `standalone` users. The feature can only be enabled on Linux targets, because allocation is done using the Linux-specific `fallocate(2)` system call. Unlike `ftruncate(2)` or the portable `posix_fallocate(3)`, `fallocate(2)` supports allocating disk space without zeroing. This is currently required, because the commitlog format does not handle padding bytes. If not enough space can be allocated, the commitlog refuses writes. For commitlogs that were created without preallocation, this means that the commitlog cannot even be opened in this situation. The local durability impl will crash if it detects that the commitlog is unable to allocate enough space. This means that a database will eventually crash and be unable to start in an out-of-space situation. Allocated space is not included in the reported size of the commitlog. Instead, allocated blocks are reported separately. # Expected complexity level and risk 3 - Disk size monitoring may need to be adjusted. # Testing - [x] Adds a test that demonstrates the crash behavior of [`spacetimedb_durability::Local`] when there is insufficient space. The test performs I/O against a loop device. - [x] Modified the `repo::Memory` impl so that it can run out of space. No test currently utilizes this, but existing tests assuming infinite space still pass.	2025-11-10 16:55:55 +00:00
Kim Altintop	798852e466	commitlog: Improve error context (#3506 ) The commitlog creates new segments atomically, returning EEXIST if the segment already exists. This is to break a retry loop in case the filesystem becomes unwritable. This error did not contain any context about what does not exist, so this patch adds some. Also, an unhandled edge case has been discovered: When opening an existing log, the commitlog will try to resume the last segment for writing. If it finds a corrupt commit in that segment, it won't resume, but instead create a new segment at the corrupt commit's offset + 1. However, if the first commit in the last segment is corrupted, the offset will be that of the last segment -- trying to start a new segment will thus fail with EEXIST. Without additional recovery mechanisms, it is not obvious what to do in this case: the segment could contain valid data after the initial commit, so we certainly don't want to throw it away. Instead, we now detect this case and return `InvalidData` with some context. # Expected complexity level and risk 1 # Testing - [ ] A (regression) test is included	2025-10-29 10:59:35 +00:00
Phoebe Goldman	e77b62f475	Also capture a snapshot every new commitlog segment (#3405 ) # Description of Changes We've run into a problem on Maincloud caused by a database that was writing a relatively small number of very large transactions. This was accruing many commitlog segments consuming hundreds of gigabytes of disk, but had not ever taken a snapshot, or compressed or archived any data, as the database had not progressed past one million transactions. With this PR, we take a snapshot every time the commitlog segment rotates. We still also snapshot every million transactions. One BitCraft database we looked at had 2.5 million transactions per commitlog segment, meaning that this change will not meaningfully affect the frequency of snapshots. The offending Maincloud database, however, had only 50 transactions per segment! # API and ABI breaking changes N/a # Expected complexity level and risk 3: Hastily made changes to finnicky code across several crates. # Testing I am unsure how to test these changes. - [ ] <!-- maybe a test you want to do --> - [ ] <!-- maybe a test you want a reviewer to do, so they can check it off when they're satisfied. -->	2025-10-15 15:18:15 +00:00
Noa	619b8ce021	Bump Rust to 1.90 (#3397 ) # Description of Changes Necessary for pulling in rolldown. # API and ABI breaking changes None # Expected complexity level and risk 1, with the caveat that this updates the Rust version and therefore touches all the code. # Testing - [ ] Just the automated testing	2025-10-09 20:41:25 +00:00
Zeke Foppa	f6f0909ea4	Update all licenses (#3002 ) # Description of Changes We recently merged several repos together. This PR clarifies the license terms for several subdirectories, as well as the relationship between the licenses. The licenses in our subdirectories have become symbolic links to licenses in our toplevel `licenses` directory. For any particular subdirectory's license file in the diff, you can click `... -> View file` and then click on the text that says "Symbolic Link" on that page. This will take you to the license file that it links to. I have also updated the `tools/upgrade-version` script to update the change date in the new `licenses/BSL.txt` file. # API and ABI breaking changes None. # Expected complexity level and risk 1 # Testing None. Only changes to license files. --------- Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com>	2025-08-12 18:20:58 +00:00
Kim Altintop	37c64c787b	commitlog: Provide folding over a range of tx offsets (#3129 ) Adds methods and free-standing functions to allow folds to stop at an upper bound, by passing a range instead of only a start offset. # Expected complexity level and risk 1 # Testing	2025-08-08 11:55:27 +00:00
Kim Altintop	7709f3cf1e	commitlog: Set up options for toml configuration (#2942 )	2025-07-17 08:34:35 +00:00
Noa	742303ca49	Bump rust-toolchain to rust 1.88 (#2749 ) Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>	2025-07-15 17:39:41 +00:00
Viktor Szépe	f6da9e1f5f	Fix typos (#2812 ) Signed-off-by: Viktor Szépe <viktor@szepe.net>	2025-06-04 16:33:32 +00:00
Kim Altintop	a6bc0e59fd	commitlog: Reduce log noise when offset index cannot be used (#2791 )	2025-06-02 07:27:12 +00:00
Kim Altintop	c98219529d	commitlog: Fix index truncation test. (#2792 )	2025-05-30 15:13:06 +00:00
Shubham Mishra	f2a9657a72	Commitlog: handle empty offset index lookup (#2771 )	2025-05-22 12:59:02 +00:00
Kim Altintop	1f4207de86	commitlog: Include latest commit offset in segment metadata (#2733 )	2025-05-19 06:54:58 +00:00
Kim Altintop	3d1a91c25c	Handle snapshot restore more robustly (#2735 ) Signed-off-by: Kim Altintop <kim@eagain.io> Signed-off-by: Shubham Mishra <shivam828787@gmail.com> Co-authored-by: Shubham Mishra <shubham@clockworklabs.io>	2025-05-15 14:35:09 +00:00
Shubham Mishra	41c316c984	Commitlog stream range fix. (#2721 )	2025-05-10 04:06:05 +00:00
Noa	483a9488e2	Update rand (#2568 )	2025-04-11 17:39:41 +00:00
Mario Montoya	3fd78203c4	Compress the snapshot (#2034 )	2025-04-11 15:18:17 +00:00
Shubham Mishra	76a52ca747	Use Offset Index on Meta extract (#2549 )	2025-04-09 19:05:52 +00:00
Noa	2f6660e919	Add integration test for commitlog compression (#2538 )	2025-04-08 17:10:31 +00:00
Kim Altintop	d88a266c20	commitlog: Derive serde for `Commit` (#2535 )	2025-04-02 11:16:11 +00:00
Noa	d436b1f9b7	Followup to #2504 (#2534 )	2025-03-31 23:52:56 +00:00
Noa	a5212a5f75	Commitlog compression (#2504 )	2025-03-31 22:00:52 +00:00
Kim Altintop	8dfab1c09d	commitlog: Open stream writer with metadata (#2530 )	2025-03-31 17:25:39 +00:00
Kim Altintop	5063bd8759	commitlog: Streaming (#2492 )	2025-03-26 07:40:23 +00:00
Kim Altintop	434c28063f	commitlog: Fix open flags for read-only offset index (#2468 )	2025-03-19 12:24:30 +00:00
Mario Montoya	f9f38543c8	Add readmes to all implementation crates specifying that they do no offer stable interfaces (#2320 )	2025-03-06 19:50:17 +00:00
Shubham Mishra	7cb509c2e2	handle offset index empty (#2344 )	2025-03-05 11:04:10 +00:00
Kim Altintop	8054999927	commitlog: Use `fdatasync` (#2338 )	2025-03-04 15:20:14 +00:00
Noa	293aebaef9	Bump to Rust 1.84 (#2001 )	2025-01-28 23:11:29 +00:00
Phoebe Goldman	d171b44a89	Don't create indexes during bootstrapping; wait until after replay (#2161 )	2025-01-23 19:41:39 +00:00
Kim Altintop	c5f4c8bc5c	commitlog: Make offset index usable externally (#2108 )	2025-01-14 18:56:08 +00:00
Kim Altintop	da0f83b6dd	commitlog: Make memory segment behave like `O_APPEND` (#2072 )	2024-12-20 11:28:19 +00:00
Kim Altintop	a191055f56	commitlog: Fix offset index truncation (#2073 )	2024-12-19 15:44:10 +00:00
Kim Altintop	31698618a8	commitlog: Provide `segment_len` method for segments (#2042 )	2024-12-10 10:43:39 +00:00
Shubham Mishra	f04d2817d0	create commitlog dir in fs::New (#2006 )	2024-11-21 15:47:40 +00:00
Kim Altintop	125ab58388	commitlog: Fix set_epoch (#2005 )	2024-11-21 13:34:10 +00:00
Noa	97bff92efb	Optimize integrate_generated_columns (#1895 )	2024-11-12 16:36:50 +00:00
Noa	f136670420	Directory structure impl (#1879 ) Co-authored-by: Jeffrey Dallatezza <jeffreydallatezza@gmail.com>	2024-11-12 04:24:43 +00:00
Kim Altintop	e4fcb72432	commitlog: Small tweaks (#1978 )	2024-11-11 13:24:21 +00:00
Kim Altintop	f22b163c0a	commitlog: Introduce epoch (#1851 )	2024-11-05 10:10:30 +00:00

1 2

74 Commits