200 Commits

Author SHA1 Message Date
Nick Sweeting
95beddc5fc
more migration fixes 2025-12-29 22:12:57 -08:00
Nick Sweeting
2e350d317d
fix initial migrtaions 2025-12-29 21:27:31 -08:00
Nick Sweeting
3dd329600e
comment updates 2025-12-29 21:05:34 -08:00
Nick Sweeting
80f75126c6
more fixes 2025-12-29 21:03:05 -08:00
Claude
a5654e877f
rename media plugin to ytdlp with backwards-compatible aliases
- Rename archivebox/plugins/media/ → archivebox/plugins/ytdlp/
- Rename hook script on_Snapshot__63_media.bg.py → on_Snapshot__63_ytdlp.bg.py
- Update config.json: YTDLP_* as primary keys, MEDIA_* as x-aliases
- Update templates CSS classes: media-* → ytdlp-*
- Fix gallerydl bug: remove incorrect dependency on media plugin output
- Update all codebase references to use YTDLP_* and SAVE_YTDLP
- Add backwards compatibility test for MEDIA_ENABLED alias
2025-12-29 19:09:05 +00:00
Nick Sweeting
30c60eef76
much better tests and add page ui 2025-12-29 04:02:11 -08:00
Nick Sweeting
f4e7820533
use full dotted paths for all archivebox imports, add migrations and more fixes 2025-12-29 00:47:08 -08:00
Nick Sweeting
f0aa19fa7d
wip 2025-12-28 17:51:54 -08:00
Claude
057b49ad85
Update status command to use DB as source of truth
Remove imports of deleted folder utility functions and rewrite
status command to query Snapshot model directly. This aligns with
the fs_version refactor where the DB is the single source of truth.

- Use Snapshot.objects queries for indexed/archived/unarchived counts
- Scan filesystem directly for present/orphaned directory counts
- Simplify output to focus on essential status information
2025-12-28 19:19:03 +00:00
Nick Sweeting
bd265c0083
rename extractor to plugin everywhere 2025-12-28 04:43:15 -08:00
Nick Sweeting
50e527ec65
way better plugin hooks system wip 2025-12-28 03:39:59 -08:00
Claude
b632894bc9
Update views, API, and exports for new ArchiveResult output fields
Replace old `output` field with new fields across the codebase:
- output_str: Human-readable output summary
- output_json: Structured metadata (optional)
- output_files: Dict of output files with metadata
- output_size: Total size in bytes
- output_mimetypes: CSV of file mimetypes

Files updated:
- api/v1_core.py: Update MinimalArchiveResultSchema to expose new fields
- api/v1_core.py: Update ArchiveResultFilterSchema to search output_str
- cli/archivebox_extract.py: Use output_str in CLI output
- core/admin_archiveresults.py: Update admin fields, search, and fieldsets
- core/admin_archiveresults.py: Fix output_html variable name bug in output_summary
- misc/jsonl.py: Update archiveresult_to_jsonl() to include new fields
- plugins/extractor_utils.py: Update ExtractorResult helper class

The embed_path() method already uses output_files and output_str,
so snapshot detail page and template tags work correctly.
2025-12-27 20:28:22 +00:00
Claude
c3acadd528
Remove extractor field from Crawl model and fix tests
- Remove extractor field from Crawl model (moved to config dict)
- Update migration 0002_drop_seed_model to not add extractor
- Update archivebox_add.py to use config['PARSER'] instead
- Update admin.py recrawl to not pass extractor
- Update jsonl.py serialization to not include extractor
- Update test schema SCHEMA_0_8 to not include extractor
- Set default timeout to 60s for test commands
2025-12-27 01:49:09 +00:00
Nick Sweeting
9838d7ba02
tons of ui fixes and plugin fixes 2025-12-25 03:59:51 -08:00
Nick Sweeting
bb53228ebf
remove Seed model in favor of Crawl as template 2025-12-25 01:52:41 -08:00
Nick Sweeting
28e6c5bb65
add mcp server support 2025-12-25 01:51:42 -08:00
Nick Sweeting
866f993f26
logging and admin ui improvements 2025-12-25 01:10:41 -08:00
Nick Sweeting
d95f0dc186
remove huey 2025-12-24 23:40:18 -08:00
Nick Sweeting
6c769d831c
wip 2 2025-12-24 21:46:14 -08:00
Nick Sweeting
1915333b81
wip major changes 2025-12-24 20:10:38 -08:00
Nick Sweeting
c1335fed37
Remove ABID system and KVTag model - use UUIDv7 IDs exclusively
This commit completes the simplification of the ID system by:

- Removing the ABID (ArchiveBox ID) system entirely
- Removing the base_models/abid.py file
- Removing KVTag model in favor of the existing Tag model in core/models.py
- Simplifying all models to use standard UUIDv7 primary keys
- Removing ABID-related admin functionality
- Cleaning up commented-out ABID code from views and statemachines
- Deleting migration files for ABID field removal (no longer needed)

All models now use simple UUIDv7 ids via `id = models.UUIDField(primary_key=True, default=uuid7)`

Note: Old migrations containing ABID references are preserved for database
migration history compatibility.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 06:13:49 -08:00
Nick Sweeting
930b9bf386
add archivebox worker cli cmd to list of all cmds 2024-12-12 21:44:44 -08:00
Nick Sweeting
5cf7725f0e
add new archivebox worker implementation based on better distributed systems principles 2024-12-12 21:41:45 -08:00
Nick Sweeting
dcd7e2555e
add new archivebox_extract cli command 2024-12-03 02:14:56 -08:00
Nick Sweeting
b948e49013
add urls log to Crawl model 2024-11-19 06:32:33 -08:00
Nick Sweeting
4dd53dc12a
Merge branch 'newchanges' into dev 2024-11-19 05:28:20 -08:00
Nick Sweeting
b852951c58
fix cli loading edge case where setup_django wasnt running when it should 2024-11-19 05:27:35 -08:00
Nick Sweeting
f8e2f7c753
restore missing archivebox_update work 2024-11-19 05:09:19 -08:00
Nick Sweeting
52446b86ba
restore missing archivebox_status work 2024-11-19 05:08:41 -08:00
Nick Sweeting
0f536ff18b
restore missing archivebox_schedule work 2024-11-19 05:07:55 -08:00
Nick Sweeting
fe3320eff0
restore missing archivebox_remove work 2024-11-19 05:07:12 -08:00
Nick Sweeting
230bf34e14
restore missing archivebox_config work 2024-11-19 05:05:06 -08:00
Nick Sweeting
6740202d78
fix cli loading edge case where setup_django wasnt running when it should 2024-11-19 04:20:00 -08:00
Nick Sweeting
f21b86aba8
better cli colors 2024-11-19 04:10:07 -08:00
Nick Sweeting
0f860d40f1
working archivebox_status CLI cmd 2024-11-19 04:05:05 -08:00
Nick Sweeting
292730ebad
working archivebox_schedule cmd 2024-11-19 03:54:47 -08:00
Nick Sweeting
3a64ced697
fix archivebox delete errors 2024-11-19 03:45:44 -08:00
Nick Sweeting
0347b911aa
archivebox add and remove CLI cmds 2024-11-19 03:40:01 -08:00
Nick Sweeting
2595139180
improve statemachine logging and archivebox update CLI cmd 2024-11-19 03:31:05 -08:00
Nick Sweeting
c9a05c9d94
working archivebox update CLI cmd 2024-11-19 02:32:05 -08:00
Nick Sweeting
a0edf218e8
fix archivebox init and archivebox install CLI commands 2024-11-19 01:05:49 -08:00
Nick Sweeting
5f01fc8307
fix archivebox shell and manage CLI commands 2024-11-19 00:48:39 -08:00
Nick Sweeting
328eb98a38
move main funcs into cli files and switch to using click for CLI 2024-11-19 00:18:51 -08:00
Nick Sweeting
569081a9eb
rename abid_utils to base_models 2024-11-18 19:40:05 -08:00
Nick Sweeting
65afd405b1
merge seeds and crawls apps 2024-11-18 19:23:14 -08:00
Nick Sweeting
4a5d607296
move logging_util into archivebox.misc subfolder 2024-11-18 19:08:49 -08:00
Nick Sweeting
e469c5a344
merge queues and actors apps into new workers app 2024-11-18 18:52:48 -08:00
Nick Sweeting
0acd388c02
fix imports and deps 2024-11-18 18:07:34 -08:00
Nick Sweeting
6b83b4c995
leave archivebox running when in archivebox update 2024-11-18 04:27:38 -08:00
Nick Sweeting
eeb2671e4d
API improvements 2024-11-18 04:27:38 -08:00