- Create archivebox/config/ldap.py with LDAPConfig class
- Create archivebox/ldap/ Django app with custom auth backend
- Update core/settings.py to conditionally load LDAP when enabled
- Add LDAP_CREATE_SUPERUSER support to auto-grant superuser privileges
- Add comprehensive tests in test_auth_ldap.py (no mocks, no skips)
- LDAP only activates if django-auth-ldap is installed and LDAP_ENABLED=True
- Helpful error messages when LDAP libraries are missing or config is incomplete
Fixes#1664
Co-authored-by: Nick Sweeting <pirate@users.noreply.github.com>
Fixes#1445
This PR resolves the issue where SingleFile was not respecting Chrome
user data directory and other Chrome launch options that work for other
Chrome-based extractors (PDF, Screenshot, etc.).
## Changes
- Added `SINGLEFILE_CHROME_ARGS` config option with fallback to
`CHROME_ARGS`
- Updated SingleFile extractor to pass Chrome arguments via
`--browser-args`
- Updated documentation
This ensures SingleFile respects the same Chrome configuration as other
Chrome-based extractors.
Generated with [Claude Code](https://claude.ai/code)
Show small thumbnails of recently completed ArchiveResult content in the
progress header. The thumbnail strip appears below the stats bar and
shows the last 20 successfully archived items with embeddable content
(screenshots, favicons, DOM snapshots, etc.).
Features:
- API returns recent_thumbnails with embed paths for succeeded results
- Thumbnails display with plugin-specific icons as fallback
- New thumbnails animate in with a pop effect
- Clicking a thumbnail navigates to the snapshot admin page
- Horizontal scrollable strip with custom scrollbar styling
<!-- IMPORTANT: Do not submit PRs with only formatting / PEP8 / line
length changes. -->
# Summary
<!--e.g. This PR fixes ABC or adds the ability to do XYZ...-->
# Related issues
<!-- e.g. #123 or Roadmap goal #
https://github.com/pirate/ArchiveBox/wiki/Roadmap -->
# Changes these areas
- [ ] Bugfixes
- [ ] Feature behavior
- [ ] Command line interface
- [ ] Configuration options
- [ ] Internal architecture
- [ ] Snapshot data layout on disk
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Adds a thumbnail strip to the live progress header. It shows previews of
the last 20 successful archived items for quick visual feedback and
one-click navigation.
- **New Features**
- API returns recent_thumbnails with embed paths for succeeded results.
- Horizontal, scrollable thumbnail strip under the header.
- Uses preview images when available; plugin icons as fallback.
- New thumbnails animate in with a pop effect.
- Clicking a thumbnail opens the snapshot admin page.
<sup>Written for commit 17029ba8b8c3d1b405d3d0905506b0063c89dea6.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
…nstall
- Delete chrome/on_Crawl__10_chrome_validate.py (duplicates
chrome_install)
- Rename wget/on_Crawl__11_wget_validate.py →
on_Crawl__06_wget_install.py
All hooks now follow consistent naming: install, launch, or config
<!-- IMPORTANT: Do not submit PRs with only formatting / PEP8 / line
length changes. -->
# Summary
<!--e.g. This PR fixes ABC or adds the ability to do XYZ...-->
# Related issues
<!-- e.g. #123 or Roadmap goal #
https://github.com/pirate/ArchiveBox/wiki/Roadmap -->
# Changes these areas
- [ ] Bugfixes
- [ ] Feature behavior
- [ ] Command line interface
- [ ] Configuration options
- [ ] Internal architecture
- [ ] Snapshot data layout on disk
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Removed the redundant Chrome validate hook, renamed the Wget validate
hook to wget_install, and standardized hook names and priorities to
match the install/launch/config lifecycle. This removes duplicate logic
and fixes priority conflicts across Crawl, Binary, and Snapshot hooks.
- **Refactors**
- Deleted chrome/on_Crawl__10_chrome_validate.py (dup of chrome_install)
- Renamed wget validate to on_Crawl__06_wget_install.py
- Standardized on_Binary hook priorities: npm 10, pip 11, brew 12, apt
13, custom 14, env 15
- Fixed on_Snapshot order: staticfile 32, readability 56, mercury 57,
htmltotext 58
<sup>Written for commit 09a1ca3134847b47ca71576506cbac9c67a360ae.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
Show small thumbnails of recently completed ArchiveResult content in the
progress header. The thumbnail strip appears below the stats bar and shows
the last 20 successfully archived items with embeddable content (screenshots,
favicons, DOM snapshots, etc.).
Features:
- API returns recent_thumbnails with embed paths for succeeded results
- Thumbnails display with plugin-specific icons as fallback
- New thumbnails animate in with a pop effect
- Clicking a thumbnail navigates to the snapshot admin page
- Horizontal scrollable strip with custom scrollbar styling
- Add assertIsNotNone for accessibility_data to ensure test fails if no data generated
- Capture and report JSON decode errors in parse_dom_outlinks test
- Add assertIsNotNone for outlinks_data with error details
- Removes conditional checks that allowed tests to pass without verifying functionality
Addresses review comments from cubic-dev-ai
Co-authored-by: Nick Sweeting <pirate@users.noreply.github.com>
- test_seo.py: Add assertIsNotNone before conditional to catch SEO extraction failures
- test_ssl.py: Add assertIsNotNone to ensure SSL data is captured from HTTPS URLs
- test_pip_provider.py: Assert jsonl_found variable to verify binary discovery
- dns plugin: Deduplicate NXDOMAIN records using seenResolutions map
Tests now fail when functionality doesn't work (no cheating).
Co-authored-by: Nick Sweeting <pirate@users.noreply.github.com>
Add input validation and path safety checks to prevent path traversal
attacks in persona name handling:
- Add validate_persona_name() to block dangerous characters (/, \, .., etc)
- Add ensure_path_within_personas_dir() to verify resolved paths stay within PERSONAS_DIR
- Apply validation at persona creation, renaming, and deletion operations
Fixes security issues identified by cubic-dev-ai in PR review.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Nick Sweeting <pirate@users.noreply.github.com>
- merkletree: Tests merkle tree generation with real files,
empty directory handling, and disabled mode
- custom: Tests custom bash command execution and binary discovery
Real integration tests using Chrome sessions with example.com:
- accessibility: Tests page outline and accessibility tree extraction
- parse_dom_outlinks: Tests link extraction and categorization
- consolelog: Tests console output capture
The assertion was checking 'has_seo_data or seo_data' inside an 'if seo_data:' block,
making it always truthy. Changed to just check 'has_seo_data' to properly verify
that expected SEO keys were extracted.
Co-authored-by: Nick Sweeting <pirate@users.noreply.github.com>