1
Fork 0
mirror of https://github.com/thegeneralist01/archivr synced 2026-05-30 08:36:47 +02:00
archivr/docs/README.md
TheGeneralist 2d59ab0af5
feat: add archiving of platform media files (#1)
* chore: specify non-ignored `.md` files

* refactor: rename youtube downloader to ytdlp

More generic name since yt-dlp supports many sites beyond YouTube.

* feat: add local file downloader

Supports file:// URLs for archiving local files.

* deps: add regex crate for URL pattern matching

* feat: expand source detection with granular YouTube types

- Split Source::YouTube into YouTubeVideo, YouTubePlaylist, YouTubeChannel
- Add Source::X for Twitter/X posts
- Add Source::Local for file:// URLs
- Add regex-based URL pattern matching for YouTube URLs
- Add shorthand schemes (yt:video/ID, youtube:playlist/ID, etc.)
- Add comprehensive tests for all URL patterns

* docs: update README milestones

Mark YouTube videos, Twitter videos, and local files as done.

* chore: update flake.lock

* feat: add shorthand schemes for X/Twitter media

* chore: move docs into docs dir

* Remove temp file using timestamp path

Delete the temp entry at store_path/temp/<timestamp> in both
the hash-exists and success paths. Stop constructing the full filename
with extension and remove the early process::exit to de-duplicate
cleanup.

* Add Nix caches and default flake package

* Add social platform source detection and update milestones

* Tighten social URL matching to avoid false positives

* Mark media archiving milestone complete
2026-03-31 12:39:35 +02:00

49 lines
1.9 KiB
Markdown

# archivr
An open-source self-hosted archiving tool. Work in progress.
## Milestones
- [ ] Archiving
- [X] Archiving media files from social media platforms
- [X] YouTube Videos
- [X] Twitter Videos
- [X] Instagram
- [X] Facebook
- [X] TikTok
- [X] Reddit
- [X] Snapchat
- [ ] YouTube Posts (postponed)
- [X] Archiving local files
- [ ] Archiving files from cloud storage services (Google Drive, Dropbox, OneDrive) and from URLs
- [ ] URLs
- [ ] Google Drive
- [ ] Dropbox
- [ ] OneDrive
- (Some of these could be postponed for later.)
- [ ] Archiving Twitter threads
- [ ] Archive web pages (HTML, CSS, JS, images)
- [ ] Archiving emails (???)
- [ ] Gmail
- [ ] Outlook
- [ ] Yahoo Mail
- [ ] Management
- [ ] Deduplication
- [ ] Tagging system
- [ ] Search functionality
- [ ] Categorization
- [ ] Metadata extraction and storage
- [ ] User Interface
- [ ] Web-based UI
- [ ] Backup and Sync
- [ ] Cloud backup (AWS S3, Google Cloud Storage)
- [ ] Local backup
## Motivation
There are two driving factors behind this project:
- In the age of information, all data is ephemeral. Social media platforms frequently delete content, and cloud storage services can become inaccessible and unreliable. Being able to archive important data is *very important* for preserving personal memories and digital history.
- I will be creating a small encyclopedia for my future family and kids. Therefore, I want to make sure that all the information I gather is preserved and accessible for future reference.
This project aims to provide a reliable solution for archiving important data from various sources, ensuring that users can preserve their digital assets for the long term.
## License
This project is licensed under the MIT License. See the [LICENSE](LICENSE.md) file for details.