mirror of
https://github.com/thegeneralist01/archivr
synced 2026-05-30 08:36:47 +02:00
feat: add archiving of platform media files (#1)
* chore: specify non-ignored `.md` files * refactor: rename youtube downloader to ytdlp More generic name since yt-dlp supports many sites beyond YouTube. * feat: add local file downloader Supports file:// URLs for archiving local files. * deps: add regex crate for URL pattern matching * feat: expand source detection with granular YouTube types - Split Source::YouTube into YouTubeVideo, YouTubePlaylist, YouTubeChannel - Add Source::X for Twitter/X posts - Add Source::Local for file:// URLs - Add regex-based URL pattern matching for YouTube URLs - Add shorthand schemes (yt:video/ID, youtube:playlist/ID, etc.) - Add comprehensive tests for all URL patterns * docs: update README milestones Mark YouTube videos, Twitter videos, and local files as done. * chore: update flake.lock * feat: add shorthand schemes for X/Twitter media * chore: move docs into docs dir * Remove temp file using timestamp path Delete the temp entry at store_path/temp/<timestamp> in both the hash-exists and success paths. Stop constructing the full filename with extension and remove the early process::exit to de-duplicate cleanup. * Add Nix caches and default flake package * Add social platform source detection and update milestones * Tighten social URL matching to avoid false positives * Mark media archiving milestone complete
This commit is contained in:
parent
553cca99ca
commit
2d59ab0af5
12 changed files with 616 additions and 74 deletions
52
README.md
52
README.md
|
|
@ -1,52 +0,0 @@
|
|||
# archivr
|
||||
|
||||
An open-source self-hosted archiving solution. Work in progress.
|
||||
|
||||
## Milestones
|
||||
- [ ] Archiving
|
||||
- [ ] Archiving media files from social media platforms
|
||||
- [ ] YouTube
|
||||
- [ ] Twitter
|
||||
- [ ] Instagram
|
||||
- [ ] Facebook
|
||||
- [ ] TikTok
|
||||
- [ ] Reddit
|
||||
- [ ] Snapchat
|
||||
- (Some of these could be postponed for later.)
|
||||
- [ ] Archiving local files
|
||||
- [ ] Archive videos (MP4, WebM)
|
||||
- [ ] Archive audio files (MP3, WAV)
|
||||
- [ ] Archive documents (DOCX, XLSX, PPTX)
|
||||
- [ ] Archive PDFs
|
||||
- [ ] Archive images (JPEG, PNG, GIF)
|
||||
- [ ] Archiving files from cloud storage services (Google Drive, Dropbox, OneDrive) and from URLs
|
||||
- [ ] URLs
|
||||
- [ ] Google Drive
|
||||
- [ ] Dropbox
|
||||
- [ ] OneDrive
|
||||
- [ ] Archive web pages (HTML, CSS, JS, images)
|
||||
- [ ] Archiving emails (???)
|
||||
- [ ] Gmail
|
||||
- [ ] Outlook
|
||||
- [ ] Yahoo Mail
|
||||
- [ ] Management
|
||||
- [ ] Deduplication
|
||||
- [ ] Tagging system
|
||||
- [ ] Search functionality
|
||||
- [ ] Categorization
|
||||
- [ ] Metadata extraction and storage
|
||||
- [ ] User Interface
|
||||
- [ ] Web-based UI
|
||||
- [ ] Backup and Sync
|
||||
- [ ] Cloud backup (AWS S3, Google Cloud Storage)
|
||||
- [ ] Local backup
|
||||
|
||||
## Motivation
|
||||
There are two driving factors behind this project:
|
||||
- In the age of information, all data is ephemeral. Social media platforms frequently delete content, and cloud storage services can become inaccessible and unreliable. Being able to archive important data is *very important* for preserving personal memories and digital history.
|
||||
- I will be creating a small encyclopedia for my future family and kids. Therefore, I want to make sure that all the information I gather is preserved and accessible for future reference.
|
||||
|
||||
This project aims to provide a reliable solution for archiving important data from various sources, ensuring that users can preserve their digital assets for the long term.
|
||||
|
||||
## License
|
||||
This project is licensed under the MIT License. See the [LICENSE](LICENSE.md) file for details.
|
||||
Loading…
Add table
Add a link
Reference in a new issue