1
Fork 0
mirror of https://github.com/thegeneralist01/archivr synced 2026-05-30 08:36:47 +02:00

Flatten tweet archives and rearchive tweet assets

This commit is contained in:
TheGeneralist 2026-04-01 14:56:39 +02:00
parent 805916eee7
commit cb0abbb760
Signed by: thegeneralist01
SSH key fingerprint: SHA256:pp9qddbCNmVNoSjevdvQvM5z0DHN7LTa8qBMbcMq/R4
4 changed files with 466 additions and 13 deletions

View file

@ -50,6 +50,8 @@ This project aims to provide a reliable solution for archiving important data fr
- Tweet media/video: `tweet:media:ID`
- Thread TOML content: `x:thread:ID`, `twitter:thread:ID`
Tweet and thread TOMLs are stored directly in `raw_tweets/`. Downloaded tweet media and avatars are re-archived into the hashed `raw/` store, and the TOMLs point at those archived files using store-relative `raw/...` paths.
Twitter tweet/thread scraping requires `ARCHIVR_TWITTER_CREDENTIALS_FILE` to point to a cookies file for the vendored scraper.
## License