The Tech Stack
Are you an experienced developer with knowledge of GoLang, Postgres, TypeScript/Angular and/or BitTorrent protocols? I’m currently a lone developer with a full time job and many other commitments, and have been working on this in spare moments for the past few months. This project is too big for one person! If you’re interested in contributing please review the open issues and feel free to open a PR!
Postgres
Postgres is the primary data store, and powers the search engine and message queue. These make use of several Postgres-specific features and extensions; as such, supporting other storage engines is likely to be complicated and is not a priority at the moment.
GoLang Backend
Some key libraries used include:
- anacrolix/torrent not heavily used right now, but contains many useful BitTorrent utilities and could drive future features such as in-place seeding
- fx for dependency injection and management of the application lifecycle
- gin for the HTTP server
- goose for database migrations
- gorm for database access
- gqlgen for the GraphQL server implementation
- rex a regular expression library that makes some of the monstrous classification regexes more manageable
- urfave/cli for the command line interface
- zap for logging
TypeScript/Angular Web UI
Using Angular Material components. The web UI is embedded in the GoLang binary and served by the Gin web framework, and hence the build artifacts are committed into the repository.
Nix dev shell
The repository includes a Nix shell for a reproducible development environment. To use the shell, simply install Nix then run nix develop
(or better still, use nix-direnv to use the included shell automatically.
Other tooling
- The repository includes a Taskfile containing several useful development scripts
- GitHub actions are used for CI, building the Docker image and for building this website
Architecture & Lifecycle of the DHT Crawler
The DHT and BitTorrent protocols are (rather impenetrably) documented at bittorrent.org. Relevant resources include:
- BEP 5: DHT Protocol
- BEP 51: Infohash Indexing
- BEP 33: DHT Scrapes
- BEP 10: Extension Protocol
- The Kademlia paper
The rest of what I’ve figured out about how to implement a DHT crawler was cobbled together from the now archived magnetico project and anacrolix’s BitTorrent libraries.