Quick start
Install, configure Postgres, run migrations, seed projects, and start web + worker.
What you get
- A public website that helps people find datasets for projects.
- A management console for crawler projects, sources, and run history.
- A worker process that runs crawls via pg-boss.
- A Postgres schema that stores datasets, resources, runs, logs, and analytics profiles.
Prerequisites
- Node.js 20+ (22 recommended)
- Postgres 14+
1) Configure
Copy the example config and edit your values:
cp config/registry.config.example.json config/registry.config.local.json
Set:
db.*connection details (ordb.connectionString)security.emergencyAdminToken
2) Install
npm install
3) Migrate + seed
npm run db:migrate
npm run db:seed
Or via the CLI (same thing):
registry db migrate
registry db seed
Or do a clean reset:
npm run db:reset
CLI equivalent:
registry db reset
4) Start the web app
npm run dev
# or
npm run build && npm run start
Open:
- Public:
http://localhost:3000 - Management UI:
http://localhost:3000/app
5) Start the worker
In a second terminal:
npm run worker
6) Run a seeded crawler project
- Go to Management → Crawler projects
- Click Run now on a seeded project
- Inspect Crawl runs for results
7) Health checks
GET /api/health- CLI:
npm run doctor
Notes
- For schedule changes (cron), restart the worker (current build).
- For production, use a process manager (systemd / pm2 / supervisor).