DataShield MCP Dataset Library

Browse Upload Help API Pricing Management

Management UI guide

What each page does and the safest way to operate the crawlers.

Access

Open /app and enter your admin token.

The token is stored only in your browser localStorage.
Configure the server-side expected token with EMERGENCY_ADMIN_TOKEN.

Pages

Dashboard (`/app`)

High-level metrics
Provider breakdown
Recent runs
Recent logs

Crawler projects (`/app/crawlers`)

Edit project:
- enabled/paused
- cron schedule
- concurrency
- rate limit (RPS)
- max requests per run
Run now

Best practice

Start with enabled=false and manual runs.
Verify the provider works and you’re not being blocked.
Only then add cron schedules.

Sources (`/app/sources`)

Add portal roots or dataset landing pages.
Provider auto-detection is stored with the source.

Tip: Use multiple projects per provider if you want to split work (e.g. “Socrata - Cities” vs “Socrata - States”).

Runs (`/app/runs`)

Each run records what changed.
Inspect run items if something unexpected happened.

Datasets (`/app/datasets`)

Search the library.
Add/edit:
- steward notes
- notes_json_text
- extras_json_text

Alerts (`/app/alerts`)

Quick view of errors and failing sources.

App admin (`/app/admin`)

Doctor health report
Test DB credentials and optionally write config file
Purge logs
Create/disable API keys
Reset registry schema (testing)

Operating modes

Safe mode (recommended)

Use manual runs only
Keep concurrency <= 2
Keep RPS <= 1

Growth mode

Split providers into multiple projects
Stagger cron schedules
Watch logs and HTTP errors

Agent-managed mode

Create API keys with limited scopes
Use /api/events to record agent usage
Use MCP server for tool-based access