DataShield MCP Dataset Library

deploy-library-myorg-ai

Deploying DataShield MCP Dataset Library

Target: library.myorg.ai — separate project, separate database, same Postgres server.


1. Create the database

On your existing Postgres server, create a dedicated database:

CREATE DATABASE myorg_dataset_library OWNER postgres;

Verify connectivity from the hosting server:

psql postgres://postgres:YOUR_PASSWORD@127.0.0.1:5432/myorg_dataset_library -c "SELECT 1;"

2. Unzip and install

# Wherever you keep projects (e.g. /var/www or /home/user/apps)
mkdir -p /var/www/library.myorg.ai
cd /var/www/library.myorg.ai
unzip myorg-public-dataset-library__v0_2_0.zip

npm install

3. Configure

Option A: .env file (simplest)

cp .env.example .env

Edit .env:

DATABASE_URL=postgres://postgres:YOUR_PASSWORD@127.0.0.1:5432/myorg_dataset_library
EMERGENCY_ADMIN_TOKEN=pick-a-strong-token-here
TZ=America/Chicago
LOG_LEVEL=info
MCP_PORT=3100

Option B: JSON config (more control)

cp config/registry.config.example.json config/registry.config.local.json

Edit config/registry.config.local.json:

{
  "app": {
    "publicSiteName": "DataShield Public Dataset Library",
    "publicBaseUrl": "https://library.myorg.ai",
    "adminBaseUrl": "https://library.myorg.ai/app",
    "enablePublicIndexing": true
  },
  "db": {
    "connectionString": "postgres://postgres:YOUR_PASSWORD@127.0.0.1:5432/myorg_dataset_library"
  },
  "security": {
    "emergencyAdminToken": "pick-a-strong-token-here"
  }
}

4. Run migrations + seed

npm run db:migrate
npm run db:seed

Verify:

npm run doctor

5. Build the Next.js app

npm run build

6. Run the three processes

The app has three independent processes. All three should be running:

| Process | Command | Port | Purpose | |---------|---------|------|---------| | Web | npm start | 3000 (default) | Next.js frontend + API | | Worker | npm run worker | — | pg-boss crawl scheduler | | MCP HTTP | npm run mcp:http | 3100 | Agent tool access (stateless HTTP) |

Using PM2 (recommended for Hostinger/VPS):

npm install -g pm2

# Start all three
pm2 start npm --name "pdl-web" -- start
pm2 start npm --name "pdl-worker" -- run worker
pm2 start npm --name "pdl-mcp" -- run mcp:http

# Save so they restart on reboot
pm2 save
pm2 startup

Custom PORT for the web process:

PORT=3001 pm2 start npm --name "pdl-web" -- start

7. Reverse proxy (Nginx / LiteSpeed / Caddy)

Point library.myorg.ai at the Next.js process.

Nginx example:

server {
    listen 443 ssl;
    server_name library.myorg.ai;

    ssl_certificate     /path/to/cert.pem;
    ssl_certificate_key /path/to/key.pem;

    # Next.js web app
    location / {
        proxy_pass http://127.0.0.1:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    # MCP endpoint (stateless HTTP — no SSE, no long-lived connections)
    location /mcp {
        proxy_pass http://127.0.0.1:3100/mcp;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_http_version 1.1;
    }
}

LiteSpeed (Hostinger) — .htaccess proxy:

If using LiteSpeed with the Node.js hosting setup described in docs/devops/deploy-hostinger-litespeed.md, configure the app entry point and set APP_PORT=3000 in the hosting panel. The MCP endpoint can be accessed directly at port 3100 or proxied via a subdomain.


8. Verify deployment

# Health check
curl https://library.myorg.ai/api/health

# Search API
curl "https://library.myorg.ai/api/public/datasets?q=lottery"

# Admin API (requires token)
curl -H "x-admin-token: YOUR_TOKEN" https://library.myorg.ai/api/app/metrics

# MCP endpoint (should return JSON server info)
curl https://library.myorg.ai/mcp

# MCP tool test (initialize)
curl -X POST https://library.myorg.ai/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-03-26","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}'

9. Connect MCP to your analysis engine

Stdio mode (local, e.g. Claude Desktop):

Add to your Claude Desktop claude_desktop_config.json:

{
  "mcpServers": {
    "dataset-library": {
      "command": "npm",
      "args": ["run", "mcp"],
      "cwd": "/var/www/library.myorg.ai"
    }
  }
}

HTTP mode (remote / DataShield integration):

The MCP server at https://library.myorg.ai/mcp uses stateless Streamable HTTP transport. Each POST is handled independently — no sessions, no SSE, no long-lived connections. Works reliably behind LiteSpeed and nginx.

Claude.ai Custom Connector URL: https://library.myorg.ai/mcp

IMPORTANT: The SDK version is pinned to 1.26.0 in package.json. Do NOT use a caret (^1.26.0) — newer SDK versions change Origin validation and connection lifecycle behavior.


10. Logging

Structured JSON logs go to stderr. View with:

# PM2 logs
pm2 logs pdl-worker
pm2 logs pdl-mcp

# Set debug level for troubleshooting
LOG_LEVEL=debug pm2 restart pdl-worker

Log format:

{"ts":"2026-03-08T12:00:00.000Z","level":"info","event":"crawl.run.started","projectId":"...","runId":"...","sources":3}

Architecture summary

                    ┌──────────────────────────────────────────┐
                    │         library.myorg.ai (Nginx)         │
                    └──────┬─────────────────────┬─────────────┘
                           │                     │
                    ┌──────▼──────┐       ┌──────▼──────┐
                    │  Next.js    │       │  MCP HTTP   │
                    │  :3000      │       │  :3100      │
                    └──────┬──────┘       └──────┬──────┘
                           │                     │
                    ┌──────▼─────────────────────▼──────┐
                    │    Postgres (myorg_dataset_library)│
                    │    registry.* schema               │
                    │    pgboss.* schema (worker)        │
                    └──────▲───────────────────────────  ┘
                           │
                    ┌──────┴──────┐
                    │  Worker     │
                    │  (pg-boss)  │
                    └─────────────┘

The database runs on your existing Postgres server as a separate database (myorg_dataset_library). It does not touch DataShield's database or schemas.