Filesystem / network share
Crawl local directories or mounted shares (with strict allow-listing).
Safety first
Filesystem crawling is disabled unless you set allowedRoots.
This prevents accidentally crawling your whole server.
Configure
In config/registry.config.local.json:
{
"filesystem": {
"allowedRoots": ["/mnt/data"],
"maxFiles": 2000,
"maxDepth": 6,
"maxFileSizeBytes": 209715200,
"allowedExtensions": [".csv", ".json", ".parquet", ".xlsx", ".tsv", ".geojson"]
}
}
Add a source
Use a source URL:
file:///mnt/data
The crawler will walk the directory and ingest matching files.
Notes
- Extensions are filtered
- Files over max size are rejected