MCP Server Setup¶
labretriever ships an MCP server that exposes
a VirtualDB instance as a set of tools callable by Claude Code (or any other MCP
client). Once configured, Claude can discover datasets, inspect schema, and execute
DuckDB SQL queries against your collection without any manual Python.
Quick Install (Claude Code Plugin)¶
First, install labretriever so that labretriever-mcp
is available on your PATH.
Then add the marketplace and install the plugin:
The plugin will prompt you for a VirtualDB config file path and an optional
HuggingFace token at enable time. If labretriever-mcp is not found on PATH
when a session starts, Claude will display installation instructions.
For the BrentLab yeast resources collection, download the ready-to-use config from:
https://github.com/BrentLab/tfbpshiny/blob/main/tfbpshiny/brentlab_yeast_collection.yaml
Save it to a stable path and provide that path when the plugin prompts you.
Manual Configuration (without the plugin)¶
Install the package first — see Installation.
LABRETRIEVER_CONFIG must point to a VirtualDB YAML file that you create or
download — it tells the server which HuggingFace datasets to expose and how to map
their fields. See the VirtualDB Configuration docs
for the full format.
Add the following to .claude/settings.json (or ~/.claude/settings.json for
user-level):
{
"mcpServers": {
"labretriever": {
"command": "labretriever-mcp",
"type": "stdio",
"env": {
"LABRETRIEVER_CONFIG": "/absolute/path/to/brentlab_yeast_collection.yaml",
"HF_TOKEN": "${HF_TOKEN}"
}
}
}
}
HF_TOKEN is only required for private HuggingFace repositories. If it is not set
and a query touches a private or gated repository, the server will return a clear
error naming the repository.
Available Tools¶
Once the server is running, Claude has access to these tools:
| Tool | Description |
|---|---|
list_datasets |
List all registered dataset names (call this first). |
describe_dataset |
Return column names and types for a {name} or {name}_meta view. |
get_column_metadata |
Return semantic roles and condition-level definitions for each column. |
get_tags |
Return provenance tags (assay type, publication, etc.) for a dataset. |
get_common_fields |
Return column names shared across all _meta views. |
query |
Execute DuckDB SQL; returns shape by default, rows when return_data=True. |
Example Session¶
After connecting, a typical workflow in Claude Code looks like:
list_datasets- discover available views (harbison,callingcards, etc.)describe_dataset("harbison_meta")- inspect sample-level columnsget_column_metadata("harbison")- understand condition values and measurement rolesquery("SELECT * FROM harbison_meta WHERE condition = 'GAL'", return_data=True)- explorequery("SELECT regulator_symbol, COUNT(*) FROM harbison WHERE condition = 'GAL' AND pvalue < 0.001 GROUP BY 1 ORDER BY 2 DESC")- full analysis
See the VirtualDB tutorial for more query patterns.