Initial commit: gp-mcp server

MCP stdio server for Greenplum 6.x query plan evaluation: - explain_sql / explain_dbt_model tools - read-only session enforcement + statement_timeout - dbt compile integration - all settings via env vars (no hardcoded defaults)
2026-05-31 14:06:21 +03:00
commit 7c9487e0f9
10 changed files with 843 additions and 0 deletions
--- a/.env.example
+++ b/.env.example
@@ -0,0 +1,26 @@
 # Greenplum connection (required)
 GP_HOST=
 GP_PORT=
 GP_USER=
 GP_PASSWORD=
 GP_DATABASE=
 # Greenplum schema search_path (optional, comma-separated)
 GP_SCHEMA=
 # dbt project (required for explain_dbt_model)
 DBT_PROJECT_DIR=
 DBT_PROFILES_DIR=
 DBT_TARGET=
 # Path to dbt executable (optional, defaults to "dbt" on PATH)
 DBT_EXECUTABLE=
 # Statement timeout in milliseconds.
 # STATEMENT_TIMEOUT_MS = default applied to every EXPLAIN ANALYZE.
 # MAX_STATEMENT_TIMEOUT_MS = upper bound; per-call override cannot exceed this.
 STATEMENT_TIMEOUT_MS=
 MAX_STATEMENT_TIMEOUT_MS=
 # Logging: DEBUG, INFO, WARNING, ERROR
 LOG_LEVEL=
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,11 @@
 .env
 .venv/
 venv/
 __pycache__/
 *.pyc
 *.pyo
 *.egg-info/
 .pytest_cache/
 .mypy_cache/
 .ruff_cache/
 .DS_Store
--- a/README.md
+++ b/README.md
@@ -0,0 +1,301 @@
 # gp-mcp
 MCP-сервер для оценки плана запросов dbt-моделей в Greenplum 6.x.
 Запускается локально по `stdio` рядом с AI-агентом, который рефакторит легаси PL/SQL
 в dbt-модели. Сервер:
 1. компилирует выбранную dbt-модель (`dbt compile --select <model>`);
 2. подключается к Greenplum под read-only пользователем
   (`SET default_transaction_read_only = on`, `statement_timeout`);
 3. выполняет `EXPLAIN (ANALYZE, VERBOSE, FORMAT JSON)`;
 4. возвращает JSON-план + краткую сводку с GP-метриками (motion-узлы,
   самый медленный узел, ошибка оценки строк).
 ## Tools
 | Tool | Параметры | Что делает |
 |------|-----------|------------|
 | `explain_sql` | `sql: str`, `statement_timeout_ms?: int` | EXPLAIN ANALYZE для произвольного SQL |
 | `explain_dbt_model` | `model_name: str`, `statement_timeout_ms?: int` | `dbt compile` + EXPLAIN ANALYZE для модели |
 Возвращаемый JSON:
 ```json
 {
  "summary": {
    "total_cost": 12345.6,
    "plan_rows": 100000,
    "actual_rows": 98412,
    "execution_time_ms": 842.3,
    "planning_time_ms": 12.1,
    "slowest_node": { "node_type": "Seq Scan", "actual_total_time_ms": 700.2, "...": "..." },
    "motion_nodes": [{ "node_type": "Redistribute Motion", "...": "..." }],
    "rows_misestimation_factor": 1.02
  },
  "plan": [ /* raw EXPLAIN JSON */ ],
  "statement_timeout_ms": 300000,
  "compiled_sql": "select ...",
  "model_name": "fct_orders"
 }
 ```
 ## Установка
 ```bash
 cd /Users/admin/Projects/vpn
 python3 -m venv .venv
 source .venv/bin/activate
 pip install -r requirements.txt
 ```
 ## Конфигурация
 Все настройки — через переменные окружения. Скопируй `.env.example` в `.env`
 и заполни.
 | Переменная | Обязательная | Назначение |
 |------------|:-:|---|
 | `GP_HOST` | + | Хост Greenplum master |
 | `GP_PORT` | + | Порт |
 | `GP_USER` | + | Read-only пользователь (см. ниже) |
 | `GP_PASSWORD` | + | Пароль |
 | `GP_DATABASE` | + | Имя БД |
 | `GP_SCHEMA` |   | `search_path`, можно через запятую |
 | `DBT_PROJECT_DIR` | + | Каталог dbt-проекта (содержит `dbt_project.yml`) |
 | `DBT_PROFILES_DIR` | + | Каталог с `profiles.yml` |
 | `DBT_TARGET` | + | Имя target из `profiles.yml` (напр. `dev`) |
 | `DBT_EXECUTABLE` |   | Путь к `dbt`, по умолчанию `dbt` из PATH |
 | `STATEMENT_TIMEOUT_MS` | + | Дефолтный `statement_timeout` для EXPLAIN ANALYZE |
 | `MAX_STATEMENT_TIMEOUT_MS` | + | Верхняя граница, агент не сможет превысить |
 | `LOG_LEVEL` |   | `DEBUG`/`INFO`/`WARNING`/`ERROR`, дефолт `INFO` |
 Если обязательная переменная не задана — сервер не стартует и пишет в stderr
 имя недостающей переменной.
 ## Read-only роль в Greenplum
 Сервер требует, чтобы доступ был ограничен на уровне БД. Минимум:
 ```sql
 CREATE ROLE dbt_explain LOGIN PASSWORD '...';
 GRANT CONNECT ON DATABASE <db> TO dbt_explain;
 GRANT USAGE ON SCHEMA <schema> TO dbt_explain;
 GRANT SELECT ON ALL TABLES IN SCHEMA <schema> TO dbt_explain;
 ALTER DEFAULT PRIVILEGES IN SCHEMA <schema>
  GRANT SELECT ON TABLES TO dbt_explain;
 ```
 Сервер дополнительно ставит сессионный `default_transaction_read_only = on`,
 но GRANT-ы — единственная надёжная защита.
 ## Запуск
 Локально (для отладки):
 ```bash
 python -m gp_mcp.server
 ```
 Сервер ничего не печатает в stdout (это канал MCP) — все логи идут в stderr.
 ## Подключение к клиенту
 Сервер общается по `stdio`, поэтому клиент должен сам его запускать.
 Конфиг — стандартный MCP JSON: одинаковая форма для Claude Code и Cursor,
 различаются только пути к файлам настроек.
 Общий блок, который пригодится ниже:
 ```json
 {
  "command": "/Users/admin/Projects/vpn/.venv/bin/python",
  "args": ["-m", "gp_mcp.server"],
  "cwd": "/Users/admin/Projects/vpn/src",
  "env": {
    "GP_HOST": "gp-master.internal",
    "GP_PORT": "5432",
    "GP_USER": "dbt_explain",
    "GP_PASSWORD": "REPLACE_ME",
    "GP_DATABASE": "analytics",
    "GP_SCHEMA": "analytics,public",
    "DBT_PROJECT_DIR": "/Users/admin/Projects/dbt-analytics",
    "DBT_PROFILES_DIR": "/Users/admin/.dbt",
    "DBT_TARGET": "dev",
    "STATEMENT_TIMEOUT_MS": "300000",
    "MAX_STATEMENT_TIMEOUT_MS": "900000",
    "LOG_LEVEL": "INFO"
  }
 }
 ```
 Важно:
 - `command` — **абсолютный** путь к Python из venv проекта. Клиенты MCP
  обычно стартуют без активированного окружения, поэтому полагаться на
  `python` из PATH нельзя.
 - `cwd` указан на `src/`, чтобы Python нашёл пакет `gp_mcp` без установки
  (`pip install -e .` не делаем).
 - Секреты держим в `env` соответствующего конфига клиента, **не** в коде
  и **не** в репозитории.
 ---
 ### Claude Code
 Есть три способа добавить сервер — выбери один.
 **1. Через CLI (быстрее всего)**
 ```bash
 claude mcp add gp-mcp \
  --scope user \
  --env GP_HOST=gp-master.internal \
  --env GP_PORT=5432 \
  --env GP_USER=dbt_explain \
  --env GP_PASSWORD=REPLACE_ME \
  --env GP_DATABASE=analytics \
  --env DBT_PROJECT_DIR=/Users/admin/Projects/dbt-analytics \
  --env DBT_PROFILES_DIR=/Users/admin/.dbt \
  --env DBT_TARGET=dev \
  --env STATEMENT_TIMEOUT_MS=300000 \
  --env MAX_STATEMENT_TIMEOUT_MS=900000 \
  -- /Users/admin/Projects/vpn/.venv/bin/python -m gp_mcp.server
 ```
 Флаг `--scope`:
 - `user` — для всех проектов (пишется в `~/.claude.json`);
 - `project` — общий для команды, кладётся в `.mcp.json` в корне проекта,
  его можно коммитить в git (секреты тогда задают через `${VAR}`-подстановку
  из окружения, а не хардкодом);
 - `local` — только в текущем проекте, только у тебя.
 **2. Вручную, user-scope: `~/.claude.json`**
 ```json
 {
  "mcpServers": {
    "gp-mcp": { /* см. общий блок выше */ }
  }
 }
 ```
 **3. Вручную, project-scope: `.mcp.json` в корне dbt-репозитория**
 ```json
 {
  "mcpServers": {
    "gp-mcp": {
      "command": "/Users/admin/Projects/vpn/.venv/bin/python",
      "args": ["-m", "gp_mcp.server"],
      "cwd": "/Users/admin/Projects/vpn/src",
      "env": {
        "GP_HOST": "${GP_HOST}",
        "GP_PORT": "${GP_PORT}",
        "GP_USER": "${GP_USER}",
        "GP_PASSWORD": "${GP_PASSWORD}",
        "GP_DATABASE": "${GP_DATABASE}",
        "DBT_PROJECT_DIR": "${DBT_PROJECT_DIR}",
        "DBT_PROFILES_DIR": "${DBT_PROFILES_DIR}",
        "DBT_TARGET": "${DBT_TARGET}",
        "STATEMENT_TIMEOUT_MS": "300000",
        "MAX_STATEMENT_TIMEOUT_MS": "900000"
      }
    }
  }
 }
 ```
 **Проверка:**
 ```bash
 claude mcp list             # gp-mcp должен быть в списке
 claude mcp get gp-mcp       # детали конфига
 ```
 В сессии `/mcp` покажет статус подключения и список tool'ов. Если статус
 `failed`, посмотри `~/Library/Logs/Claude/` — сервер пишет ошибки запуска
 (включая отсутствующие env-переменные) в stderr.
 ---
 ### Cursor IDE
 Cursor использует тот же MCP-формат, но свой файл настроек.
 **1. Через UI**
 `Settings` → `Cursor Settings` → `MCP & Integrations` → `New MCP Server` →
 откроется `mcp.json` для редактирования.
 **2. Вручную, глобально: `~/.cursor/mcp.json`**
 Доступно во всех проектах.
 ```json
 {
  "mcpServers": {
    "gp-mcp": {
      "command": "/Users/admin/Projects/vpn/.venv/bin/python",
      "args": ["-m", "gp_mcp.server"],
      "cwd": "/Users/admin/Projects/vpn/src",
      "env": {
        "GP_HOST": "gp-master.internal",
        "GP_PORT": "5432",
        "GP_USER": "dbt_explain",
        "GP_PASSWORD": "REPLACE_ME",
        "GP_DATABASE": "analytics",
        "DBT_PROJECT_DIR": "/Users/admin/Projects/dbt-analytics",
        "DBT_PROFILES_DIR": "/Users/admin/.dbt",
        "DBT_TARGET": "dev",
        "STATEMENT_TIMEOUT_MS": "300000",
        "MAX_STATEMENT_TIMEOUT_MS": "900000"
      }
    }
  }
 }
 ```
 **3. Вручную, для проекта: `.cursor/mcp.json` в корне dbt-репозитория**
 Видно только в этом проекте. Удобно, когда у разных dbt-проектов разные
 `DBT_PROJECT_DIR`/`DBT_TARGET`.
 **Проверка:**
 `Settings` → `MCP & Integrations` — справа от `gp-mcp` должен загореться
 зелёный индикатор и появиться список tool'ов (`explain_sql`,
 `explain_dbt_model`). В чате tools будут доступны Agent-режиму.
 Если индикатор красный — раскрой сервер в этом же окне, там показывается
 stderr запуска (включая `Configuration error: Required environment variable
 '...' is not set`).
 ---
 ### Общие проблемы при подключении
 | Симптом | Причина |
 |---------|---------|
 | `Configuration error: Required environment variable 'X' is not set` | Переменная `X` не задана в `env` конфига клиента |
 | `ModuleNotFoundError: No module named 'gp_mcp'` | Неверный `cwd` — должен указывать на `src/`, или Python не из venv |
 | `ModuleNotFoundError: No module named 'mcp'` | `command` указывает не на Python из venv, где установлены зависимости |
 | Сервер стартует, но tools не появляются | Клиент не перезапущен / нет permissions в Cursor для MCP |
 | `dbt: command not found` при вызове `explain_dbt_model` | Поставь `DBT_EXECUTABLE=/абсолютный/путь/к/dbt` в `env` |
 ## Структура
 ```
 vpn/
 ├── .env.example
 ├── .gitignore
 ├── requirements.txt
 ├── README.md
 └── src/
    └── gp_mcp/
        ├── __init__.py
        ├── config.py       # загрузка и валидация env
        ├── db.py           # psycopg2 + read-only + timeout
        ├── dbt_runner.py   # subprocess dbt compile + чтение compiled SQL
        ├── explain.py      # EXPLAIN ANALYZE + summary
        └── server.py       # FastMCP, регистрация tools, stdio
 ```
--- a/requirements.txt
+++ b/requirements.txt
@@ -0,0 +1,4 @@
 mcp>=1.2.0
 psycopg2-binary>=2.9.9
 python-dotenv>=1.0.1
 PyYAML>=6.0.1
--- a/src/gp_mcp/init.py
+++ b/src/gp_mcp/init.py
--- a/src/gp_mcp/config.py
+++ b/src/gp_mcp/config.py
@@ -0,0 +1,125 @@
 """Configuration loaded entirely from environment variables.
 No hard-coded defaults for connection or paths — required variables must be
 set explicitly. A missing required variable raises ConfigError at startup with
 the offending variable name.
 """
 from __future__ import annotations
 import os
 from dataclasses import dataclass
 from pathlib import Path
 from dotenv import load_dotenv
 class ConfigError(RuntimeError):
    pass
 def _require(name: str) -> str:
    value = os.environ.get(name)
    if value is None or value.strip() == "":
        raise ConfigError(f"Required environment variable {name!r} is not set")
    return value.strip()
 def _optional(name: str) -> str | None:
    value = os.environ.get(name)
    if value is None or value.strip() == "":
        return None
    return value.strip()
 def _require_int(name: str) -> int:
    raw = _require(name)
    try:
        return int(raw)
    except ValueError as exc:
        raise ConfigError(f"Environment variable {name!r} must be an integer, got {raw!r}") from exc
 def _require_positive_int(name: str) -> int:
    value = _require_int(name)
    if value <= 0:
        raise ConfigError(f"Environment variable {name!r} must be > 0, got {value}")
    return value
 def _require_dir(name: str) -> Path:
    raw = _require(name)
    path = Path(raw).expanduser()
    if not path.is_dir():
        raise ConfigError(f"Environment variable {name!r} points to {raw!r}, which is not a directory")
    return path
@dataclass(frozen=True)
 class GreenplumConfig:
    host: str
    port: int
    user: str
    password: str
    database: str
    schema: str | None
@dataclass(frozen=True)
 class DbtConfig:
    project_dir: Path
    profiles_dir: Path
    target: str
    executable: str
@dataclass(frozen=True)
 class LimitsConfig:
    statement_timeout_ms: int
    max_statement_timeout_ms: int
@dataclass(frozen=True)
 class AppConfig:
    gp: GreenplumConfig
    dbt: DbtConfig
    limits: LimitsConfig
    log_level: str
 def load_config() -> AppConfig:
    """Load and validate the entire configuration from environment."""
    load_dotenv(override=False)
    gp = GreenplumConfig(
        host=_require("GP_HOST"),
        port=_require_positive_int("GP_PORT"),
        user=_require("GP_USER"),
        password=_require("GP_PASSWORD"),
        database=_require("GP_DATABASE"),
        schema=_optional("GP_SCHEMA"),
    )
    dbt = DbtConfig(
        project_dir=_require_dir("DBT_PROJECT_DIR"),
        profiles_dir=_require_dir("DBT_PROFILES_DIR"),
        target=_require("DBT_TARGET"),
        executable=_optional("DBT_EXECUTABLE") or "dbt",
    )
    limits = LimitsConfig(
        statement_timeout_ms=_require_positive_int("STATEMENT_TIMEOUT_MS"),
        max_statement_timeout_ms=_require_positive_int("MAX_STATEMENT_TIMEOUT_MS"),
    )
    if limits.statement_timeout_ms > limits.max_statement_timeout_ms:
        raise ConfigError(
            "STATEMENT_TIMEOUT_MS must be <= MAX_STATEMENT_TIMEOUT_MS "
            f"(got {limits.statement_timeout_ms} > {limits.max_statement_timeout_ms})"
        )
    log_level = (_optional("LOG_LEVEL") or "INFO").upper()
    if log_level not in {"DEBUG", "INFO", "WARNING", "ERROR"}:
        raise ConfigError(f"LOG_LEVEL must be one of DEBUG/INFO/WARNING/ERROR, got {log_level!r}")
    return AppConfig(gp=gp, dbt=dbt, limits=limits, log_level=log_level)
--- a/src/gp_mcp/db.py
+++ b/src/gp_mcp/db.py
@@ -0,0 +1,51 @@
 """Greenplum connections with enforced read-only mode and statement_timeout."""
 from __future__ import annotations
 from contextlib import contextmanager
 from typing import Iterator
 import psycopg2
 from psycopg2 import sql
 from psycopg2.extensions import connection as PgConnection
 from .config import GreenplumConfig
 def connect(gp: GreenplumConfig, statement_timeout_ms: int) -> PgConnection:
    """Open a new connection with read-only and timeout enforced.
    Why session-level (not just transaction-level) read-only: a misbehaving query
    that opens its own transaction inside the session still cannot write.
    """
    conn = psycopg2.connect(
        host=gp.host,
        port=gp.port,
        user=gp.user,
        password=gp.password,
        dbname=gp.database,
        application_name="gp-mcp",
    )
    conn.autocommit = True
    with conn.cursor() as cur:
        cur.execute("SET SESSION CHARACTERISTICS AS TRANSACTION READ ONLY")
        cur.execute("SET default_transaction_read_only = on")
        cur.execute("SET statement_timeout = %s", (statement_timeout_ms,))
        if gp.schema:
            schemas = [s.strip() for s in gp.schema.split(",") if s.strip()]
            if schemas:
                stmt = sql.SQL("SET search_path TO {}").format(
                    sql.SQL(", ").join(sql.Identifier(s) for s in schemas)
                )
                cur.execute(stmt)
    return conn
@contextmanager
 def open_connection(gp: GreenplumConfig, statement_timeout_ms: int) -> Iterator[PgConnection]:
    conn = connect(gp, statement_timeout_ms)
    try:
        yield conn
    finally:
        conn.close()
--- a/src/gp_mcp/dbt_runner.py
+++ b/src/gp_mcp/dbt_runner.py
@@ -0,0 +1,96 @@
 """Run `dbt compile` and read the compiled SQL for a selected model."""
 from __future__ import annotations
 import re
 import subprocess
 from pathlib import Path
 import yaml
 from .config import DbtConfig
 class DbtCompileError(RuntimeError):
    pass
 _MODEL_NAME_RE = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*$")
 def _validate_model_name(model_name: str) -> str:
    """Reject anything that isn't a bare dbt identifier.
    Why: model_name is appended to a `dbt --select` argument and used to locate
    a file on disk. Restricting to identifier characters keeps both subprocess
    and filesystem lookup safe.
    """
    if not _MODEL_NAME_RE.match(model_name):
        raise DbtCompileError(
            f"Invalid dbt model name {model_name!r}: must match {_MODEL_NAME_RE.pattern}"
        )
    return model_name
 def _project_name(project_dir: Path) -> str:
    project_file = project_dir / "dbt_project.yml"
    if not project_file.is_file():
        raise DbtCompileError(f"dbt_project.yml not found in {project_dir}")
    with project_file.open("r", encoding="utf-8") as f:
        data = yaml.safe_load(f) or {}
    name = data.get("name")
    if not isinstance(name, str) or not name:
        raise DbtCompileError(f"`name` not found in {project_file}")
    return name
 def _find_compiled_sql(project_dir: Path, project_name: str, model_name: str) -> Path:
    compiled_root = project_dir / "target" / "compiled" / project_name
    if not compiled_root.is_dir():
        raise DbtCompileError(f"Compiled output dir does not exist: {compiled_root}")
    matches = list(compiled_root.rglob(f"{model_name}.sql"))
    if not matches:
        raise DbtCompileError(
            f"Compiled SQL for model {model_name!r} not found under {compiled_root}"
        )
    if len(matches) > 1:
        # Ambiguous (model name reused across paths). Surface the candidates.
        rels = ", ".join(str(p.relative_to(project_dir)) for p in matches)
        raise DbtCompileError(
            f"Multiple compiled files match model {model_name!r}: {rels}"
        )
    return matches[0]
 def compile_model(cfg: DbtConfig, model_name: str) -> str:
    """Compile a single dbt model and return the resulting SQL."""
    model_name = _validate_model_name(model_name)
    project_name = _project_name(cfg.project_dir)
    cmd = [
        cfg.executable,
        "compile",
        "--select", model_name,
        "--project-dir", str(cfg.project_dir),
        "--profiles-dir", str(cfg.profiles_dir),
        "--target", cfg.target,
    ]
    result = subprocess.run(
        cmd,
        cwd=cfg.project_dir,
        capture_output=True,
        text=True,
        check=False,
    )
    if result.returncode != 0:
        raise DbtCompileError(
            f"dbt compile failed (exit {result.returncode}):\n"
            f"stdout:\n{result.stdout}\n"
            f"stderr:\n{result.stderr}"
        )
    compiled_path = _find_compiled_sql(cfg.project_dir, project_name, model_name)
    return compiled_path.read_text(encoding="utf-8")
--- a/src/gp_mcp/explain.py
+++ b/src/gp_mcp/explain.py
@@ -0,0 +1,120 @@
 """Run EXPLAIN (ANALYZE, VERBOSE, FORMAT JSON) and summarise the GP plan."""
 from __future__ import annotations
 from dataclasses import dataclass, field
 from typing import Any
 from psycopg2.extensions import connection as PgConnection
 class ExplainError(RuntimeError):
    pass
@dataclass
 class PlanSummary:
    total_cost: float | None
    plan_rows: float | None
    actual_rows: float | None
    execution_time_ms: float | None
    planning_time_ms: float | None
    slowest_node: dict[str, Any] | None
    motion_nodes: list[dict[str, Any]] = field(default_factory=list)
    rows_misestimation_factor: float | None = None
    def as_dict(self) -> dict[str, Any]:
        return {
            "total_cost": self.total_cost,
            "plan_rows": self.plan_rows,
            "actual_rows": self.actual_rows,
            "execution_time_ms": self.execution_time_ms,
            "planning_time_ms": self.planning_time_ms,
            "slowest_node": self.slowest_node,
            "motion_nodes": self.motion_nodes,
            "rows_misestimation_factor": self.rows_misestimation_factor,
        }
 def explain_analyze_json(conn: PgConnection, sql_text: str) -> list[dict[str, Any]]:
    """Run EXPLAIN (ANALYZE, VERBOSE, FORMAT JSON) and return the raw plan list."""
    if not sql_text or not sql_text.strip():
        raise ExplainError("SQL is empty")
    # ANALYZE actually executes the statement. The session is set
    # default_transaction_read_only=on (see db.connect), so writes are rejected
    # by the server. statement_timeout caps runaway plans.
    wrapped = f"EXPLAIN (ANALYZE, VERBOSE, FORMAT JSON) {sql_text}"
    with conn.cursor() as cur:
        cur.execute(wrapped)
        row = cur.fetchone()
    if not row:
        raise ExplainError("EXPLAIN returned no rows")
    payload = row[0]
    if not isinstance(payload, list):
        raise ExplainError(f"EXPLAIN returned unexpected payload type: {type(payload).__name__}")
    return payload
 def _walk(node: dict[str, Any]):
    yield node
    for child in node.get("Plans", []) or []:
        yield from _walk(child)
 def summarise(plan_payload: list[dict[str, Any]]) -> PlanSummary:
    """Extract GP-relevant metrics from the JSON plan."""
    if not plan_payload:
        raise ExplainError("Plan payload is empty")
    root = plan_payload[0]
    plan = root.get("Plan", {})
    total_cost = plan.get("Total Cost")
    plan_rows = plan.get("Plan Rows")
    actual_rows = plan.get("Actual Rows")
    execution_time = root.get("Execution Time")
    planning_time = root.get("Planning Time")
    slowest: dict[str, Any] | None = None
    motions: list[dict[str, Any]] = []
    for node in _walk(plan):
        node_type = node.get("Node Type", "")
        if "Motion" in node_type or node_type.startswith("Gather"):
            motions.append({
                "node_type": node_type,
                "slice": node.get("Slice"),
                "senders": node.get("Senders"),
                "receivers": node.get("Receivers"),
                "actual_rows": node.get("Actual Rows"),
                "actual_total_time_ms": node.get("Actual Total Time"),
            })
        actual_total = node.get("Actual Total Time")
        if actual_total is not None:
            if slowest is None or actual_total > slowest.get("actual_total_time_ms", -1):
                slowest = {
                    "node_type": node_type,
                    "actual_total_time_ms": actual_total,
                    "actual_rows": node.get("Actual Rows"),
                    "plan_rows": node.get("Plan Rows"),
                    "relation": node.get("Relation Name"),
                    "alias": node.get("Alias"),
                }
    misestimation: float | None = None
    if plan_rows and actual_rows and plan_rows > 0 and actual_rows > 0:
        misestimation = max(plan_rows / actual_rows, actual_rows / plan_rows)
    return PlanSummary(
        total_cost=total_cost,
        plan_rows=plan_rows,
        actual_rows=actual_rows,
        execution_time_ms=execution_time,
        planning_time_ms=planning_time,
        slowest_node=slowest,
        motion_nodes=motions,
        rows_misestimation_factor=misestimation,
    )
--- a/src/gp_mcp/server.py
+++ b/src/gp_mcp/server.py
@@ -0,0 +1,109 @@
 """MCP stdio server exposing Greenplum EXPLAIN tools for dbt model review."""
 from __future__ import annotations
 import json
 import logging
 import sys
 from typing import Any
 from mcp.server.fastmcp import FastMCP
 from .config import AppConfig, ConfigError, load_config
 from .db import open_connection
 from .dbt_runner import DbtCompileError, compile_model
 from .explain import ExplainError, explain_analyze_json, summarise
 logger = logging.getLogger("gp_mcp")
 def _resolve_timeout(cfg: AppConfig, override_ms: int | None) -> int:
    if override_ms is None:
        return cfg.limits.statement_timeout_ms
    if override_ms <= 0:
        raise ValueError("statement_timeout_ms must be > 0")
    if override_ms > cfg.limits.max_statement_timeout_ms:
        raise ValueError(
            f"statement_timeout_ms {override_ms} exceeds MAX_STATEMENT_TIMEOUT_MS "
            f"{cfg.limits.max_statement_timeout_ms}"
        )
    return override_ms
 def _explain_payload(cfg: AppConfig, sql_text: str, timeout_ms: int) -> dict[str, Any]:
    with open_connection(cfg.gp, timeout_ms) as conn:
        plan = explain_analyze_json(conn, sql_text)
    summary = summarise(plan)
    return {
        "summary": summary.as_dict(),
        "plan": plan,
        "statement_timeout_ms": timeout_ms,
    }
 def build_server(cfg: AppConfig) -> FastMCP:
    mcp = FastMCP("gp-mcp")
    @mcp.tool()
    def explain_sql(sql: str, statement_timeout_ms: int | None = None) -> str:
        """Run EXPLAIN (ANALYZE, VERBOSE, FORMAT JSON) on the given SQL.
        Returns a JSON string with:
          - summary: GP-relevant metrics (total_cost, execution_time_ms,
            motion_nodes, slowest_node, rows_misestimation_factor)
          - plan: raw EXPLAIN JSON
          - statement_timeout_ms: actual timeout applied
        The session is read-only; writes are rejected by the server.
        """
        timeout_ms = _resolve_timeout(cfg, statement_timeout_ms)
        payload = _explain_payload(cfg, sql, timeout_ms)
        return json.dumps(payload, ensure_ascii=False, default=str)
    @mcp.tool()
    def explain_dbt_model(model_name: str, statement_timeout_ms: int | None = None) -> str:
        """Compile a dbt model and run EXPLAIN (ANALYZE, FORMAT JSON) on it.
        Steps:
          1. `dbt compile --select <model_name>` in the configured project
          2. Read target/compiled/<project>/.../<model>.sql
          3. EXPLAIN ANALYZE against Greenplum (read-only session)
        Returns the same JSON shape as explain_sql, plus `compiled_sql`.
        """
        timeout_ms = _resolve_timeout(cfg, statement_timeout_ms)
        compiled_sql = compile_model(cfg.dbt, model_name)
        payload = _explain_payload(cfg, compiled_sql, timeout_ms)
        payload["compiled_sql"] = compiled_sql
        payload["model_name"] = model_name
        return json.dumps(payload, ensure_ascii=False, default=str)
    return mcp
 def main() -> int:
    try:
        cfg = load_config()
    except ConfigError as exc:
        # stderr — stdout is the MCP transport channel.
        print(f"Configuration error: {exc}", file=sys.stderr)
        return 2
    logging.basicConfig(
        level=cfg.log_level,
        stream=sys.stderr,
        format="%(asctime)s %(levelname)s %(name)s: %(message)s",
    )
    logger.info(
        "gp-mcp starting (host=%s db=%s schema=%s timeout_ms=%d max_timeout_ms=%d)",
        cfg.gp.host, cfg.gp.database, cfg.gp.schema,
        cfg.limits.statement_timeout_ms, cfg.limits.max_statement_timeout_ms,
    )
    mcp = build_server(cfg)
    mcp.run(transport="stdio")
    return 0
 if __name__ == "__main__":
    raise SystemExit(main())