Back to Blog
5 min read

Model Context Protocol (MCP): The Standard for AI-Application Integration

Model Context Protocol (MCP) is emerging as a standard for connecting AI models to external data and tools. Think of it as USB for AI - a universal way to plug capabilities into language models.

What is MCP?

MCP defines how AI applications can:

  • Access external data sources
  • Execute tools and functions
  • Maintain context across sessions
  • Handle authentication and permissions
┌─────────────────┐     MCP      ┌─────────────────┐
│   AI Model      │◄────────────►│  MCP Server     │
│  (Claude, GPT)  │              │  (Your Tools)   │
└─────────────────┘              └─────────────────┘

                    ┌───────────────────┼───────────────────┐
                    ▼                   ▼                   ▼
              ┌──────────┐       ┌──────────┐       ┌──────────┐
              │ Database │       │   APIs   │       │  Files   │
              └──────────┘       └──────────┘       └──────────┘

Building an MCP Server

Let’s create an MCP server that exposes data analytics capabilities:

from mcp import Server, Tool, Resource
from mcp.types import TextContent

# Create MCP server
server = Server("data-analytics-mcp")

# Define tools the AI can use
@server.tool()
async def query_warehouse(query: str, warehouse: str = "main") -> str:
    """Execute a SQL query against the data warehouse.

    Args:
        query: SQL query to execute
        warehouse: Target warehouse (main, staging, analytics)

    Returns:
        Query results as formatted text
    """
    # Execute query against your warehouse
    results = await execute_sql(query, warehouse)
    return format_results(results)

@server.tool()
async def get_table_schema(table_name: str) -> str:
    """Get the schema for a database table.

    Args:
        table_name: Name of the table

    Returns:
        Table schema with column names and types
    """
    schema = await fetch_schema(table_name)
    return schema.to_markdown()

@server.tool()
async def run_data_quality_check(table: str, rules: list[str]) -> str:
    """Run data quality checks on a table.

    Args:
        table: Table to check
        rules: List of DQ rules to apply

    Returns:
        Data quality report
    """
    results = await run_dq_checks(table, rules)
    return results.to_report()

# Define resources (data the AI can read)
@server.resource("schema://tables")
async def list_tables() -> str:
    """List all available tables in the warehouse."""
    tables = await get_all_tables()
    return "\n".join(tables)

@server.resource("schema://tables/{table_name}")
async def get_table_info(table_name: str) -> str:
    """Get detailed information about a specific table."""
    info = await fetch_table_info(table_name)
    return info.to_markdown()

# Run the server
if __name__ == "__main__":
    server.run()

Connecting MCP to AI Applications

Claude Desktop Integration

// claude_desktop_config.json
{
  "mcpServers": {
    "data-analytics": {
      "command": "python",
      "args": ["/path/to/mcp_server.py"],
      "env": {
        "WAREHOUSE_CONNECTION": "your-connection-string"
      }
    }
  }
}

Programmatic Integration

from mcp import Client
from anthropic import Anthropic

# Connect to MCP server
mcp_client = Client()
await mcp_client.connect("data-analytics-mcp")

# Get available tools
tools = await mcp_client.list_tools()

# Use with Claude
anthropic = Anthropic()

response = anthropic.messages.create(
    model="claude-3-5-sonnet-20241022",
    messages=[
        {"role": "user", "content": "What tables are available and what's the schema of the sales table?"}
    ],
    tools=[tool.to_anthropic_format() for tool in tools]
)

# Handle tool calls
for content in response.content:
    if content.type == "tool_use":
        # Execute via MCP
        result = await mcp_client.call_tool(
            content.name,
            content.input
        )
        # Continue conversation with result

MCP for Data Platforms

Microsoft Fabric MCP Server

from mcp import Server
from azure.identity import DefaultAzureCredential
import requests
import sempy.fabric as fabric

server = Server("fabric-mcp")

def get_fabric_headers():
    credential = DefaultAzureCredential()
    token = credential.get_token("https://api.fabric.microsoft.com/.default")
    return {"Authorization": f"Bearer {token.token}", "Content-Type": "application/json"}

@server.tool()
async def query_lakehouse(query: str, lakehouse: str) -> str:
    """Query a Fabric lakehouse using SQL."""
    # Use sempy.fabric for semantic link queries
    df = fabric.evaluate_measure(
        dataset=lakehouse,
        measure=query
    )
    return df.to_markdown()

@server.tool()
async def run_notebook(workspace_id: str, notebook_id: str, parameters: dict) -> str:
    """Execute a Fabric notebook with parameters via REST API."""
    headers = get_fabric_headers()
    url = f"https://api.fabric.microsoft.com/v1/workspaces/{workspace_id}/items/{notebook_id}/jobs/instances?jobType=RunNotebook"
    response = requests.post(url, headers=headers, json={"executionData": {"parameters": parameters}})
    return f"Notebook job started: {response.json().get('id')}"

@server.tool()
async def refresh_semantic_model(workspace_id: str, dataset_id: str) -> str:
    """Refresh a Power BI semantic model via REST API."""
    headers = get_fabric_headers()
    url = f"https://api.fabric.microsoft.com/v1/workspaces/{workspace_id}/items/{dataset_id}/jobs/instances?jobType=Refresh"
    response = requests.post(url, headers=headers)
    return f"Refresh initiated: {response.json().get('id')}"

@server.resource("fabric://lakehouses")
async def list_lakehouses(workspace_id: str) -> str:
    """List all lakehouses in the workspace via REST API."""
    headers = get_fabric_headers()
    url = f"https://api.fabric.microsoft.com/v1/workspaces/{workspace_id}/lakehouses"
    response = requests.get(url, headers=headers)
    lakehouses = response.json().get("value", [])
    return "\n".join([lh["displayName"] for lh in lakehouses])

Azure Data Factory MCP Server

from mcp import Server
from azure.mgmt.datafactory import DataFactoryManagementClient

server = Server("adf-mcp")

@server.tool()
async def list_pipelines(factory_name: str) -> str:
    """List all pipelines in a data factory."""
    client = get_adf_client()
    pipelines = client.pipelines.list_by_factory(
        resource_group, factory_name
    )
    return "\n".join([p.name for p in pipelines])

@server.tool()
async def trigger_pipeline(
    factory_name: str,
    pipeline_name: str,
    parameters: dict = None
) -> str:
    """Trigger a pipeline run."""
    client = get_adf_client()
    run = client.pipelines.create_run(
        resource_group, factory_name, pipeline_name,
        parameters=parameters
    )
    return f"Pipeline run started: {run.run_id}"

@server.tool()
async def get_pipeline_status(
    factory_name: str,
    run_id: str
) -> str:
    """Get the status of a pipeline run."""
    client = get_adf_client()
    run = client.pipeline_runs.get(
        resource_group, factory_name, run_id
    )
    return f"Status: {run.status}, Duration: {run.duration_in_ms}ms"

Security Considerations

Authentication

from mcp import Server
from mcp.auth import OAuth2Handler

server = Server("secure-mcp")

# Configure authentication
server.auth = OAuth2Handler(
    provider="azure-ad",
    client_id="your-client-id",
    scopes=["https://graph.microsoft.com/.default"]
)

@server.tool()
async def sensitive_operation(data: str) -> str:
    # Tool only accessible to authenticated users
    user = server.current_user
    if not user.has_permission("data:write"):
        raise PermissionError("Insufficient permissions")
    # ... perform operation

Rate Limiting

from mcp.middleware import RateLimiter

server.add_middleware(
    RateLimiter(
        requests_per_minute=60,
        requests_per_hour=1000
    )
)

Audit Logging

from mcp.middleware import AuditLogger

server.add_middleware(
    AuditLogger(
        backend="azure_monitor",
        log_inputs=True,
        log_outputs=False  # Don't log sensitive results
    )
)

Benefits of MCP

  1. Standardization: One protocol for all AI integrations
  2. Security: Built-in auth and permission handling
  3. Discoverability: AI can explore available tools
  4. Portability: Switch AI providers without rewriting integrations
  5. Composability: Combine multiple MCP servers

Getting Started

  1. Identify tools your AI applications need
  2. Build MCP servers exposing those capabilities
  3. Configure AI clients to connect to your servers
  4. Implement security and monitoring
  5. Iterate based on usage patterns

MCP is becoming essential infrastructure for enterprise AI. Start building your MCP servers today.

Michael John Peña

Michael John Peña

Senior Data Engineer based in Sydney. Writing about data, cloud, and technology.