
CybersecurityHQ — CISO Deep Dive
In partnership with:
Smallstep – Secures Wi-Fi, VPNs, ZTNA, SaaS and APIs with hardware-bound credentials powered by ACME Device Attestation
LockThreat – AI-powered GRC that replaces legacy tools and unifies compliance, risk, audit and vendor management in one platform
CybersecurityHQ exists to issue and preserve dated, bounded external cyber judgment. Not news reaction, advisory opinion, or consensus analysis.
1. What MCP Actually Is (Operationally, Not Conceptually)
The Model Context Protocol is a JSON-RPC based standard that mediates communication between AI agents and external tools. It operates as an intermediary layer where clients (Claude Desktop, Cursor, VS Code integrations) request capabilities, and servers expose tools, file systems, APIs, and organizational resources.
The architecture involves three components: MCP clients that interface with LLMs and render user interactions; MCP servers that expose tool definitions and execute actions against downstream systems; and the transport layer handling JSON-RPC payloads between them.
MCP servers hold OAuth tokens, API keys, and service credentials. They execute tool calls with whatever permissions those credentials carry. When a user authorizes an MCP server to access Gmail, Slack, GitHub, or Salesforce, the server maintains persistent tokens that remain valid until explicitly revoked.
Research from Clutch Security indicates that in a typical 10,000-person organization, approximately 15% of employees run an average of two MCP servers each. This produces over 3,000 server deployments, 38% of which originate from unofficial implementations by anonymous authors.
Astrix Research analyzed over 5,000 open-source MCP server implementations. Approximately 88% require credentials to function. The predominant credential pattern remains static secrets stored in plaintext configuration files or environment variables.
The protocol specification itself does not mandate authentication. Authentication is left to implementers. The JSON-RPC standard that MCP uses lacks built-in authentication or encryption by default.
This is no longer a tooling layer. MCP has become the de facto identity broker, execution engine, and authorization plane for AI-assisted workflows. The question of whether to govern it was resolved by adoption patterns that preceded governance frameworks.
MCP now satisfies every operational criterion historically used to classify Tier-0 infrastructure: credential concentration, execution authority, cross-domain access, and irreversibility of misuse. Systems meeting these criteria have always been treated as primary breach vectors. MCP is not an exception; it is simply newer and less formally governed.
2. Observed Failure Modes
2.1 Stored Prompt Injection
In May 2025, Invariant Labs documented a prompt injection vulnerability in the official GitHub MCP server. Malicious instructions embedded in public GitHub issues could hijack AI assistants and cause them to exfiltrate data from private repositories into public pull requests. The root cause was broad Personal Access Token scopes combined with untrusted content entering the LLM context.
In mid-2025, researchers at General Analysis demonstrated that Supabase's Cursor agent, operating with privileged service-role access, would process support tickets containing user-supplied input as executable commands. Attackers embedded SQL instructions in support tickets that read integration tokens and posted them to public threads.
CVE-2025-54135 and CVE-2025-54136 affected Cursor IDE versions below 1.3.9. Attackers could implant malicious instructions in public documents (README files, shared documents). When an AI agent summarized the contaminated document, it would follow hidden instructions to create a malicious .cursor/mcp.json configuration file containing reverse shell commands. The agent executed these without user interaction because file creation did not require approval.
2.2 OAuth and Credential Abuse
CVE-2025-6514 affected mcp-remote, an OAuth proxy with over 437,000 downloads. The vulnerability allowed malicious MCP servers to send a crafted authorization_endpoint that mcp-remote passed directly to the system shell, achieving remote code execution. Attackers could execute arbitrary commands, steal API keys, cloud credentials, local files, SSH keys, and Git repository contents.
The MCP Authorization specification permits confused deputy attacks when proxy servers use static client IDs with third-party authorization servers while allowing MCP clients to dynamically register. When a third-party authorization server sets a consent cookie after first authorization, attackers can redirect authorization codes to their own servers and exchange stolen codes for access tokens.
Asana discovered a logic flaw in its MCP server that permitted cross-tenant access. Projects, teams, tasks, and other objects belonging to one customer were potentially accessible by different customers due to improper isolation in the access control layer.
2.3 Tool Poisoning and Shadowing
Tool descriptions are passed directly to AI models as context. Invariant Labs demonstrated that malicious instructions embedded in tool metadata can redirect agent behavior without user awareness. A tool described as "add two numbers" contained hidden instructions that modified how a separate, trusted email-sending tool operated. The agent sent all emails to an attacker-controlled address while displaying the user-specified recipient in the interface.
Invariant Labs demonstrated that a malicious MCP server could exfiltrate a user's entire WhatsApp message history by combining tool poisoning with a legitimate whatsapp-mcp server in the same agent. A "random fact of the day" tool morphed into a sleeper backdoor that rewrote how WhatsApp messages were sent, forwarding conversation history to an attacker-controlled number disguised as ordinary outbound messages.
MCP servers can modify their tool definitions between sessions. A tool approved on day one can silently reroute API keys to an attacker by day seven. While some MCP clients show tool descriptions initially, they do not notify users about changes to tool definitions after approval.
2.4 Supply Chain Compromise
Koi Security identified a malicious MCP server package masquerading as a legitimate "Postmark MCP Server" with a single line changed. The modification added a BCC field to all emails sent through the tool, forwarding copies to an attacker-controlled address.
A supply chain attack on an MCP hosting service leaked builder credentials including a Fly.io API token granting control over more than 3,000 applications, most of them hosted MCP servers. From there, attackers could run arbitrary commands in MCP server containers and intercept inbound client traffic containing API keys for downstream services.
Security researchers found two critical flaws in Anthropic's Filesystem-MCP server: sandbox escape and symlink/containment bypass, enabling arbitrary file access and code execution on the host filesystem.
3. The Structural Impossibility
Existing security controls assume a separation that MCP collapses by design. Identity, logging, segmentation, and change control each fail for the same reason: MCP merges the trust boundary between user, agent, credential, and execution into a single context window that cannot be partitioned after the fact.
On identity: MCP authentication must handle the human user, the AI agent acting on their behalf, the MCP client application, the MCP server, and the downstream services being accessed. Traditional OAuth assumes human users click through consent screens. MCP agents discover and access capabilities programmatically without human intervention at runtime. User-delegated agents inherit human identity but current implementations grant persistent, broad-scoped credentials rather than ephemeral, task-specific tokens. Compromised tokens do not trigger login alerts because they appear as normal API usage.
On logging: MCP was designed as a connectivity protocol, not an observability platform. Out of the box, logs capture communications only for the current session. When the process restarts, logs are lost. Traditional security tools were not designed for context-aware auditing. Knowing "what resource was touched" is insufficient. MCP auditing requires knowing which agent, operating under which policy, with what context metadata, accessed which specific resource. Attackers exploit the complexity of multi-party workflows and ephemeral infrastructure to hide malicious activity.
On segmentation: Backslash Security found hundreds of MCP servers bound to 0.0.0.0, exposing all network interfaces. MCP servers chain tool calls: an over-privileged permission in one context cascades into network access, shell commands, or data exfiltration in another. MCP concentrates access. A single server breach hands attackers broad control over every integrated service. Access persists even after password changes because OAuth tokens remain valid until explicitly revoked.
On change control: Tool definitions can change between sessions without versioning, notification, or approval. AI agents generate edge cases human users never attempt: unusual parameter combinations, extreme values, rapid-fire request patterns that expose race conditions. If a path exists to chain tool calls for privilege escalation or data extraction, an AI system will eventually find it. Enterprise MCP deployments lack approval workflows, server-side validation, and audit trails. Permissions configured per-user become complex and multifaceted. Temporary debug access quietly becomes permanent.
These are not four separate problems. They are four manifestations of a single architectural fact: MCP places credential storage, execution authority, and trust decisions inside a context window that treats all tokens as equally authoritative. The controls designed to separate those functions cannot be retrofitted onto a layer that was never designed to separate them.
In MCP environments, breach discovery will systematically lag breach occurrence. Most organizations will first encounter MCP compromise through downstream anomalies, third-party notification, or regulatory inquiry rather than internal detection.
4. Why This Layer Is Hard to See
4.1 Tooling Gaps
MCP security tooling emerged in late 2025 in response to documented breaches. Before this, organizations deploying MCP had no specialized scanning, monitoring, or governance capabilities. The security market is now producing MCP-specific tools (MCP-Scan, MCP Manager, Invariant, Pillar Security, and others), but enterprise deployment of these tools lags MCP adoption.
Traditional endpoint detection and response tools do not parse MCP protocol traffic. SIEM integrations for MCP audit logs require custom development. Vulnerability scanners designed for web applications and APIs do not evaluate tool descriptions for embedded prompt injection payloads.
The MCPTox benchmark evaluates how often malicious or manipulated tool definitions pass into AI agent contexts without detection. The benchmark exists because no standard testing methodology existed prior to its development in August 2025.
MCP governance cannot be retrofitted at the same rate MCP adoption is compounding. By the time inventories, policies, and audit frameworks exist, the installed base will already exceed the organization's ability to re-authorize, re-scope, or rotate credentials safely.
4.2 Ownership Ambiguity
MCP servers are installed by individual developers. They run locally on developer workstations, often in personal directories. Security teams may not know they exist. IT asset inventories do not catalog them. The servers use credentials provisioned by individual users rather than centrally managed service accounts.
The 2025 Stack Overflow Developer Survey indicates 84% of professional developers use or plan to use AI tools daily. When these tools operate via MCP with all-access permissions to files, shell, and network, the potential damage increases. However, permission grants happen at the individual developer level, outside centralized access governance processes.
Shadow MCP refers to unauthorized MCP instances operating outside governance, capable of executing tools or exfiltrating data without audit trails or oversight. Signs of overexposure include overly permissive tool scopes, unverified third-party endpoints, unused or orphaned tool paths, and MCP servers accepting commands without authentication.
4.3 Audit Blind Spots
Noma Security documented an incident where an AI agent tasked with cleaning duplicate customer records deleted thousands of legitimate accounts. The agent had destructive database capabilities enabled by default. The organization lacked auditing capabilities to trace the sequence of tool invocations that led to the deletion.
MCP-related security events require specialized observability. Organizations need to know which agent accessed what resource, when, and under what specific context. Without this capability, security teams cannot detect anomalous patterns, conduct post-incident analysis, or map blast radius when incidents occur. AI-enabled systems suffer from structural opacity in the absence of reproducible inference logging and state preservation; suppression mechanisms discard decisions silently, preventing forensic reconstruction.
The worst time to discover gaps in audit coverage is during a real security incident. Regular tabletop exercises that walk through hypothetical compromises reveal whether logs have enough detail to reconstruct attack timelines and identify affected data. Most organizations have not conducted MCP-specific tabletop exercises.
5. Conditions Under Which This Risk Weakens
The structural risks documented above persist under current MCP implementation patterns. However, several architectural conditions would reduce exposure if achieved:
If MCP servers operated with ephemeral, tightly-scoped credentials: Token lifetimes measured in minutes rather than months would limit exposure windows. Credentials scoped to specific operations rather than broad API access would contain lateral movement. This requires integration with identity providers that support fine-grained, time-bounded delegation.
If tool definitions were cryptographically signed and immutable: Versioned tool schemas with signature verification would prevent silent modification between sessions. Clients could reject unsigned changes and alert users to tool definition updates requiring explicit re-approval.
If MCP logs were externalized and tamper-evident: Structured logging to external SIEM systems with cryptographic integrity protection would enable forensic reconstruction. Context-aware audit trails capturing agent identity, policy context, and full request-response chains would support investigation.
If MCP deployments were discoverable and centrally governed: Asset inventory systems that automatically detect MCP server processes would provide visibility. Centralized policy enforcement that validates server configurations before allowing network access would reduce shadow deployments.
If the protocol specification mandated authentication and authorization: Baseline security requirements built into the MCP standard itself would raise the floor for all implementations. Optional security features remain optional; mandatory requirements establish minimum defensibility.
These conditions are technically achievable but operationally rare. Their absence represents the current baseline against which organizational risk should be assessed.
6. Unresolved Governance Ambiguities
Several classification and accountability questions remain undefined at both the protocol and organizational levels:
Who owns the risk when an MCP server compromises downstream services? The user who installed the server operates without security review. The security team lacks visibility into what servers exist. The application owner whose API was accessed did not grant direct access. The MCP client vendor provides the infrastructure but not the server code. The MCP server author may be anonymous. Incident response requires identifying an accountable party; current MCP deployments do not clearly establish one.
When is an MCP tool invocation a user action versus an agent action? For compliance frameworks requiring user authentication and authorization, the distinction matters. If an MCP agent deletes production data, was that the user's action because they initiated the agent, or the agent's action because it selected the specific tool and parameters? Audit trails that show "user X via MCP agent Y" do not resolve whether X authorized the specific action or only the general task.
What controls apply when MCP servers cross trust boundaries? An MCP server running on a developer laptop may access corporate APIs, personal cloud storage, and external SaaS services within a single session. Traditional security controls assume network zones and data classification boundaries. MCP workflows that span personal and corporate infrastructure do not fit existing data handling policies.
How should organizations classify MCP vulnerabilities? Is a prompt injection vulnerability in an MCP tool description an application security issue, an AI safety issue, or a supply chain issue? Is an OAuth token leak from an MCP server a credential compromise, an insider threat, or a third-party breach? Existing incident classification taxonomies do not cleanly map to MCP-specific attack vectors.
What artifacts demonstrate adequate MCP governance? Organizations subject to SOC 2, ISO 27001, or regulatory audits must demonstrate control effectiveness. What evidence proves MCP deployments are governed: server inventories, approved tool registries, audit log retention, tabletop exercise records, policy documentation? No consensus standard exists for what constitutes auditable MCP governance.
These ambiguities create exposure beyond technical vulnerabilities. When incident response plans do not address MCP-specific scenarios, when audit procedures do not evaluate MCP controls, and when accountability frameworks do not assign MCP risk ownership, organizations operate without the governance artifacts typically expected for systems with comparable access and impact.
7. Open Questions
The MCP specification recommends that there "SHOULD always be a human in the loop with the ability to deny tool invocations." Whether SHOULD functions as MUST in enterprise deployments remains undefined. The protocol does not enforce this requirement, leaving implementation to individual clients.
MCP servers running locally execute with the user's full privileges. Whether containerization, sandboxing, or process isolation becomes standard practice is unresolved. The tradeoff between developer convenience (broad access for productivity) and security constraints (least privilege, ephemeral credentials) has not been settled at the protocol or implementation level.
Microsoft integrated MCP support across Copilot Studio and Azure AI Foundry. Whether enterprise platform vendors will mandate authentication, audit logging, and tool verification as non-optional features, or leave these as configurable options, affects whether MCP deployments inherit security posture from the platform or must build it independently.
The breaches documented in 2025 confirm that MCP systems are governed by the same security principles as traditional software. Whether organizations will treat MCP surfaces with the same seriousness as API gateways, CI/CD pipelines, and Cloud IAM remains to be observed.
Prompt injection remains listed as LLM01 on the OWASP Top 10 for GenAI and Large Language Model Applications. Whether technical mitigations can adequately address a vulnerability class where malicious intent can be encoded in natural language in virtually infinite ways is an open research problem.
Multi-agent systems decompose problems across specialized agents. Google announced the Agent2Agent Protocol in 2025 to enable communication between agentic applications regardless of vendor. Whether expanded agent-to-agent communication surfaces create more attack vectors than they solve coordination problems is undefined.
The MCP ecosystem continues to expand. Unofficial registries index over 16,000 MCP servers. The attack surface grows with each new server, each new credential grant, each new tool definition that enters a context window somewhere.
Whether the gap between MCP adoption velocity and MCP governance maturity represents a temporary lag or a structural condition of the protocol's design remains unresolved.
The absence of agreed-upon governance artifacts for MCP does not delay accountability. It ensures that accountability will be assigned retroactively, under breach conditions, using standards that do not yet exist.
This record establishes the observable baseline for MCP-related control-plane exposure as of January 2026.

