← Back to Blog

Governing MCP Servers, Skills, and Plugins at Enterprise Scale

Governing MCP Servers, Skills, and Plugins at Enterprise Scale

AgentSeal scanned 1,808 MCP servers in early 2026. 66% had security findings. 43% were vulnerable to command execution attacks. 53% relied on long-lived static secrets for authentication. OAuth adoption sat at 8.5%. In organizations with 100+ engineers, security assessments routinely found 15 to 30 MCP server configurations on developer machines that IT had no visibility into. OWASP published a dedicated MCP Top 10. Anthropic, Microsoft, and multiple open-source MCP servers were flagged for vulnerabilities enabling remote code execution and cloud account takeover.

The Model Context Protocol won. It is the standard interface between AI agents and external systems. Every major IDE, every cloud provider, and every significant AI platform supports it. MCP servers, skills built on top of them, and plugins that extend agent capabilities are proliferating at a pace that enterprise governance has not kept up with. The protocol layer works. The management layer does not.

The three unsolved problems are version control and distribution, access control and authorization, and security across the tool supply chain. Each requires specific architectural responses.

Version Control and Distribution

MCP servers ship as source code or package manager references. A typical enterprise MCP deployment involves fetching dependencies, resolving versions, and hoping nothing changed since the last successful test. This is the npm/PyPI problem all over again, except the blast radius is larger because these tools have runtime access to enterprise systems, credentials, and data.

The version control challenge compounds across three layers. MCP servers themselves need versioning. Skills, which compose multiple tool invocations into higher-level capabilities, depend on specific server versions and behaviors. Plugins that extend agent functionality depend on both. A single breaking change in a server's tool schema cascades through every skill and plugin that references it.

The Registry Architecture

The MCP specification now defines a registry protocol with a federated architecture. The upstream MCP Registry at registry.modelcontextprotocol.io serves as the canonical source of public MCP server metadata. Enterprise subregistries, public or private, can ingest, augment, mirror, and extend this upstream registry while applying organizational policies.

A production registry must enforce several invariants. Version strings use semantic versioning and must be unique per publication. Once published, version metadata is immutable. The registry stores server manifests that declare tool schemas, required credentials, runtime dependencies, and minimum client capabilities. This is the package.json equivalent for the MCP ecosystem, and it needs the same rigor that took npm a decade to develop.

The more interesting architectural decision is MCPB: MCP Bundles. Instead of distributing servers as source code with dependency resolution at install time, MCPB packages the server code, all dependencies, and a manifest describing runtime requirements into a self-contained, reproducible artifact. This eliminates the class of problems where a server works on one machine but fails on another because of dependency drift. For enterprises, MCPB bundles are the equivalent of container images for MCP servers: hermetic, versioned, and auditable.

What a Governed Distribution Pipeline Looks Like

The pattern that works is a three-tier registry with promotion gates. Internal teams publish MCP servers and skills to a development registry. Automated scanning (dependency audit, tool description analysis, schema validation) gates promotion to a staging registry. Security review and compliance sign-off gates promotion to the production registry. Only servers in the production registry are available to agents operating on production data.

Each tier enforces immutable versioning. Rollback means deploying a previous version, not mutating the current one. The registry maintains a signed manifest for every version, including cryptographic hashes of the bundle contents, the tool schemas exposed, and the permissions required. Agents verify these signatures before loading a server. This is the software supply chain security pattern applied to AI tooling.

Access Control and Authorization

The MCP specification added OAuth 2.1 support in June 2025. It is a recommendation, not a requirement. The spec defines how servers can authenticate clients and how clients can present user credentials to servers. But it does not prescribe how enterprises should govern which users can access which tools, through which servers, under which conditions. That is the enterprise's problem.

The current state is poor. 88% of MCP servers require some form of credentials, but the majority use API keys and personal access tokens. These are long-lived, broadly scoped, and typically stored in plaintext configuration files on developer machines. A single compromised workstation leaks every MCP credential on it.

The Gateway Pattern

There are two approaches to MCP authentication: implement OAuth directly on each server, or deploy a gateway that enforces authentication in front of all servers at once. For enterprises, the gateway pattern wins because it centralizes the authentication boundary.

An MCP gateway sits between agents and MCP servers. It terminates authentication, enforces authorization policies, and proxies requests to upstream servers. The gateway integrates with the enterprise identity provider via SAML or OIDC, mapping user identities to role-based access policies that determine which tools each user or agent can invoke.

The critical detail is per-tool authorization, not just per-server. A user may have access to a GitHub MCP server but should only be able to read repositories, not create or delete them. The gateway must inspect the tool invocation, match it against the user's role permissions, and allow or deny at the individual tool level. This requires the gateway to understand MCP tool schemas and maintain a policy engine that maps roles to permitted tool invocations with parameter constraints.

Scoped, Short-Lived Credentials

The OAuth 2.1 integration in the MCP spec supports RFC 8707 resource indicators, which allow access tokens to be scoped to specific MCP servers. Combined with short token lifetimes and refresh token rotation, this eliminates the long-lived, broadly scoped API key problem. An agent receives a token that is valid for one server, for a limited set of tools, for a short duration. If the token is compromised, the blast radius is bounded.

For service-to-service communication where no human is in the loop, the pattern is workload identity federation. The agent's runtime environment (a Kubernetes pod, a cloud function, a CI runner) presents its platform identity to the gateway, which issues a scoped MCP token based on the workload's registered permissions. No static secrets are stored anywhere.

The Security Surface

The OWASP MCP Top 10 codifies what security researchers have been demonstrating since late 2025: MCP servers are a high-value attack surface with novel threat vectors that do not map cleanly onto traditional application security models.

Tool Poisoning

Tool poisoning exploits the gap between what a user sees and what the AI model sees when processing tool descriptions. The user sees a tool named "Add Numbers." The model receives the full tool description, which may contain hidden instructions injected by the server author. In controlled testing, tool poisoning attacks achieve an 84.2% success rate when agents have auto-approval enabled.

The attack is effective because tool descriptions are part of the model's context window and are processed as trusted instructions. A poisoned tool description can instruct the model to exfiltrate data from other tool calls, override safety instructions, or escalate privileges. The user never sees the malicious instructions because they are in the tool metadata, not in the conversation.

Defense requires tool description pinning: the registry stores a cryptographic hash of each tool's description at publication time. The gateway or client verifies the hash before loading the tool. Any modification to the description after publication is detected and blocked. This is the core defense against a related attack, the rug pull, where a server author modifies tool behavior after users have approved it.

Supply Chain Attacks

The MCP supply chain has already been exploited in production. In September 2025, a counterfeit npm package called postmark-mcp functioned correctly for email sending but silently BCC'd every outgoing email to an attacker-controlled address. The malicious behavior was introduced in version 1.0.16 and ran undetected for days. In October 2025, a supply chain attack on Smithery compromised 3,000+ hosted applications and their API tokens. In January 2026, 2,000+ MCP instances leaked credentials and conversation histories through unauthenticated gateways in the Clawdbot exposure.

These are not theoretical risks. They are production incidents with real data exposure. The defense stack is the same as traditional software supply chain security, applied to MCP: dependency pinning with hash verification, provenance attestation on published packages, automated scanning for known vulnerabilities and behavioral anomalies, and a curated allowlist of approved servers maintained by the enterprise security team.

Shadow MCP Servers

Shadow MCP is the new shadow IT. Developers install MCP servers locally to connect their AI coding assistants to internal systems. These servers run with the developer's credentials, bind to local ports (often 0.0.0.0, making them accessible to any device on the network), and operate entirely outside the organization's security perimeter. The NeighborJack vulnerability class exploits exactly this: MCP servers bound to all interfaces without authentication, accessible to any attacker on the local network.

The organizational response is a combination of detection and redirection. Network scanning identifies unauthorized MCP server processes. Endpoint management tools flag unapproved MCP configurations. But the more effective approach is to make the governed path easier than the ungoverned one. If the enterprise MCP registry and gateway provide fast, low-friction access to approved tools, developers have less incentive to run their own servers. The goal is to make the secure path the path of least resistance.

Data Leakage Across Organizational Boundaries

MCP servers create a new class of data leakage risk because they operate at the intersection of AI context and enterprise systems. An agent querying a CRM server, a document server, and a database server in sequence accumulates context from all three. If the agent's session is not properly isolated, information from one query can influence or leak into another.

Cross-Tenant Leakage

In May 2025, Asana launched an MCP-powered feature and within weeks discovered that a bug caused customer information to bleed into other customers' MCP instances. This is the canonical cross-tenant leakage pattern: shared infrastructure serving multiple tenants without sufficient isolation at the MCP session level. The fix requires tenant-scoped MCP sessions with no shared state, separate credential stores per tenant, and strict memory isolation between concurrent agent executions.

Credential Exposure Through Tool Invocations

MCP servers frequently store service tokens for the systems they connect to: GitHub tokens, database credentials, API keys for SaaS platforms. A compromised MCP server leaks not just its own credentials but every downstream credential it holds. The mitigation is to never store credentials in the MCP server itself. Instead, the server requests credentials from a secrets manager at invocation time, with each request scoped to the specific operation and logged for audit. Credentials are never persisted in server memory beyond the duration of a single tool invocation.

Context Leakage Through Agent Memory

Agents that maintain conversation history or memory across sessions can inadvertently carry sensitive data from one interaction into another. An agent that queried HR records in one session and then assists a different user in the next session may retain fragments of the first interaction in its context. The architectural response is session-scoped agent state with mandatory clearing between users and between privilege levels. Memory systems that persist across sessions must enforce the same access control policies as the underlying data sources.

The Enterprise Control Plane

The architecture that addresses all three problems, version control, access control, and security, converges on a single component: the MCP gateway backed by a governed registry.

The registry handles distribution. It stores signed, versioned bundles. It enforces promotion gates between development, staging, and production tiers. It maintains an allowlist of approved servers and rejects anything not in the list. It provides a federated API that internal teams and approved external registries can publish to.

The gateway handles runtime governance. It terminates authentication via the enterprise IdP. It enforces per-tool, per-role authorization policies. It logs every tool invocation with full context: who invoked what tool, with what parameters, at what time, and what the response was. It verifies tool description hashes against the registry to detect poisoning. It enforces rate limits and circuit breakers to contain blast radius.

Production MCP gateways in 2026 add 3 to 4 milliseconds of latency per request, around 10 milliseconds under load. This is negligible compared to the latency of the LLM inference that follows. The governance overhead is not a performance concern. It is a policy enforcement layer that operates at a fraction of the cost of the operations it governs.

The audit trail the gateway produces is the compliance foundation. SOC 2 Type II, HIPAA, and GDPR all require demonstrable control over how AI systems access and process data. An immutable log of every tool invocation, with user identity, tool identity, parameter values, response payloads, and timestamps, satisfies the audit requirements that enterprises in regulated industries cannot avoid.

What This Means for Engineering Teams

The MCP ecosystem in 2026 resembles the npm ecosystem circa 2016: explosive growth, minimal governance, and a supply chain that relies on trust more than verification. The enterprises that avoided the worst npm supply chain incidents were those that ran private registries, enforced allowlists, and treated dependency management as a security function. The same pattern applies to MCP.

The concrete steps are not optional for any organization deploying AI agents against production systems. Deploy a private MCP registry with promotion gates and signed bundles. Deploy a gateway that enforces SSO-integrated authentication and per-tool RBAC. Disable auto-approval for tool invocations in production environments. Pin tool descriptions and verify hashes at load time. Scan for shadow MCP servers on developer machines. Scope credentials to individual invocations, never to servers. Log everything.

The protocol is mature. The tooling is catching up. The governance gap between what MCP enables and what enterprises can safely operate is closing, but it is closing through deliberate architectural choices, not through defaults. The defaults are still insecure. The organizations that treat MCP governance as infrastructure, not afterthought, will be the ones that scale AI agent deployments without scaling their attack surface proportionally.