Every AI agent demo looks great in a notebook. The problem starts the moment you ask the obvious questions. Where does the agent run for eight hours without leaking memory into the next user's session? Who issues the OAuth token when the agent needs to read your Salesforce on your behalf? What stops the agent from using all your CPU when somebody asks it to summarize a 400-page PDF? You can build that scaffolding yourself, and a lot of teams have, badly. Amazon Bedrock AgentCore is AWS trying to ship that scaffolding as a real platform — modular, framework-agnostic, and finally serious about the difference between a hackathon agent and a production one.
This is the version that started shipping in preview during summer 2025 and got a major refresh in April 2026 with the managed harness, filesystem persistence, and a proper CLI. If you tried AgentCore early and bounced off, the 2026 release is where it actually starts feeling like a platform instead of a checklist of microservices.
What AgentCore actually is
The marketing line is that AgentCore is an "agentic platform for building, deploying, and operating effective agents securely at scale." That is technically true and also useless. Here is the engineering version. AgentCore is six independent services — Runtime, Memory, Gateway, Identity, Code Interpreter and Browser tools, and Observability — that you can adopt one at a time, with whatever agent framework you already like. It does not force you onto a new SDK, a new orchestration model, or a new model provider. Strands Agents, LangGraph, CrewAI, LlamaIndex, Google ADK, OpenAI Agents SDK, Pydantic AI — they all run on top of it unchanged. The opinion baked into AgentCore is not "use our framework." It is "your framework is fine, but the runtime, the auth, and the memory should not be your problem."
That separation matters because every team that built agents before mid-2025 built their own version of these primitives. The migration story for AgentCore is not "rewrite your agent." It is "move your agent into our microVM, plug your tools into our gateway, and stop maintaining the rest."
A nine-microservice architecture so you don't have to maintain a nine-microservice architecture. Honest, at least.
Runtime: the part that actually runs your agent
AgentCore Runtime is serverless infrastructure tuned for what agents do, which is fundamentally different from what web requests do. Each user session lands in a dedicated microVM with its own isolated CPU, memory, and filesystem, and the VM is wiped at the end of the session. No noisy neighbors, no shared kernel between tenants, no chance that one user's tool output leaks into another user's context window through a shared cache. Sessions can run for up to eight hours, which is the kind of duration you need when an agent is doing actual work — research synthesis, multi-step refactors, long browser automations — and not just answering a question.
The pricing is consumption-based and reasonably transparent. You pay $0.0895 per vCPU-hour and $0.00945 per GB-hour of memory, billed on active CPU use and peak memory consumption. A small agent that spends most of its time waiting on the model API costs almost nothing in compute, because vCPU billing tracks active CPU. The thing you actually pay for is sustained reasoning loops, large in-process retrieval, and tool calls that hammer the CPU.
Two 2026 additions matter here. The first is filesystem persistence — the local session state can be externalized so an agent can suspend mid-task and resume exactly where it left off, including its in-progress files. The second is the managed harness, where you specify a model, a system prompt, and a set of tools, and AgentCore runs the entire agent loop for you: reasoning, tool selection, action execution, response streaming. No orchestration code. The harness is essentially AWS saying you do not need LangGraph for the simple cases, and they are mostly right.
Memory: the thing every framework gets wrong
Most agent frameworks treat memory as a single concept, and then quietly break in production when the chat history is 400KB and the context window is full of last week's noise. AgentCore Memory splits the problem into short-term and long-term, prices them differently, and lets you reason about them as separate things.
Short-term memory is raw events from the current session, billed by event volume. Long-term memory is processed and stored across sessions — episodic memory of what happened, semantic memory of what the user prefers, both queryable through retrieval calls billed per call. The split is not a UX detail. It is what lets you put a per-user memory bound on short-term, then promote distilled facts to long-term without dragging the entire conversation log along for the ride. It is the difference between an agent that remembers the user's name across sessions and an agent that costs $0.40 per turn because every reply replays the entire history.
Yes, you could build this on DynamoDB and a vector store. You probably already have. The question is how many times you want to.
Gateway: turning your APIs into tools without writing wrappers
Every agent codebase has a tools/ directory full of thin wrappers around internal APIs, and every one of those wrappers eventually rots. AgentCore Gateway eats this category. Point it at your existing services and it transforms them into agent-ready tools with minimal code, while applying policy on every invocation. Every time an agent calls a tool through Gateway, the action is checked against your rules to determine whether it is allowed or denied — that is the layer where you stop the helpful junior agent from accidentally deleting the production database.
The pricing model rewards this pattern: tool calls and policy checks are cheap, and you stop maintaining bespoke MCP servers and authentication shims. If you already speak MCP, Gateway speaks it back; AgentCore ships an official MCP server so coding assistants like Claude Code, Cursor, Codex, and Kiro Power can drive your agent infrastructure directly.
Identity: the unsexy service that prevents most disasters
Agentic auth is not the same problem as user auth. The agent is acting on behalf of a specific user, with scoped permissions, possibly across multiple SaaS providers, possibly hours after the user closed their browser tab. AgentCore Identity handles this. It integrates natively with AWS IAM and external providers like Okta, manages OAuth tokens for non-AWS resources, and gives you a clean answer to the question "which human authorized this tool call." When used with Runtime or Gateway, Identity is included at no additional cost; used standalone, it bills per successful OAuth token or API key issued.
The reason this matters is that most "agent gone wrong" stories are not really about model behavior. They are about an agent running with broader permissions than the user behind it, because the team did not have time to build proper delegation. Identity is what removes that excuse.
The good news: your agent can finally act on behalf of a user. The bad news: your agent can finally act on behalf of a user.
Code Interpreter, Browser, Observability
Code Interpreter gives the agent a sandboxed Python execution environment, isolated enough to safely run model-generated code. Browser is a managed headless browsing service for web research, the kind of thing every team rebuilds with Playwright and a bag of duct tape. Both are billed per session and remove a category of "we need somebody who knows containers" work that has nothing to do with the agent's actual job.
Observability is built on OpenTelemetry and ships traces, metrics, and reasoning steps directly to Amazon CloudWatch. Every tool call, memory retrieval, and reasoning hop is captured in a structured trace. This is the part that makes the difference between debugging your production agent and shrugging at it. If you have ever tried to figure out why an agent took a wrong turn three steps ago, you understand why this exists.
How you actually deploy one
The 2026 release adds the AgentCore CLI, which is the ergonomic part the launch was missing. You install it via npm, run a create wizard, pick your framework — Strands, LangGraph, CrewAI, OpenAI Agents SDK, whatever — and pick Python or TypeScript. The CLI scaffolds a project, gives you agentcore dev to run locally with hot reload, and agentcore deploy to push the agent into AgentCore Runtime as managed infrastructure. The same agent that ran on your laptop runs unchanged in production, which is the whole pitch.
Code-wise, the wrapper is a few lines: instantiate a BedrockAgentCoreApp, define an entrypoint that takes a request and yields events from your agent loop, and call app.run(). Whatever you put inside the entrypoint — a Strands graph, a LangGraph state machine, a CrewAI crew — runs on top of the harness. You did not have to choose AgentCore's flavor of agent. You just chose where it runs.
The other notable 2026 addition is AgentCore Skills, which are pre-built coding-assistant guides that let tools like Claude Code, Cursor, Codex, and Kiro Power generate AgentCore-native code accurately. This sounds like a marketing item until you watch a coding agent burn an hour hallucinating an API surface that does not exist, at which point first-party skills become very valuable.
Where AgentCore fits, and where it does not
AgentCore is the right choice when you want production primitives without rebuilding them, when your agent needs long-running sessions or strict tenant isolation, and when you already have an AWS footprint you want to reuse for IAM, CloudWatch, and networking. It is overkill when your "agent" is a single-shot prompt with one tool call — at that scale you should be using Bedrock Agents or even just plain InvokeModel.
The harder question is what AgentCore is competing with. Inside AWS, it is replacing the older Bedrock Agents abstraction for any non-trivial use case. Outside AWS, it is competing with the same patchwork of LangGraph plus Postgres plus Redis plus your own auth layer that everyone already runs. The argument is not that AgentCore is more powerful than that stack. It is that AgentCore is the same stack, billed by the hour, with one team — not yours — paying the on-call cost.
If your roadmap has the words "agent in production" on it for 2026, you are going to build most of these primitives anyway. The interesting question is whether you build them yourself, again, or rent them.