Ecosystem
January 26, 2026

Amazon Bedrock adds one-hour prompt caching to boost latency and cost efficiency

Amazon Bedrock now supports one-hour prompt caching, allowing developers to reuse context efficiently, reduce inference latency, and lower costs for repetitive or long-running generative AI workloads.

AWS has enhanced Amazon Bedrock by extending prompt caching duration to one hour, a significant upgrade for developers building production-scale generative AI applications.

Prompt caching enables reuse of previously processed context, reducing repeated computation and improving response latency while lowering inference costs. This is particularly valuable for agentic workflows, RAG systems, and conversational applications with stable system prompts.

The update signals AWS’s focus on inference optimization rather than just model access, positioning Bedrock as a more cost-efficient, enterprise-ready platform for scalable Gen AI deployments.

#
Bedrock

Read Our Content

See All Blogs
Gen AI

Anthropic’s Claude Managed Agents platform accelerates AI agent deployment for teams

Deveshi Dabbawala

April 9, 2026
Read more
AI safety

Everything you need to know about Anthropic's Project Glasswing

Deveshi Dabbawala

April 8, 2026
Read more