Ecosystem
January 26, 2026

Amazon Bedrock adds one-hour prompt caching to boost latency and cost efficiency

Amazon Bedrock now supports one-hour prompt caching, allowing developers to reuse context efficiently, reduce inference latency, and lower costs for repetitive or long-running generative AI workloads.

AWS has enhanced Amazon Bedrock by extending prompt caching duration to one hour, a significant upgrade for developers building production-scale generative AI applications.

Prompt caching enables reuse of previously processed context, reducing repeated computation and improving response latency while lowering inference costs. This is particularly valuable for agentic workflows, RAG systems, and conversational applications with stable system prompts.

The update signals AWS’s focus on inference optimization rather than just model access, positioning Bedrock as a more cost-efficient, enterprise-ready platform for scalable Gen AI deployments.

#
Bedrock

Read Our Content

See All Blogs
Gen AI

Exploring OpenClaw: The self-hosted AI assistant revolution that is reshaping everything

Deveshi Dabbawala

February 18, 2026
Read more
LLM Models

The comprehensive guide to building production-ready Model Context Protocol systems

Deveshi Dabbawala

February 11, 2026
Read more