Confidential Computing on AWS: Protecting Data in Use with Nitro Enclaves
08 April 2026 - 9 min. read
Alessio Gandini
Cloud-native Development Line Manager

Generative AI, Machine Learning, Large Language Models, Foundation Models.
If you work in the Cloud, these terms are buzzing everywhere: from tech conferences to corporate meetings, and in every newsletter landing in your inbox. But how many times, after the initial enthusiasm, have you faced the most uncomfortable question: "Okay, but how do we bring this to production?"
That is exactly where many generative AI projects stall. Between choosing models, configuring infrastructure, optimizing costs, and meeting security constraints, the gap between the proof-of-concept and production can change significantly and become a blocker to the project's go-live.
Amazon Bedrock was created precisely to solve these problems. It is not just another service promising miracles, but a serverless platform that makes the most advanced Foundation Models (FM) accessible through managed APIs, allowing you to focus on business value rather than infrastructure. In this guide, we will show you how to use Amazon Bedrock in real-world scenarios, from deployment to cost optimization, covering architectural patterns like RAG and Agents. Because doing a "hello world" with ChatGPT is one thing, but building scalable and efficient production-ready systems is another.
Amazon Bedrock is a fully managed service that offers access to Foundation Models from several leading AI companies through a single API. Think of it as an AI model marketplace where you can choose the one best suited for your use-case without worrying about training, hosting, or scaling.
Bedrock provides proprietary and open-source models, including:
The variety is not random: every model has specific characteristics in terms of performance, cost, capabilities, and speed. Choosing the right one makes the difference between a sustainable project and an AWS bill spiraling out of control.
"But can't I just use OpenAI's APIs directly or host an open-source model on EC2?" we are often asked. Sure you can. But it is important to consider these aspects:
But be careful: as always in the cloud, "serverless" does not mean it does everything by itself; it is crucial to architect intelligently to optimize costs and performance.
Let's understand how Bedrock works under the hood.
Bedrock offers mainly two usage modes:
The choice between the two depends, as usual, on your use-case. If, for example, you have a chatbot with unpredictable spikes, on-demand is the right choice. If you process thousands of documents per day with a constant volume, provisioned throughput can save you up to 50%.
A typical Bedrock implementation includes:
Custom models: Fine-tuning base models (when on-demand is not enough).
Theory is done. Now let's see how to bring Bedrock to production using Infrastructure as Code.
Before starting, ensure you have:
Important: Not all models are available in all regions. Always check regional availability before designing the architecture.
Once deployed, we can use our foundational model for our workloads; a typical example is RAG.
You might be dealing with an existing application that doesn't use native Amazon Bedrock APIs. It's not a problem at all: during re:Invent 2025, Project Mantle was announced. It is a distributed inference engine designed to offer OpenAI-compatible API endpoints. The idea is extremely simple: it acts as a drop-in replacement, allowing developers to migrate existing applications (built using the OpenAI API set) to Bedrock, simply by changing the API endpoint and generating a new key from the AWS console. This way, you can use the models present on Bedrock without modifying the application code, reducing porting time for applications to zero.
In this page, you can find everything you need to start.
Let's talk about money. Because a chatbot that costs €10,000 a month to serve 1,000 users is not sustainable.
Bedrock uses a pay-per-use model based on:
Pricing varies from model to model, here an example as of January 2026
`# model_selector.py
def select_model_for_task(task_type, complexity, context_length):
""" Select the model related to tasks """
if task_type == 'classification' or task_type == 'extraction':
return 'anthropic.claude-4-haiku-v1:0'
elif task_type == 'summarization':
if context_length < 10000:
return 'anthropic.claude-4-v1:0'
else:
return 'anthropic.claude-4-v1:0'
elif task_type == 'reasoning' or complexity == 'high':
if context_length > 50000:
return 'anthropic.claude-4-opus-v1:0'
else:
return 'anthropic.claude-4-sonnet-v1:0'
elif task_type == 'code_generation':
return 'anthropic.claude-4-sonnet-v1:0'
else:
return 'anthropic.claude-4-haiku-v1:0'2. Monitoring and cost alerts Implement monitoring from day one:
# cost_monitoring.py
import boto3
from datetime import datetime, timedelta
cloudwatch = boto3.client('cloudwatch')
def put_cost_metric(model_id, tokens_used, cost):
"""
Publishes cost metrics to CloudWatch
"""
cloudwatch.put_metric_data(
Namespace='Bedrock/Usage',
MetricData=[
{
'MetricName': 'TokensUsed',
'Value': tokens_used,
'Unit': 'Count',
'Timestamp': datetime.utcnow(),
'Dimensions': [
{'Name': 'ModelId', 'Value': model_id}
]
},
{
'MetricName': 'EstimatedCost',
'Value': cost,
'Unit': 'None',
'Timestamp': datetime.utcnow(),
'Dimensions': [
{'Name': 'ModelId', 'Value': model_id}
]
}
]
)
At this point, with this metric, you can create a CloudWatch alarm on token usage to detect anomalies and avoid unexpected costs.
Bedrock handles potentially sensitive information. Security is not optional.
Bedrock Guardrails allows you to automatically filter problematic content, both in user inputs and model responses, regardless of the model used.
It works across six policy categories: content filters, denied topics, word filters, sensitive information filters, contextual grounding check, and Automated Reasoning checks. Each one is independently configurable, so you can build exactly the level of protection you need.
Content filters cover six predefined categories of harmful content: Hate, Insults, Sexual, Violence, Misconduct, and Prompt Attack. For each one you can adjust the filter strength, and it's not a binary choice: you can tune it based on the application's context.
For sensitive data, you can choose from a predefined list of PII types or define custom patterns using regular expressions. Particularly useful in regulated industries like finance or healthcare.
We have covered a lot of ground: deployment infrastructure-as-code, RAG for knowledge retrieval, Agents for task automation, cost optimization, security best practices.
But here is the truth: bringing Foundation Models to production is not a technological project, it is a business project. Technology is the means, not the end.
Before writing the first line of code, ask yourselves:
Amazon Bedrock makes the technology accessible. But building successful AI systems still requires planning, intelligent architecture, and continuous optimization. The good news? Now you have the tools to do it.
In the next articles, we will delve into specific use cases: how to build a multi-language RAG system, how to optimize an Agent to reduce hallucinations, how to implement A/B testing between different models.
Stay tuned on #Proud2beCloud!