AWS Bedrock

Overview

Amazon Bedrock is a fully managed service that makes high-performing foundation models (FMs) from leading AI startups and Amazon available via a single API. It allows you to build and scale generative AI applications using FMs from providers like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon.

Core Concepts

Foundation Models (FMs): Update-to-date models for text, chatbots, image generation, and embeddings.
Agents: Manage and automate complex tasks by breaking them down into multiple steps and invoking APIs.
Guardrails: Implement safety and responsible AI policies to filter content.
Knowledge Bases: Connect FMs to your data sources for RAG (Retrieval Augmented Generation).

Real-Time Implementation Example: Generative AI Text Summarizer

Scenario: You want to build a simple tool that summarizing product reviews using the Anthropic Claude model via Bedrock.

1. Shell (AWS CLI)

Directly invoke the model from the command line. Useful for testing and quick scripts.

# 1. List available base models to find the ID
aws bedrock list-foundation-models --by-provider anthropic

# 2. Invoke the model (Claude 3 Sonnet example)
# Save the payload in a json file
echo '{"anthropic_version": "bedrock-2023-05-31", "max_tokens": 1024, "messages": [{"role": "user", "content": "Summarize this review: The product is great but the delivery was late."}]}' > payload.json

# Run the command
aws bedrock-runtime invoke-model \
    --model-id anthropic.claude-3-sonnet-20240229-v1:0 \
    --body file://payload.json \
    --cli-binary-format raw-in-base64-out \
    response.json

# Check output
cat response.json

2. Terraform

Provisioning permissions. Bedrock models are serverless, but you often need IAM roles for an application to access them.

# IAM Role for an EC2 instance or Lambda to access Bedrock
resource "aws_iam_role" "bedrock_access_role" {
  name = "bedrock_access_role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "lambda.amazonaws.com"
        }
      }
    ]
  })
}

# Policy to allow invoking specific models
resource "aws_iam_policy" "bedrock_invoke_policy" {
  name        = "bedrock_invoke_policy"
  description = "Allow invoking Claude models"

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect   = "Allow"
        Action   = [
          "bedrock:InvokeModel",
          "bedrock:InvokeModelWithResponseStream"
        ]
        Resource = "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0"
      }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "attach_bedrock" {
  role       = aws_iam_role.bedrock_access_role.name
  policy_arn = aws_iam_policy.bedrock_invoke_policy.arn
}

3. Python (Boto3)

The most common way to integrate Bedrock into applications.

import boto3
import json

# Initialize the client
bedrock = boto3.client(service_name='bedrock-runtime', region_name='us-east-1')

def summarize_review(review_text):
    # Define the payload for Claude 3
    body = json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 500,
        "messages": [
            {
                "role": "user", 
                "content": f"Please provide a one-sentence summary of the following review: {review_text}"
            }
        ]
    })

    modelId = 'anthropic.claude-3-sonnet-20240229-v1:0'
    accept = 'application/json'
    contentType = 'application/json'

    try:
        response = bedrock.invoke_model(
            body=body, 
            modelId=modelId, 
            accept=accept, 
            contentType=contentType
        )

        response_body = json.loads(response.get('body').read())
        # Parse the response based on Claude's output structure
        summary = response_body['content'][0]['text']
        return summary

    except Exception as e:
        print(f"Error invoking Bedrock: {e}")
        return None

# Example Usage
review = "I've been using this coffee machine for a month. While the coffee tastes amazing and it's easy to clean, the water tank is too small and requires constant refilling."
print(f"Summary: {summarize_review(review)}")

Detailed Real-Life Scenario: Intelligent Customer Support Agent

Challenge: A retail company is overwhelmed by customer emails asking about order status and return policies. Solution: Build a RAG (Retrieval Augmented Generation) solution using AWS Bedrock (Knowledge Bases) and Lambda. 1. Ingest: Upload PDF policy documents to S3. 2. Index: Use Bedrock Knowledge Bases to chunk these documents and store embeddings in OpenSearch Serverless. 3. Chat: When a user asks a question via a web UI (hosted on Amplify), the query is sent to Bedrock Agents. 4. Retrieve & Generate: The Agent searches the Knowledge Base for relevant policy info and sends the context + user query to Claude 3. 5. Response: The LLM generates a polite, accurate answer based only on the company policy.

Interview Questions

Conceptual Questions

What is Amazon Bedrock and how does it differ from Amazon SageMaker?
- Amazon Bedrock is a fully managed service for using Foundation Models (FMs) via a simple API. It is designed for application developers who want to consume models without managing infrastructure.
- Amazon SageMaker is a comprehensive platform for data scientists to build, train, and tune their own machine learning models from scratch. Use Bedrock to consumes FMs; use SageMaker to build/train custom models.
What are Bedrock Agents?
- Agents allow Bedrock to execute multi-step tasks. They can break down a user request (e.g., "Book a flight and send receipt"), identify the necessary API calls (defined in OpenAPI schemas), and execute them securely.
Explain the concept of RAG (Retrieval-Augmented Generation) in Bedrock.
- RAG is a technique where the model retrieves relevant data from an external knowledge source (like company documents in S3) to augment its answer. In Bedrock, Knowledge Bases fully manage the RAG workflow: ingesting docs, chunking them, creating embeddings, and retrieving relevant chunks to send to the LLM.
What are Guardrails in Bedrock?
- Guardrails provide a safety layer that filters inputs and outputs. You can define denied topics, filter hate speech/toxicity, and redact PII (personally identifiable information) regardless of the underlying model being used.

Scenario-Based Questions

You need to build a chatbot that answers questions based strictly on your internal HR PDF documents. You don't want the model to use its general knowledge to hallucinate answers. How do you implement this?
- Use Bedrock Knowledge Bases. Upload the HR PDFs to S3 and sync them to a Knowledge Base (backed by a vector store like OpenSearch Serverless). When querying the model using the RetrieveAndGenerate API, Bedrock will fetch the relevant chunks from the PDF and the model will generate the answer based only on that context.
You are using the Anthropic Claude model via Bedrock, but you are hitting throughput limits (ThrottlingExceptions). How can you increase the throughput?
- By default, Bedrock uses "On-Demand" throughput which is shared. To get guaranteed/higher throughput, you must purchase Provisioned Throughput. This allows you to reserve a specific number of "Model Units" for a base model or a custom model for a fixed term (1 or 6 months).
A developer wants to test a new prompt for a marketing email generator but doesn't want to write code yet. What tool should they use?
- They should use the Bedrock Playgrounds (Text, Image, or Chat) in the AWS Console. It allows them to experiment with different models, inference parameters (temperature, top-p), and prompts interactively.