I was working with Amazon Bedrock to run LLM inference. AWS has its fair share of complexity – VPCs, subnets, security groups, etc.
On the surface, running inference on Amazon Bedrock is straightforward. A simple script might look like this (assuming you have proper environment variables set):
bedrock = boto3.client('bedrock-runtime', region_name='us-east-2')
messages = [{"role": "user", "content": [{"text": "What is 2+2?"}]}]
res = bedrock.converse(modelId="anthropic.claude-3-5-sonnet-20241022-v2:0", messages=messages)
print(res['output']['message']['content'][0]['text'])
We can find this model name in the us-east-2
model catalog (requires sign in).
Anthropic’s documentation also mentions these model names.
I was working with SageMaker and dealing with a restrictive VPC setup, but when I tried to run the inference above, I got the following error:
An error occurred (ValidationException) when calling the Converse operation: Invocation of model ID anthropic.claude-3-5-sonnet-20241022-v2:0 with on-demand throughput isn't supported
This was confusing specifically because I wasn’t sure if I was having AWS permissions and access issues or inference issues.
It turned out, prefixing the model id with us.
, i.e.gs us.anthropic.claude-3-5-sonnet-20241022-v2:0
was all I needed to get things working for my setup.
bedrock = boto3.client('bedrock-runtime', region_name='us-east-2')
messages = [{"role": "user", "content": [{"text": "What is 2+2?"}]}]
res = bedrock.converse(modelId="us.anthropic.claude-3-5-sonnet-20241022-v2:0", messages=messages)
print(res['output']['message']['content'][0]['text'])
Hopefully, this is helpful if you’re calling Bedrock through a VPC endpoint.