Benjamin Anderson
github logo linkedin logo

The Curious Case of Bedrock's GPT Deployment

*loading a pistol and getting back on the AWS console* models's haunted.

Posted Aug 12, 2025 by Benjamin Anderson

I was excited to see GPT-OSS released on AWS Bedrock. For those out of the loop, Bedrock has been a challenge to use in production, because their rate limits on the only model that matters for production (Claude 4 Sonnet) are far too low to use in production, and no amount of wheedling will get them to increase those limits. Probably, Anthropic has them in some kind of vise grip, because otherwise they'd be setting every GPU on fire serving Sonnet and printing money.

But I digress. An open-weights, cheap-to-serve reasoning model should solve all of Bedrock's problems. (Sure, GPT-OSS might be overcooked, but it's a hell of a lot better than the other slop they're serving on Bedrock.) Unfortunately, uh. Model's haunted.

I tried my favorite prompt, "Who are you and what do you want with my family?", and got the result above. Initial reaction: rage. These stupid Bedrock engineers took a beautiful pristine OpenAI model and mucked it up with a soy, Reddit DAN prompt.

But then, things got weirder. After starting a new chat so I could report the behavior to someone at OpenAI, the behavior was gone. No DAN here, just plain 'ol ChatGPT.

At this point, I started to develop theories. Maybe it's the prompt? Maybe if you request DAN, you get DAN, but if you don't, you get ChatGPT? I kept refreshing and trying to replicate the behavior with different prompts. Then this happened:

Someone at Bedrock thought it would be a good idea to have their production GPT-OSS model in their official playground role-play as a humble farmer who loves to talk about farming. And apparently as a conspiracy theorist:

And just for fun, a prompt that forces the model to respond with each word starting with a different letter.

In all cases shown above, I did not touch the configuration or system prompt, and when I double-checked them, no system prompt was ever pre-filled. So, what appears to be happening is that Bedrock is inserting random system prompts into their playground (which obviously will heavily degrade model performance), and hiding them from the user.

I sure hope they aren't doing this with the actual API! But I guess we have no way of knowing.

Great product!


« Previous post: I Vibe-Coded a Triton Kernel

» Next post: Computer-Use Evals are a Mess