Anthropic has very tight limits, so you're basically using the worst (pricing-wise) SOTA cloud model as your baseline. I have $200 subs for both Claude and OpenAI, and I also bump into limits with Claude all the time, whether coding or research. With Codex, I ran into the limit once so far, and that's in a month of very heavy (sometimes literally 24 hours around the clock, leaving long-running tasks overnight) use.
I bought the Gemini Ultra to try for a month (at the discounted price). I have been using it non-stop for Opus 4.6 Thinking, which is much better than Gemini 3 Pro (High) and it's been a blast. The most I've managed to consume is 60% of my 5 hourly quota. That was with 2-3 instances in parallel.
I hope too many of us won't be doing this and cause Google to add limits! My hope is Google sees the benefit in this and goes all in - continues to let people decide which Google hosted model to use, including their own.
Not OP, but I am pretty sure they are using Opencode with a certain antigravity plugin. Not going to link it, since it technically allows breaking TOS. If you‘re not using Opencode yet, I wholeheartedly recommend the switch.
Getting CC to work with other models is quite straightforward -- setting a few env vars, and a thin proxy that rewrites the requests/responses to be in the expected format.