One huge problem with these "cheap" models is that they happen to be more expens...

JimDabell · 2025-09-29T16:48:59 1759164539

> One huge problem with these "cheap" models is that they happen to be more expensive in the typical agent workflow if the provider does not support caching.

DeepSeek supports caching and cache hits are a tenth of the cost.

$0.028/M for cache hit

$0.28/M for cache miss

$0.42/M for output

— https://api-docs.deepseek.com/news/news250929

grim_io · 2025-09-29T16:51:24 1759164684

I auto disqualify the chinese first party endpoints.

If they are okay for you, then sure go ahead. Enjoy the caching.

What other provider is going to support it?

JimDabell · 2025-09-29T16:54:32 1759164872

> I auto disqualify the chinese first party endpoints.

Why?

curseofcasandra · 2025-09-29T18:03:37 1759169017

I’m guessing it’s something along the lines of this: https://youtu.be/kYiUY07TzS4

guluarte · 2025-09-29T16:59:17 1759165157

by your logic then you have to disqualify openai and anthropic first party endpoints for testing gpt and claude...

grim_io · 2025-09-29T17:21:19 1759166479

There is no bug in my logic. Anthropic and OpenAI are not chinese first party providers.

segmondy · 2025-09-29T16:48:05 1759164485

you declared a huge problem and followed up with an IF.

deepseek API supports caching, stop manufacturing problems where there is none.

https://api-docs.deepseek.com/guides/kv_cache

grim_io · 2025-09-29T16:49:47 1759164587

Sure. But there is no way I'm going to use the deepseek endpoint.

Openrouter says they might use your data for training.

cheema33 · 2025-09-29T19:44:26 1759175066

First you complained about lack of caching. When you were informed that the model supports caching, instead of admitting your error you switched to an unrelated complaint. I hope that you you do not use similar strategies for discussion in your personal and work life.

grim_io · 2025-09-29T20:06:47 1759176407

Your broad attack on me as a person is unnecessary.

If you read my post carefully, you will realize that I did not make any contradictory statements.

llllm · 2025-09-29T23:21:33 1759188093

Not a broad attack, it is specifically targeted at your proud xenophobia.

grim_io · 2025-09-30T22:06:06 1759269966

Absolutely ridiculous.

My wife is Chinese.

segmondy · 2025-09-30T03:24:59 1759202699

caching is not a function of the model but the provider, all models can be cached. the provider serving the model decides if they are going to cache it. openrouter is not a provider but a middleman between providers, so some of their providers for deepseek might provide caching and some might not. if you just use any then you might run into the issue. some of their provider might use your data for training, some might not. you have to look at the list and you can cherry pick ones that won't train on your data and that also provide caching.

NotMichaelBay · 2025-09-29T16:47:18 1759164438

I was under the impression that this model does support caching. The pricing page says the cost of input tokens (cache hit) is $0.028.