Skip to content(if available)orjump to list(if available)

Command A: Max performance, minimal compute – 256k context window

Oras

Cohere always coming across as a niche LLM, not a generic one.

I once tried it to enforce returning the response in British English and it worked a lot better than any other model that time. But that was about it for following the prompt. Their pricing is not competitive for others to jump on and I suspect that’s why it’s not widely used.

razemio

I distrust those benchmarks after working with sonnet for half a year now. Many OpenAI models beat Sonnet on paper. This seems to be the case because it's strength (agent, visual, caching) aren't being used, I guess? Otherwise there is no explanation why it's not constantly on top. I have tried so many times to use other models for various tasks, not only coding. The only thing where OpenAI accelerates is analytic tasks at a much higher price. Everything else sonnet works for me the best and Gemini Flash 2.0 for cost effective and latency relevant tasks.

In practice this perception of mine seems to be valid: https://openrouter.ai/models?order=top-weekly

The same with this model. It claims to be good at coding but it seriously isn't compared to sonnet. Funny enough it isn't being tested against.

integralof6y

I just tried the chat and asked the LLM to compute the double integral of 6*y on the interior of a triangle given the vertices. There were many trials all incorrect, then I asked to compute a python program to solve this, again incorrect. I know math computation is a weak point for LLM specially on a chat. In one of the programs it used a hardcoded number 10 to branch, this suggests that the program generated was fitted to give a good result for the test (I did give it the correct result before). This suggests that you should be careful when testing the generated programs, they could be fitted to pass your simple tests. Edited: Also I tried to compute the integral of 6y on the triangle with vertices A(8, 8), B(15, 29), C(10, 12) and it yield a wrong result of 2341, then I suggested computing that using the formula for the barycenter of the triangle, that is, 6Area*(Mean of y-coordinates) and it returned the correct value of 686.

To summarize: It seems that LLM are not able to give correct result for simple math problems (here a double integral on a triangle). So students should not rely on them since nowaday they are not able to perform simple task without many errors.

HeatrayEnjoyer

>compute the integral of 6*y on the triangle with vertices A(8, 8), B(15, 29), C(10, 12)

o3-mini returned 686 on the first try, without executing any code.

Szpadel

what disqualified this model for me (I mostly use llms for codding) was 12% score in aider benchmark (https://aider.chat/docs/leaderboards/)

Alifatisk

Cohere API Pricing for Command A

- Input Tokens: $2.50 / 1M

- Output Tokens: $10.00 / 1M

WOW, what makes them this expensive? Are we going against the trend here and raising the prices instead?

null

[deleted]