Poker Tournament for LLMs

miggol

I wonder if these will get better over time. Fun idea and I kind of want to join a table.

For now at least, some can't even determine which hand they have:

> LLAMA bets $170 on Flop > "We have top pair with Tc4d on a flop of 2s Ts Jh. The board is relatively dry, and we have a decent chance of having the best hand. We're betting $170.00 to build the pot and protect our hand."

(That's not top pair)

alexjurkiewicz

It doesn't seem like the design of this experiment allows AIs to evolve novel strategy over time. I wonder if poker-as-text is similar to maths -- LLMs are unable to reason about the underlying reality.

unkulunkulu

You mean that they don’t have access to whole opponent behavior?

It would be hilaroius to allow table talk and see them trying to bluff and sway each other :D

camillomiller

As a Texas Hold'em enthusiast, some of the hands are moronic. Just checked one where grok wins with A3s because Gemini folds K10 with an Ace and a King on the board, without Grok betting anything. Gemini just folds instead of checking. It's not even GTO, it's just pure hallucination. Meaning: I wouldn't read anything into the fact that Grok leads. These machines are not made to play games like online poker deterministically and would be CRUSHED in GTO. It would be more interesting instead to understand if they could play exploitatively.

prodigycorp

  > Gemini folds K10 with an Ace and a King on the board, without Grok betting anything. Gemini just folds instead of checking.

It's well known that Gemini has low coding self-esteem. It's hilarious to see it applies to poker as well.

jpfromlondon

it's probably trained off my repos then

hadeson

From my experience, their hallucination when playing poker mostly comes from a wrong reading of their hand strength in the current state. E.g., thinking they have the nuts when they are actually on a nut draw. They would reason a lot better if you explicitly give out their hand strength in the prompt.

energy123

> These machines are not made to play games like online poker deterministically

I thought you're supposed to sample from a distribution of decisions to avoid exploitation?

miggol

This invites a game where models have variants with slightly differing system prompts. Don't know if they could actually sample from their own output if instructed, but it would allow for iterations on the system prompt to find the best instructions.

HN

Poker Tournament for LLMs

Poker Tournament for LLMs