AI's Biggest Flaw? The Blinking Cursor Problem
44 comments
·February 24, 2025lisper
psadri
I think it depends on who is using the system. For a power user who is already familiar with the domain, a free form, open ended ui that can understand anything I very powerful and liberating.
For a novice user or someone who is not from the domain - it can be challenging because they may not know where to start.
There is so much that can be done in this space by fully leaning into the AI. It can for example figure out the user’s level and offer varying levels of guidance and help.
joe_the_user
Just to be clear, the long video you link to essentially saying lack of discoverability is an intentional misfeature of social media.
Which is to say that the host and OP agree lack of discoverability is a problem (Watons just views it as maliciously inserted problem). And so your "No" involves a bit of misrepresentation...
lisper
> lack of discoverability is an intentional misfeature of social media
That's not the message at all. The message is that the problem with social media is that it feeds you content without any prompting, and so it turns the user into a purely passive consumer and robs them of their agency. There's plenty of discoverability in social media. The problem is you don't have to use it, and so people don't. A blinking cursor forces you to take the wheel.
ojschwa
This is a tantalizing problem for me as a UX designer. My approach, which I'm quite proud of, places a UI primitive (Todo lists) center stage, with the chat thread on the side similar to Canvas or Claude's Artifacts. The interaction works like this:
1. User gets shown a list GUI based on their requirement (Meal Planning, Shopping List...) 2. Users speak directly to the list while the LLM listens in realtime 3. The LLM acknowledges with emojis that flash to confirm understanding 4. The LLM creates, updates or deletes the list items in turn (stored in localStorage or a Durable Object -> shout out https://tinybase.org/)
The lists are React components, designed to be malleable. They can be re-written in-app by the LLM, while still taking todos. The react code also provide great context for the LLM — a shared contract between user and AI. I'm excited to experiment with streaming real-time screenshots of user interactions with the lists for even deeper mind-melding.
I believe the cursor and chat thread remain critical. They ground the user and visually express the shared context between LLM and user. And of course, all these APIs are fundamentally structured around sequential message exchanges. So it will be an enduring UI pattern.
If you're curious I have a demo here -> https://app.tinytalkingtodos.com/
psadri
I’m a fan of this co-authoring-a-document model. It’s nice bridge between free form interaction and rigid UIs of the yore.
xg15
> Once you’ve overcome the intimidation of the blinking cursor and typed something in, one of the next obstacles you’ll likely meet is the lack of clarity regarding the overall capabilities of these general-purpose AI systems.
The article presents this as a UX problem, but isn't this actually a much deeper issue? We straight up don't know what those models can and cannot do (I.e. which tasks can be reliably done with high levels of correctness and which tasks will just lead to endless hallucinations) because the mechanism by which the models generalize tasks is still not fully understood. This stuff is still an active area of research.
antonkar
Yep, we can build the Artificial Static Place Intelligence – instead of creating AI/AGI agents that are like librarians who only give you quotes from books and don’t let you enter the library itself to read the whole books. Why not expose the whole library – the entire multimodal language model – to real people, for example, in a computer game?
To make this place easier to visit and explore, we could make a digital copy of our planet Earth and somehow expose the contents of the multimodal language model to everyone in a familiar, user-friendly UI of our planet.
We should not keep it hidden behind the strict librarian (AI/AGI agent) that imposes rules on us to only read little quotes from books that it spits out while it itself has the whole output of humanity stolen.
We can explore The Library without any strict guardian in the comfort of our simulated planet Earth on our devices, in VR, and eventually through some wireless brain-computer interface (it would always remain a game that no one is forced to play, unlike the agentic AI-world that is being imposed on us more and more right now and potentially forever)
marginalia_nu
These seem to mostly be a human problem.
Out of the large number of things you can do, most likely you're only consciously aware of a small number of them, and even among those, you're fairly likely to fall back on doing the things you've done before.
You could potentially do something new, something you haven't even considered doing that's wildly out of character, there's any number of such things you could do, but most likely you won't, you'll follow your routines and do the same proven things over and over again.
You Can Just Do Things (TM), sure, but first you need to have the idea of doing them. That's the difficult hard part, fishing an interesting idea out of the dizzying expanse of possibilities.
light_triad
A chat interface is great in the sense that it's open, flexible and intuitive.
The downside is there's a tendency to anthropomorphise AI, and you might not want to talk to your computer: it takes too long to explain all the details, can be clunky for certain tasks and as the author argues actually limiting if you don't already know what it can do.
There's a need to get past the "Turing test" phase and integrate AI into more workflows so that chat is one interface among many options depending on the job to be done.
42lux
You know I kinda want to but more like in Star Trek. Interconnected between voice commands, terminals and screens. The problem is the fact that we won’t get a well integrated AI. The best possibility has probably apple because they usually get the interconnections between their products right… but they have other problems in regards to AI.
rafaelmn
Chat works because humans are really impressed by natural language responses, irregardless of the actual correctness/quality.
Once you build it into a product the failure modes become obvious.
mozzieman
For programming, the tooling and ui is progressing. Like reasoning models and tooling around them that makes sure to write unit tests, compile the code and try the tests. If wrong, redo the code again. This causes other ui problems yet to solve like longer iterations between user feedback but the ui problems are not for the lack of progression.
null
binarymax
I see some good points here but overall I disagree. Traditionally all UI have required people to adapt to how machines work. We need to memorize commands and navigate clunky interfaces that are painstakingly assembled (often unsuccessfully) by UX research and UI teams.
The chat reverses this. It is now machines adapting to how we communicate. I can see some UI sugar finding its way into this new way of interaction, but we should start over and force the change to keep it on our terms.
jltsiren
Chat UI can be intuitive if it sees the context. If you can make single-sentence queries and the AI understands enough of the context to guess what you actually meant and to extrapolate the details, it can be very powerful. But if you actually need precise and detailed queries for good outcomes, it's not very natural. Explaining things clearly is often harder than understanding them or doing them yourself.
recursive
> These AI systems are not able to describe their own capabilities or strengths and are not aware of their limitations and weaknesses
I've experienced this with github copilot. At the beginning of a copilot chat, there's a short paragraph. It tells you to use "slash commands" for various purposes. I ask for a list of what slash commands are available. It responds by giving me a general definition of the term "slash command". No. I want to know which slash commands you support. Then it tells me it doesn't actually support slash commands.
I definitely feel like I'm falling into the non-power-user category described here in most of my AI interactions. So often I just end up arguing them in circles and them constantly agreeing and correcting, but never addressing my original goal.
yorwba
To find out about slash commands, you should type "/help". Of course, you'd only know about the "/help" slash command if you were already at least a bit familiar with slash commands. It is a conundrum.
recursive
Or... it could say "type /help to learn more". But maybe that would make it too easy.
ddxv
Another issue is trust. When it does tell you inrormation, how do you know you can trust that?
I treat it now more like advice from a friend. Great information that isn't necessarily right and often wrong without having any idea it is wrong.
Syonyk
> I treat it now more like advice from a friend. Great information that isn't necessarily right and often wrong without having any idea it is wrong.
"Drunken uncle at a bar, known for spinning tales, and a master BSer who hustled his way through college in assorted pool halls" is my personal model of it. Often right, or nearly so. Frequently wrong. Sometimes has made things up on the spot. Absolutely zero ability to tell which it is, from the conversation.
skydhash
You actually have a confidence measure for your friend advice. I’d trust a mechanic friend if he says I should have someone take a look at my car, or my librarian friend when he recommends a few books. Not everyone tell a lie and the truth in the same breath. And there’s the quantifier like “I think…”, “I believe…”, “Maybe…”
null
amelius
WhatsApp has the same blinking cursor, and everybody is happy with it.
layer8
An important feature of WhatsApp is that it lets you communicate with different people, who each have different pre-existing contexts and roles for you. Role selection is one of the possible solutions proposed in the article.
wepple
I tend to know people on chat are human and therefore what they’re likely capable of and not capable of.
And I’m not expected to use them as a tool. By contrast I can probably pick up any Ryobi power tool that I’ve never seen before and work out how to make it do its thing, and probably what its purpose is
kleiba
The blinking cursor is a metaphor, it's about having to craft prompts and what that implies from a UX perspective.
matthewmueller
This seems true to get the most out of an LLM, but you could also say Google has this problem too.
Seems like not a huge stretch to apply how you use Google to LLMs and get good milage.
No, the blinking cursor is a feature, not a bug. Alec Watons over at Technology Connections has a much better argument for this than I could ever hope to muster, so I'll just hand it over to him:
https://www.youtube.com/watch?v=QEJpZjg8GuA