GPT-4o with scheduled tasks (jawbone) is available in beta

140 comments

·January 14, 2025

imsotiredspacex

This is the prompt describing the function call parameters:

When calling the automation, you need to provide three main parameters: 1. Title (title): A brief descriptive name for the automation. This helps identify it at a glance. For example, "Check for recent news headlines". 2. Prompt (prompt): The detailed instruction or request you want the automation to follow. For example: "Search for the top 10 headlines from multiple sources, ensuring they are published within the last 48 hours, and provide a summary of any recent Russian military strikes in the Lviv Oblast." 3. Schedule (schedule): This uses the iCalendar (iCal) VEVENT format to specify when the automation should run. For example, if you want it to run every day at 8:30 AM, you might provide:

BEGIN:VEVENT RRULE:FREQ=DAILY;BYHOUR=8;BYMINUTE=30;BYSECOND=0 END:VEVENT

Optionally, you can also include: • DTSTART (start time): If you have a specific starting point, you can include it. For example:

BEGIN:VEVENT DTSTART:20250115T083000 RRULE:FREQ=DAILY;BYHOUR=8;BYMINUTE=30;BYSECOND=0 END:VEVENT

In summary, the call typically includes: • title (string): A short name. • prompt (string): What you want the automation to do. • schedule (string): The iCal VEVENT defining when it should run.

notachatbot123

What is the source for that claim?

thenameless7741

https://gist.github.com/thenameless7741/a1957c2898d80ce99ebd...

notachatbot123

That's different text.

null

[deleted]

dmadisetti

The beta is inconsistently showing (required a few refreshes to get something to show up), but my limited usage of it showed a plethora of issues:

- Assumed UTC instead of EST. Corrected it and it still continued to bork

- Added random time deltas to my asked times (+2, -10 min).

- Couple notifications didn't go off at all

- The one that did go off didn't provide a push notification.

---

On top of that, only usable without search mode. In search mode, it was totally confused and gave me a Forbes article.

Seems half baked to me.

Doing scheduled research behind the scenes or sending a push notification to my phone would be cool, but surprised they thought this was OK for a public beta.

gukov

You'd think Open AI's dev velocity and quality would be off the charts since they live and breathe "AI." If a company building ChatGPT itself often delivers buggy features then it doesn't bode well for this whole 'AI will eat the world' notion.

practice9

Well none of the labs have good frontend or mobile engineers or even infra engineers

Anthropic is ahead in this because they keep their UIs simplistic so the failure modes are also simple (bad connection)

OpenAI is just pushing half baked stuff to prod and moving on (GPTs, Canvas).

Find it hilarious and sad that o1-pro just times out thinking on very long or image-intense chats. Need to reload page multiple times after it fails to reply and maybe answer will appear (or not? Or in 5 minutes?). Kinda shows they’re not testing enough and “not eating their own food” and feels like chatgpt 3.5 ui before the redesign

lolinder

> Anthropic is ahead in this because they keep their UIs simplistic ... OpenAI is just pushing half baked stuff to prod and moving on (GPTs, Canvas).

What's funny is that OpenAI's Canvas was their attempt to copy Anthropic's Artifacts! So it's not like Anthropic is stagnant and OpenAI is at least shipping, Anthropic is shipping and OpenAI can't even copy them right.

jeffgreco

It's a good point, Anthropic is being VERY choosy and winds up knocking it out of the park with stuff like Artifacts. Meanwhile their MacOS app is junk, but obviously not a priority.

cma

> because they keep their UIs simplistic

How do I edit a sent message in the Claude Android app? It's so simplistic I can't find it.

golergka

So far, I've found AI to be a great force multiplier in green field, small projects. In a huge corporate codebase, it has the power of advanced refactoring (which doesn't touch more than a handful files at a time) and a CSS wizard.

cruffle_duffle

According to all the magazines I've been reading, all that is required is to just prompt it with "please fix all of these issues" and give it a bulleted list with a single sentence describing each issue. I mean, it's AI powered and therefore much better than overpaid prima-donna engineers, so obviously it should "just work" and all the problems will get fixed. I'm sure most of the bugs were the result of humans meddling in the AI's brilliant output.

Right now, in fact, my understanding is OpenAI is using their current LLM's to write the next generation ones which will far surpass anything a developer can currently do. Obviously we'll need to keep management around to tell these things what to do, but the days of being a paid software engineer are numbered.

xarope

I think you forgot the /s (sarcasm) in your post!

ineedasername

When I have it do a search I have to tell it to just get all the info it can in the search but wait for the next request. The I explicitly tell it we’re done searching and to treat the next prompt as a new request but using the new info it found.

That’s the only way I get it to have a halfway decent brain after a web search. Something about that mode makes it more like a PR drone version of whatever I asked it to search, repeating things verbatim even when I ask for more specifics in follow-up.

emkee

Can you give an example prompt for this approach?

imsotiredspacex

i posted the system prompt part describing the function call; if you read it and adjust your prompt for creating the task it works way better.

null

[deleted]

potatoman22

I'd rather have buggy things now than perfect things in a year.

dmadisetti

Doesn't need to be perfect- but using this would actively reduce productivity

sprobertson

First impressions matter, if the experience is this bad you're probably waiting a year to come back anyway.

jahewson

Worked out great for Sonos when their timers and alarms didn’t work.

broknbottle

Found the PM

arthurcolle

DateTime stuff is generally super annoying to debug. Can't fault them too badly. Adding a scheduler is a key enabling idea for a ton of use cases

sensanaty

> Can't fault them too badly

The same company that touts their super hyper advanced AI tool that can do everyone's (except the C-level's, apparently) jobs to the world can't figure out how to make a functional cron job happen? And we're giving them a pass, despite the bajillions of dollars that M$ and VC is funneling their way?

Quite interesting they wouldn't just throw the "proven to be AGI cause it passes some IQ tests sometimes" tooling at it and be done with it.

arthurcolle

it would explain the bugs if they used the AI to make the datetime implementation though

cbeach

Agreed on date/time being a frustrating area of software development.

But wouldn't a company like OpenAI use a tick-based system in this architecture? i.e. there's an event emitter that ticks every second (or maybe minute), and consumers that operate based on these events in realtime? Obviously things get complicated due to the time consumed by inference models, but if OpenAI knows the task upfront it could make an allowance for the inference time?

If the logic is event driven and deterministic, it's easy to test and debug, right?

singron

The original cron was programmed this way, but it has to examine every job every tick to check if it should run, which doesn't scale well. Instead, you predict when the next run for a job will be and insert that into an indexed schedule. Then each tick it checks the front of the schedule in ascending order of timestamps until the remaining jobs are in the future.

This is also a bad case in terms of queueing theory. Looking at Kingmans equation, the arrival variance is very high (a ton of jobs will run at 00:00 and much fewer at 00:01), and the service time also has pretty high variance. That combo will either require high queue delay variance, low utilization (i.e. over-provosioning), or a sophisticated auto-scaler that aggressively starts and stops instances to anticipate the schedule. Most of the time it's ok to let jobs queue since most use cases don't care if a daily or weekly job is 5 minutes late.

dmadisetti

Yeah, they're not exactly a scrappy startup- I'd be surprised if they had 0 QA.

Makes me wonder if they internally have "press releases / Q" as an internal metric to keep up the hype.

airstrike

Maybe that's the Q* we've been hearing rumors about

ttul

Amazon had an insane number of people working on just the alarms feature in Alexa when they interviewed me for a position years ago. They had entire teams devoted to the tiniest edge case within the realm of scheduling things with Alexa. This is no doubt one of the biggest use cases in computing: getting your computer to tell you what to do at a given time.

qgin

Recurring schedules across time zones is an unbelievably maddening thing to implement. At first glance it seems simple, but it gets very weird very quickly.

uncomplexity_

this.

some people cant even wrap gheir heads around it, taking hours and hours of discussions. still my favourite problem though.

wkat4242

Yeah summer time in different countries switching on different days and often in a different direction (other hemisphere). I used to work on such matters and those weeks were the toughest.

ethbr1

Developers when they first start working with time across timezones: "This is a technical problem."

Developers after more research: "Oh... this is a political problem."

echeese

Considering my iPhone alarm still sometimes fails to go off (it just shows the alarm screen silently), I'd be inclined to believe you.

ineedasername

Thanks for that— I though I was going crazy (well still could be I guess) or had some strange habit or gesture I didn’t realize was silencing the alarm somehow.

emptiestplace

https://www.theverge.com/2025/1/9/24340238/apple-iphone-alar...

yakz

Whenever I have to wake for something that I absolutely can’t miss, I set 2-3 extra reminders 5 minutes apart precisely because of this “silent alarm” bug. It’s only happened to me a couple of times but twice was enough to completely destroy my trust in the alarm. The first time I thought I just did something in my sleep to cause it, but the UI shows it as if the alarm worked. I’m lucky to have the privilege that if I oversleep an hour or so it’s no big deal, otherwise ye olde tabletop alarm clock would be back.

paul7986

Open AI just needs to create & release their own phone with Microsoft's help! H.E.R. the movie phone.

Apple has not innovated in years and a GPT Phone where your lock screen is a FaceTime call like UI/UX with your AI Agent who does everything for you would give Apple a run for it's money! Pick up your phone & see your agent waiting to assist & it could be skinned to look like a deceased loved one (mom still guiding your through life).

To get things done it would interface with other AI Agents of businesses, companies, your doctor, friends & family to schedule things & used as a knowledgebase.

Maybe this is their step towards creating said agents?

elicksaur

> your lock screen is a FaceTime call like UI/UX with your AI Agent who does everything for you

I just… don’t want this. I don’t think anyone I know wants this.

paul7986

Cool, thanks for the comment!

I use chatGPT now for almost everything and when in the car have full back n forth conversations to get things done (or as a knowledge-base) there too. Recently i was discussing with it how do i properly get rid (junk) my old car in Pennsylvania. It provided all the steps and gave me local businesses. Though it didn't call them or interface with them to find their available times/costs, tell me such details & have me instruct it to schedule my preferred choice. Which i wish it did and prompted thoughts how it could do so, as technology that gets adopted mostly is tech that has simplified our lives.

I think my concept above is similar to what was seen in the movie H.E.R. (Joaquin Phoenix & Scarlett Johansson starred) so it's not that crazy or odd. Throwing in skinning it to be whoever like a deceased loved one, might to probably is.

android521

And gmail schedule delivery just won't work if you want to email yourself a month later.

cbeach

I'm sure it's brilliant, but I have no idea what it's capable of. What will it do? Send me a push notification? Have an answer waiting for me when I come back to it in a while?

I switched over to the "GPT4o with scheduled tasks" model and there were no UI hints as to how I might use the feature. So I asked it "what you can you follow up later on and how?"

It replied "Could you clarify what specifically you’d like me to follow up on later?"

This is a truly awful way to launch a new product.

benaduggan

After asking it to schedule something, it prompted me to allow or block notifications, so sounds like this is just chatGPT scheduling push notifications? We'll see!

jerpint

So basically canibalizing Siri ?

1propionyl

Siri has access to a wealth of private existing and future on-device APIs to fuel context sensitive responses to queries on vendor locked devices used all day long. (Which Apple has apparently decided to just not use yet.)

OpenAI doesn't, they just have a ton of funding and (up to recently) a good mass media story, and the best natural language responses.

The moat around Siri is much deeper, and I don't really see any evidence OpenAI has any special sauce that can't be reproduced by others.

My prediction is that OpenAI's reliance on AI doomerism to generate a regulatory moat falters as they become unable to produce step changes in new models, while Apple's efforts despite being halting and incomplete become ubiquitous thanks to market share and access to on device context.

I wouldn't (and don't) put my money in OpenAI anymore. I don't see a future for them beyond being the first mover in an "LLM as a service" space in which they have no moat. On top of that they've managed to absorb the worst of criticism as a sort of LLM lightning rod. Worst of all, it may turn out that off-device isn't even really necessary for most consumer applications in which case they'll start to have to make most of their money on corporate contracts.

Maybe something will change, but right now OpenAI is looking like a bubble company with no guarantee to its dominant position. Because it is what it is: simply the largest pooling of money to try to corner this market. What else do they have?

siva7

Yep, this is a truly bad feature launch. I have no clue what this model does. Did they somehow lose their competent product people?

cbeach

Ah, I've just stumbled on some hints after clicking around.. click on your avatar image (top right) and then click "Tasks"

Then there are some UI hints.

"Remind me of your mom's birthday on [X] date"

Wow, really maximising that $10bn GPU investment!

danpalmer

Glad to see that the thriving 2010 market of TODO list apps will see a resurgence in the AI era.

delichon

A todo app that you can write and modify by editing a natural language prompt, and that can parse inputs from the whole web with flexibility and nuance, is not a small thing.

prettyblocks

It could get really interesting if they allow webhooks and structured output

sandspar

Maybe it's effective at hitting a goal which you do not see.

PittleyDunkin

Where are the release notes?

Edit: I suppose they'll be here at some point: https://help.openai.com/en/articles/9624314-model-release-no...

These seem like extremely shitty release notes. I have no clue why anybody pays for this model.

ben_w

You might want this? It's more technical than the one you linked to:

https://platform.openai.com/docs/changelog

sunaookami

Does this show "Invalid DateTime" only for me? Kinda ironic! https://i.imgur.com/ZAcwhxT.png

ben_w

Not for me, this time; they're "December, 2024", "Dec 18", "Dec 17" respectively.

Recently someone shared a link to one of their chat sessions here, and it reliably 404'd for me but not others.

speedgoose

It has consistently been the best model for the two last years and only Gemini is perhaps slightly better now.

PittleyDunkin

Right, but free models you run on your local computer are just as good for 99% of use cases and don't cost an arm and a leg.

throwup238

The docs for the beta seem to already be up: https://help.openai.com/en/articles/10291617-scheduled-tasks...

TheJCDenton

Nothing yet

sky2224

Pretty useless so far. I'm not sure what the intended application of this is so far, but I wanted it to schedule some work for me.

It only scheduled the first thing and that was after having to be specific by saying "7:30pm-11pm". I wanted to say "from now to 11pm" but it did couldn't process "now"

sandspar

If you find a tool useless then it's likely that you lack imagination.

sky2224

Okay, let's say I do lack imagination: please enlighten me after you've had a chance to actually use this half-baked feature.

mulmboy

https://www.theverge.com/2025/1/14/24343528/openai-chatgpt-r...

phgn

What am I supposed to see at the link?

swifthesitation

You click the drop down menu for model selection and choose 4o with scheduled tasks

elyase

There is more information in these twitter threads:

https://x.com/karinanguyen_/status/1879270529066262733 https://x.com/OpenAI/status/1879267276291203329

encoderer

Founder of Cronitor.io here — if you’re a developer considering using this, would it be valuable for you to be able to report in to Cronitor when it runs so we can keep an eye and alert you if your tasks are late, skipped or accidentally deleted?

We support just about every other job platform but I’d love to hear from potential users before I hack something together.

simple10

The UI is different in the desktop app for macOS. The ability to edit the schedule task is only available in the web UI for me.

I got the best results by not enabling Search the Web when I was trying to create tasks. It confuses the model. But scheduled tasks can successfully search the web.

It's flaky, but looks promising!

throwaway314155

Less relevant but why isn't canvas available in the desktop app? I thought they had feature parity but it seems not.

nycdatasci

Lots of complaints mentioned here. If you have a legitimate need for a product like Tasks that is more fully baked, I’d encourage you to check out lindy.ai (no affiliation). I’ve been using it to send periodic email updates on specific topics and it works flawlessly.

HN

GPT-4o with scheduled tasks (jawbone) is available in beta

GPT-4o with scheduled tasks (jawbone) is available in beta