Skip to content(if available)orjump to list(if available)

The surprising struggle to get a Unix Epoch time from a UTC string in C or C++

d_burfoot

My personal rule for time processing: use the language-provided libraries for ONLY 2 operations: converting back and forth between a formatted time string with a time zone, and a Unix epoch timestamp. Perform all other time processing in your own code based on those 2 operations, and whenever you start with a new language or framework, just learn those 2.

I've wasted so many dreary hours trying to figure out crappy time processing APIs and libraries. Never again!

chikere232

Is it a struggle though?

They needed to have a locale matching the language of the localised time string they wanted to parse, they needed to use strptime to parse the string, they needed to use timegm() to convert the result to seconds when seen as UTC. The man pages pretty much describe these things.

The interface or these things could certainly be nicer, but most of the things they bring up as issues aren't even relevant for the task they're trying to do. Why do they talk about daylight savings time being confusing when they're only trying to deal with UTC which doesn't have it?

johnisgood

It is not.

  int main(void) {
    struct tm tm = {0}; 
    const char *time_str = "Mon, 20 Jan 2025 06:07:07 GMT"; 
    const char *fmt = "%a, %d %b %Y %H:%M:%S GMT"; 

    // Parse the time string
    if (strptime(time_str, fmt, &tm) == NULL) {
        fprintf(stderr, "Error parsing time\n");
        return 1;
    }

    // Convert to Unix timestamp (UTC)
    time_t timestamp = timegm(&tm);
    if (timestamp == -1) {
        fprintf(stderr, "Error converting to timestamp\n");
        return 1;
    }

    printf("Unix timestamp: %ld\n", timestamp);
    return 0;
  }
It is a C99 code snippet that parses the UTC time string and safely converts it to a Unix timestamp and it follows best practices from the SEI CERT C standard, avoiding locale and timezone issues by using UTC and timegm().

You can avoids pitfalls of mktime() by using timegm() which directly works with UTC time.

Where is the struggle? Am I misunderstanding it?

Oh by the way, must read: https://www.catb.org/esr/time-programming/ (Time, Clock, and Calendar Programming In C by Eric S. Raymond)

paxcoder

[dead]

pif

What's a man page? [cit]

johnisgood

"manual pages", type "man man" in your terminal.

https://man7.org/linux/man-pages/man1/man.1.html

jonstewart

The headline doesn’t match the article. As it points out, C++20 has a very nice, and portable, time library. I quibble with the article here, though: in 2025, C++20 is widely available.

1970-01-01

13 more years to go until the 2038 problem.

Surely we'll have everything patched up by then..

ahubert

wow that is dedication 1970-01-01! :-)

amelius

You can probably find the solution in StackOverflow, or by asking an LLM.

null

[deleted]

sylware

Until you understand that the core of unix time is the "day", in the end, you only need to know the first leap year (If I recall properly it is 1972), then you have to handle the "rules" of leap years, and you will be ok (wikipedia I think, don't use google anymore since they now force javascript upon new web engines).

I did write such code in RISC-V assembly (for a custom command line on linux to output the statx syscall output). Then, don't be scared, with a bit of motivation, you'll figure it out.

DamonHD

The core of the UNIX time is seconds since epoch, nothing else. 'Day' has no special place at all. There are calendars for converting to and from dates, including Western-style, but the days in those calendars vary in length because of daylight saving switches and leap seconds for example.

Kwpolska

UNIX time ignores leap seconds, so every day is exactly 86400 seconds, and every year is either 365*86400 or 366*86400 seconds. This makes converting from yyyy-mm-dd to UNIX time quite easy, as you can just do `365*86400*(yyyy-1970) + leap_years*86400` to get to yyyy-01-01.

udidnmem

You cannot since it's missing time zone

RHSeeger

UTC is a timezone, though. Or am I misunderstanding what you're saying?

cvadict

That is fine as long as the input / output is always in UTC... but at the end of the day you often want to communicate that timepoint to a human user (e.g. an appointment time, the time at which some event happened, etc.), which is when our stupid monkey brains expect the ascii string you are showing us to actually make sense in our specific locale (including all of the warts each of those particular timezones have, including leap second, DST, etc.)

jxjsndbxbd

UTC would be marked as +Z

Without any marking, it could be anything

loeg

The article explicitly mentions UTC all over the place. It's UTC.

DamonHD

No '+'.

Noon UTC is "12:00Z".

emcell

again, one of the many reasons i dont code in c or cpp anymore

pif

And the fox said: "These grapes are sour".

Dylan16807

That makes no sense in this context. What's the situation you're imagining where they wanted to use C but something else prevented them so they made up an excuse to call C bad?

You know sometimes people just dislike things, right?

zX41ZdbW

The first rule of thumb is to never use functions from glibc (gmtime, localtime, mktime, etc) because half of them are non-thread-safe, and another half use a global mutex, and they are unreasonably slow. The second rule of thumb is to never use functions from C++, because iostreams are slow, and a stringstream can lead to a silent data loss if an exception is thrown during memory allocation.

ClickHouse has the "parseDateTimeBestEffort" function: https://clickhouse.com/docs/en/sql-reference/functions/type-... and here is its source code: https://github.com/ClickHouse/ClickHouse/blob/74d8551dadf735...