How often does Python allocate?

sushibowl

With respect to tagged pointers, there seems to be some recent movements on that front in CPython: https://github.com/python/cpython/issues/132509

notatallshaw

Unfortunately that was posted 1 month before the Faster CPython project was disbanded by Microsoft, so I imagine things have slowed.

nu11ptr

I admit it may just be because I'm a PL nerd, but I thought it was general knowledge that pretty much EVERYTHING in Python is an object, and an object in Python is always heap allocated AFAIK. This goes deeper than just integers. Things most think of as declarative (like classes, modules, etc.) are also objects, etc. etc. It is both the best thing (for dynamism and fun/tinkering) and worst thing (performance optimization) about Python.

If you've never done it, I recommend using the `dir` function in a REPL, finding interesting things inside your objects, do `dir` on those, and keep the recursion going. It is a very eye opening experience as to just how deep the objects in Python go.

boothby

Once upon a time, I wanted to back a stack with a linked list in Python. I had been reading a lot of compiled bytecode, and had recently learned that CPython is a stack-based language capable of unrolling and popping tuples in its hot path. I also learned about the freelist.

I ended up with the notation

  Initialization:
    head = ()
  Push:
    head = data, head
  Safe Pop:
    if head:
       data, head = head
  Safe Top:
    head[0] if head else None

And for many stack-based algorithms, I've found this to be quite optimal in part because the length-2 tuples get recycled (also due to a lack of function calls, member accesses, etc). But I'm rather embarrassed to put it into a codebase due to others' expectations that Python should be beautiful and this seems weird.

mgkuhn

There are reasons why the same program in Julia can be 60x faster than in Python, see e.g. slide 5 in https://www.cl.cam.ac.uk/teaching/2526/TeX+Julia/julia-slide... for an example.

godshatter

C gets a lot of crap, sometimes for good reason, but one thing I like about it is that the question of whether C is allocating something is easy to answer, at least for your own code.

bee_rider

It is nice.

Although, there are also modern, beautiful, user friendly languages where allocation is mostly obvious. Like Fortran.

manwe150

Python is entirely a C program, ergo by this article, this seems like one of those fallacies C programs believe justifies using C

null

[deleted]

zahlman

A caveat applies to the entire analysis that CPython may be the reference implementation, but it's still just one implementation. This sort of thing may work totally differently in PyPy, and especially in implementations that make use of another garbage-collecting runtime, such as Jython.

> Let’s take out the print statement and see if it’s just the addition:

Just FWIW: the assignment is not required to prevent optimizing out the useless addition. It isn't doing any static analysis, so it doesn't know that `range` is the builtin, and thus doesn't know that `i` is an integer, and thus doesn't know that `+` will be side-effect-free.

> Nope, it seems there is a pre-allocated list of objects for integers in the range of -5 -> 1025. This would account for 1025 iterations of our loop but not for the rest.

1024 iterations, because the check is for numbers strictly less than `_PY_NSMALLPOSINTS` and the value computed is `i + 1` (so, `1` on the first iteration).

Interesting. I knew of them only ranging up to 256 (https://stackoverflow.com/questions/306313).

It turns out (https://github.com/python/cpython/commit/7ce25edb8f41e527ed4...) that the change is barely a month old in the repository; so it's not in 3.14 (https://github.com/python/cpython/blob/3.14/Include/internal...) and won't show up until 3.15.

> Our script appears to actually be reusing most of the PyLongObject objects!

The interesting part is that it can somehow do this even though the values are increasing throughout the loop (i.e., to values not seen on previous iterations), and it also doesn't need to allocate for the value of `i` retrieved from the `range`.

> But realistically the majority of integers in a program are going to be less than 2^30 so why not introduce a fast path which skips this complicated code entirely?

This is the sort of thing where PRs to CPython are always welcome, to my understanding. It probably isn't a priority, or something that other devs have thought of, because that allocation presumably isn't a big deal compared to the time taken for the actual conversion, which in turn is normally happening because of some kind of I/O request. (Also, real programs probably do simple arithmetic on small numbers much more often than they string-format them.)

petters

> that make use of another garbage-collecting runtime, such as Jython

I think that is mostly of historical interest. For example, it still does not support Python 3 and has not been updated in a very long time

cogman10

Graalpy [1] is where it's at if you want python on a JVM.

[1] https://www.graalvm.org/python/

zahlman

Ah, sorry to hear it. I'd lost track. It looks like IronPython is still active, but way behind. Or rather, maybe they have no interest in implementing Python 3 features beyond 3.4.

HN

How often does Python allocate?

How often does Python allocate?