Skip to content(if available)orjump to list(if available)

Python as a second language empathy (2018)

planckscnst

I learned several languages before Python, but the one that made it the most difficult was Ruby. After having done Ruby for years and becoming really familiar with idioms and standard practices, coming to python feels like trying to cuddle with a porcupine.

List (or set, or dict) comprehensions are really cool... one level deep. The moment you do 'in...in' my brain shuts off. It takes me something like 30 minutes to finally get it, but then ten seconds later, I've lost it again. And there are a bunch of other things that just feel really awkward and uncomfortable in Python

whytevuhuni

There's a pretty easy trick to nested comprehensions.

Just write the loop as normal:

    for name, values in map_of_values.items():
        for value in values:
            yield name, value
Then concatenate everything on one line:

    [for name, values in map_of_values.items(): for value in values: yield name, value]
Then move the yield to the very start, wrap it in parentheses (if necessary), and remove colons:

    [(name, value) for name, values in map_of_values.items() for value in values]
And that's your comprehension.

Maybe this makes more sense if you're reading it on a wide screen:

    [              for name, values in map_of_values.items(): for value in values: yield name, value]
                                                                                   ^^^^^^^^^^^^^^^^^
           /------------------------------------------------------------------------------/
     vvvvvvvvvvvvv
    [(name, value) for name, values in map_of_values.items()  for value in values                   ]

JodieBenitez

I write Python every day and I would have just write the "normal" loop which is the only one that is really readable in your examples, at least to me.

whytevuhuni

I wrote it on one line to make the example make sense, but most Python comprehensions are plenty readable if they're split into many lines:

    new_list = [
        (name, value)
        for name, values in map_of_values.items()
        for value in values
    ]
The normal loop needs either a yield (which means you have to put it inside a generator function, and then call and list() the generator), or you have to make a list and keep appending things into it, which looks worse.

skeledrew

Wow, this is nice. I've been doing Python for quite a few years and whenever I needed a nested comprehension I'd always have to go to my cheatsheet. Now that I've seen how it's composed that's one less thing I'll need to lookup again. Thank you.

TheOtherHobbes

It's a fairly clunky idiom and there's no reason to use it if you prefer more explicit code.

I can see the attraction for terse solutions, but even the name is questionable. ("Comprehension?" Of what? It's not comprehending anything. It's a filter/processor.)

xk3

The first example is a generator. If you wanted to keep that just use `()` instead of `[]`:

    ((name, value) for name, values in map_of_values.items() for value in values)

dijksterhuis

i have a rule of thumb around “one liner” [0] comprehensions that go beyond two nested for loops and/or multiple if conditions — turn it into a generator function

    def gen_not_one_y_from_xs(xs):
        for ys in xs:
            for y in ys:
                if y != 1:
                    yield y

    # … later, actual call
    
    not_one_ys = list(gen_y_from_xs(my_xs))
the point of the rule being — if there are two nested for loops in a comprehension then it’s more complicated than previously thought and should be written out explicitly. list comprehensions are a convenience shortcut, so treat them as such and only use them for simple/obvious loops, basically.

edit — also, now it’s possible to unit test the for loop logic and debug each step, which you cannot do with a list comprehension (unless a function is just returning the result of a list comprehension… which is… yeah… i mean you could do that…)

[0]: 9 times out of 10, multiple for loop comprehensions are not one liners and end up sprawling over multiple lines and become an abject misery to debug.

mixmastamyk

The loops are in standard order, with the yield value up front. Write the loops, surround with brackets, move the value up to the first line.

a_e_k

Oddly, trying to occasionally place on the Advent of Code global leaderboard was the thing that got me most fluent in Python as its own language.

Before that, I tended to write Python in a much more imperative style, since C++ is my main language; think for loops and appending to lists. Thanks to AoC, I'm now addicted to comprehensions for transforming and filtering, know the standard library cold, and write much more concise, Pythonic code.

dbcurtis

Ha ha, yeah, for sure. It is easy to tell the Python code coming from an experienced C programmer. The thing about Python is, you can just give C a semi-colon-ectomy and it pretty much works. But it isn't Python... it took me a while to get completely pickled in the Python way of doing things. Ramalho's book did more than anything to put my brain into Python mode.

cozzyd

Fortunately for me, python doesn't care about my semicolons

analog31

I do a lot of hardware hacking, where I'm writing C for the hardware, and Python for my testing. Probably half of my syntax errors are typing C into the Python editor, or Python into the C editor.

emmelaich

I did my first ever AoC last year and I'm old. It was fun. Didn't get very far though. First thing that mucked me up was forgetting negative indices in Python arrays are valid.

I'm an admin/SRE and don't spend enough time programming, in too many languages. C++, C, Perl, Ruby, Python, Awk, etc.

ddejohn

Python's standard library is phenomenal.

sgarland

And vast. IME, people often underestimate what it can do. For example, if you need to store IP addresses in a DB, you can use the stdlib library ipaddress to validate them, then store them as an INT, casting it back to dotted-quad with Python if needed.

Also, of course, Postgres has a native inet type, which performs validation and other operations, storing a v4 in 7 bytes – not quite as good as 4 bytes for an INT, but much better than the maximum 16 bytes to store a dotted-quad as a VARCHAR. But if you’ve got MySQL (or anything else without a native type), this is a solid way to save space – and more importantly, memory – that can add up over hundreds of millions of rows (session information for a large SaaS product, for example).

xg15

> But annotations are much less powerful than Python decorators.

Coming from Java and seeing what kind if stuff is practically done there with annotations (looking at you, Spring), Python decorators actually feel more constrained than annotations, in a fully beneficial sense.

Yes, Python decorators run arbitrary code, but at least, there is a straight-forward relationship between the decorator and the location of that code. Decorators are just syntactic sugar for a function call that takes the decorated function as an argument - so you just have to look up the definition of the decorator to find out what it does.

In contrast, a Java annotation does exactly nothing by itself - it's literally a piece of metadata stapled to a function or field or argument, etc.

This metadata is accessible either at compile-time using compiler postprocessors, or at runtime using reflection. The code that accesses the annotation and implements its behavior can be essentially anywhere in your class path. In the worst case, you'll have to go through every single usage site of the annotation to find the place where its behavior is defined.

(As a bonus, annotations can be annotated themselves, too - so you'll potentially have to repeat the process recursively)

zahlman

> Who here has another language that they knew pretty well before learning Python? (most hands go up) Great! Terrific! That’s a superpower you have that I can never have. I can never unlearn Python, become fluent in another language, and then learn Python again. You have this perspective that I can’t have.

It felt a little strange to read this. I knew many languages "pretty well" (or would have said so at the time) before Python, but it's been my primary programming language pretty much the entire time since I first picked it up 20 years ago. So I do have that perspective (I also spend a fair amount of time thinking about PL design in the abstract), but I don't feel "PSL" at all.

> Does the value stored in `a` know the name of that variable?

Okay, I'm stumped - in what other language would it "know" (I assume this means automatically recording the string "a" as part of the object's state in some way)? Or is this just referring to debugging information added in some environments?

> Python = is like Java, it’s always a reference and never a copy (which it is by default in C++).

Java has primitive types, for which it is a copy. But it's more accurate IMO to attribute this as a property of names, rather than a property of assignment. Python's names (and most Java values, and some C# values) have reference semantics. Having understood that properly, saying that Python (like Java) is "pass by assignment" then just follows naturally, and all the usual confusion around this point vanishes. Passing an argument to a function is simply assigning it to the corresponding parameter. (C++'s idea of "references" in the type system is far more complex IMO - e.g. https://langdev.stackexchange.com/q/3798/_/3807#3807 .)

> We have both reference counting and garbage collection in Python.

This is strange phrasing, because reference counting is a garbage collection strategy. I assume that here "garbage collection" is meant to refer to mark-and-sweep etc. strategies that operate in a separate pass, timed independently of the explicit code. But also, the reference-counting system is a CPython implementation detail. (Although it does look like PyPy tries to emulate that, at least: https://doc.pypy.org/en/default/discussion/rawrefcount.html )

> I’ll translate this one to the transcript later - for now you’ll have to watch it because the visual is important: explicit super.

Raymond Hettinger's video is also a useful reference here: https://www.youtube.com/watch?v=EiOglTERPEo

> It takes less characters in Python!

This checks out, but it would be fairer to show what it looks like on the declaration side rather than just the usage side.

ForTheKidz

> This is strange phrasing, because reference counting is a garbage collection strategy.

Oof, don't open that can of worms. Let me summarize the result: 1) the distinction is arbitrary and 2) people are very passionate about their personal view being considered canonically correct. To the extent i've started referred to refer to both mark-and-sweep/boehm garbage collection and reference counting as "automatic memory management".

KerrAvon

I think people get unhappy that their favorite language doesn’t have real GC and it becomes an extension of their language advocacy. Python didn’t have GC for a long time.

makeitdouble

Not everyone wants GC either, so the mix of feelings must have been complex.

I didn't much follow these kind of holy wars, but I've seen people leave Java partly to get away from GC, same as people in the Objective C community being pretty happy to get ARC instead of heavier GC systems.

ForTheKidz

Let's just agree to disagree. I don't see any value fighting over these words.

MrJohz

> Java has primitive types, for which it is a copy. But it's more accurate IMO to attribute this as a property of names, rather than a property of assignment. Python's names (and most Java values, and some C# values) have reference semantics. Having understood that properly, saying that Python (like Java) is "pass by assignment" then just follows naturally, and all the usual confusion around this point vanishes. Passing an argument to a function is simply assigning it to the corresponding parameter. (C++'s idea of "references" in the type system is far more complex IMO - e.g. https://langdev.stackexchange.com/q/3798/_/3807#3807 .)

I use pass-by-label for this term, after a visual demonstration of variables that showed them as labels that attach to different values/objects. But I like "pass-by-assignment" a lot — it gives the correct intuition that whatever is happening when I do `x = 5` is also happening when I do `def func(x): pass; func(5)`.

hexane360

Well, except in the case of variable shadowing.

FreakLegion

> > Does the value stored in `a` know the name of that variable?

> Okay, I'm stumped - in what other language would it "know" (I assume this means automatically recording the string "a" as part of the object's state in some way)? Or is this just referring to debugging information added in some environments?

Python is (or can be) the anomaly here because of descriptors, specifically __set_name__: https://docs.python.org/3/reference/datamodel.html#object.__...

I'm partial to using descriptors to reference attribute names (as opposed to using arbitrary strings) so that refactoring tools recognize them. That looks roughly like:

    _T_co = TypeVar("_T_co", covariant=True)

    class Parameter(Generic[_T_co]):
        """
        Here we push descriptors a bit further by using overloads to distinguish
        type-level vs. instance-level references. Given:

            ```
            class C:
                param: Parameter[Param] = Param(...)

            c = C(param=Param.VALUE)
            ```

            `c.param` operates as attribute accesses normally do, invoking the
            descriptor with `instance=c` and transparently returning `Param.VALUE`.

            `C.param`, on the other hand,  calls `__get__` with `instance=None`,
            which is recognized by the overload as returning `Param` itself.

            Thus `C.param.parameter_name` returns the string `param`, is typed
            correctly, and if you need to find or update references to `param`
            in the future, `setattr(c, C.param.parameter_name)` will be handled,
            where `setattr(c, "param")` wouldn't have been.
        """

        @overload
        def __get__(self, instance: None, owner: type) -> Param: ...

        @overload
        def __get__(self, instance: Any, owner: type) -> _T_co: ...

        def __get__(self, instance: object, owner: type) -> Param | _T_co: ...

        def __set__(self, instance: object, value: Param | _T_co) -> None: ...

        parameter_name: str
        """The parameter name this param has been assigned to."""

        def __set_name__(self, _owner: type, parameter_name: str) -> None:
            # Descriptor protocol to capture the field name as a string for reference
            # later. This spares us having to write parameter names as strings, which
            # is both error-prone and not future-proof.

            # The type may be immutable, with dot assignment and normal setattr not
            # available.
            object.__setattr__(self, "parameter_name", parameter_name)

zahlman

Neat. That actually was beyond what I'd committed to memory about the object/class model. But this only applies within a class body, which seems to be excluded in the article's example (since it isn't indented).

Townley

In C# you can do

string foo = "bar";

string nameOfFoo = nameof(foo); // “foo”

Kinda nice for iterating through lists of variables and saving them to a key/value map

zahlman

Interesting, but it isn't causing the value (here, the "bar" string object) to know about that name.

yyyk

nameof() is a compile time method, not a runtime method like most dynamic language equivalents.

8n4vidtmkvmk

Name of! Never knew this existed. I think it would make writing some debug statements easier and immune to renames.

null

[deleted]

daveguy

Title needs (2018). Data on language usage growth goes to 2018.

zmmmmm

I consider pretty much anything written pre-LLM about learning languages to be dated these days. LLMs don't necessarily help you learn but the perform a complete transformation on the challenge of doing it to an entirely different problem space.

svilen_dobrev

i have taught quite a few people of various backgrounds and levels. And still provide a short crash-course, once in a while.

IMO in this article the elephant is missing, esp. for C/C++/java people: there are no variables as storage-spaces in Python. There are names/labels in namespaces, attached/referencing some value (stored in VM), and that is taken/examined at runtime-usage of those labels/names. The values themselves are living on other side of the fence, inside the virtual machine.

The most important piece of the language Python in docs is the execution model, and in it, the "naming and binding" part:

https://docs.python.org/3.11/reference/executionmodel.html#n...

i.e. what constitutes a namespace, how are they stacked at declare-time or run-time (visibility/shadowing), lifetimes, scopes and all that.

The most usual gotcha is that inside of class xyz: ... is only a temporary namespace - and becomes class' attributes when closed (well, subject to metaclass behaviour).

i usually present the "storage" as "imagine 2 notebooks of ~transparent-sheets" : one is the names' visibility (so called "variables") - namespace-hierarchy, where each sheet is one namespace level, and you see only the inner-most page that has something in certain row - shadowing all those towards the root at module level; the other is the attribute-on-something visibility, where the instance (if any, of whatever type) is (only one and) always on top, and then inherited classes (not instances) underneath one under another. Also, all instance-or-static methods in Python are virtual, i.e. using the topmost available - while class-methods are not, they use the specified-or-below but not above.

Frankly, took me ~10 years, occasionally rereading that chapter each time when hitting new category of errors, to grasp the whole of it. Each comma there matters. Take your time.

Edit: this "names-only" thing is similar in javascript, and other interpreted languages. Python also exposes the whole VM-runtime - from programmer's side of the fence - i.e. locals() is the current namespace, inspect.stack() gives the current namespace hierarchy, etc. Self-reflection can be taken quite far

jebarker

> The most important piece of the language Python in docs is...

Asking as an intermediate python programmer (in my assessment!): what is the importance of knowing this? Is it essential to know to _use_ python correctly or to _understand_ the language implementation?

svilen_dobrev

it's only about usage. The Rules of the language. implementation does not matter, Cpython behaves same as IronPython or Jython or what-have-you - at least for this basic stuff.

if you use python in very procedural way, and never reuse things in time, then probably does not matter. But, thing is, python is not very procedural, and, most of the stuff happens and depends in time as well as in space/storage/memory. And those rules draw the boundaries which is where and when, and why.

One weird way of looking at these maybe that "variables" - i.e. names - are somewhat like attributes of some superficial "instance" called namespace, with leaner syntax; but with different behavior than plain attributes on things (btw those are quite deep). And there no other ways to access and juggle values/things - so one better knows them.

Read that chapter once. You may or may not grasp everything (and don't need to). Then forget about it. Next time you step on weird error of some xyz being 4 when it was assigned 67, re-read it. Repeat..