Skip to content(if available)orjump to list(if available)

Pipelining might be my favorite programming language feature

invalidator

The author keeps calling it "pipelining", but I think the right term is "method chaining".

Compare with a simple pipeline in bash:

  grep needle < haystack.txt | sed 's/foo/bar/g' | xargs wc -l
Each of those components executes in parallel, with the intermediate results streaming between them. You get a similar effect with coroutines.

Compare Ruby:

  data = File.readlines("haystack.txt")
    .map(&:strip)
    .grep(/needle/)
    .map { |i| i.gsub('foo', 'bar') }
    .map { |i| File.readlines(i).count }
In that case, each line is processed sequentially, with a complete array being created between each step. Nothing actually gets pipelined.

Despite being clean and readable, I don't tend to do it any more, because it's harder to debug. More often these days, I write things like this:

  data = File.readlines("haystack.txt")
  data = data.map(&:strip)
  data = data.grep(/needle/)
  data = data.map { |i| i.gsub('foo', 'bar') }
  data = data.map { |i| File.readlines(i).count }
It's ugly, but you know what? I can set a breakpoint anywhere and inspect the intermediate states without having to edit the script in prod. Sometimes ugly and boring is better.

wahern

> The author keeps calling it "pipelining", but I think the right term is "method chaining". [...] You get a similar effect with coroutines.

The inventor of the shell pipeline, Douglas McIlroy, always understood the equivalency between pipelines and coroutines; it was deliberate. See https://www.cs.dartmouth.edu/~doug/sieve/sieve.pdf It goes even deeper than it appears, too. The way pipes were originally implemented in the Unix kernel was when the pipe buffer was filled[1] by the writer the kernel continued execution directly in the blocked reader process without bouncing through the scheduler. Effectively, arguably literally, coroutines; one process call the write function and execution continues with a read call returning the data.

Interestingly, Solaris Doors operate the same way by design--no bouncing through the scheduler--unlike pipes today where long ago I think most Unix kernels moved away from direct execution switching to better support multiple readers, etc.

[1] Or even on the first write? I'd have to double-check the source again.

marhee

I don’t find your “seasoned developer” version ugly at all. It just looks more mature and relaxed. It also has the benefits that you can actually do error handling and have space to add comments. Maybe people don’t like it because of the repetition of “data =“ but in fact you could use descriptive new variable names making the code even more readable (auto documenting). I’ve always felt method chaining to look “cramped”, if that’s the right word. Like a person drawing on paper but only using the upper left corner. However, this surely is also a matter of preference or what your used to.

freehorse

I have a lot of code like this. The reason I prefer pipelines now is the mental overhead of understanding the intermediate step variables.

Something like

  lines = File.readlines("haystack.txt")
  stripped_lines = lines.map(&:strip)
  needle_lines = stripped_lines.grep(/needle/)
  transformed_lines = needle_lines.map { |line| line.gsub('foo', 'bar') }
  line_counts = transformed_lines.map { |file_path| File.readlines(file_path).count }
is a hell to read and understand later imo. You have to read a lot of intermediate variables that do not matter in anything else in the code after you set it up, but you do not know in advance necessarily which matter and which don't unless you read and understand all of it. Also, it pollutes your workspace with too much stuff, so while this makes it easier to debug, it makes it also harder to read some time after. Moreover becomes even more crumpy if you need to repeat code. You probably need to define a function block then, which moves the crumpiness there.

What I do now is starting defining the transformation in each step as a pure function, and chain them after once everything works, plus enclosing it into an error handler so that I depend on breakpoint debugging less.

There is certainly a trade off, but as a codebase grows larger and deals with more cases where the same code needs to be applied, the benefits of a concise yet expressive notation shows.

deredede

Code in this "named-pipeline" style is already self-documenting: using the same variable name makes it clear that we are dealing with a pipeline/chain. Using more descriptive names for the intermediate steps hides this, making each line more readable (and even then you're likely to end up with `dataStripped = data.map(&:strip)`) at the cost of making the block as a whole less readable.

pragma_x

> Maybe people don’t like it because of the repetition of “data =“

Eh, at first glance it looks "amateurish" due to all the repeated stuff. Chaining explicitly eliminates redundant operations - a more minimal representation of data flow - so it looks more "professional". But I also know better than to act on that impulse. ;)

That said, it really depends on the language at play. Some will compile all the repetition of `data =` away such that the variable's memory isn't re-written until after the last operation in that list; it'll hang out in a register or on the stack somewhere. Others will run the code exactly as written, bouncing data between the heap, stack, and registers - inefficiencies and all.

IMO, a comment like "We wind up debugging this a lot, please keep this syntax" would go a long way to help the next engineer. Assuming that the actual processing dwarfs the overhead present in this section, it would be even better to add discrete exception handling and post-conditions to make it more robust.

ehnto

In most debuggers I have used, if you put a breakpoint on the first line of the method chain, you can "step over" each function in the chain until you get to the one you want.

Bit annoying, but serviceable. Though there's nothing wrong with your approach either.

grimgrin

debuggers can take it even further if they want that UX. in firefox given a chain of foo().bar().baz() you can set a breakpoint on any of 'em.

https://gist.github.com/user-attachments/assets/3329d736-70f...

runeks

> The author keeps calling it "pipelining", but I think the right term is "method chaining".

Allow me, too, to disagree. I think the right term is "function composition".

Instead of writing

  h(g(f(x)))
as a way to say "first apply f to x, after which g is applied to the result of this, after which h is applied to the result of this", we can use function composition to compose f, g and h, and then "stuff" the value x into this "pipeline of composed functions".

We can use whatever syntax we want for that, but I like Elm syntax which would look like:

  x |> f >> g >> h

billdueber

If you add in a call to “.lazy“ it won’t create all the intermediate arrays. There since at least 2.7. https://ruby-doc.org/core-2.7.0/Enumerator/Lazy.html

dorfsmay

I do the same with Python, replacing multilevel comprehensions with intermediary steps of generator expressions, which are lazy and therefore do not impact performance and memory usage.

https://peps.python.org/pep-0289/

zelphirkalt

Ultimately it will depend on the functions being chained. If they can work with one part of the result, or a subset of parts, then they might not block, otherwise they will still need to get a complete result and the lazy cannot help.

hbogert

Not much different from having a `sort` in shell pipeline I guess?

snthpy

I think the best term is "function composition", but with a particular syntax so pipelining seems alright. Method chaining is a common case, where some base object is repeatedly modified by some action and then the object reference is returned by the "method", thus allowing the "chaining", but what if you're not dealing with objects and methods? The pipelined composition pattern is more general than method chaining imho.

You make an interesting point about debugging which is something I have also encountered in practice. There is an interesting tension here which I am unsure about how to best resolve.

In PRQL we use the pipelining approach by using the output of the last step as the implicit last argument of the next step. In M Lang (MS Power BI/Power Query), which is quite similar in many ways, they use second approach in that each step has to be named. This is very useful for debugging as you point out but also a lot more verbose and can be tedious. I like both but prefer the ergonomics of PRQL for interactive work.

Update: Actually, PRQL has a decent answer to this. Say you have a query like:

    from invoices
    filter total > 1_000
    derive invoice_age = @2025-04-23 - invoice_date
    filter invoice_age > 3months
and you want to figure out why the result set is empty. You can pipe the results into an intermediate reference like so:

    from invoices
    filter total > 1_000
    into tmp
    
    from tmp
    derive invoice_age = @2025-04-23 - invoice_date
    filter invoice_age > 3months
So, good ergonomics on the happy path and a simple enough workaround when you need it. You can try these out in the PRQL Playground btw: https://prql-lang.org/playground/

AdieuToLogic

> The author keeps calling it "pipelining", but I think the right term is "method chaining".

I believe the correct definition for this concept is the Thrush combinator[0]. In some ML-based languages[1], such as F#, the |> operator is defined[2] for same:

  [1..10] |> List.map (fun i -> i + 1)
Other functional languages have libraries which also provide this operator, such as the Scala Mouse[3] project.

0 - https://leanpub.com/combinators/read#leanpub-auto-the-thrush

1 - https://en.wikipedia.org/wiki/ML_(programming_language)

2 - https://fsharpforfunandprofit.com/posts/defining-functions/

3 - https://github.com/typelevel/mouse?tab=readme-ov-file

ehnto

I'm not sure that's right, method chaining is just immediately acting on the return of the previous function, directly. It doesn't pass the return into the next function like a pipeline. The method must exist on the returned object. That is different to pipelines or thrush operators. Evaluation happens in the order it is written.

Unless I misunderstood the author, because method chaining is super common where I feel thrush operators are pretty rare, I would be surprised if they meant the latter.

bccdee

They cite Gleam explicitly, which has a thrush operator in place of method chaining.

I get the impression (though I haven't checked) that the thrush operator is a backport of OOP-style method chaining to functional languages that don't support dot-method notation.

ses1984

Shouldn’t modern debuggers be able to handle that easily? You can step in, step out, until you get where you want, or you could set a breakpoint in the method you want to debug instead of at the call site.

abirch

Even if your debugger can't do that, an AI agent can easily change the code for you to add intermediate output.

bccdee

...an AI agent can independently patch your debugger to modify the semantics? Wow that's crazy.

Incidentally, have you ever considered investing in real estate? I happen to own an interest in a lovely bridge which, for personal reasons, I must suddenly sell at a below-market price.

bnchrch

I'm personally someone who advocates for languages to keep their feature set small and shoot to achieve a finished feature set quickly.

However.

I would be lying if I didn't secretly wish that all languages adopted the `|>` syntax from Elixir.

```

params

|> Map.get("user")

|> create_user()

|> notify_admin()

```

Cyykratahk

We might be able to cross one more language off your wishlist soon, Javascript is on the way to getting a pipeline operator, the proposal is currently at Stage 2

https://github.com/tc39/proposal-pipeline-operator

I'm very excited for it.

chilmers

It also has barely seen any activity in years. It is going nowhere. The TC39 committee is utterly dysfunctional and anti-progress, and will not let any this or any other new syntax into JavaScript. Records and tuples has just been killed, despite being cited in surveys as a major missing feature[1]. Pattern matching is stuck in stage 1 and hasn't been presented since 2022. Ditto for type annotations and a million other things.

Our only hope is if TypeScript finally gives up on the broken TC39 process and starts to implement its own syntax enhancements again.

[1] https://2024.stateofjs.com/en-US/usage/#top_currently_missin...

tkcranny

I wouldn’t hold your breath for TypeScript introducing any new supra-JS features. In the old days they did a little bit, but now those features (namely enums) are considered harmful.

More specifically, with the (also ironically gummed up in tc39) type syntax [1], and importantly node introducing the --strip-types option [2], TS is only ever going to look more and more like standards compliant JS.

[1] https://tc39.es/proposal-type-annotations/

[2] https://nodejs.org/en/blog/release/v22.6.0

johnny22

Records and Tuples weren't stopped because of tc39, but rather the engine developers. Read the notes.

TehShrike

I was excited for that proposal, but it veered off course some years ago – some TC39 members have stuck to the position that without member property support or async/await support, they will not let the feature move forward.

It seems like most people are just asking for the simple function piping everyone expects from the |> syntax, but that doesn't look likely to happen.

packetlost

I don't actually see why `|> await foo(bar)` wouldn't be acceptable if you must support futures.

I'm not a JS dev so idk what member property support is.

zdragnar

I worry about "soon" here. I've been excited for this proposal for years now (8 maybe? I forget), and I'm not sure it'll ever actually get traction at this point.

gregabbott

A while ago, I wondered how close you could get to a pipeline operator using existing JavaScript features. In case anyone might like to have a look, I wrote a proof-of-concept function called "Chute" [1]. It chains function and method calls in a dot-notation style like the basic example below.

  chute(7)        // setup a chute and give it a seed value
  .toString       // call methods of the current data (parens optional)
  .parseInt       // send the current data through global native Fns
  .do(x=>[x])     // through a chain of one or more local / inline Fns
  .JSON.stringify // through nested global functions (native / custom)
  .JSON.parse
  .do(x=>x[0])
  .log            // through built in Chute methods
  .add_one        // global custom Fns (e.g. const add_one=x=>x+1)
  ()              // end a chute with '()' and get the result
[1] https://chute.pages.dev/ | https://github.com/gregabbott/chute

hinkley

All of their examples are wordier than just function chaining and I worry they’ve lost the plot somewhere.

They list this as a con of F# (also Elixir) pipes:

    value |> x=> x.foo()
The insistence on an arrow function is pure hallucination

    value |> x.foo()
Should be perfectly achievable as it is in these other languages. What’s more, doing so removes all of the handwringing about await. And I’m frankly at a loss why you would want to put yield in the middle of one of these chains instead of after.

hoppp

Cool I love it, but another thing we will need polyfills for...

hathawsh

I believe you meant to say we will need a transpiler, not polyfill. Of course, a lot of us are already using transpilers, so that's nothing new.

bobbylarrybobby

How do you polyfill syntax?

valenterry

I prefer Scala. You can write

``` params.get("user") |> create_user |> notify_admin ```

Even more concise and it doesn't even require a special language feature, it's just regular syntax of the language ( |> is a method like .get(...) so you could even write `params.get("user").|>(create_user) if you wanted to)

elbasti

In elixir, ```Map.get("user") |> create_user |> notify_admin ``` would aso be valid, standard elixir, just not idiomatic (parens are optional, but preferred in most cases, and one-line pipes are also frowned upon except for scripting).

MaxBarraclough

With the disclaimer that I don't know Elixir and haven't programmed with the pipeline operator before: I don't like that special () syntax. That syntax denotes application of the function without passing any arguments, but the whole point here is that an argument is being passed. It seems clearer to me to just put the pipeline operator and the name of the function that it's being used with. I don't see how it's unclear that application is being handled by the pipeline operator.

Also, what if the function you want to use is returned by some nullary function? You couldn't just do |> getfunc(), as presumably the pipeline operator will interfere with the usual meaning of the parentheses and will try to pass something to getfunc. Would |> ( getfunc() ) work? This is the kind of problem that can arise when one language feature is permitted to change the ordinary behaviour of an existing feature in the name of convenience. (Unless of course I'm just missing something.)

valenterry

Oh that's nice!

agent281

Isn't it being a method call not quite equivalent? Are you able to define the method over arbitrary data types?

In Elixir, it is just a macro so it applies to all functions. I'm only a Scala novice so I'm not sure how it would work there.

valenterry

> Are you able to define the method over arbitrary data types?

Yes exactly, which is why it is not equivalent. No macro needed here. In Scala 2 syntax:

``` implicit class AnyOps[A](private val a: A) extends AnyVal { def |>[B](f: A => B) = f(a) } ```

AdieuToLogic

> I would be lying if I didn't secretly wish that all languages adopted the `|>` syntax from Elixir.

This is usually the Thrush combinator[0], exists in other languages as well, and can be informally defined as:

  f(g(x)) = g(x) |> f
0 - https://leanpub.com/combinators/read#leanpub-auto-the-thrush

Munksgaard

Not quite. Note that the Elixir pipe puts the left hand of the pipe as the first argument in the right-hand function. E.g.

    x |> f(y) = f(x, y)
As a result, the Elixir variant cannot be defined as a well-typed function, but must be a macro.

AlchemistCamp

I've been using Elxir for a long time and had that same hope after having experienced how clear, concise and maintainable apps can be when the core is all a bunch of pipelines (and the boundary does error handling using cases and withs). But having seen the pipe operator in Ruby, I now think it was a bad idea.

The problem is that method-chaining is common in several OO languages, including Ruby. This means the functions on an object return an object, which can then call other functions on itself. In contrast, the pipe operator calls a function, passing in what's on the left side of it as the first argument. In order to work properly, this means you'll need functions that take the data as the first argument and return the same shape to return, whether that's a list, a map, a string or a struct, etc.

When you add a pipe operator to an OO language where method-chaining is common, you'll start getting two different types of APIs and it ends up messier than if you'd just stuck with chaining method calls. I much prefer passing immutable data into a pipeline of functions as Elixir does it, but I'd pick method chaining over a mix of method chaining and pipelines.

rkangel

I'm a big fan of the Elixir operator, and it should be standard in all functional programming languages. You need it because everything is just a function and you can't do anything like method chaining, because none of the return values have anything like methods. The |> is "just" syntax sugar for a load of nested functions. Whereas the Rust style method chaining doesn't need language support - it's more of a programming style.

Note also that it works well in Elixir because it was created at the same time as most of the standard library. That means that the standard library takes the relevant argument in the first position all the time. Very rarely do you need to pipe into the second argument (and you need a lambda or convenience function to make that work).

matthewsinclair

Agree. This is absolutely my fave part of Elixir. Whenever I can get something to flow elegantly thru a pipeline like that, I feel like it’s a win against chaos.

mvieira38

R has a lovely toolkit for data science using this syntax, called the tidyverse. My favorite dev experience, it's so easy to just write code

jasperry

Yes, a small feature set is important, and adding the functional-style pipe to languages that already have chaining with the dot seems to clutter up the design space. However, dot-chaining has the severe limitation that you can only pass to the first or "this" argument.

Is there any language with a single feature that gives the best of both worlds?

null

[deleted]

bnchrch

FWIW you can pass to other arguments than first in this syntax

```

params

|> Map.get("user")

|> create_user()

|> (&notify_admin("signup", &1)).() ```

or

```

params

|> Map.get("user")

|> create_user()

|> (fn user -> notify_admin("signup", user) end).() ```

Terr_

BTW, there's a convenience macro of Kernel.then/2 [0] which IMO looks a little cleaner:

    params
    |> Map.get("user")
    |> create_user()
    |> then(&notify_admin("signup", &1))

    params
    |> Map.get("user")
    |> create_user()
    |> then(fn user -> notify_admin("signup", user) end)

[0] https://hexdocs.pm/elixir/1.18.3/Kernel.html#then/2

AndyKluger

Do concatenative langs like Factor fit the bill?

Straw

Lisp macros allow a general solution to this that doesn't just handle chained collection operators but allows you to decide the order in which you write any chain of calls.

For example, we can write: (foo (bar (baz x))) as (-> x baz bar foo)

If there are additional arguments, we can accommodate those too: (sin (* x pi) as (-> x (* pi) sin)

Where expression so far gets inserted as the first argument to any form. If you want it inserted as the last argument, you can use ->> instead:

(filter positive? (map sin x)) as (->> x (map sin) (filter positive?))

You can also get full control of where to place the previous expression using as->.

Full details at https://clojure.org/guides/threading_macros

gleenn

I find the threading operators in Clojure bring much joy and increase readability. I think it's interesting because it makes me actually consider function argument order much more because I want to increase opportunities to use them.

aeonik

These threading macros can increase performance, the developer even has a parallelizing threading macro.

I use these with xforms transducers.

https://github.com/johnmn3/injest

benrutter

Yeah, I found this when I was playing around with Hy a while back. I wanted a generic `->` style operator, and isn't wasn't too much trouble to write a macro to introduce one.

That's sort of an argument for the existence of macros as a whole, you can't really do this as neatly in something like python (although I've tried) - I can see the downside of working in a codebase with hundreds of these kind of custom language features though.

sooheon

Yes threading macros are so much nicer than method chaining, because it allows general function reuse, rather than being limited to the methods that happen to be defined in your initial data object.

duped

A pipeline operator is just partial application with less power. You should be able to bind any number of arguments to any places in order to create a new function and "pipe" its output(s) to any other number of functions.

One day, we'll (re)discover that partial application is actually incredibly useful for writing programs and (non-Haskell) languages will start with it as the primitive for composing programs instead of finding out that it would be nice later, and bolting on a restricted subset of the feature.

zelphirkalt

I like partial application like in Standard ML, but it also means, that one must be very careful with the order of arguments, unless we get a variant of partial application, that is flexible enough to let you specify which arguments you want to provide, instead of always assuming the first n arguments. I use "cut" for this in Scheme. Threading/Pipelines are still very useful though and can shorten things and make them very readable.

dayvigo

Sure. But how do you write that in a way that is expressive, terse, and readable all at once? Nothing beats x | y | z or (-> x y z). The speed of both writing and reading (and comprehending), the sheer simplicity, is what makes pipelining useful in the first place.

gpderetta

for loops are also gotos with less power, yet we usually prefer them.

choult

... and then recreate the scripting language...

stogot

I was just thinking does this not sound like a shell language? Using | instead of .function()

SimonDorfman

The tidyverse folks in R have been using that for a while: https://magrittr.tidyverse.org/reference/pipe.html

thom

I've always found magrittr mildly hilarious. R has vestigial Lisp DNA, but somehow the R implementation of pipes was incredibly long, complex and produced stack traces, so it moved to a native C implementation, which nevertheless has to manipulate the SEXPs that secretly underlie the language. Compared to something like Clojure's threading macros it's wild how much work is needed.

madcaptenor

And base R has had a pipe for a couple years now, although there are some differences between base R's |> and tidyverse's %>%: https://www.tidyverse.org/blog/2023/04/base-vs-magrittr-pipe...

steine65

R, specifically tidyverse, has a special place in my heart. Tidy principles makes data analysis easy to read and easy to use new functions, since there are standards that must be met to call a function "tidy."

Recently I started using Nushell, which feels very similar.

flobosg

Base R as well: |> was implemented as a pipe operator in 4.1.0.

tylermw

Importantly, the base R pipe implements the operation at the language parsing level, so it has basically zero overhead.

zelphirkalt

I would assume, that most languages do that, or alternatively have a compiler, that is smart enough to ensure there is no actual overhead in the compiled code.

mvieira38

R + tidyverse is the gold standard for working with data quickly in a readable and maintainable way, IMO. It's just absolutely seamless. Shoutout to tidyverts (https://tidyverts.org/) for working with time series, too

amai

Pipelining looks nice until you have to debug it. And exception handling is also very difficult, because that means to add forks into your pipelines. Pipelines are only good for programming the happy path.

mpalmer

At the risk of over generalized pronouncements, ease of debugging is usually down to how well-designed your tooling happens to be. Most of the time the framework/language does that for you, but it's not the only option.

And for exceptions, why not solve it in the data model, and reify failures? Push it further downstream, let your pipeline's nodes handle "monadic" result values.

Point being, it's always a tradeoff, but you can usually lessen the pain more than you think.

And that's without mentioning that a lot of "pipelining" is pure sugar over the same code we're already writing.

eikenberry

Pipelining simplifies debugging. Each step is obvious and it is trivial to insert logging between pipeline elements. It is easier to debug than the patterns compared in the article.

Exception handing is only a problem in languages that use exceptions. Fortunately there are many modern alternatives in wide use that don't use exceptions.

switchbak

This is my experience too - when the errors are encoded into the type system, this becomes easier to reason about (which is much of the work when you’re debugging).

w4rh4wk5

Yes, certainly!

I've encountered and used this pattern in Python, Ruby, Haskell, Rust, C#, and maybe some other languages. It often feels nice to write, but reading can easily become difficult -- especially in Haskell where obscure operators can contain a lot of magic.

Debugging them interactively can be equally problematic, depending on the tooling. I'd argue, it's commonly harder to debug a pipeline than the equivalent imperative code and, that in the best case it's equally hard.

jim-jim-jim

I don't know what you're writing, but this sounds like language smell. If you can represent errors as data instead of exceptions (Either, Result, etc) then it is easy to see what went wrong, and offer fallback states in response to errors.

Programming should be focused on the happy path. Much of the syntax in primitive languages concerning exceptions and other early returns is pure noise.

rusk

Established debugging tools and logging rubric are not suitable for debugging heavily pipelined code. Stack traces, debuggers rely heavily on line based references which are less useful in this style and can make diagnostic practices feel a little clumsy.

The old adage of not writing code so smart you can’t debug it applies here.

Pipelining runs contrary enough to standard imperative patterns. You don’t just need a new mindset to write code this way. You need to think differently about how you structure your code overall and you need different tools.

That’s not to say that doing things a different way isn’t great, but it does come with baggage that you need to be in a position to carry.

hnlmorg

Pipelining is just syntactic sugar for nested function calls.

If you need to handle an unhappy path in a way that isn’t optimal for nested function calls then you shouldn’t be nesting your function calls. Pipelining doesn’t magically make things easier nor harder in that regard.

But if a particular sequence of function calls do suit nesting, then pipelining makes the code much more readable because you’re not mixing right-to-left syntax (function nests) with left-to-right syntax (ie you’re typical language syntax).

EVa5I7bHFq9mnYK

I think they are talking about nested loops, not nested function calls.

hnlmorg

Nested loops isn’t pipelining. Some of the examples make heavy use of lambda so they do have nested loops happening as well but in those examples the pipelining logic is still the nesting of the lambda functions.

Crudely put, in C-like languages, pipelining is just as way of turning

  fn(fn(fn()))
Where the first function call is in the inner, right-most, parentheses,

into this:

  fn | fn | fn
…which can be easily read sequentially from left-to-right.

null

[deleted]

bsder

Pipelining is also nice until you have to use it for everything because you can't do alternatives (like default function arguments) properly.

Rust chains everything because of this. It's often unpleasant (see: all the Rust GUI toolkits).

kordlessagain

While the author claims "semantics beat syntax every day of the week," the entire article focuses on syntax preferences rather than semantic differences.

Pipelining can become hard to debug when chains get very long. The author doesn't address how hard it can be to identify which step in a long chain caused an error.

They do make fun of Python, however. But don't say much about why they don't like it other than showing a low-res photo of a rock with a pipe routed around it.

Ambiguity about what constitutes "pipelining" is the real issue here. The definition keeps shifting throughout the article. Is it method chaining? Operator overloading? First-class functions? The author uses examples that function very differently.

Mond_

> Pipelining can become hard to debug when chains get very long. The author doesn't address how hard it can be to identify which step in a long chain caused an error.

Yeah, I agree that this can be problem when you lean heavily into monadic handling (i.e. you have fallible operations and then pipe the error or null all the way through, losing the information of where it came from).

But that doesn't have much to do with the article: You have the same problem with non-pipelined functional code. (And in either case, I think that it's not that big of a problem in practice.)

> The author uses examples that function very differently.

Yeah, this is addressed in one of the later sections. Imo, having a unified word for such a convenience feature (no matter how it's implemented) is better than thinking of these features as completely separate.

zelphirkalt

You can add peek steps in pipelines and inspect the in between results. Not really any different from normal function call debugging imo.

krapht

Yes, but here's my hot take - what if you didn't have to edit the source code to debug it? Instead of chaining method calls you just assign to a temporary variable. Then you can set breakpoints and inspect variable values like you do normally without editing source.

It's not like you lose that much readability from

  foo(bar(baz(c)))

  c |> baz |> bar |> foo

  c.baz().bar().foo()

  t = c.baz()
  t = t.bar()
  t = t.foo()

Mond_

I feel like a sufficiently good debugger should allow you to place a breakpoint at any of the lines here, and it should break exactly at that specific line.

  fn get_ids(data: Vec<Widget>) -> Vec<Id> {
      data.iter()
          .filter(|w| w.alive)
          .map(|w| w.id)
          .collect()
  }
It sounds to me like you're asking for linebreaks. Chaining doesn't seem to be the issue here.

erichocean

The Clojure equivalent of `c |> baz |> bar |> foo` are the threading macros:

    (-> c baz bar foo)
But people usually put it on separate lines:

    (-> c
        baz
        bar
        foo)

andyferris

A debugger should let you inspect the value of any expression, not just variables.

bena

I think you may have misinterpreted his motive here.

Just before that statement, he says that it is an article/hot take about syntax. He acknowledges your point.

So I think when he says "semantics beat syntax every day of the week", that's him acknowledging that while he prefers certain syntax, it may not be the best for a given situation.

fsckboy

the paragraph you quoted (atm, 7 mins ago, did it change?) says:

>Let me make it very clear: This is [not an] article it's a hot take about syntax. In practice, semantics beat syntax every day of the week. In other words, don’t take it too seriously.

AYBABTME

It's just as difficult to debug when function calls are nested inline instead of assigning to variables and passing the variables around.

steine65

Agreed that long chains are hard to debug. I like to keep chains around the size of a short paragraph.

pavel_lishin

The article also clearly points that that it's just a hot-take, and to not take it too seriously.

epolanski

I personally like how effect-ts allows you to write both pipelines or imperative code to express the very same things.

Building pipelines:

https://effect.website/docs/getting-started/building-pipelin...

Using generators:

https://effect.website/docs/getting-started/using-generators...

Having both options is great (at the beginning effect had only pipe-based pipelines), after years of writing effect I'm convinced that most of the time you'd rather write and read imperative code than pipelines which definitely have their place in code bases.

In fact most of the community, at large, converged at using imperative-style generators over pipelines and having onboarded many devs and having seen many long-time pipeliners converging to classical imperative control flow seems to confirm both debugging and maintenance seem easier.

vitus

I think the biggest win for pipelining in SQL is the fact that we no longer have to explain that SQL execution order has nothing to do with query order, and we no longer have to pretend that we're mimicking natural language. (That last point stops being the case when you go beyond "SELECT foo FROM table WHERE bar LIMIT 10".)

No longer do we have to explain that expressions are evaluated in the order of FROM -> JOIN -> ON -> SELECT -> WHERE -> GROUP BY -> HAVING -> ORDER BY -> LIMIT (and yes, I know I'm missing several other steps). We can simply just express how our data flows from one statement to the next.

(I'm also stating this as someone who has yet to play around with the pipelining syntax, but honestly anything is better than the status quo.)

_dark_matter_

You flipped SELECT and WHERE, which probably just solidifies your point. I can't count the number if times I've seen this trip up analysts.

osigurdson

C# has had "Pipelining" (aka Linq) for 17 years. I do miss this kind of stuff in Go a little.

bob1029

I don't see how LINQ provides an especially illuminating example of what is effectively method chaining.

It is an exemplar of expressions [0] more than anything else, which have little to do with the idea of passing results from one method to another.

[0]: https://learn.microsoft.com/en-us/dotnet/csharp/language-ref...

osigurdson

Example from article:

fn get_ids(data: Vec<Widget>) -> Vec<Id> { data.iter() // get iterator over elements of the list .filter(|w| w.alive) // use lambda to ignore tombstoned widgets .map(|w| w.id) // extract ids from widgets .collect() // assemble iterator into data structure (Vec) }

Same thing in 15 year old C# code.

List<Guid> GetIds(List<Widget> data)

{

    return data

           .Where(w => w.IsAlive())

           .Select(w => w.Id)

           .ToList();

}

hahn-kev

So many things have been called Linq over the years it's hard to talk about at this point. I've written C# for many years now and I'm not even sure what I would say it's referring to, so I avoid the term.

In this case I would say extension methods are what he's really referring to, of which Linq to objects is built on top of.

osigurdson

I'd say there are just two things:

1) The method chaining extension methods on IEnumerable<T> like Select, Where, GroupBy, etc. This is identical to the rust example in the article.

2) The weird / bad (in my opinion) language keywords analogous to the above such as "from", "where", "select" etc.

delusional

You might be talking about LINQ queries, while the person you are responding to is probably talking about LINQ in Method Syntax[1]

[1]: https://learn.microsoft.com/en-us/dotnet/csharp/linq/get-sta...

vjvjvjvjghv

Agreed. It would be nice if SQL databases supported something similar.

sidpatil

PRQL [1] is a pipeline-based query language that compiles to SQL.

[1] https://prql-lang.org/

NortySpock

I've used "a series of CTEs" to apply a series of transformations and filters, but it's not nearly as elegant as the pipe syntax.

singularity2001

I tried to convince the julia authors to make a.b(c) synonymous to b(a,c) like in nim (for similar reasons as in the article). They didn't like it.

sparkie

I don't like it either, because it promotes method `b` to the global namespace. There may be many such `b` methods on different, unrelated types. I think that the latter should be prefixed with the typename or module name.

   a.b(c) == AType.b(a, c)   (or AType::b(a, c) , C++ style)

singularity2001

It's the other way around: in Julia b are functions which are globally visible by default and I just suggested to optionally hide them or find them via the object a.

queuebert

What were their reasons?

pansa2

I suspect:

Julia's multiple dispatch means that all arguments to a function are treated equally. The syntax `b(a, c)` makes this clear, whereas `a.b(c)` makes it look like `a` is in some way special.

0xf00ff00f

First example doesn't look bad in C++23:

    auto get_ids(std::span<const Widget> data)
    {
        return data
            | filter(&Widget::alive)
            | transform(&Widget::id)
            | to<std::vector>();
    }

uzerfcwn

To me, the cool (and uncommon in other languages' standard libraries) part about C++ ranges is that they reify pipelines so that you can cut and paste them into variables, like so:

    auto get_ids(std::span<const Widget> data)
    {
        auto pipeline = filter(&Widget::alive) | transform(&Widget::id);
        auto sink = to<std::vector>();
        return data | pipeline | sink;
    }

Shorel

This looks awesome!

I'm really want to start playing with some C++23 in the future.

0xf00ff00f

I cheated a bit, I omitted the namespaces. Here's a working version: https://godbolt.org/z/1rE9o3Y95

inetknght

This is not functionally different from operator<< which std::cout has taught us is a neat trick but generally a bad idea.

senderista

Unlike the iostreams shift operators, the ranges pipe operator isn't stateful.

jjmarr

There's state when you try to use the final result, though. It's not threadsafe due to caching.

https://www.youtube.com/watch?v=c1gfbbE2zts