Skip to content(if available)orjump to list(if available)

Exo: Exocompilation for productive programming of hardware accelerators

alex7o

Halide does something similar but as C++ for it dsl langauge: https://halide-lang.org/

gotoeleven

The Exo docs mention Halide as an example of a language similar to Exo but is "lowering-based", while Exo is "rewrite-based." This seems to mean that Halide is more of a DSL where what you want is specified at a higher level while Exo is a set of transformations you can apply to an existing kernel of code (though at least in the examples the kernel of code is written in python and then somehow C is generated from it, after transformations are applied). Do you know what the relative strengths of these two approaches are?

Also, are there any languages in this vein that take more of a declarative (as opposed to imperative) approach?

erdaniels

I'm not the target audience but the GitHub and website getting started page feel so poorly explained. What the hell is a schedule?

gnabgib

This MIT article covers it a bit more (with a slightly too generic title) High-performance computing, with much less code https://news.mit.edu/2025/high-performance-computing-with-mu... (https://news.ycombinator.com/item?id=43357091)

ajb

A schedule is the order in which machine instructions get executed.

So, I've done this professionally (written assembler code, and then scheduled it manually to improve performance). Normally you don't need to do that these days, as even mobile CPUs use out-of-order cores which dynamically schedule at runtime.

It's only going to be useful if you're writing code for some machine that doesn't do that (they give examples of TPU etc)

almostgotcaught

> out-of-order cores which dynamically schedule at runtime.

OOO architectures don't reschedule dynamically - that's impossible - they just have multiple instruction buffers that can issue the instructions. So scheduling is still important for OOO it's just at the level of DDG instead of literally linear order in the binary.

Edit: just want to emphasize

> It's only going to be useful if you're writing code for some machine that doesn't do that

There is no architecture for which instruction scheduling isn't crucial.

ajb

If you're talking about modifying the DDG, I would not call that scheduling. Because then you need to do serious work to prove that your code is actually doing the same thing. But I haven't spent a lot of time in the compiler world, so maybe they do call it that. Perhaps you could give your definition?

null

[deleted]

almostgotcaught

"[compilation] for productive programming of hardware accelerators"

But 93% of the codebase is Python lol. Whatever you think about Python, it is not a systems programming language. Takeaway: this is not a serious compiler project (and it's clearly not, it's a PhD student project).

Deeper take: this is just a DSL that behind the scenes calls a (SMT) solver for tuning (what they call "scheduling"). There are a million examples of this approach for every architecture under the sun. My org is literally building out the same thing right now. Performance is directly a function of how good your model of the architecture is. At such a high-level it's very likely to produce suboptimal results because you have no control over ISA/intrinsic level decisions. Lower-level implementations are much more robustly "powerful".

https://dl.acm.org/doi/10.1145/3332373

rscho

Well, this is clearly an attempt at abstracting the kind of low-level stuff you describe. Perhaps it doesn't work (yet), but that shouldn't prevent people from trying ? Involving a SMT solver suggests that the solver is doing the heavy-lifting, not python. PhDs often produce inapplicable stuff, but they are also the basis for industry/application R&D, such as what your org is doing... PhDs are the slaves of science. They make stuff happen for peanuts in return and deserve our respect for that, even if what happens is oftentimes a dead-end. It's really sad seeing people shitting on PhDs.

null

[deleted]

fancyfredbot

Your take seems to contradict the article? You say SMT solvers give "no control over ISA/intrinsic level decisions" but their design.md says "user-defined scheduling operations can encapsulate common optimization patterns and hardware-specific transformations". Are they wrong about this? Can you explain why?

QuadmasterXLII

any sufficiently powerful compiler is going to run an interpreted language at compile time, and there’s no reason it can’t be Python instead of C++ template metaprograms or CMake

null

[deleted]

null

[deleted]

null

[deleted]