Tensor evolution: A framework for fast tensor computations using recurrences

sakras

> Tensor Evolution (TeV)

Oh no, I'm never not going to read this as "tera-electron Volts". I wish they picked a different acronym!

null

[deleted]

thomasahle

This is neat! I've been thinking about how to best express recursion with diagrams for tensorgrad (https://github.com/thomasahle/tensorgrad) and this might just be it.

So much computation these days are done on tensors, mostly for easy parallelism reasons. But we can't get completely rid of recursion. Great to have work like this making them more computationally compatible!

pertymcpert

I don't see a lot of use for this since it only extends element-wise operations and not things like multiplication which are the more interesting operations on tensors. Yeah you can model tensors in the compiler with this but no loops in the compiler are actually doing this. Unlike SCEV which is used for almost every loop.

almostgotcaught

> not things like multiplication

in order to perform that kind of canonicalization/folding/cse you effectively need to embed a whole-ass tensor ops interpreter into your compiler. note this isn't so far-fetched

https://github.com/openxla/stablehlo/blob/main/docs/interpre...

the problem is it needs to be performant as well. so what some people do is jit these canonicalizations using the compiler (yo dawg i heard you like compiling...).

almostgotcaught

lol who is interested in this on hn? this has nothing to do with frameworks or training or inference or whatever. this is purely about "when can i infer statically XYZ property about tensor constants in a program". mind you, small tensor constants, because this thing is effectively performing actual tensor arithmetic and you don't want slow tensor arithmetic in your compiler.

also FYI they're not magically calculating closed form solutions to recurrence relations here (that's not possible) - they're basically just iterating the loop arithmetic.

saagarjha

This kind of thing is quite useful for people implementing block-level compilers; in particular spending time on optimizations for “small” tensors that are being operated on in a block are very worthwhile because the generated code is then reused across all blocks and multiple invocations of the kernel.

For this case, you can actually calculate a closed form for many recurrences, in the same way that SCEV works. You don’t need to interpret the loop to do this. If you mark your induction variables and put the loop in the right form then it’s possible to do recurrence analysis in constant time. This isn’t particularly complex or novel (and, to be fair, the paper doesn’t do much other than go “hmm what if we did this for tensors”) so I suggest you look up how it’s done in traditional compilers.

almostgotcaught

> closed form for many recurrences, in the same way that SCEV works ... so I suggest you look up how it’s done in traditional compilers.

SCEV does not compute closed forms ... because that's impossible. if you don't understand why it's impossible you can consult this same team's slides on SCEV in LLVM, a "traditional compiler":

https://llvm.org/devmtg/2018-04/slides/Absar-ScalarEvolution...

specifically pages 16-18

saagarjha

I feel like we must be talking about different things, because I don't see how any of this follows, and the slides just seem to explain how LLVM does its SCEV? Consider the following program (in C and not LLVM IR because I am very lazy):

  int foo = 0;
  for (int i = 0; i < n; ++i) {
   foo += 1;
  }

Both foo and i have an addrec of {0, +, 1}. Given the loop iteration count n it is pretty clear that you can calculate the value at exit to be n without actually running the loop n times. In fact you don't even know n and can still convert it to a closed form at compile time. So what's the point of disagreement here?

pstoll

> lol who is interested in this on hn?

I’m guessing lots of people who are on HN. I’ll go with - me for one. Guessing I’m not alone.

Remember - it’s a big world. Lots of POVs.

llm_trw

Anyone working on deep learning for one.

almostgotcaught

if you can explain to me the relevance of this paper for your stack i'll paypal you $50.

saagarjha

The people who are in a position to respond to you are also unlikely to be moved by an offer of $50.

gaze

I am. Why would you assume otherwise?