Interprocedural Sparse Conditional Type Propagation

Animats

You can do that. But as a language feature, inter-procedural inference creates problems for humans. There's no way for humans to see what type a function wants.

Language design seems to be converging on the idea that functions should have human-readable type signatures, but within functions, automatic inference can be used freely. In one direction, we have the "auto" inference mechanism of C and C++, and the local inference of Go and Rust. In the other direction, we have TypeScript for JavaScript and Python's rather strange unchecked optional typing. All those changes move towards the center position.

(Then, of course, there are "lambdas", but that's another subject.)

js8

Well, it kinda works in Haskell, but you get weird type signatures. I haven't tried Liquid Haskell but it looks like a potential solution.

But you're right. Even in Haskell, I always write type signatures for functions. It makes type inference debugging easier.

But as I understand it, this feature in Ruby is for compilers. It's a compiler's job to translate human-readable (hopefully) source code to an efficient mess of machine code.

tekknolagi

One of the authors here. Would enjoy hearing what you think

DannyBee

I remember implementing this decades ago. Brings me back to see it again :)

Your implementation looks nice.

You are right that often lots of lattice tricks and such are played in real implementations.

If you are worried about speed and memory usage while dealing with context sensitivity, and want to make the lattice taller or wider, incorporate BDD's or a similar mechanism. Somewhat pointless for single-functions, but as context-sensitivity grows, you either need BDD's or summaries (or some form of auto-sharing) if you want to represent nice large lattices.

Lattice value bdd's are a thing too, so you don't have to spend a bunch of time figuring out how to represent the lattice as a BDD (alternatively, you can do something like Souffle and friends that will take care of it under the covers) :)

This is the historically normal mechanism for effective representation of this sort of problem, because it deals with sharing of equivalent values/etc for you.

Also, if you want later passes to take advantage of the final lattice information, there are ways to represent it in the IR effectively, the same way you would with value ranges.

In value ranges, you would do:

if (a0 > 5) { a1 = range_info(a0, >5) } else { a2 = range_info(a0, <=5) }

You can do the same thing with the final type info to pass it along to later things.

The limit of this kind of splitting is basically SSI and friends.

If you are interested in the general theory of splitting and using it to represent info or solve other types of problems faster, see: https://arxiv.org/pdf/1403.5952 and friends.

ashton314

I was wondering what you would use to find function callers—happy to see k-CFA employed!

I wasn’t able to get too deep in your article; have you looked at demand-CFA for making some analysis cheaper?

How is analysis going for higher-order functions?

hahnchen

Cool article! I was wondering how high k goes in production? You can easily get call stacks that are 20-100 depth so the theoretical maximum is around 100 but I imagine that would be very very slow

tekknolagi

I don't know how deep call stacks go (and also separately don't know if I am allowed to quote stats like that) but people tend to not do points-to with k higher than 2 or 3 because it explodes the whole graph

ashton314

IIRC the complexity of k-CFA is exponential in k, which is not great. There’s another version called m-CFA; I don’t remember exactly what it does differently, but it doesn’t blow up exponentially

tempodox

[flagged]

tekknolagi

It's not really supposed to be for any particular language. We didn't want to even say Ruby in the title because the experiment is on a very very (very!) stripped-down Ruby.

IMO the more interesting thing here is incrementally discovering and building the interprocedural flows-to relation.

tines

> We know from experience that YJIT compiles over 9000 methods when running Shopify’s production code

All your methods are belong to us?

null

[deleted]

HN

Interprocedural Sparse Conditional Type Propagation

Interprocedural Sparse Conditional Type Propagation