Skip to content(if available)orjump to list(if available)

Pretty State Machine Patterns in Rust

phibz

I'm a bit surprised that they don't use the name "type state". Perhaps it wasn't in wide use when this post was originally written?

The important ideas here are that each state is moved in to the method that transitions to the next state. This way you're "giving away" your ownership of the data. This is great for preventing errors. You cannot retain access to stale state.

And by using the From trait to implement transitions, you ensure that improper transitions are impossible to represent.

It's a great pattern and has only grown in use since this was written.

wging

I think I first learned of that term from this 2019 article: https://cliffle.com/blog/rust-typestate/ I can't be the only one...

zokier

Typestates were also notable feature in early Rust, albeit in a very different form. I do recall them mentioned often in presentations/talks/etc at the time.

Tbh it would make interesting blog post to compare modern typestate patterns to the historical built-in typestate mechanism.

https://github.com/rust-lang/rust/issues/2178

mijoharas

Maybe I'm missing something, but it seems that this approach doesn't allow you to store external input that's provided when you transition states.

Say stateB is transitioned to from stateA and needs to store a value that is _not_ directly computed from stateA but is externally supplied at the point of transition.

As far as I understand this isn't possible with the proposed solution? Am I missing something? This seems like a pretty common use case to me.

vlovich123

impl From<(OldState, Input)> for NewState would be one way.

gsliepen

I don't understand why you would code these explicit state machines when you can just write normal code that is much more readable. The state machine example they start with could be written as:

  while (true) {
     wait();
     fill();
     finish();
  }
I don't think the approach from the article would have any benefits like less bugs or higher performance.

MrJohz

For a very simple example like this, your version will probably be okay, but it has its own set of problems:

* It's difficult to introspect the current state of the system. If I were to build an API that fetches the current state, I'd need to do something like add an extra state that the different functions could then update, which makes all the code more messy. By turning this into an explicit state machine, I can encode the states directly in the system and introspect them.

* Similarly it's often useful to be able to listen to state transitions in order to update other systems. I could include that as part of the functions themselves, but again, if I just encode this operation as an explicit state machine, the transition points fall out very nicely from that.

* Here there is no branching, but state machines can often have complicated branching logic as to which state should be called next. It's possible to write this directly as code, but in my experience, it often gets complicated more quickly than you'd think. This is just a simple example, but in practice a bottle filler probably has extra states to track whether the machine has been turned off, in which case if it's in the `fill` state it will switch to the `finish` or `wait` state to ensure the internal machinery gets reset before losing power. Adding this extra logic to a state machine is usually easier than adding it to imperative code.

* In your version, the different functions need to set up the state ready to be used by the next function (or need to rely on the expected internal state at the end of the previous function). This gives the functions an implicit order that is not really enforced anywhere. You can imagine in a more complicated state machine, easily switching up two functions and calling them in the wrong order. In OP's version, because the state is encoded in such a type-safe way, any state transitions must handle the data from the previous state(s) directly, and provide all the correct data for the next state. Even if you were to get some of the states the wrong way round, you'd still need to correctly handle that transition, which prevents anything from breaking even if the behaviour is incorrect.

IshKebab

One very common reason is to make the code non-blocking. In fact Rust's async/await system works by converting the code into a state machine.

Unfortunately Rust doesn't have proper support for coroutines or generators yet so often you'll want to use a hand written state machine anyway.

Even if it did, sometimes the problem domain means that a state machine naturally fits the semantics better than using a control flow based approach anyway.

joshka

There's definitely a missing part of this which talks about when to use this sort of approach. The answer is often when there's non trivial amounts of stuff that happens between the end of one method and the start of the next which is in control of the external system. That said, I often argue that async/await solves the majority of that problem by implicit modeling of the state machine while keeping the code readable.

sdenton4

I tend to think of state machines as becoming important when you're forced to deal with the unpredictably of the real world, rather than just pummeling bits until they repent.

You've got some complicated Thing to control which you don't have full visibility into... Like, say, a Bluetooth implementation, or a robot. You have a task to complete, which goes through many steps, and requires some careful reset operations before you can try again when things inevitably don't go according to plan. What steps are needed for the reset depends on where you're at in the process. Maybe you only need to retry from there steps ago instead of going all the way back to the beginning... The states help you keep track of where things are at, and more rigorously define the paths available.

crq-yml

It's a formal model that we can opt into surfacing, or subsume into convenient pre-packaged idioms. For engineering purposes you want to be aware of both.

It's way easier to make sense of why it's relevant to write towards a formalism when you are working in assembly code and what is near at hand is load and store, push and pop, compare and jump.

Likewise, if the code you are writing is actually concurrent in nature(such as the state machines written for video games, where execution is being handed off cooperatively across the game's various entities to produce state changes over time) most prepackaged idioms are insufficient or address the wrong issue. Utilizing a while loop and function calls for this assumes you can hand off something to the compiler and it produces comparisons, jumps, and stack manipulations, and that that's what you want - but in a concurrent environment, your concerns shift towards how to "synchronize, pause and resume" computations and effects, which is a much different way of thinking about control flow that makes the formal model relevant again.

null

[deleted]

asimpletune

Because to reason about things becomes harder as the stakes are raised. We had to implement paxos for distributed systems in college and my partner I started over probably about three times trying to code it normally. Then we switched to just focusing on defining states and the conditions that transition between them and our solution became much easier to code.

skavi

allows trivially mocking effects for testing

theOGognf

It’s funny seeing this blog post again. This is actually a reference I used to make a poker game as a state machine last year: https://github.com/theOGognf/private_poker

It made the development feel a lot safer and it’s nice knowing the poker game state cannot be illegally transitioned with the help of the type system

gnabgib

(2016) Popular, but barely discussed at the time (62 points, 3 comments) https://news.ycombinator.com/item?id=12703623

michalsustr

Well this is timely :) I’m in the middle of writing a library, based on rust-fsm crate, that adds nice support for Mealy automata, with extensions like

- transition handlers

- guards

- clocks

- composition of automata into a system.

The idea is to allow write tooling that will export the automata into UPPAAL and allow for model checking. This way you don’t need to make too much additional effort to ensure your model and implementation match/are up to date, you can run the checker during CI tests to ensure you don’t have code that deadlocks/ some states are always reachable etc.

I plan to post a link here to HN once finished.

locusofself

Is the title a nod to the Nine Inch Nails album "Pretty Hate Machine" ?

pjmlp

Kind of interesting seeing folks rediscovering ideas from Standard ML.

skavi

i think stable coroutines [0] would be huge for rust. they would enable writing pure state machines in the form of straight line imperative code.

currently they’re used in the implementation of async/await, but aren’t themselves exposed.

[0]: https://doc.rust-lang.org/beta/unstable-book/language-featur...

jll29

How about state machines with millions of transitions such as letter transducers?

raphinou

Is there any crate advised to be used when developing state machines? Any experience to share?

joshka

I prefer giving the transitions explicit names over relying on the From implemenations defined on the machine (defining them on the states still prevents bad transitions). The raft example drops a bunch of syntactic noise and repetition this way:

https://play.rust-lang.org/?version=stable&mode=debug&editio...

    fn main() {
        let is_follower = Raft::new(/* ... */);
        // Raft typically comes in groups of 3, 5, or 7. Just 1 for us. :)
        
        // Simulate this node timing out first.
        let is_candidate = is_follower.on_timeout();
        
        // It wins! How unexpected.
        let is_leader = is_candidate.on_wins_vote();
        
        // Then it fails and rejoins later, becoming a Follower again.
        let is_follower_again = is_leader.on_disconnected();
        
        // And goes up for election...
        let is_candidate_again = is_follower_again.on_timeout();
        
        // But this time it fails!
        let is_follower_another_time = is_candidate_again.on_lose_vote();
    }
    
    
    // This is our state machine.
    struct Raft<S> {
        // ... Shared Values
        state: S
    }
    
    // The three cluster states a Raft node can be in
    
    // If the node is the Leader of the cluster services requests and replicates its state.
    struct Leader {
        // ... Specific State Values
    }
    
    // If it is a Candidate it is attempting to become a leader due to timeout or initialization.
    struct Candidate {
        // ... Specific State Values
    }
    
    // Otherwise the node is a follower and is replicating state it receives.
    struct Follower {
        // ... Specific State Values
    }
    
    impl<S> Raft<S> {
        fn transition<T: From<S>>(self) -> Raft<T> {
            let state = self.state.into();
            // ... Logic prior to transition
            Raft {
                // ... attr: val.attr 
                state,
            }
        }
    }
    
    // Raft starts in the Follower state
    impl Raft<Follower> {
        fn new(/* ... */) -> Self {
            // ...
            Raft {
                // ...
                state: Follower { /* ... */ }
            }
        }
        
        // When a follower timeout triggers it begins to campaign
        fn on_timeout(self) -> Raft<Candidate> {
            self.transition()
        }
    }
    
    
    
    impl Raft<Candidate> {
        // If it doesn't receive a majority of votes it loses and becomes a follower again.
        fn on_lose_vote(self) -> Raft<Follower> {
            self.transition()
        }
    
        // If it wins it becomes the leader.
        fn on_wins_vote(self) -> Raft<Leader> {
            self.transition()
        }
    }
    
    impl Raft<Leader> {
        // If the leader becomes disconnected it may rejoin to discover it is no longer leader
        fn on_disconnected(self) -> Raft<Follower> {
            self.transition()
        }
    }
    
    // The following are the defined transitions between states.
    
    // When a follower timeout triggers it begins to campaign
    impl From<Follower> for Candidate {
        fn from(state: Follower) -> Self {
            Candidate { /* ... */ }
        }
    }
    
    // If it doesn't receive a majority of votes it loses and becomes a follower again.
    impl From<Candidate> for Follower {
        fn from(state: Candidate) -> Self {
            Follower { /* ... */ }
        }
    }
    
    // If it wins it becomes the leader.
    impl From<Candidate> for Leader {
        fn from(val: Candidate) -> Self {
            Leader { /* ... */ }
        }
    }
    
    // If the leader becomes disconnected it may rejoin to discover it is no longer leader
    impl From<Leader> for Follower {
        fn from(val: Leader) -> Self {
            Follower { /* ... */ }
        }
    }

gnatolf

I did the same. In short examples like the ones used in the article, it's easy to reason about the states and transitions. But in a much larger codebase, it gets so much harder to even discover available transitions if one is leaning too much on the from/into implementations. Nice descriptive function names go a long way in terms of ergonomic coding.