Notebooks as reusable Python programs

60 comments

·March 19, 2025

abdullahkhalids

One design decision they made is that outputs are not stored. This means these notebooks are not suitable replacement for heavy computation routines, where the notebook is a record of the final results. Other people are not expected to run minutes/hours long computation to see what the author intended.

You can work your way around it by storing the results in a separate file(s), and writing the boiler plate to let the reader load the results. Or they let you export to ipynb - which is still sharing two files.

Presumably the reason for this decision is making git diffs short. But to me the solution is to fix git diff to operate on JSON nicely, rather than changing the entire notebook format.

cantdutchthis

(someone from the marimo team here)

The `export` command can generate a rendered artifact if that's what you're after but there is also another avenue here, have you seen the caching feature? The one that caches to disk and persists?

https://docs.marimo.io/guides/expensive_notebooks/?h=cache#d...

This can automatically store the output of expensive functions, keeping the previous state of the cells in mind. If a re-compute ever needs to happen it will just load it straight from cache.

Another option is to run in lazy mode, documented here:

https://docs.marimo.io/guides/expensive_notebooks/?h=cache#l...

This will prevent the notebook from rerunning cells by accident.

We're thinking about adding features that would make marimo great for running long running batch work but there's not a whole lot I can share about it yet. If you have specific thoughts or concerns though, feel free to join our discord!

https://marimo.io/discord?ref=nav

abdullahkhalids

The caching is a very nice feature, and will stop me from keeping my computer running for days/weeks while I work on a notebook.

If I understand it correctly, `@mo.persistent_cache(name="my_cache")` creates a binary file `my_cache` that I should commit as well if I don't want others to repeat the computation?

This kinda solves the problem, except for having two files per notebook, and that marimo notebooks are no longer viewable with output on github directly.

mscolnick

The default "store" is a local FileStore. In your case, it will save the outputs to a file on disk called `my_cache`.

We plan to add more stores like Redis, S3-bucket, or an external server, since you may not always want to commit this file, but like you said want others to avoid the computation.

fragmede

> This will prevent the notebook from rerunning cells by accident.

Is that really what's wanted? If there's some cell that I need to run twice for some reason that I tried debugging but wasn't able to figure out why, or for debug just run cells 1-5 in order but on specific other (prod) systems, skip cell 4 and run 3 before 2. Now, arguably well written software would handle those things automatically, but we're not talking about battle hardened software that's had an SRE team vigorously refactor it until it's been proven suitable for such purpose by having been on call for it for months off not years. We're talking about notebooks, which have their time and place, but the entire point, I would argue, is to make it easier run notebooks in production without the added overhead of said SRE team. And in that world, the reality is the PhD is gonna have some things they know they should fix, but it's easier to just comment out the unneeded cells on the prod and hit run all, so shouldn't the tools better support that use case (by saving what was actually run and offering to rerun that) over caching based on input string and hoping for the best?

mscolnick

> One design decision they made is that outputs are not stored

This is not quite true. Outputs are not stored...in the Python file*. marimo does store outputs in the `/__marimo__` folder with settings enabled.

> writing the boiler plate to let the reader load the results.

Therea are some primitives to do this for you, such as mo.persistent_cache. This can be an annotation or 'with' block. It intelligently knows when either the source code or inputs change.

The plan is to take this one step further than storing just the output. Because marimo knows the dependencies of each cell, in a future version, it will store each output AND let you know which are stale based on code changes. This is being built by Dylan (from the blog) and inspired by Nix.

akshayka

It’s true that we don’t store outputs in the file format. This is the main tradeoff, as discussed in the blog. But that doesn’t mean marimo notebooks aren’t suitable for heavy computation.

marimo lets you automatically snapshot outputs in an auxiliary file while you work. We also have a persistent cache that lets you pick up where you left off — saving not just outputs but also computed data.

Many of our users do very heavy computation.

https://docs.marimo.io/guides/expensive_notebooks/

duped

> Other people are not expected to run minutes/hours long computation to see what the author intended

Arguably this is a good thing. You shouldn't distribute things you can't prove have the same results and one way to do that is to require others to run the same computations.

0cf8612b2e1e

You lose out on so many use cases without the stored output. The first that comes to mind is all of the learning resources that are now presented in notebooks.

Most learners do not need to fact check the instructor, but do want to see the operations which were run. Those that are curious can run/edit the notebook themselves.

Edit: The JupyterBook ecosystem (https://executablebooks.org/en/latest/gallery/) as an example of what is possible with stored plots/calculations. Most learners are just going to follow along with the material, but being able to optionally play with the data is the super power of the platform with minimal friction.

mscolnick

The parent comment wasn't fully correct. marimo doesn't store outputs in the notebook file, but it does have many ways outputs are stored alongside the notebook, or remotely if you'd like: HTML, ipynb, pickle

epistasis

It's hard for me to imagine the use case where this is appropriate.

I have looked at marimo several times, and while it's great for interactive computing, and it has a fantastic team, it's not a replacement for notebooks and I find their use of the term "notebook" confusing. As a scientist, I don't understand what use case they are exploring, but I do know it's not the use case where Jupyter was created and it's not my current use case for Jupyter on teams I work with.

abdullahkhalids

If you read what they think the problems to solve are, you will get it.

    small edits to code yield enormous Git diffs;
    code is copy-pasted across notebooks, instead of reused;
    magic commands limit the portability of notebook code;
    logic that would be useful as a script or library gets thrown away;
    logic that should be tested almost never is.

These are problems primarily a software engineer has. Not problems a scientist thinks are important. If you asked a scientist to list problems with Jupyter notebooks (and there are many), it would be very different list, primarily about science.

aaplok

> You shouldn't distribute things you can't prove have the same results

Why not? Can you expand on this because I don't see why this is not a good thing.

Besides if you distribute your code alongside your output, aren't you providing that proof anyway? People can run your code and see they are getting the same result.

bulletmarker

This is the top reply here but I completely disagree with it. Storing the output of running your source file back into your source file is one of the design decisions in Jupyter which I have always considered abhorrent and am very happy to see fixed by Marimo.

jdaw0

i wanted to like marimo, but the best notebook interface i've tried so far is vscode's interactive window [0]. the important thing is that it's a python file first, but you can divide up the code into cells to run in the jupyter kernel either all at once or interactively.

0: https://code.visualstudio.com/docs/python/jupyter-support-py

aaplok

Spyder also has these, possibly for longer than vscode [0]. I don't know who had this idea first but I remember some vim plugins doing that long ago, so maybe the vim community?

[0] https://docs.spyder-ide.org/current/panes/editor.html#code-c...

westurner

Jupytext docs > The percent format: https://github.com/mwouts/jupytext/blob/main/docs/formats-sc... :

  # %% [markdown]
  # Another Markdown cell

  # %%
  # This is a code cell
  class A():
      def one():
          return 1


  # %% Optional title [cell type] key="value"

MyST Markdown has: https://mystmd.org/guide/notebooks-with-markdown :

  ```{code-cell} LANGUAGE
  :key: value

  CODE TO BE EXECUTED
  ```

And :

  ---
  kernelspec:
    name: javascript
    display_name: JavaScript
  ---

  # Another markdown cell

  ```{code-cell} javascript
  // This is a code cell
  console.log("hello javascript kernel");
  ```

But also does not store the outputs in the markdown.

aaplok

Thanks, this is a good read. I did not know MyST, it is very cool.

The vim plugin I was talking about was vim-slime [0], which seems to date from 2007 and does have regions delimited with #%%.

Slime comes from Emacs originally, but I could not find if the original Emacs slime has regions.

Matlab also has those, which they call code sections [1]. Hard to find when they were introduced. Maybe 2021, but I suspect older.

None of those stores gge output of the command.

[0] https://vimawesome.com/plugin/vim-slime

[1] https://www.mathworks.com/help/matlab/matlab_prog/create-and...

0cf8612b2e1e

This is also where I have landed. Gives you all of your nice IDE tooling alongside the REPL environment. No need for separate notebook aware code formatters/linters/etc. That they version cleanly is just the cherry on top.

darkteflon

Looks very interesting. Could you elaborate on why you prefer this over the .ipynb notebook interface built into VS Code? The doc you linked mentions debugging, but I have found that the VS Code debugger is already fairly well-integrated into .ipynb notebooks. Is it mainly the improved diffing and having a REPL?

jdaw0

my impetus for exploring it was that vim modal editing and keyboard navigation is just really clunky in the notebook integration.

whether or not it's better for you depends on your use case for notebooks — i use them mostly for prototyping and exploratory data analysis so separating the code from the output might be more convenient for me than for you

cantdutchthis

Out of curiosity, does this approach also allow for interactive widgets?

luke-stanley

Yes. Though it is split into a code section and an interactive section, like with Markdown previews. It really is driven by the code cells though.

kylebarron

Agreed, I find this to be a super productive environment, because you get all of vscode's IDE plus the niceties of Jupyter and IPython.

I wrote a small vscode extension that builds upon this to automatically infer code blocks via indentation, so that you don't have to select them manually: [0]

[0]: https://github.com/kylebarron/vscode-jupyter-python

floathub

One approach to this is org-mode with babel.

You can have a plaintext file which is also the program which is also the documentation/notebook/website/etc. It's extremely powerful, and is a compelling example of literate programming.

Decent overview here: https://www.johndcook.com/blog/2022/08/02/org-babel-vs-jupyt...

[edit: better link]

addisonbeck

I'm writing this comment as I write a literate README.org that documents and tangles a PoC program I'm writing for an upcoming big initiative at my job.

You can do most work in org instead of editing programs directly, and it'll save time and produce good documentation. Can't recommend it enough.

This approach has the recently developed benefit of being really great as context stores for interacting with LLMs.

TheAlchemist

This looks really very very neat.

One (not a great) workflow I have, is that I use notebooks as quicks UIs to visualize some results. 1. Run a simulation that outputs results to some file 2. Load results in a notebook and do some quick processing + visualization

Very often, I want to compare quickly between 2 different runs and end up copying down the cell with visualization, then just re-run the data load + processing + visualization and compare them.

My understanding is that this would not be possible with marimo, since it will re-run automatically the cell with my previous data right ?

cantdutchthis

(marimo team-member here)

I have had a similar situation and my "hack" for this was to start the same notebook twice and have two tabs open. This worked for something things ...

Other times I just bit the bullet and made two forms on two variables so that everything would fit in a single notebook. By having two variables that contain all the inputs you can make sure that only the cells update that need to update. It takes a bit more effort but makes a lot of sense for some apps that you want to share with a colleague.

If you share more details about your setup I might be able to give better advice/think along more.

mscolnick

It may be preferable to create a variable tied to a UI element that can be used as a toggle to view each analysis.

choice = mo.ui.dropdown(['train', 'split')

data = load(choice.value)

processed = process(data)

visualize(processed)

This way, you can toggle between just more than two if needed. If you need to see both at once, you'd want to refactor the processing and visualizing step into functions, and then just duplicate the finals cell(s).

marimo has a multi-column mode, so you can view them side-by-side

blooalien

Marimo also has a nice Tabs widget that can contain whatever else Marimo can display in cleanly separated tabs.

florbnit

> When working with Jupyter, too often you end up with directories strewn with spaghetti-code notebooks, counting up to Untitled12.ipynb or higher. You the notebook author don’t know what’s in these notebooks

This is such a small UX thing but it’s so damn important. The simple fix is to not auto-name notebooks untitled-# when the user clicks new notebook just ask the name straight away, if they can’t name it don’t create it. It might add the smallest amount of friction to the UX, but it’s so damn important.

Also the choice of json as the file format is just plain wrong. Why the project hasn’t just abandoned that entirely and done a json-#python back and forth when writing to file is beyond me. There are extensions that do this, but that’s a really clunky interface, and while I can set it up for myself it’s difficult to force upon others in a corporate environment.

Great to see someone is taking the seemingly small things up, because they mean a world of difference to the overall ecosystem.

BrenBarn

> The simple fix is to not auto-name notebooks untitled-# when the user clicks new notebook just ask the name straight away, if they can’t name it don’t create it.

The even simpler fix is to just not name them until the user does. That's the way other programs work. If you create a new document in a word processor, it will say "Untitled" at the top of the window, but it doesn't create a file called untitled.doc on disk until you do "Save as" and choose a filename. It has always irritated me that Jupyter insists on having an on-disk file right from the beginning.

janalsncm

It would be neat if it worked similar to ChatGPT sessions which give themself a name based on the conversation. You could have a small local model that gives a default name automatically.

It would also be great to have a decent search across notebooks in a project so I could quickly find an old function.

cantdutchthis

(someone from the marimo team here)

How you start the marimo notebook, via

`marimo edit must-give-name-to-this-file.py`

is indeed one of my teeny by favourite features of it. When you start a new notebook you're kind of forced to name it immediately.

nemoniac

"until recently, Jupyter notebooks were the only programming environment that let you see your data while you worked on it."

This is false. Org-mode has had this functionality for over two decades.

https://orgmode.org/

jarpineh

And since in Lisp code is data and data is code you could go even farther back. A tad sensationalist claim from the article authors.

paddy_m

I develop an open source notebook widget. Working with marimo has been a joy compared to developing on top of any other notebook environment.

The team is responsive and they care about getting it right. Having a sane file format for serializing notebooks is an example of this. They are thinking about core problems. They are also building in the open.

The core jupyter team is very unresponsive and unfocused. When you have a bug, you need to figure out which one of many many interelated projects caused the bug, issues go weeks without a response. It's a mess.

Then there are the proprietary notebook like environments. VSCode notebooks, and google colab in particular. They frequently rely on opaque undocumented APIs and are also very unresponsive.

cantdutchthis

(someone from the marimo team here)

Happy to hear it! Got a link to your widget? I am assuming it's an anywidget?

jarpineh

I find it a baffling that the popularity of Jupyter and successes of notebook analysis in science hasn’t brought a change in Python to better support this user base. Packaging has (slowly) progressed and Uv nicely made the experience smooth, fast and above all coherent. Yet the Python runtime and parser are the same as ever. The ipynb notebook format and now Marimo’s decorator approach had to be invented on top. Python might never attain the heights of Lisp’s REPL driven development, yet I wonder if it couldn’t be better. As much I enjoy using Jupyter it’s always been something tacked on top of infrastructure that doesn’t want to accommodate it. Thus you need to take care of cell order yourself or learn to use a helper tool (Jupytext, Nbdev).

Me, I’d have support in the parser for a language structure or magic comment that points the cell boundaries. I would make dynamic execution of code a first party feature with history tracking and export of all the code sent into the runtime through this. Thus what the runtime saw happen could be committed over what user thought they did. Also, a better, notebook aware evaluation with extension hooks for different usage methods (interactive, scripting, testing).

I have no solution to ipynb JSON problem. I do think it is bad that our seemingly only solution for version control can manage only simple text, and users of all the other format have to adapt or suffer.

Kydlaw

I discovered Marimo a couple weeks/months ago here iirc. This really lands on a sweet spot for me for data exploration. For me the features that really nails it are the easy imports from other modules, the integrated UI components, and the app mode.

Being able to build model/simulations easily and being able to share them with others, who can then even interact with the results, as truly motivated me to try more stuff and build more. I've been deploying more and more of these apps as PoCs to prospects and people really like them as well.

Big thanks to the team!

dchuk

I’ve been tinkering with Marimo, it’s pretty sweet (and you can use cursor or other AI IDEs pretty easily with it).

On running notebooks as scripts: I can’t find in the docs what happens if you have plotting and other notebook oriented code? Like I’m using pygwalker to explore data through transformation steps, and end with saving to csv. If I just run the notebook as a script, is all of the plotting automatically skipped?

cantdutchthis

(someone from the marimo team here)

It depends a bit on how the notebook is written. There is `mo.app_meta()` that allows you to detect how the notebook is running. This can be in "app mode", "edit mode" or in "script mode".

https://docs.marimo.io/api/app/?h=meta#marimo.app_meta

Effectively this could allow you to do things like "only run this bit when not in script mode" if you want to skip things.

Alternatively you can also run the notebook via the `marimo export` command if you care about the charts and want to have a rendered notebook as an artifact.

Gotta ask out of curiosity, anything you can share about your cursor/marimo workflow? Are you using the llm tools from within marimo or outside of it?

epistasis

I have looked at Marimo in the past, and read this blog post with great interest, but I still don't "get" Marimo. What it does well: have a sane way to create and interact with widgets. Lots of widget authors and tooling authors, people I respect a lot, admire Marimo and like how it does stuff.

However, I'm not sure what the use case is for Marimo. I see Jupyter notebooks being used in two primary use cases: 1) prototyping new code and interactions with services and databases and datasets, as a record of the REPL used to understand something, with interactive notes and plots and pasted in images from docs, etc. 2) a record of how a calculation was performed, experimental data analyzed, and in a permanent artifact that others can look up later. For both of these, outputs and markdown/image cells are just as important as the code cells. These are both "write once" types of things where changes in git are rare, and ideally would never happen.

With Marimo, can I check the outputs directory into version control in a reasonable way and have it stored for posterity? Is that .ipynb?

Is there a way to convert a stored .ipynb checkpoint back into the marimo format?

And why does a small .ipynb change lead to many lines of change in the git diff? It's because the outputs changed. Deciding to not store outputs in version control and counting it as a win for pretty git diffs is saying "this core feature of .ipynb should be ignored because it's inconvenient". I'd much rather educate people about turning on GitHub's visual Jupyter diff rather than switch to an environment where I can no longer store outputs inline.

Similarly, being able to import one cell into a different notebook seems like the wrong direction to solve the problem of "it's time to turn the prototype notebook into a reusable module." If it's time to reuse a cell, it's time to make a cleaned-up Python code module file, not have the code interspersed with all the rest of the stuff.

I'd like to learn more about the use cases where Marimo is useful. As a scientist, it's not useful to me. I don't care about smaller git diffs on a notebook, in fact if a notebook is getting changed and re-checked into version control then a big awkward diff is not a problem and probably a feature, because notebooks should not be getting changed. They are a notebook something that you write in once and it's done!

akshayka

Hi; I'm the author of this post, and one of marimo's original developers. Thanks for taking the time to read this post, and for looking at marimo.

> With Marimo, can I check the outputs directory into version control in a reasonable way and have it stored for posterity? Is that .ipynb?

You can snapshot to ipynb automatically while a marimo notebook is running, or after the fact. You can also snapshot as HTML, which marimo can hydrate back into a notebook.

> Is there a way to convert a stored .ipynb checkpoint back into the marimo format?

Yes, `marimo convert notebook.ipynb -o notebook.py`.

marimo was originally commissioned by, and designed in collaboration with, scientists at Stanford's SLAC National Laboratory. These scientists were heavy users of Jupyter, but decided they needed something like marimo to solve two main problems they encountered with Jupyter notebooks: computational reproducibility, and publishing interactive science communication on the web.

We have co-authored an article with these scientists, explaining how these problems and how marimo solves, which you can read here: https://marimo.io/blog/slac-marimo.

From our article:

> In 2019, a study from New York University and Federal Fluminense University found that of the 863,878 Jupyter notebooks on GitHub with valid execution orders, only 24% could be re-run, and just 4% reproduced the same results.

Many scientists (these scientists and myself) use notebooks as a critical part of the scientific process. Jupyter notebooks are notorious for being plagued with reproducibility issues due to hidden state, to the extent that the authors of Jupyter wrote a paper titled "Ten simple rules for reproducible research in Jupyter notebooks" [1]. In my mind, these rules are not simple nor convenient. One is to periodically restart the kernel to make sure you don't accumulate hidden state — untenable for expensive notebook. Another rule is in fact to use version control (Git):

> Version control is a critical adjunct to notebook use, because the interactive nature of notebooks makes it easy to accidentally change or delete important content. Furthermore, since notebooks contain code, and code inevitably contains bugs, being able to determine the history of when a given bug you have discovered was introduced to the code vs when it was fixed – and thus what analyses it may have affected – is a key capability in scientific computation.

In my own PhD, I used Jupyter notebooks extensively to see my data while I worked on it, and to produce figures for papers [2]; sometimes, these notebooks took a very long time to run. I have a background in software engineering, so I was able to make my notebooks more or less reproducible, though I wasted many hours debugging inconsistencies due to hidden state (one example: delete a cell, but its variable are still in memory!) My co-authors, however, often did not have a background in software. They also used Jupyter notebooks. But when I tried to run their notebooks, either they didn't work at all (packages and computational environment not properly documented), or when running I got different results than the ones serialized in the notebook (notebook author ran cells out of order, or ran side-effecting cells multiple times, etc -- hidden state). This was a huge problem and really a non-starter for computational science.

Yes, you can get Jupyter/Ipykernel to work for you if you try hard enough, if you are experienced enough, and if you are willing to eat enough pain (sometimes literally -- I know of someone who got an incorrect tattoo due to hidden state in a Jupyter notebook that rendered a design for their tattoo; the notebook author forgot to "restart and run all" after making a change). But the imperative nature of the REPL makes it a fundamentally error-prone experience. Our philosophy with marimo is that notebooks should be reproducible by default.

Hope this helps. It's still possible that marimo is not for you, but at least for scientists who use notebooks to conduct research, or want to share interactive articles on the web without paying for compute, I do believe it has something compelling to offer over traditional notebooks like Jupyter.

[1] https://arxiv.org/pdf/1810.08055 [2] https://web.stanford.edu/~boyd/papers/min_dist_emb.html

epistasis

Thanks for such a detailed response, extremely helpful! I do share many of the frustrations, and hope that you are very successful. I have not yet discovered how to make marimo work for me or my teams, but I'm keeping a close eye on it because lots of very smart people like it a lot.

dmadisetti

I'm from a computational mechanics background, and my biggest issue with jupyter notebooks was doing analysis, playing around with my models, getting results I was happy with; then having my kernel break on a fresh session whenever I wanted to revisit the result.

You could argue this is a user issue, and being sloppy with cell state is a me problem- not a notebook problem- but marimo prevents this in the first place. Just once is already a wasted day's work. Even if one is especially diligent in Jupyter, they do not have the guarantees marimo provides. Unless you are pressing "restart and run all" in a Jupyter kernel every time - you are accumulating state.

I got started with marimo and helped build out the caching because I wanted to open up a notebook from scratch and automatically restart at the point I was at- without having to worry whether my cell order is correct or not. I wanted my code to automatically rerun update outputs if I changed something like a filtering or transform step to my data. I also wanted to go back to previous states easily without having to rerun long computations or deal with data dump versioning.

marimo's caching makes this all automatic.

The plan is to bring caching to an opt-in cell basis so this happens behind the scenes, with some ideas for handling side-effects like randomness and other "non-pure" cases.

But to answer your points, my use case has been specifically scientific computing. Yes for 1) prototyping- but also knowing my prototype does not hold some hidden state or bug. I have shown my advisor results, gone to deploy and been unable to replicate them at scale because of some hidden bug in my notebook execution. To me this is unacceptable. but also 2) documentation, notes, and report sharing. Not saving the outputs directly with the code isn't a blocker here, because caching automatically restores my state, and I can export to a more suitable format like a PDF or HTML file if I need to share it. Saving generated binaries in source for any other setup would be considered poor form- I don't understand why Jupyter gets a pass here. "Write once" is a google doc, notebooks should be "run many times", i.e. reproducible and sharable.

Hopefully, we get to address some of the other pain points with existing infrastructure; like auto-scaling on HPCs (easy for marimo, it's already a DAG); integrating code more directly into papers (we have mkdocs, and experimental quarto extension exists); and built-in environments (no more `pip install -R requirements.txt` on a paper project and hoping the authors remembered to freeze and pinned versions)

---

Hope my tone doesn't come off too contentious- I've just been burned by Jupyter and I'm excited because I feel like marimo addresses its major issues.

epistasis

Fantastic, thanks for such a detailed response, it is very much appreciated and I learned a lot! (And you don't come off as contentious at all!)

I think your use case makes a ton of sense for Marimo, iterative modeling and mucking around. Most of my modeling has been less like that, and goes straight to a single computation with the result. That is a significant workflow difference.

The jupyter lab web interface has many many UX problems that make it extremely difficult, as does the whole kernel launching infrastructure. Having a desktop app, or another way to launch kernels in a daemon like mode without keeping a terminal open, could save a ton of headaches.

Marimo is great in that it accomodates pure functional programming fantastically for notebooks. Unfortunately too much of the Python API ecosystem does not, but that doesn't mean that it couldn't in the future.

blooalien

My own personal experience of Marimo vs JupyterLab is that they're both really useful tools, but with somewhat differing (but overlapping) purposes. Marimo has become my go-to for when I want something "notebook-like" but which operates more like a fully-fledged application than a Jupyter notebook would, where JupyterLab still (at least for me) is the better choice for random explorations, learning, and note-taking/documentation of my learnings. At this point I'm really feeling like it's a good thing to have both handy and available, since neither one is really a complete replacement for the other.

HN

Notebooks as reusable Python programs

Notebooks as reusable Python programs