ASK HN: How to engineer a JavaScript to Python migration?
33 comments
·March 14, 2025barrkel
I did a reasonably big rewrite from JavaScript (Nashorn, long story) to Kotlin/JVM recently (with 60x speedup and elimination of huge variance in runtime).
Keys to success in a larger scale translation:
- don't redesign anything, do a port (see also, Typescript compiler to Go port)
- leverage LLMs interactively: per chunk (e.g. function), copy the old code into a comment in the new code, then use LLM completion to quickly fill out the translation
- get something basic up and running ASAP that you can test, ideally data-driven (inputs, expectations) tuples, that you can write scaffolds for execution of the old and the new code
- for every method / control flow ported, add tests that target the newly added code, validating it does the same as the old code
Some of this may be less applicable to scripts or harder to apply to imperative code, for which you might want to spend time converting side-effecting actions into data that can be asserted on (e.g. instead of performing commands, emit a list of commands); do this refactoring on the old code before porting.
Don't get tempted into doing refactorings as you go. When you notice an opportunity to refactor, create a bug for it. What you don't want to do is build up a list of transformations that increases the more code you port, and makes finishing everything harder and harder.
anonzzzies
> - don't redesign anything, do a port (see also, Typescript compiler to Go port)
> Don't get tempted into doing refactorings as you go.
I would say those are the most important. We did so many migrations in the past 30 years and the only ones that went ok were the ones that held to these rules. If you don't, you are rapidly stuck in a lot of pain and probably you won't be able to get out.
nailer
Do add TODO comments about proposed refactorings for later though.
from-nibly
How to translate JavaScript to Python is a bike shed.
The thing that really matters is how are you going to ship this?
You should figure out if there is a way it can be delivered incrementally.
Make sure it's easy yo roll back from new to old on as small a chunk as possible.
Make sure rollbacks and deploys don't require manual futzing.
Make sure it's easy for outside people to KNOW the status of things without asking you.
Make sure you have a way to coordinate with feature devs on when it's OK to work on a specific chunk.
Make sure you can test if things are working after you deploy a change.
After that you'll probably come up with like 30 ways to translate the code and use all of them until you find one that's actually tollerable.
michaelrpeskin
Was it Larry Wall that said "it's easier to port a shell than a shell script"?
I've done similar inter-language changes, and I have always found it easier to not change the language of the business logic. (Unless the change gives you something really big - for me I often port stuff to numpy because I need the vectorized code, but that's only for very specific problems).
If I had this task, the place where my brain would go would be to find a way to compile JS into C and then use C calling conventions to call the functions from Python. Keep the JS code around so that if you need to change anything, you change it in JS and then recompile to C.
I don't know the JS space very well, but can you get a JS interpreter that lives in Python? That way you can call JS functions from Python?
I don't like transpiling, there's always enough differences between the languages that something bad happens. When I've run into issues like this, since I'm an "old guy", I tend to try to get everything into C calling conventions and use that as my base interface.
Worst case, there has to be some good JS interpreter that can give you a C interface that you could call from Python. So you'd have Python -> C -> JS and your business logic can still live in JS (if your port is because of efficiency and you need compiled code, then you can ignore me.)
JimDabell
Unless you’re dealing with a lot of third-parties who can’t port their code, all of this seems like overkill. Just port the workflows to Python instead of trying to transpile them.
If you have an ecosystem to keep compatibility with, I would look at compiling the JavaScript to WASM and running the WASM from Python, or some kind of sandboxing to continue running the JavaScript as-is.
huem0n
If you have any async JS, that's going to seriously complicate things. Theres no AST mapping for that (python async is not the same).
Pitfalls to watch our for? Tons of them. Comparison is very different, modulus is different, .sort is different, object destructuring doesn't map nicely to python, lambda's won't map nicely to python, promises won't map to python. Labelled loops won't map nicely to python.
If your JS snippets are truly simple, just LLM translate and manually check. They're pretty good at the simple stuff.
egeozcan
Random idea: Couldn't Babel translate async code to callbacks?
LunaSea
I would simply dockerize the Airflow tasks and keep them in JS as-is.
Then you write the short DAG description in Python but make the task executor launch the Docker containers.
And then you're done.
viceconsole
This was my immediate thought. Just because Airflow is written in Python doesn't mean the tasks you're running need to be in Python.
Separate the concerns: migrate the task orchestration to Airflow (or whatever) while keeping the actual Javascript task code largely unchanged.
KolmogorovComp
It's hard to give a proper advice without knowing which magnitude of LOC you are talking about.
jhfdhsldhdlflj
1. Ensure there are tests for EVERYTHING important on the JS side of things. 2. Port the tests (if necessary -- if there is a REST interface, just use the same tests) 3. Port parts of the code and run against the tests.
This way you have an accurate idea of how your code is working before and after the port.
willquack
Check out PythonMonkey [1], it's an actively maintained project which embeds the SpiderMonkey JS engine inside a Python library. It reuses the same memory buffers whenever possible and allows for pretty impressive interop like executing functions back and forth [2].
At my last job we used PythonMonkey to port our complex distributed computing JS Library to Python enabling us to reuse all the code and keep almost all the performance.
1. https://pythonmonkey.io/ and https://github.com/Distributive-Network/PythonMonkey 2. https://distributive.network/jobs/python-monkey
antiobli
Working on something similar but in reverse (Python to JavaScript). A few tips:
- If the scripts are primarily by the same author, it's likely they copy-pasted a lot of their functions throughout their scripts. Do something like a "Find All" in the repository holding the scripts for the function name. Regex will help a lot in this, and help you see if a function ever has variations on number of arguments, slight name changes or misspells, etc.
- Refine an AI prompt as you convert scripts over and over.
- AI (ChatGPT, Copilot, etc.) is going to be the best automation you can get for this, because often times conversions don't match up one-to-one in a language. Especially if your scripts use npm, you might not find one-to-one matches in PyPI. AI refactors will also force you to examine the conversions.
- Understand what tech debt could be introduced in a one-to-one conversion. There may be some language workarounds that make a lot more sense to introduce over exact conversions of workaround functions for the language. I've had to work with several legacy Python scripts that called functions that can be better written in your target language. For example, one of our functions was conditionally_get_keys(level1, level2, level3) to kind of recursively get a nested value. This didn't need to be a function when I rewrote it in JS, rather I just wrote the variable as something like `city = user.location?.city`. No need to one-to-one convert a function that can be better written with a JS language feature. You'll probably encounter this when you can more succinctly write list manipulations with slicing (e.g. `steppedList = fooList[::2]` instead of converting a function necessary in JS to do the same thing).
Sorry if you have a time-crunch with converting things, but I would recommend a more hands-on conversion strategy, leveraging AI to do the basic conversion and then manually testing, debugging the solutions.
dvh
I once needed to convert 2000 line excel formula to PHP but PHP linters sucks and are generally not helpful so I converted it to JS first and then I just added $ signs in front of variable names, few minor tweaks and it worked. It was easier than to go directly to PHP.
harvey9
Do you literally mean Excel formula and not VBA? That's mind-blowing.
anonzzzies
Lot of insurers etc do their work in Excel, i've seen 10000s of 'lines' of formulas in 1000s of sheets needing to be translated into Java. Most of them try, every few years, one of those 'why can't we just run Excel on the backend?' with one of those tools, commercial or not, spend a bunch of money, find it's a crap idea (scalability, maintenance etc) and then port it to a 'real language'.
nextts
Semantics are gonna get you. Especially if they use idiomatic stuff like (!x && x == y) and rely on JS type coercion.
In this sense an LLM or hand crafted approach may win out.
Also API will likely be different.
Context: I was tasked with migrating a legacy workflow system (Broadcom CA Workflow Automation) to Airflow.
There are some jobs that contain rather simple JavaScript snippets, and I was trying to design a first prototype that simply takes the JS parts and runs them in a transpiler.
In this respect, I found a couple of packages that could be leveraged: - js2py: https://github.com/PiotrDabkowski/Js2Py - mini-racer: https://github.com/bpcreech/PyMiniRacer Yet, both seem to be abandoned packages that might not be suitable for usage in production.
Therefore, I was thinking about parsing and translating Javascript's abstract syntax trees to Python. Whereas a colleague suggested I bring up an LLM pipeline.
How much of an overkill that might be? Has anyone else ever dealt with a JavaScript-to-Python migration and could share heads-ups on strategies or pitfalls to avoid?