ASK HN: How to engineer a JavaScript to Python migration?
41 comments
·March 14, 2025barrkel
I did a reasonably big rewrite from JavaScript (Nashorn, long story) to Kotlin/JVM recently (with 60x speedup and elimination of huge variance in runtime).
Keys to success in a larger scale translation:
- don't redesign anything, do a port (see also, Typescript compiler to Go port)
- leverage LLMs interactively: per chunk (e.g. function), copy the old code into a comment in the new code, then use LLM completion to quickly fill out the translation
- get something basic up and running ASAP that you can test, ideally data-driven (inputs, expectations) tuples, that you can write scaffolds for execution of the old and the new code
- for every method / control flow ported, add tests that target the newly added code, validating it does the same as the old code
Some of this may be less applicable to scripts or harder to apply to imperative code, for which you might want to spend time converting side-effecting actions into data that can be asserted on (e.g. instead of performing commands, emit a list of commands); do this refactoring on the old code before porting.
Don't get tempted into doing refactorings as you go. When you notice an opportunity to refactor, create a bug for it. What you don't want to do is build up a list of transformations that increases the more code you port, and makes finishing everything harder and harder.
anonzzzies
> - don't redesign anything, do a port (see also, Typescript compiler to Go port)
> Don't get tempted into doing refactorings as you go.
I would say those are the most important. We did so many migrations in the past 30 years and the only ones that went ok were the ones that held to these rules. If you don't, you are rapidly stuck in a lot of pain and probably you won't be able to get out.
nailer
Do add TODO comments about proposed refactorings for later though.
datadrivenangel
110% this. Resist the urge to make changes until everything is moved over. Any system 'enhancements' may also be viewed as bugs/defects, and reduces trust, requiring lengthier validation.
mooreds
> for every method / control flow ported, add tests that target the newly added code, validating it does the same as the old code
Nice, and as a bonus you end up with a well tested system. Can't speak highly enough of data driven testing for this kind of system. Gives you such confidence.
> When you notice an opportunity to refactor, create a bug for it.
Did you have success in getting time to revisit all these bugs? Did you get pressure to fix them along the way (in either codebase)?
barrkel
I think data driven tests are underused by engineers generally. One advantage that shines in a porting scenario is that they can be language agnostic.
There was pressure from a reviewer in one area where the code could obviously be improved and I pushed back fairly hard in principle, but this was our first big project together and we were building trust, so I compromised in some leaf functions that presented the same API.
from-nibly
How to translate JavaScript to Python is a bike shed.
The thing that really matters is how are you going to ship this?
You should figure out if there is a way it can be delivered incrementally.
Make sure it's easy yo roll back from new to old on as small a chunk as possible.
Make sure rollbacks and deploys don't require manual futzing.
Make sure it's easy for outside people to KNOW the status of things without asking you.
Make sure you have a way to coordinate with feature devs on when it's OK to work on a specific chunk.
Make sure you can test if things are working after you deploy a change.
After that you'll probably come up with like 30 ways to translate the code and use all of them until you find one that's actually tollerable.
JimDabell
Unless you’re dealing with a lot of third-parties who can’t port their code, all of this seems like overkill. Just port the workflows to Python instead of trying to transpile them.
If you have an ecosystem to keep compatibility with, I would look at compiling the JavaScript to WASM and running the WASM from Python, or some kind of sandboxing to continue running the JavaScript as-is.
willquack
Check out PythonMonkey [1], it's an actively maintained project which embeds the SpiderMonkey JS engine inside a Python library. It reuses the same memory buffers whenever possible and allows for pretty impressive interop like executing functions back and forth [2].
At my last job we used PythonMonkey to port our complex distributed computing JS Library to Python enabling us to reuse all the code and keep almost all the performance.
1. https://pythonmonkey.io/ and https://github.com/Distributive-Network/PythonMonkey 2. https://distributive.network/jobs/python-monkey
LunaSea
I would simply dockerize the Airflow tasks and keep them in JS as-is.
Then you write the short DAG description in Python but make the task executor launch the Docker containers.
And then you're done.
viceconsole
This was my immediate thought. Just because Airflow is written in Python doesn't mean the tasks you're running need to be in Python.
Separate the concerns: migrate the task orchestration to Airflow (or whatever) while keeping the actual Javascript task code largely unchanged.
KolmogorovComp
It's hard to give a proper advice without knowing which magnitude of LOC you are talking about.
huem0n
If you have any async JS, that's going to seriously complicate things. Theres no AST mapping for that (python async is not the same).
Pitfalls to watch our for? Tons of them. Comparison is very different, modulus is different, .sort is different, object destructuring doesn't map nicely to python, lambda's won't map nicely to python, promises won't map to python. Labelled loops won't map nicely to python.
If your JS snippets are truly simple, just LLM translate and manually check. They're pretty good at the simple stuff.
jhfdhsldhdlflj
1. Ensure there are tests for EVERYTHING important on the JS side of things. 2. Port the tests (if necessary -- if there is a REST interface, just use the same tests) 3. Port parts of the code and run against the tests.
This way you have an accurate idea of how your code is working before and after the port.
TZubiri
Line by line, don't overthink it.
Programmers have an unhealthy aversion to repetitive tasks. Sometimes you just have to do work-work. Happens all the time in other industries,
clock in at 9 do the same thing for 2 hours, take a break, do the same thing for 2 hours, lunch, 2 hours break, 2 hours, go home.
Repeat this for weeks if necessary, you can plan it out and predict when it will be done, if need be ask for more resources.
michaelrpeskin
Was it Larry Wall that said "it's easier to port a shell than a shell script"?
I've done similar inter-language changes, and I have always found it easier to not change the language of the business logic. (Unless the change gives you something really big - for me I often port stuff to numpy because I need the vectorized code, but that's only for very specific problems).
If I had this task, the place where my brain would go would be to find a way to compile JS into C and then use C calling conventions to call the functions from Python. Keep the JS code around so that if you need to change anything, you change it in JS and then recompile to C.
I don't know the JS space very well, but can you get a JS interpreter that lives in Python? That way you can call JS functions from Python?
I don't like transpiling, there's always enough differences between the languages that something bad happens. When I've run into issues like this, since I'm an "old guy", I tend to try to get everything into C calling conventions and use that as my base interface.
Worst case, there has to be some good JS interpreter that can give you a C interface that you could call from Python. So you'd have Python -> C -> JS and your business logic can still live in JS (if your port is because of efficiency and you need compiled code, then you can ignore me.)
willquack
> can you get a JS interpreter that lives in Python
The PythonMonkey library is a full JS interpreter running in the same Python process. It allows for JS functions to be called from Python and vice versa
austin-cheney
> How much of an overkill that might be?
It sounds like a complete waste of time. If you are talking about small code snippets then simply write new original Python to replace them.
ninocan
Yep, I thought about that... Still, there's a few hundreds of workflows to migrate, so I was looking for a systematic approach
simonw
LLMs are absolutely the right thing to look at for migrating hundreds of "simple" workflows like this.
The hard work will be validating that the code they write for you is exactly right. You would have to do that if you wrote the code yourself, too. The LLMs will accelerate the writing-the-code part but the manual QA work will still be on you: https://simonwillison.net/2025/Mar/11/using-llms-for-code/#y...
antiobli
Working on something similar but in reverse (Python to JavaScript). A few tips:
- If the scripts are primarily by the same author, it's likely they copy-pasted a lot of their functions throughout their scripts. Do something like a "Find All" in the repository holding the scripts for the function name. Regex will help a lot in this, and help you see if a function ever has variations on number of arguments, slight name changes or misspells, etc.
- Refine an AI prompt as you convert scripts over and over.
- AI (ChatGPT, Copilot, etc.) is going to be the best automation you can get for this, because often times conversions don't match up one-to-one in a language. Especially if your scripts use npm, you might not find one-to-one matches in PyPI. AI refactors will also force you to examine the conversions.
- Understand what tech debt could be introduced in a one-to-one conversion. There may be some language workarounds that make a lot more sense to introduce over exact conversions of workaround functions for the language. I've had to work with several legacy Python scripts that called functions that can be better written in your target language. For example, one of our functions was conditionally_get_keys(level1, level2, level3) to kind of recursively get a nested value. This didn't need to be a function when I rewrote it in JS, rather I just wrote the variable as something like `city = user.location?.city`. No need to one-to-one convert a function that can be better written with a JS language feature. You'll probably encounter this when you can more succinctly write list manipulations with slicing (e.g. `steppedList = fooList[::2]` instead of converting a function necessary in JS to do the same thing).
Sorry if you have a time-crunch with converting things, but I would recommend a more hands-on conversion strategy, leveraging AI to do the basic conversion and then manually testing, debugging the solutions.
Context: I was tasked with migrating a legacy workflow system (Broadcom CA Workflow Automation) to Airflow.
There are some jobs that contain rather simple JavaScript snippets, and I was trying to design a first prototype that simply takes the JS parts and runs them in a transpiler.
In this respect, I found a couple of packages that could be leveraged: - js2py: https://github.com/PiotrDabkowski/Js2Py - mini-racer: https://github.com/bpcreech/PyMiniRacer Yet, both seem to be abandoned packages that might not be suitable for usage in production.
Therefore, I was thinking about parsing and translating Javascript's abstract syntax trees to Python. Whereas a colleague suggested I bring up an LLM pipeline.
How much of an overkill that might be? Has anyone else ever dealt with a JavaScript-to-Python migration and could share heads-ups on strategies or pitfalls to avoid?