Skip to content(if available)orjump to list(if available)

Show HN: Globstar – Open-source static analysis toolkit

Show HN: Globstar – Open-source static analysis toolkit

12 comments

·February 28, 2025

Hey HN! We’re Jai and Sanket, co-founders of DeepSource (YC W20). We're open-sourcing Globstar (https://github.com/DeepSourceCorp/globstar), a static analysis toolkit that lets you easily write and run custom code quality and security checkers in YAML [1] or Go [2].

After 5+ years of building AST-based static analyzers that process millions of lines of code daily at DeepSource, we kept hearing a common request from customers: "How do we write custom checks specific to our codebase?" AppSec and DevOps teams have a lot of learned anti-patterns and security rules they want to enforce across their orgs, and being able to do that without being a static analysis expert, came up as an important want.

We initially built an internal framework using tree-sitter [3] for our proprietary infrastructure-as-code analyzers, which enabled us to rapidly create new checkers. We realized that making the framework open-source could solve this problem for everyone.

Our key insight was that writing checkers isn't the hard part anymore. Modern AI assistants like ChatGPT and Claude are excellent at generating tree-sitter queries with very high accuracy. We realized that the tree-sitters' gnarly s-expression syntax isn’t a problem anymore (since the AI will be doing all the generation anyway), and we can instead focus on building a fast, flexible, and reliable checker runtime around it.

So instead of creating yet another DSL, we use tree-sitter's native query syntax. Yes, the expressions look more complex than simplified DSLs, but they give you direct access to your code's actual AST structure – which means your rules work exactly as you'd expect them to. When you need to debug a rule, you're working with the actual structure of your code, not an abstraction that might hide important details.

We've also designed Globstar to have a gradual learning curve: The YAML interface works well for simple checkers, and the Go Interface can handle complex scenarios when you need features like cross-file analysis, scope resolution, data flow analysis, and context awareness. The Go API gives you direct access to tree-sitter bindings, so you can write arbitrarily complex checkers on day one.

Key features:

- Written in Go with native tree-sitter bindings, distributed as a single binary

- MIT-licensed

- Write all your checkers in a “.globstar” folder in your repo, in YAML or Go, and just run “globstar check” without any build steps

- Multi-language support through tree-sitter (20+ languages today)

We have a long way to go and a very exciting roadmap for Globstar, and we’d love to hear your feedback!

[1] https://globstar.dev/guides/writing-yaml-checker

[2] https://globstar.dev/guides/writing-go-checker

[3] https://tree-sitter.github.io/tree-sitter/

markrian

Interesting! Do you have a page which compares globstar against other similar tools, like Semgrep, ast-grep, Comby, etc?

For instance, something like https://ast-grep.github.io/advanced/tool-comparison.html#com....

etyp

I really love that static analyzers are pushing in this direction! I loved writing Clippy lints and I think applying that "it's just code" with custom checks is a powerful idea. I worked on a static analysis product and the rules for that were horrible, I don't blame the customers for not really wanting to write them.

Is there a general way to apply/remove/act on taint in Go checkers? I may not be digging deeply enough but it seems like the example just uses some `unsafeVars` map that is made with a magic `isUserInputSource` method. It's hard for me to immediately tell what the capabilities there are, I bet I'm missing a bit.

injuly

Flow analysis, especially propagation, is a hard problem to solve in the general case. IMO, the one tool that had the best, if language-specific, approach was Pyre – Facebook's type checker and static analyzer for Python.

xxpor

Another rule engine checker that doesn't support the language that needs this type of thing the most: C

In this case, it's inexplicable to me since tree-sitter supports C fine.

sanketsaurav

Supporting C / C++ is in our roadmap. It needs some additional work to handle preprocessor directives [1] [2], which is why we didn't focus on it for the initial release.

[1] https://github.com/tree-sitter/tree-sitter-c/issues/13

[2] https://github.com/tree-sitter/tree-sitter-c/issues/108

ievans

For C, you might be interested in https://github.com/weggli-rs/weggli or https://github.com/semgrep/semgrep (I work on the latter). Both are also tree-sitter based.

pdimitar

Wow this looks great. I will be giving it a go VerySoon™!

Looking forward to writing some enhanced linters.

codepathfinder

Nothing comes closer to CodeQL!

If anyone is interested please checkout, codepathfinder.dev, truly opensource CodeQL alternative.

Feedbacks are appreciated!

injuly

Admirable effort :)

But in its current state I don't think it actually replaces any of CodeQL's use cases. The most straight forward way to do what CodeQL does today, would to be implement a flow analysis IR (say CFG+CallGraph) on top of tree-sitter.

Even the QL grammar itself can be in tree-sitter.

henning

Is there a way to add a comment to disable the check rule similar to what you can do in ESLint to ignore a rule?

null

[deleted]

sanketsaurav

Not yet, but this is in our roadmap: https://github.com/DeepSourceCorp/globstar/issues/135

We're planning to implement a `skipcq` mute word.