Show HN: TypeSchema – A JSON specification to describe data models

51 comments

·October 24, 2024

fridental

Three big downturns for me:

1) They do not publish rationale of why the world needs yet another protocol / language / framework on the homepage. It is hidden in https://typeschema.org/history

2) In the history page, they confuse strongly typed and statically typed languages. I have a prejudice about people doing this.

3) The biggest challenge about data models is not auto-generated code (that many people would avoid in principle anyway), but compressed, optimized wire serialization. So you START with selecting this for your application (eg. AVRO, CapnProto, MessagePack etc) and then use the schema definition language coming with the serialization tool you've chosen.

deskr

> ... auto-generated code (that many people would avoid in principle anyway)

Auto generated code is 100% enough, sometimes.

dirkt

I still have not found any way to use autogenerated code for Java/Spring that can handle updates to an external OpenAPI spec.

Any pointers?

(Serious question).

ericyd

Point #1 was my biggest turn off. Numbers 2 and 3 are good points too.

owlstuffing

> 1) Yet another protocol etc.

Agreed.

> 3) The biggest challenge about data models is not auto-generated code

I would say auto-generated code is most definitely the harder problem to solve, and I’d also go out on a limb and say it is THE problem to solve.

Whether it’s JSON, XML, JavaScript, SQL, or what have you, integrating both data and behavior between languages is paramount. But nothing has changed in the last 40+ years solving this problem, we still generate code the same clumsy way… Chinese wall between systems, separate build steps, and all the problems that go with it.

Something like project manifold[1] for the jvm world is in my view the way forward. Shrug.

1. https://github.com/manifold-systems/manifold

dominicrose

also the output in markdown and php doesn't seem good

nsjdjwnn

I mean, Java and Go are strongly typed languages if you consider Object a = new Integer(); a = new Float(); to be strong.

They are also strict of cause

benatkin

Have you heard of wit? I suspect we'll see use outside of WebAssembly. https://component-model.bytecodealliance.org/design/wit.html

It has non-nullable types, via option, which makes non-nullable the default, since you have to explicitly wrap it in option. https://component-model.bytecodealliance.org/design/wit.html...

A way to represent types commonly found in major languages would be nice, but it would be better to start with something like wit and build on top of it, or at least have a lot of overlap with it.

pdimitar

That was a great read and it gave me several ideas. Thank you.

matthewtovbin

Why reinvent https://json-schema.org ?? Pros/cons?

michaelsalim

From my understanding, JSON schema describes the schema of JSON objects with JSON. This one describes a variety of types of schemas with JSON.

So it could be typescript, Go, GraphQL, etc. It seems to output to JSON schema as well. I guess its main purpose is to share the schema between different languages. Which I imagine works with JSON schema too, but this takes it a step further and handle all the mapping you'd need to do otherwise.

froh

json schema has nuanced and expressive constraints to validate information exchanged in json serialization.

typeschema in contrast seems to focus on describing just the structure of data with the goal to generate stubs in a wide variety of programming languages.

oaiey

so why not sub-setting JSON Schema? Like done with XML Infoset for example compared to XSD. And extensions are also possible to achieve POCO details as needed.

cmgriffing

I find it interesting that the Go serialization just duplicates the props rather than using composition: https://typeschema.org/example/go

Seems a bit naively implemented.

Ideally, the duplicated props in Student would just be a single line of `Human`.

Onawa

Comparison between TypeSchema and LinkML for those interested as I was. https://www.perplexity.ai/search/please-compare-and-contrast...

whizzter

What's the benefit over existing variants like Swagger/OpenAPI/JsonSchema ?

mariocesar

It feels like a convert solution, as it can transform TypeSchema into JsonSchema.

8338550bff96

Yeah, I'm not really following the line of reasoning presented on the "/history" page: https://typeschema.org/history

It seems to me like a mischaracterization of JSON Schema to say you can't define a concrete type without actual data.

I am a very stupid individual so I could be misunderstanding the argument.

andix

I can't really follow those arguments either. For example the empty object example {}. Why is this bad? Types without properties are a real thing. Also an empty schema is a real thing.

The thought I do get: JSON Schema primarily describes one main document (object/thing). And additionally defines named types (#/definitions/Student). But it's totally fine to just use the definitions for code generation.

The reference semantics of JSON Schema is quite powerful, a little bit like XML with XSD and all the different imports and addons.

llamaLord

Maybe it's just me, but I've never been able to get a complex type schema to work properly with JSON schema.

The moment you have types referencing other types in a way that can become recursive in ANY way, the whole thing seems to explode.

dangsux

[dead]

RedShift1

Heh feels like Json schema to me too... Same, but different.

drdaeman

Feels much weaker/naive than JSON Schema, as TypeSchema barely has any constraints.

The TypeSchema spec is hard to comprehend as it doesn't delve into any details and looks like just a bunch of random examples with comments than a proper definitive document (e.g. they don't ever seem to define what "date-time" string format is). I don't see a way to say, e.g., that a string must be an UUIDv7, or that an integer must be non-negative, or support for heterogeneous collections, etc etc.

Maybe it has some uses for code generation across multiple languages for very simple JSON structures, but that feels like a very niche use case. And even then, if you have to hook up per-language validation logic anyway (and probably language-specific patterns too, to express concepts idiomatically), what's the point of a code generator?

amanzi

"What is the difference to JSON Schema? JSON Schema is a constraint system which is designed to validate JSON data. Such a constraint system is not great for code generation, with TypeSchema our focus is to model data to be able to generate high quality code."

They have more details on the History page.

mchicken

It looks far more constrained, especially when it comes to the validation logic, which makes sense validation-wise but honestly quickly becomes a "fate shovels shit in my face" kind of situation when it comes to code generation. As much as I love this sort of constraints I also find the union-type discrimination style "meh".

ssousa666

Kotlin classes are (seemingly) all generated as open classes, rather than data classes. Surprising choice - is this an intentional design decision? Wondering if I am missing something

tauntz

The output in various languages in rather questionable. Not wrong per-se as it's totally valid code, but just.. not idiomatic and not how a developer fluent in that language would implement it.

nicholaswmin

Hi man - Don't take my tone the wrong way but it's the only way i can express this. I will never, ever - EVER use your craft project without a complete series of unit-tests. Especially one like yours. I stop reading immediately and just go on about my life.

Good effort though.

Edit: Oh I thought it was yours. Well I'll leave this up anyway.

cernocky

I once read a paper about Apache/Meta Thrift [1,2]. Similarly, it allows the definition of data types/interfaces and code generation for many programming languages. It was specifically designed for RPCs and microservices.

[1]: https://thrift.apache.org/

[2]: https://github.com/facebook/fbthrift

bobbylarrybobby

The rust generator seems not to place generic parameters on the type itself?

use serde::{Serialize, Deserialize}; #[derive(Serialize, Deserialize)] pub struct Map { #[serde(rename = "totalResults")] total_results: Option<u64>,

    #[serde(rename = "entries")]
    entries: Option<Vec<T>>,

}

sevensor

Why is everything nullable?

gregw2

Kinda crazy question, but why not support SQL table/column DDL (nested JSON or arrays within those for bonus points)?