Hacker News with Generative AI: Data Formats

Cramming Scrapscript into Msgpack (taylor.town)
msgpack is a lovely little serialization format. As a JSON replacement, it saves bandwidth while preserving native language features (e.g. tuples, records, objects, dates).
Nobody gets fired for picking JSON, but maybe they should? (mcyoung.xyz)
JSON is extremely popular but deeply flawed. This article discusses the details of JSON’s design, how it’s used (and misused), and how seemingly helpful “human readability” features cause headaches instead. Crucially, you rarely find JSON-based tools (except dedicated tools like jq) that can safely handle arbitrary JSON documents without a schema—common corner cases can lead to data corruption!
Nobody Gets Fired for Picking JSON, but Maybe They Should? (mcyoung.xyz)
JSON is extremely popular but deeply flawed. This article discusses the details of JSON’s design, how it’s used (and misused), and how seemingly helpful “human readability” features cause headaches instead. Crucially, you rarely find JSON-based tools (except dedicated tools like jq) that can safely handle arbitrary JSON documents without a schema—common corner cases can lead to data corruption!
Pg_parquet – Postgres to Parquet Interoperability (i-programmer.info)
ASCII Delimited Text – Not CSV or Tab Delimited Text (wordpress.com)
Unfortunately a quick google search on “ASCII Delimited Text” shows that IBM and Oracle failed to read the ASCII specification and both define ASCII Delimited Text as a CSV format.  ASCII Delimited Text should use the record separators defined as ASCII 28-31.
New better alterative to XML, JSON and YAML (xenondata.org)
Xenon is the best way to represent information: Terse. Readable multiple line indented text. Native support for arrays Native support for a graph structure, elements may have multiple parents. Native support for a types used in serialization. Unambiguous choice of data structure. Efficient to write by hand. Can be implemented to be blazingly fast or using a mode-less tokenizer/parser. The xenon document is named.
JSON Patch (zuplo.com)
JSON Patch is a standardized format defined in RFC 6902 for describing how to modify a JSON document.
JSON is usually the least bad option for machine-readable output formats (utoronto.ca)
CSVs Are Kinda Bad. DSVs Are Kinda Good (matthodges.com)
Recommended Formats Statement (loc.gov)
Why CSV is still king (konbert.com)
Data Formats: 3D, Audio, Image (paulbourke.net)
TSV – Alternative to CSV (wikipedia.org)
LSON: JSON with binary in 260 lines of public domain Lua (github.com/civboot)
Demystifying the protobuf wire format (kreya.app)
Buckets of Parquet Files Are Awful (scratchdata.com)
The Birth of Parquet (sympathetic.ink)
Object Linking and Embedding (wikipedia.org)
Show HN: ZSV (Zip Separated Values) columnar data format (github.com/Hafthor)