Hacker News with Generative AI: ONNX

Low-Latency Bayesian Inference: Deploying Models with PyTorch and ONNX (world.hey.com)
Deploying Bayesian models in production often requires balancing predictive accuracy with low-latency inference.
Transposing Tensor Files (mmapped.blog)
The safetensors library from Huggingface is popular for representing tensors on disk, and its data layout is fully compatible with the onnx raw tensor data format.