Hacker News with Generative AI: Machine Learning

Skyvern Browser Agent 2.0: How We Reached State of the Art in Evals (skyvern.com)
We’ve been working hard cooking up something new to share with you all!
MuJoco Playground (mujoco.org)
We introduce MuJoCo Playground, a fully open-source framework for robot learning built with MJX, with the express goal of streamlining simulation, training, and sim-to-real transfer onto robots.
Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget (github.com/SonyResearch)
This repository provides a minimalistic implementation of our approach to training large-scale diffusion models from scratch on an extremely low budget.
4M Tokens Context Model (github.com/MiniMax-AI)
Train faster static embedding models with sentence transformers (huggingface.co)
This blog post introduces a method to train static embedding models that run 100x to 400x faster on CPU than state-of-the-art embedding models, while retaining most of the quality. This unlocks a lot of exciting use cases, including on-device and in-browser execution, edge computing, low power and embedded applications.
Show HN: Curator – an open-source library for synthetic data generation (github.com/bespokelabsai)
Bespoke Curator makes it easy to create synthetic data pipelines. Whether you are training a model or extracting structure, Curator will prepare high-quality data quickly and robustly.
Ε, a Nuisance No More (zna.do)
For a while now I have been advocating for tuning ε in various parts of the modern deep learning stack, and in this post I’ll explain why.
Transformer^2: Self-Adaptive LLMs (sakana.ai)
Adaptation is one of the most remarkable phenomena in nature. From the way an octopus can change their skin color to blend into its surroundings, to how the human brain rewires itself after an injury, allowing individuals to recover lost functions and adapt to new ways of thinking or moving. Living organisms exhibit adaptability that allows life to flourish in diverse and ever-changing environments.
Don't use cosine similarity carelessly (migdal.pl)
Midas turned everything he touched into gold. Data scientists turn everything into vectors. We do it for a reason — as gold is the language of merchants, vectors are the language of AI1.
Show HN: Value likelihoods for OpenAI structured output (arena-ai.github.io)
structured-logprobs is an open-source Python library that enhances OpenAI's structured outputs by providing detailed information about token log probabilities.
Voyage-code-3 (voyageai.com)
TL;DR – Introducing voyage-code-3, our next-generation embedding model optimized for code retrieval. It outperforms OpenAI-v3-large and CodeSage-large by an average of 13.80% and 16.81% on a suite of 32 code retrieval datasets, respectively. By supporting smaller dimensions with Matryoshka learning and quantized formats like int8 and binary, voyage-code-3 can also dramatically reduce storage and search costs with minimal impact on retrieval quality.
Titans: Learning to Memorize at Test Time (arxiv.org)
Over more than a decade there has been an extensive research effort on how to effectively utilize recurrent models and attention.
AI Engineer Reading List (latent.space)
We picked 50 paper/models/blogs across 10 fields in AI Eng: LLMs, Benchmarks, Prompting, RAG, Agents, CodeGen, Vision, Voice, Diffusion, Finetuning. If you're starting from scratch, start here.
VideoRAG: Retrieval-Augmented Generation over Video Corpus (arxiv.org)
Retrieval-Augmented Generation (RAG) is a powerful strategy to address the issue of generating factually incorrect outputs in foundation models by retrieving external knowledge relevant to queries and incorporating it into their generation process.
Diffusion training from scratch on a micro-budget (github.com/SonyResearch)
This repository provides a minimalistic implementation of our approach to training large-scale diffusion models from scratch on an extremely low budget.
Nvidia Tensor Core Programming (leimao.github.io)
NVIDIA Tensor Cores are dedicated accelerators for general matrix multiplication (GEMM) operations on NVIDIA GPUs since the Volta architecture.
Show HN: SemHash – Fast Semantic Text Deduplication for Cleaner Datasets (github.com/MinishLab)
Homomorphic Encryption in iOS 18 (boehs.org)
You are Apple. You want to make search work like magic in the Photos app, so the user can find all their “dog” pictures with ease. You devise a way to numerically represent the concepts of an image, so that you can find how closely images are related in meaning. Then, you create a database of known images and their numerical representations (“this number means car”), and find the closest matches. To preserve privacy, you put this database on the phone.
Experiments with Byte Matrix Multiplication (github.com/serge-sans-paille)
It's quite common in machine learning operations to multiply a matrix of unsigned byte by a matrix of signed byte.
Learning how to think with Meta Chain-of-Thought (arxiv.org)
We propose a novel framework, Meta Chain-of-Thought (Meta-CoT), which extends traditional Chain-of-Thought (CoT) by explicitly modeling the underlying reasoning required to arrive at a particular CoT.
An Introduction to Neural Ordinary Differential Equations [pdf] (diposit.ub.edu)
Apple's Machine Learning Research can now detect Heart Murmurs with 95% accuracy (myhealthyapple.com)
Apple has been on the forefront of cardio tech since it rolled out the Apple Watch close to a decade ago. Many of the company’s innovations, such as atrial fibrillation detection and irregular heart-beat detection features have proven to be life saving.
Embedding Models for Information Retrieval in 2025 (datastax.com)
The just-released Voyage-3-large is the surprise leader in embedding relevance
Show HN: TabPFN v2 – A SOTA foundation model for small tabular data (nature.com)
Tabular data, spreadsheets organized in rows and columns, are ubiquitous across scientific fields, from biomedicine to particle physics to economics and climate science1,2.
How to become a Data Scientist? My journey, overview of skill set, practice tips (mljar.com)
In recent years, many people have been drawn to the field of data science, often believing it to be a fast track to wealth.
SOTA on swebench-verified: relearning the bitter lesson (aide.dev)
SWE-bench is a dataset that tests systems' ability to solve GitHub issues automatically.
Phi-4 weights have been released under MIT license (huggingface.co)
phi-4 is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The goal of this approach was to ensure that small capable models were trained with data focused on high quality and advanced reasoning.phi-4 underwent a rigorous enhancement and alignment process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures
AI in the 80s? How a Simple Animal Guessing Game Pioneered Machine Learning (medium.com)
Recently, I stumbled upon an old programming book on the shelf in the library of my childhood. Yellowed pages, the smell of dust, and lines printed in monochrome style. Among examples of seemingly long-outdated algorithms, I came across a game called “Guess the Animal.”
Parquet and ORC's many shortfalls for machine learning, and what to do about it? (starburst.io)
At the turn of the century (around a quarter of a decade ago), over 99% of the data management industry used row-oriented storage to store data for all workloads involving structured data — including transactional and analytical workloads.
Data from macaque monkeys reveals flaws in deep neural networks (seas.harvard.edu)
Among the marvels of the human brain is its ability to generalize. We see an object, like a chair, and we know it’s a chair, even when it’s a slightly different shape, or it’s found in an unexpected place or in a dimly lit environment.