Hacker News with Generative AI: Machine Learning

FlowTSE: Target Speaker Extraction with Flow Matching (arxiv.org)
Target speaker extraction (TSE) aims to isolate a specific speaker's speech from a mixture using speaker enrollment as a reference.

Speech Recognition, Audio Processing, Machine Learning, Artificial Intelligence

25 points by agold97 51 days ago | 2 comments

Show HN: AutoThink – Boosts local LLM performance with adaptive reasoning (ycombinator.com)
I built AutoThink, a technique that makes local LLMs reason more efficiently by adaptively allocating computational resources based on query complexity.

AI, Machine Learning, Performance Optimization, Software

397 points by codelion 52 days ago | 68 comments

OpenTPU: Open-Source Reimplementation of Google Tensor Processing Unit (TPU) (github.com/UCSBarchlab)
OpenTPU is an open-source re-implementation of Google's Tensor Processing Unit (TPU) by the UC Santa Barbara ArchLab.

Open Source, Hardware, Machine Learning, Google

166 points by walterbell 52 days ago | 22 comments

Show HN: Maestro – A Framework to Orchestrate and Ground Competing AI Models (ycombinator.com)

Artificial Intelligence, Machine Learning, Software, Frameworks

23 points by defqon1 52 days ago | 5 comments

Show HN: Free mammogram analysis tool combining deep learning and vision LLM (neuralrad.com:5300)

Machine Learning, Healthcare, Deep Learning, Computer Vision

17 points by coolwulf 52 days ago | 15 comments

Show HN: Meteosource – Hyper-local weather API based on improved ML models (meteosource.com)
At an affordable price, you will receive accurate and reliable data that you can easily implement into your website or application. We also help you optimise weather-dependent activities.

Weather, Machine Learning, API, Software

10 points by Sikara 52 days ago | 7 comments

Outcome-Based Reinforcement Learning to Predict the Future (arxiv.org)
Reinforcement learning with verifiable rewards (RLVR) has boosted math and coding in large language models, yet there has been little effort to extend RLVR into messier, real-world domains like forecasting.

Reinforcement Learning, Artificial Intelligence, Forecasting, Machine Learning

99 points by bturtel 52 days ago | 15 comments

Gradient-Based Program Repair: Fixing Bugs in Continuous Program Spaces (arxiv.org)
Automatic program repair seeks to generate correct code from buggy programs, with most approaches searching the correct program in a discrete, symbolic space of source code tokens.

Software Engineering, Programming, Artificial Intelligence, Machine Learning

16 points by andre15silva 52 days ago | 0 comments

Some signs of AI model collapse begin to reveal themselves (theregister.com)
Prediction: General-purpose AI could start getting worse

Artificial Intelligence, Generative AI, AI Safety, Machine Learning

50 points by penda 53 days ago | 25 comments

Scaling RNNs to Billions of Parameters with Zero Order (arxiv.org)
During inference, Recurrent Neural Networks (RNNs) scale constant in both FLOPs and GPU memory with increasing context length, as they compress all prior tokens into a fixed-size memory.

Machine Learning, Artificial Intelligence, Deep Learning

7 points by fchaubard 53 days ago | 3 comments

You could have invented Transformers (gwern.net)
‘You Could Have Invented Transformers’ tutorial proposal

Artificial Intelligence, Machine Learning, Computer Science, Deep Learning

34 points by jxmorris12 54 days ago | 0 comments

Gemma 3n Architectural Innovations – Speculation and poking around in the model (reddit.com)
Gemma 3n is a new member of the Gemma family with free weights that was released during Google I/O. It's dedicated to on-device (edge) inference and supports image and text input, with audio input. Google has released an app that can be used for inference on the phone.

Generative AI, Artificial Intelligence, Machine Learning, Google, Software

16 points by nolist_policy 54 days ago | 7 comments

Direct Preference Optimization vs. RLHF (together.ai)
We're excited to announce that the Together Fine-Tuning Platform now supports Direct Preference Optimization (DPO)! This technique allows developers to align language models with human preferences creating more helpful, accurate, and tailored AI assistants. In this deep-dive blogpost, we provide details of what DPO is, how it works, when to use it and code examples. If you'd like to jump straight into code have a look at our code notebook.

Generative AI, Machine Learning, Artificial Intelligence

37 points by summarity 54 days ago | 1 comments

Neural Thermodynamic Laws for Large Language Model Training (arxiv.org)
Beyond neural scaling laws, little is known about the laws underlying large language models (LLMs). We introduce Neural Thermodynamic Laws (NTL) -- a new framework that offers fresh insights into LLM training dynamics.

Artificial Intelligence, Machine Learning, Research, Science

6 points by anticensor 54 days ago | 0 comments

Show HN: I made a OSS alternative to Weights and Biases (github.com/mlop-ai)
mlop is a Machine Learning Operations (MLOps) framework. It provides self-hostable superior experimental tracking capabilities and lifecycle management for training ML models. To get started, try out our introductory notebook or get an account with us today!

Machine Learning, Open Source, MLOps, Tools

6 points by LakeeSiv 55 days ago | 2 comments

Ask HN: AI Reading List (ycombinator.com)
In the thread about John Carmack presentation, somebody mentioned the reading list he got from Ilya which were crucial to understand what matters and the current state of the knowledge (at the time).<p>After some googling, it seems like this list is plausible, although not confirmed: https://github.com/dzyim/ilya-sutskever-recommended-reading?tab=readme-ov-file<p>What would an actualized list look today ?

Artificial Intelligence, Machine Learning, Reading Lists, OpenAI, Deep Learning

13 points by TheAlchemist 56 days ago | 4 comments

Model-Based Machine Learning (2023) (mbmlbook.com)

Machine Learning, Computer Science, Books

5 points by Tomte 56 days ago | 0 comments

Attention Wasn't All We Needed (stephendiehl.com)
There's a lot of modern transformer techniques that have been developed since the original Attention Is All You Need paper. Let's look at some of the most important ones that have been developed over the years and try to implement the basic ideas as succintly as possible. We'll use the Pytorch framework for most of the examples.

Generative AI, Machine Learning, Deep Learning, Transformer Models

130 points by mooreds 56 days ago | 24 comments

You Don't Need Re-Ranking: Understanding the Superlinked Vector Layer (superlinked.com)
When it comes to vector search, it's not just about matching words. Understanding the meaning behind them is equally important. But there are challenges. Sometimes, factors like text meaning, popularity, and recency can lead to results that aren't quite right. This is because vector search isn't always perfect at making precise matches.

Vector Search, Search Engines, Machine Learning, Text Meaning

29 points by softwaredoug 56 days ago | 20 comments

KumoRFM: A Foundation Model for In-Context Learning on Relational Data (kumo.ai)
Foundation Models (FMs) have completely taken over unstructured data domains like natural language and images, delivering significant advances in performance across tasks with little to no task-specific training. Yet structured and semi-structured relational data, which represent some of the most valuable information assets, largely miss out on this AI wave.

Foundation Models, Relational Data, Machine Learning, AI, In-Context Learning

110 points by cliffly 57 days ago | 19 comments

The Annotated Kolmogorov-Arnold Network (Kan) (alexzhang13.github.io)
Deep neural networks have been the driving force of developments in AI in the last decade. However, they currently suffer from several known issues such as a lack of interpretability, scaling issues, and data inefficiency – in other words, while they are powerful, they are not a perfect solution.

Artificial Intelligence, Machine Learning, Deep Learning, Networks

36 points by jxmorris12 57 days ago | 2 comments

Datadog opens sources a SOTA time series model and 350M point benchmark (datadoghq.com)
We are excited to announce a new open-weights release of Toto, our state-of-the-art time series foundation model (TSFM), and BOOM, a new public observability benchmark that contains 350 million observations across 2,807 real-world time series.

Open Source, Machine Learning, Benchmarking, Observability

13 points by chrisdevs 57 days ago | 1 comments

SUS backprop: linear backpropagation algorithm for long inputs in transformers (arxiv.org)
It is straightforward to design an unbiased gradient estimator that stochastically cuts the backpropagation flow through any part of a computational graph.

Artificial Intelligence, Machine Learning, Transformers, Deep Learning

9 points by brandonb 58 days ago | 0 comments

µPC: Scaling Predictive Coding to 100 Layer Networks (arxiv.org)
The biological implausibility of backpropagation (BP) has motivated many alternative, brain-inspired algorithms that attempt to rely only on local information, such as predictive coding (PC) and equilibrium propagation. However, these algorithms have notoriously struggled to train very deep networks, preventing them from competing with BP in large-scale settings. Indeed, scaling PC networks (PCNs) has recently been posed as a challenge for the community (Pinchetti et al., 2024).

Artificial Intelligence, Machine Learning, Deep Learning, Neural Networks

32 points by frozenseven 58 days ago | 0 comments

Harnessing the Universal Geometry of Embeddings (arxiv.org)
We introduce the first method for translating text embeddings from one vector space to another without any paired data, encoders, or predefined sets of matches.

Machine Learning, Embeddings, Vector Spaces

123 points by jxmorris12 58 days ago | 40 comments

An upgraded dev experience in Google AI Studio (googleblog.com)
Google AI Studio is the fastest place to start building with the Gemini API, with access to our most capable models, including Gemini 2.5 preview models, and generative media models like Imagen, Lyria RealTime, and Veo. At Google I/O, we announced new features to help you build and deploy complete applications, new model capabilities, and new features in the Google Gen AI SDK.

Generative AI, Machine Learning, Software Development, AI Models

197 points by meetpateltech 58 days ago | 110 comments

Depth Anything V2 (depth-anything-v2.github.io)
Depth Anything V2 is trained from 595K synthetic labeled images and 62M+ real unlabeled images, providing the most capable monocular depth estimation (MDE) model with the following features: more fine-grained details than Depth Anything V1, more robust than Depth Anything V1 and SD-based models (e.g., Marigold, Geowizard), more efficient (10x faster) and more lightweight than SD-based models, impressive fine-tuned performance with our pre-trained models. We also release six metric depth models of three scales for indoor and outdoor scenes, respectively.

Computer Vision, Machine Learning, Open Source, Image Processing, Artificial Intelligence

5 points by Brajeshwar 58 days ago | 0 comments

PlainsightAI Releases OpenFilter: Framework For Universal Vision Workloads (github.com/PlainsightAI)
OpenFilter is an universal abstraction for building and running vision workloads in modular image/video processing pipelines.

Computer Vision, Open Source, Machine Learning, Software, AI

14 points by nickstinemates 58 days ago | 4 comments

Show HN: KVoiceWalk – Voice cloning for Kokoro TTS using random walk algorithms (github.com/RobViren)
KVoiceWalk tries to create new Kokoro voice style tensors that clones target voices by using a random walk algorithm and a hybrid scoring method that combines Resemblyzer similarity, feature extraction, and self similarity.

Voice Cloning, Machine Learning, Open Source, AI, Text-to-Speech

13 points by robviren 58 days ago | 0 comments

Show HN: AI Baby Monitor – local Video-LLM that beeps when safety rules break (github.com/zeenolife)
Your second pair of eyes, powered by local video LLMs. Because, you know... it does take a village.

Artificial Intelligence, Machine Learning, Security, Children, Home Automation

94 points by zeenolife 58 days ago | 83 comments