Hacker News with Generative AI: Neural Networks

The Matrix Calculus You Need for Deep Learning (explained.ai)
Most of us last saw calculus in school, but derivatives are a critical part of machine learning, particularly deep neural networks, which are trained by optimizing a loss function.
Tied Crosscoders: Tracing How Chat LLM Behavior Emerges from Base Model (lesswrong.com)
We are interested in model-diffing: finding what is new in the chat model when compared to the base model. One way of doing this is training a crosscoder, which would just mean training an SAE on the concatenation of the activations in a given layer of the base and chat model. When training this crosscoder, we find some latents whose decoder vector mostly helps reconstruct the base model activation and does not affect the reconstruction for the chat model activation.
CHM Releases AlexNet Source Code (computerhistory.org)
In partnership with Google, CHM has released the source code to AlexNet, the neural network that in 2012 kick-started today’s prevailing approach to AI. It is available as open source here.
Neurosymbolic Decision Trees (arxiv.org)
Neurosymbolic (NeSy) AI studies the integration of neural networks (NNs) and symbolic reasoning based on logic.
Deep Learning Is Not So Mysterious or Different (arxiv.org)
Deep neural networks are often seen as different from other model classes by defying conventional notions of generalization.
Arbitrary-Scale Super-Resolution with Neural Heat Fields (therasr.github.io)
Thera is the first arbitrary-scale super-resolution method with a built-in physical observation model.
Francis Crick – The Recent Excitement About Neural Networks [pdf] (1989) (wordpress.com)
Deriving Muon (jeremybernste.in)
We recently proposed Muon: a new neural net optimizer. Muon has garnered attention for its excellent practical performance: it was used to set NanoGPT speed records leading to interest from the big labs.
Get Started with Neural Rendering Using Nvidia RTX Kit (Vulkan) (nvidia.com)
Neural rendering is the next era of computer graphics.  By integrating neural networks into the rendering process, we can take dramatic leaps forward in performance, image quality, and interactivity to deliver new levels of immersion.
Analogies Between Startups and Neural Networks (mikealche.com)
I’m consulting for a company that the had a crucial team member taking “extended vacations”. Ouch!
Computer Simulation of Neural Networks Using Spreadsheets (2018) (arxiv.org)
The article substantiates the necessity to develop training methods of computer simulation of neural networks in the spreadsheet environment.
Z-Ant: An Open-Source SDK for Neural Network Deployment on Microprocessors (github.com/ZIGTinyBook)
Zant (Zig-Ant) is an open-source SDK designed to simplify deploying Neural Networks (NN) on microprocessors.
AI Designed Computer Chips That the Human Mind Can't Understand (popularmechanics.com)
A new neural network process has designed wireless chips that can outperform existing ones.
An overview of gradient descent optimization algorithms (2016) (ruder.io)
Gradient descent is one of the most popular algorithms to perform optimization and by far the most common way to optimize neural networks.
An Introduction to Neural Ordinary Differential Equations [pdf] (diposit.ub.edu)
NNCP: Lossless Data Compression with Neural Networks (bellard.org)
NNCP is an experiment to build a practical lossless data compressor with neural networks.
The Structure of Neural Embeddings (seanpedersen.github.io)
A small collection of insights on the structure of embeddings (latent spaces) produced by deep neural networks.
A Gentle Introduction to Graph Neural Networks (2021) (distill.pub)
Neural networks have been adapted to leverage the structure and properties of graphs. We explore the components needed for building a graph neural network - and motivate the design choices behind them.
No More Adam: Learning Rate Scaling at Initialization Is All You Need (arxiv.org)
In this work, we question the necessity of adaptive gradient methods for training deep neural networks.
Sequence to sequence learning with neural networks: what a decade (youtube.com)
Neuroevolution of augmenting topologies (NEAT algorithm) (wikipedia.org)
NeuroEvolution of Augmenting Topologies (NEAT) is a genetic algorithm (GA) for the generation of evolving artificial neural networks (a neuroevolution technique) developed by Kenneth Stanley and Risto Miikkulainen in 2002 while at The University of Texas at Austin.
Inferring neural activity before plasticity for learning beyond backpropagation (nature.com)
For both humans and machines, the essence of learning is to pinpoint which components in its information processing pipeline are responsible for an error in its output, a challenge that is known as ‘credit assignment’.
Quark: Real-Time, High-Resolution, and General Neural View Synthesis (quark-3d.github.io)
We present a novel neural algorithm for performing high-quality, high-resolution, real-time novel view synthesis.
Bayesian Neural Networks (cs.toronto.edu)
Bayesian inference allows us to learn a probability distribution over possible neural networks. We can approximately solve inference with a simple modification to standard neural network tools. The resulting algorithm mitigates overfitting, enables learning from small datasets, and tells us how uncertain our predictions are.
Neural Optical Flow for PIV in Fluids (synthical.com)
Physics-informed Shadowgraph Network: End-to-end Density Field Reconstruction (arxiv.org)
This study presents a novel approach for quantificationally reconstructing density fields from shadowgraph images using physics-informed neural networks
SharpNEAT – evolving NN topologies and weights with a genetic algorithm (sourceforge.io)
Neuroevolution of Augmenting Topologies (NEAT) is an evolutionary algorithm for evolving artificial neural networks.
It all started with a perceptron (medium.com)
In homage to John Hopfield and Geoffrey Hilton, Nobel Prize winners for their “fundamental discoveries and inventions that made machine learning and artificial neural networks possible,” I propose to explore the foundations of connectionist AI.
Implementing neural networks on the "3 cent" 8-bit microcontroller (wordpress.com)
Bouyed by the surprisingly good performance of neural networks with quantization aware training on the CH32V003, I wondered how far this can be pushed. How much can we compress a neural network while still achieving good test accuracy on the MNIST dataset?
Neural Networks (MNIST inference) on the "3-cent" Microcontroller (wordpress.com)
Bouyed by the surprisingly good performance of neural networks with quantization aware training on the CH32V003, I wondered how far this can be pushed. How much can we compress a neural network while still achieving good test accuracy on the MNIST dataset?