Hacker News with Generative AI: Generative Models

GazeGaussian: High-Fidelity Gaze Redirection with 3D Gaussian Splatting (ucwxb.github.io)
Gaze estimation encounters generalization challenges when dealing with out-of-distribution data.

Computer Vision, Machine Learning, Generative Models, Research

4 points by Hard_Space 226 days ago | 0 comments

The geometry of data: the missing metric tensor and the Stein score [Part II] (christianperone.com)
I’m writing this second part of the series because I couldn’t find any formalisation of this metric tensor that naturally arises from the Stein score (especially when used with learned models), and much less blog posts or articles about it, which is surprising given its deep connection between score-based generative models, diffusion models and the geometry of the data manifold.

Machine Learning, Data Science, Generative Models, Geometry

64 points by perone 233 days ago | 7 comments

GenXD: Generating Any 3D and 4D Scenes (arxiv.org)
Recent developments in 2D visual generation have been remarkably successful. However, 3D and 4D generation remain challenging in real-world applications due to the lack of large-scale 4D data and effective model design.

3D Modeling, 4D Modeling, Computer Vision, Artificial Intelligence, Generative Models

9 points by sandwichsphinx 242 days ago | 0 comments

Lotus: Diffusion-Based Visual Foundation Model for High-Quality Dense Prediction (lotus3d.github.io)
We present Lotus, a diffusion-based visual foundation model for dense geometry prediction. With minimal training data, Lotus achieves SoTA performance in two key geometry perception tasks, i.e., zero-shot depth and normal estimation. "Avg. Rank" indicates the average ranking across all metrics, where lower values are better. Bar length represents the amount of training data used.

Computer Vision, Artificial Intelligence, Generative Models, Machine Learning

47 points by jasondavies 262 days ago | 2 comments

Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency (loopyavatar.github.io)

Artificial Intelligence, Computer Vision, Generative Models, Audio Processing

10 points by caohongyuan 302 days ago | 2 comments

Degas: Detailed Expressions on Full-Body Gaussian Avatars (initialneil.github.io)

Artificial Intelligence, Art, Generative Models

150 points by smusamashah 318 days ago | 13 comments

The Path to StyleGan2 – Implementing the Progressive Growing GAN (ym2132.github.io)

Deep Learning, Computer Vision, Generative Models, Artificial Intelligence

58 points by Two_hands 335 days ago | 13 comments

Stable Diffusion 3 Medium Released (huggingface.co)