Hacker News with Generative AI: Training Data

There's No Longer Any Doubt That Hollywood Writing Is Powering AI (theatlantic.com)
Dialogue from these movies and TV shows has been used by companies such as Apple and Anthropic to train AI systems.
SwiGLU activation function causes instability in FP8 LLM training (arxiv.org)
We train, for the first time, large language models using FP8 precision on datasets up to 2 trillion tokens -- a 20-fold increase over previous limits.
Leaked Docs Show Nvidia Scraping a Human Lifetime of Videos per Day to Train AI (404media.co)
Apple, Nvidia, Anthropic Used Swiped YouTube Videos to Train AI (proofnews.org)
YouTube creators surprised to find Apple and others trained AI on their videos (arstechnica.com)
Figma will use your content to train its AI (stackdiary.com)
OpenAI destroyed a trove of books used to train AI models (businessinsider.com)