Hacker News with Generative AI: AI Research

DeepRAG: Thinking to retrieval step by step for large language models (arxiv.org)
Large Language Models (LLMs) have shown remarkable potential in reasoning while they still suffer from severe factual hallucinations due to timeliness, accuracy, and coverage of parametric knowledge.
Scaling the Tülu 3 post-training recipes to surpass the perf of DeepSeek V3 (allenai.org)
Following the success of our Tülu 3 release in November, we are thrilled to announce the launch of Tülu 3 405B—The first application of fully open post-training recipes to the largest open-weight models. With this release, we demonstrate the scalability and effectiveness of our post-training recipe applied at 405B parameter scale.
OpenAI: DeepSeek "found some of the core ideas that we did on our way to o1" (twitter.com)
Bespoke-Stratos: The unreasonable effectiveness of reasoning distillation (bespokelabs.ai)
We trained Bespoke-Stratos-32B, our reasoning model distilled from DeepSeek-R1 using Berkeley NovaSky’s Sky-T1 data pipeline.
FrontierMath Was Funded by OpenAI (lesswrong.com)
FrontierMath was funded by OpenAI.
Explaining Large Language Models Decisions Using Shapley Values (arxiv.org)
The emergence of large language models (LLMs) has opened up exciting possibilities for simulating human behavior and cognitive processes, with potential applications in various domains, including marketing research and consumer behavior analysis.
Ilya Sutskever NeurIPS talk [video] (youtube.com)
OpenAI’s cofounder and former chief scientist, Ilya Sutskever, made headlines earlier this year after he left to start his own AI lab called Safe Superintelligence Inc.
Ethical Challenges Related to the NeurIPS 2024 Best Paper Award (var-integrity-report.github.io)
To AI Research Community: This report is written to convey our serious concerns about the recent recipient of the Best Paper award at NeurIPS 2024, Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction (VAR) . While we acknowledge that this NeurIPS paper is technically sound, we must emphasize that it involves serious misconduct by the first author (Keyu Tian), which fundamentally undermines the core values of integrity and trust upon which our academic community is built.
Ethical Challenges Related to the NeurIPS 2024 Best Paper Award (var-integrity-report.github.io)
To AI Research Community: This report is written to convey our serious concerns about the recent recipient of the Best Paper award at NeurIPS 2024, Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction (VAR). While we acknowledge that this NeurIPS paper is technically sound, we must emphasize that it involves serious misconduct by the first author (Keyu Tian), which fundamentally undermines the core values of integrity and trust upon which our academic community is built.
The Lost Reading Items of Ilya Sutskever's AI Reading List (tensorlabbet.com)
In this post: An attempt to reconstruct Ilya Sutskever's 2020 AI reading list (8 min read)
GPTs Are Maxed Out (thealgorithmicbridge.com)
March 2024. OpenAI CEO Sam Altman joins podcaster Lex Fridman for the second time since ChatGPT came out a year prior. The stakes are high and anticipation is tangible. GPT-5 appears to be around the corner. Altman, elusive as always, provides only one data point for us hungry spectators: The next-gen model (he doesn’t name it) will be better than GPT-4 to the same degree that GPT-4 was better than GPT-3.
Hunyuan-Large: An Open-Source Moe Model with 52B Activated Parameters (arxiv.org)
In this paper, we introduce Hunyuan-Large, which is currently the largest open-source Transformer-based mixture of experts model, with a total of 389 billion parameters and 52 billion activation parameters, capable of handling up to 256K tokens.
Meta FAIR refuses to cite a pre-existing open-source project – to claim novelty (granadacoders.es)
Large Enough (mistral.ai)
Chat with Meta Llama 3.1 405B (replicate.dev)
ChatGPT is better at generating code for problems written before 2021 (ieee.org)
OpenAI's GPT-5 Pushed Back to Late 2025, but Promises PhD-Level Abilities (mashable.com)
Getting 50% (SoTA) on Arc-AGI with GPT-4o (redwoodresearch.substack.com)
AI Appears to Rapidly Be Approaching Brick Wall Where It Can't Get Smarter (futurism.com)
GPT-4o's Memory Breakthrough – Needle in a Needlestack (llmonpy.ai)
Evidence that LLMs are reaching a point of diminishing returns (garymarcus.substack.com)