Hacker News with Generative AI: Reasoning

DeepThought-8B: A small, capable reasoning model (ruliad.co)
Today we're releasing DeepThought-8B, a small, capable AI reasoning model built on LLaMA-3.1 8B.
The Problem with Reasoners (notion.site)
LLaVA-O1: Let Vision Language Models Reason Step-by-Step (arxiv.org)
Large language models have demonstrated substantial advancements in reasoning capabilities, particularly through inference-time scaling, as illustrated by models such as OpenAI's o1. However, current Vision-Language Models (VLMs) often struggle to perform systematic and structured reasoning, especially when handling complex visual question-answering tasks.
Reasonable Person Principle (cs.cmu.edu)
Everyone will be reasonable. Everyone expects everyone else to be reasonable. No one is special. Do not be offended if someone suggests you are not being reasonable.
Detecting when LLMs are uncertain (thariq.io)
This post tries to explain the new reasoning techniques developed by XJDR in a new project called Entropix.
Use Prolog to improve LLM's reasoning (shchegrikovich.substack.com)
On one side, LLMs show unseen capabilities in reasoning, but on the other - reasoning in LLMs is not ideal.
Google is working on AI software with human-like reasoning ability (msn.com)
LLMs still can't reason like humans (freethink.com)
Imagine what would happen if you attempted the following experiment: First, place a washed, fresh tomato and an equally clean carrot on top of a normal kitchen plate. With one hand behind your back, flip the non-stick plate upside-down, inspecting the underside of the plate for marks. Now, slowly turn the plate right-side up and count the number of vegetables remaining on top. How many are on the plate?
Deductive Verification for Chain-of-Thought Reasoning in LLMs (arxiv.org)
Large Language Models (LLMs) significantly benefit from Chain-of-Thought (CoT) prompting in performing various reasoning tasks.
Inductive or deductive? Rethinking the fundamental reasoning abilities of LLMs (arxiv.org)
OpenAI working on reasoning tech under code name 'Strawberry' (reuters.com)
Claude uses hidden chain of thoughts to plan artifact use (ycombinator.com)
Q*: Improving Multi-Step Reasoning for LLMs with Deliberative Planning (arxiv.org)
RAR-B: Reasoning as Retrieval Benchmark (arxiv.org)
Simple tasks showing reasoning breakdown in state-of-the-art LLMs (arxiv.org)
Can large language models reason? (arnaldur.be)
GitHub: Awesome-reasoning, a curated list of datasets for reasoning AIs (github.com/neurallambda)