Hacker News with Generative AI: AI Agents

Evaluating AI Agents with Azure AI Evaluation (microsoft.com)
Artificial intelligence agents are rapidly evolving from simple chatbots to agentic AI systems capable of planning, tool use, and autonomous decision-making. With this increased sophistication comes a pressing need for equally sophisticated evaluation methods. How do we measure if an AI agent is doing the right thing, using its tools correctly, and staying on task?

Artificial Intelligence, Evaluation, AI Agents, Azure

5 points by airylizard 60 days ago | 0 comments

Claude 4 (anthropic.com)
Today, we’re introducing the next generation of Claude models: Claude Opus 4 and Claude Sonnet 4, setting new standards for coding, advanced reasoning, and AI agents.

Generative AI, AI Agents, Coding, Reasoning

2013 points by meetpateltech 62 days ago | 1168 comments

Show HN: Convert existing agent projects from different framewrks to A2A servers (github.com/NapthaAI)
AutoA2A is a CLI tool that scaffolds the boilerplate required to run AI agents as servers compatible with Google's A2A protocol. It supports various agent frameworks — requiring minimal changes to your code.

AI Agents, Serverless, Open Source, Google

11 points by richardblythman 70 days ago | 0 comments

Which AI Agent is your favorite? (ycombinator.com)
I've created a directory for AI agents, and I'm curious about which ones are the most popular and frequently used. Have you started using AI agents to assist with your daily tasks? Which AI agent is your favorite?

Artificial Intelligence, AI Agents, User Experience, Productivity

4 points by jeyzolo 71 days ago | 7 comments

Athena – An open source production-ready general AI agent (github.com/Athena-AI-Lab)
Athena is a production-ready general AI agent built to do, not just think. It bridges insight with execution, helping you move from idea to results effortlessly.

Open Source, Artificial Intelligence, AI Agents, Production-Ready

13 points by yvbbrjdr 96 days ago | 2 comments

Ask HN: How to teach agentic AI? Please share your experience (ycombinator.com)
I started teaching agentic AI at our cooperative (Berlin). It is a one day intense workshop where I: <p>1. Introduce IntelliJ IDEA IDE and tools 2. Showcase my Unix-omnipotent educational open source AI agent called Claudine (which can basically do what Claude Code can do, but I already provided it in October 2024) 3. Go through glossary of AI-related terms 4. Explore demo code snippets gradually introducing more and more abstract concepts 5.

Artificial Intelligence, Education, Open Source, Programming, AI Agents

9 points by morisil 125 days ago | 9 comments

Show HN: Hyperbrowser MCP Server – Connect AI agents to the web through browsers (github.com/hyperbrowserai)
This is Hyperbrowser's Model Context Protocol (MCP) Server. It provides various tools to scrape, extract structured data, and crawl webpages. It also provides easy access to general purpose browser agents like OpenAI's CUA, Anthropic's Claude Computer Use, and Browser Use.

Web Scraping, AI Agents, Browsers, Open Source

63 points by shrisukhani 125 days ago | 26 comments

Arcade raises $12M from Perplexity co-founder's fund to make AI agents less bad (techcrunch.com)
Arcade, an AI agent infrastructure startup founded by former Okta exec Alex Salazar and former Redis engineer Sam Partee, has raised $12 million from Laude Ventures.

Artificial Intelligence, Funding, Startups, AI Agents

13 points by rdegges 127 days ago | 3 comments

Show HN: Computer – Build Your Manus AI Agent with an OSS macOS Sandbox (github.com/trycua)
Create and run high-performance macOS and Linux VMs on Apple Silicon, with built-in support for AI agents.

MacOS, AI Agents, Open Source, Virtual Machines, Apple Silicon

11 points by f-trycua 129 days ago | 6 comments

Launching the Naptha Stack v1 Beta (naptha.ai)
We're thrilled to announce the launch of the Naptha AI Stack v1 Beta, our groundbreaking open-source platform for autonomous AI agents that enables unprecedented collaboration across different AI agent architectures.

Artificial Intelligence, Open Source, Beta Releases, AI Agents

30 points by MacsHeadroom 161 days ago | 6 comments

How we improved GPT-4o multi-step function calling success rate by 4x (xpander.ai)
AI Agents will usher in a new era of human-computer interfaces, automation, personal AI assistants and AI employees. Function calling enables AI Agents to execute complex, multi-step workflows with precision, and plays a big part in fulfilling the promise of AI Agents.

Generative AI, Artificial Intelligence, AI Agents

13 points by jimminyx 237 days ago | 6 comments

Agent Graph System boosts GPT-4o multi-step function calling success rate by 4x (xpander.ai)
AI Agents will usher in a new era of human-computer interfaces, automation, personal AI assistants and AI employees. Function calling enables AI Agents to execute complex, multi-step workflows with precision, and plays a big part in fulfilling the promise of AI Agents.

Generative AI, AI Agents, Automation

10 points by jimminyx 244 days ago | 2 comments

SuperPrompt (github.com/NeoVertex1)
This is a project that I decided to open source because I think it might help others understand AI agents. This prompt took me many months and is still in phase of forever beta. You will want to use this prompt with Claude (as custom instructions in the project knowledge) but it also work with other llms.

Open Source, AI Agents, Prompt Engineering

11 points by MrBuddyCasino 257 days ago | 4 comments

Best Practices for Prompt Writing – DigitalOcean Documentation (digitalocean.com)
DigitalOcean GenAI Platform is an offering for building AI agents on GPU-powered infrastructure. Use the platform to create AI applications with foundation models and agent routes, knowledge bases, and Retrieval-Augmented Generation (RAG) pipelines. See Features for more information.

Prompt Engineering, Generative AI, AI Agents, DigitalOcean, Cloud Computing

4 points by afsanehhh 267 days ago | 0 comments

A review of OpenAI o1 and how we evaluate coding agents (cognition.ai)
Devin is an AI software engineering agent that autonomously completes coding tasks. We’ve been testing OpenAI’s new o1-mini and o1-preview models with Devin for the past several weeks and are excited to share some early results. To contextualize these results we will also discuss our evaluation methodology and our technical approach to building reliable coding agents.

Generative AI, OpenAI, Programming, Software Engineering, AI Agents

34 points by thomasahle 314 days ago | 2 comments

Show HN: Outlit – AI Agents to Empower Non-Legal (outlit.ai)

Artificial Intelligence, Legal Technology, Software, AI Agents

14 points by leopaz 358 days ago | 5 comments

AI Agents That Matter (aisnakeoil.com)

Artificial Intelligence, AI Agents

35 points by randomwalker 385 days ago | 10 comments

SceneCraft: An LLM Agent for Synthesizing 3D Scenes as Blender Code (openreview.net)

Generative AI, 3D Graphics, Blender, AI Agents

11 points by dmezzetti 396 days ago | 1 comments

Show HN: Use functional tokens for AI agents to simplify app workflows (nexa4ai.com)

AI Agents, Functional Programming, Workflow Optimization, Software

80 points by alanzhuly 411 days ago | 10 comments