Hacker News with Generative AI: Monitoring

Show HN: rtcollector - A modular, RedisTimeSeries-native observability agent (github.com/xe-nvdk)
rtcollector is a lightweight, plugin-based agent for collecting system and application metrics, and pushing them to RedisTimeSeries.

Open Source, Monitoring, Redis

18 points by ignaciovdk 57 days ago | 0 comments

Monitoring Node.js: Key Metrics You Should Track (last9.io)
Understand which metrics matter in Node.js applications, why they’re important, and how to track them effectively in production.

Node.js, Monitoring, Performance, Metrics, Software Development

45 points by unripe_syntax 60 days ago | 2 comments

Grafana Assistant, a context-aware LLM agent built into Grafana Cloud (grafana.com)
Today, as part of the GrafanaCON 2025 keynote in Seattle, we previewed Grafana Assistant, our new LLM-powered agent in Grafana Cloud that helps you learn and solve problems in Grafana easier than ever.

Grafana, Cloud Computing, Monitoring, Artificial Intelligence

6 points by vidamon 64 days ago | 0 comments

Pgwatch: PostgreSQL Monitoring Solution (github.com/cybertec-postgresql)
🔬PGWATCH: PostgreSQL metrics monitor/dashboard

Databases, PostgreSQL, Monitoring, Open Source, Tools

8 points by klaussilveira 66 days ago | 0 comments

Monitoring my Minecraft server with OpenTelemetry and Prometheus (dash0.com)
One of the secret pleasures of life is to be paid for things you would do for free. On a completely unrelated note, this blog post documents my time figuring out how to monitor a Minecraft server with OpenTelemetry, Prometheus and Dash0.

Minecraft, OpenTelemetry, Prometheus, Monitoring, Gaming

94 points by mmanciop 71 days ago | 59 comments

Cloudflare's approach to global service health metrics and software releases (cloudflare.com)

Cloud Computing, Software Engineering, Monitoring, Release Management

7 points by jgrahamc 72 days ago | 0 comments

Show HN: Neurox – GPU Observability for AI Infra (github.com/neuroxhq)
This Helm chart is designed to install Neurox. Neurox helps monitor your AI workloads running on your Kubernetes GPU cluster. Purpose-built dashboards and reports combine metrics and live Kubernetes runtime state data to help admins, developers, researchers, and finance auditors surface relevant insights. Visit our main website for information.

AI, Machine Learning, Kubernetes, Monitoring, Observability

25 points by leeab 80 days ago | 22 comments

Show HN: Raindrop – Sentry for AI Products (raindrop.ai)
Raindrop sends you alerts when your AI misbehaves and links straight to the events, so you can dig into the conversations or traces, understand the root cause, and fix it—fast.

AI, Monitoring, Debugging, Software, Tools

11 points by alexisgauba 81 days ago | 2 comments

I gave up on self-hosted Sentry (2024) (bugsink.com)
In the early 2010s, I was a big fan of Sentry. It was a great tool for tracking errors in web applications. At the time, I was making software for law firms, so sending error reports to a third-party service was out of the question, I needed to host it myself. So I did.

Software, Self-hosting, Debugging, Monitoring, Sentry

186 points by roywashere 92 days ago | 150 comments

Show HN: Coroot – eBPF-based, open source observability with actionable insights (github.com/coroot)
Coroot is an open-source APM & Observability tool, a DataDog and NewRelic alternative. Metrics, logs, traces, continuous profiling, and SLO-based alerting, supercharged with predefined dashboards and inspections.

Open Source, Observability, Monitoring, Tools, Software

162 points by openWrangler 101 days ago | 31 comments

Engineering a Trace Details Page That Handles a Million Spans (signoz.io)

Engineering, Monitoring, Performance Optimization, Distributed Systems

10 points by vikrantgupta25 106 days ago | 1 comments

Show HN: Dish: A lightweight HTTP and TCP socket monitoring tool written in Go (github.com/thevxn)
tiny one-shot monitoring service remote configuration of independent 'dish network' (via -source ${REMOTE_JSON_API_URL} flag) fast concurrent testing, low overall execution time, 10-sec timeout per socket by default 0 dependencies

Networking, Tools, Monitoring

39 points by tackx 113 days ago | 1 comments

JEP Draft: JFR Method Timing and Tracing (openjdk.org)
Extend JDK Flight Recorder (JFR) to support bytecode-based method timing and tracing for quick and easy use.

Java, Performance, Debugging, Monitoring, OpenJDK

7 points by mfiguiere 116 days ago | 0 comments

Gravity CI (gravity.ci)
Gravity monitors build artifact sizes to prevent accidental increases – right in your CI pipeline.

CI/CD, Software Development, DevOps, Build Tools, Monitoring

11 points by marcoow 122 days ago | 3 comments

Some notes on Grafana Loki's new "structured metadata" (utoronto.ca)
Grafana Loki somewhat bills itself as "Prometheus for logs", and so it's unsurprising that it started with a data model much like Prometheus.

Monitoring, Grafana, Logging, Prometheus

117 points by valyala 125 days ago | 44 comments

WinRing0: Why Windows is flagging your monitoring and fan control apps as threat (theverge.com)
On Tuesday morning, some PC gamers woke up to discover their computers were seemingly under threat.

Windows, Security, Gaming, Software, Monitoring

6 points by zdw 126 days ago | 1 comments

Show HN: Pulse – Maintain healthy OpenSearch and Elasticsearch clusters (pulse.support)
Pulse puts you in control of your search cluster monitoring and maintenance. Get more clarity, better performance, and lower costs

Open Source, Elasticsearch, Monitoring, Performance Optimization

19 points by zevir 149 days ago | 12 comments

Show HN: Subtrace – Wireshark for Docker Containers (github.com/subtrace)
Subtrace is Chrome DevTools for your backend. It tracks the API requests coming in and going out of your servers so that you can solve problems in production quickly.

Docker, Debugging, Monitoring, Development Tools

369 points by adtac 150 days ago | 73 comments

Ask HN: What do you run instead of Datadog? (ycombinator.com)
Datadog has turned into an ever loving piece of shit. I am sick of their sales team grabbing us by the ankles and "Accidentally" charging for services we don't use. Now, this morning they changed something with their AWS integration that is causing 10X the API calls against our accounts (and thus, 10X guardduty costs on our end analyzing those API requests).

Software, Monitoring, Cloud Computing, SaaS, Alternatives

13 points by theorlandog 157 days ago | 21 comments

Grafana: Why observability needs FinOps, and vice versa (grafana.com)
Observability tools have changed the way we monitor infrastructure and applications, as teams get complete visibility into performance across complex, multi-cloud environments.

Observability, FinOps, Infrastructure, Monitoring, Cloud

62 points by StratusBen 162 days ago | 43 comments

Datadog Dollars: Why Your Monitoring Bill Is Breaking the Bank (oneuptime.com)
Have you ever opened your monitoring bill and felt your heart skip a beat? You're not alone. In the world of digital infrastructure, many companies are experiencing sticker shock when they see their Datadog invoices. Let's unpack why your monitoring bill might be breaking the bank and explore how you can rein it back in.

Monitoring, Costs, Datadog, Cloud Infrastructure, SaaS

4 points by ndhandala 165 days ago | 1 comments

How to monitor and debug Terraform (Terragrunt/OpenTofu) using OpenTelemetry (dash0.com)
This blog post provides a comprehensive guide on monitoring and debugging Terragrunt, Terraform/OpenTofu using OpenTelemetry.

Terraform, OpenTelemetry, Debugging, Monitoring

10 points by marbir1981 165 days ago | 1 comments

Perforator – cluster-wide continuous profiling tool for large data centers (github.com/yandex)
Perforator is a production-ready, open-source Continuous Profiling app that can collect CPU profiles from your production without affecting its performance, made by Yandex and inspired by Google-Wide Profiling.

Open Source, Monitoring, Profiling, Data Centers, Performance

26 points by simonpure 168 days ago | 1 comments

Top OpenTelemetry Collector Components (dash0.com)
Understanding and managing the performance of your applications can be a significant challenge – but it doesn’t have to be. This is where OpenTelemetry comes in, offering a powerful framework for collecting and exporting telemetry data (traces, metrics, and logs) from your applications.

OpenTelemetry, Monitoring, Observability, Performance, Software

11 points by de107549 173 days ago | 0 comments

Kubestatus: Open source tool to easily add status page to your K8s cluster (github.com/soub4i)
Kubestatus is an free and open-source tool to easily add status page to your Kubernetes cluster that currently display the status (operational, degraded or DOWN) of services.It is written in Go and uses the Kubernetes API to fetch information about the clusters and resources checck the kubestatus-operand image.

Kubernetes, Open Source, Monitoring, DevOps, Tools

10 points by soubai 175 days ago | 3 comments

Slum: The Shadow Library Uptime Monitor (open-slum.org)

Software, Open Source, Libraries, Monitoring

107 points by rand0mx1 187 days ago | 6 comments

Datadog acquires Quickwit (datadoghq.com)
To help our customers meet these requirements without sacrificing visibility or introducing multiple logging tools, we are pleased to announce that Quickwit—a popular open source distributed search engine—is joining Datadog.

Acquisitions, Open Source, Search Engines, Monitoring, Software

17 points by louis-paul 190 days ago | 2 comments

Kubernetes horizontal pod autoscaling powered by an OpenTelemetry-native tool (dash0.com)
This blog post shows how to use Dash0 as the source of truth to automatically scale applications running on Kubernetes.

Kubernetes, OpenTelemetry, Cloud Computing, DevOps, Monitoring

33 points by mmanciop 191 days ago | 6 comments

37signals Dev – Monitoring 10 Petabytes of Data in Pure Storage (37signals.com)
How we use Prometheus to have metrics and alerts for Pure Storage.

Monitoring, Data Storage, Cloud Infrastructure, Prometheus

15 points by doppp 196 days ago | 0 comments

Using AZs can eat up your budget – From Prometheus to VictoriaMetrics (prezi.com)
By 2024, Prezi’s monitoring system, built around Prometheus, was becoming outdated. It was already 5+ years old, running on a deprecated internal platform and accumulating a significant amount of costs every month.

Monitoring, Cost Optimization, Open Source, Cloud Computing, Prometheus

71 points by shscs911 205 days ago | 58 comments