Hacker News with Generative AI: Monitoring

Show HN: Pulse – Maintain healthy OpenSearch and Elasticsearch clusters (pulse.support)
Pulse puts you in control of your search cluster monitoring and maintenance. Get more clarity, better performance, and lower costs
Show HN: Subtrace – Wireshark for Docker Containers (github.com/subtrace)
Subtrace is Chrome DevTools for your backend. It tracks the API requests coming in and going out of your servers so that you can solve problems in production quickly.
Ask HN: What do you run instead of Datadog? (ycombinator.com)
Datadog has turned into an ever loving piece of shit. I am sick of their sales team grabbing us by the ankles and "Accidentally" charging for services we don't use. Now, this morning they changed something with their AWS integration that is causing 10X the API calls against our accounts (and thus, 10X guardduty costs on our end analyzing those API requests).
Grafana: Why observability needs FinOps, and vice versa (grafana.com)
Observability tools have changed the way we monitor infrastructure and applications, as teams get complete visibility into performance across complex, multi-cloud environments.
Datadog Dollars: Why Your Monitoring Bill Is Breaking the Bank (oneuptime.com)
Have you ever opened your monitoring bill and felt your heart skip a beat? You're not alone. In the world of digital infrastructure, many companies are experiencing sticker shock when they see their Datadog invoices. Let's unpack why your monitoring bill might be breaking the bank and explore how you can rein it back in.
How to monitor and debug Terraform (Terragrunt/OpenTofu) using OpenTelemetry (dash0.com)
This blog post provides a comprehensive guide on monitoring and debugging Terragrunt, Terraform/OpenTofu using OpenTelemetry.
Perforator – cluster-wide continuous profiling tool for large data centers (github.com/yandex)
Perforator is a production-ready, open-source Continuous Profiling app that can collect CPU profiles from your production without affecting its performance, made by Yandex and inspired by Google-Wide Profiling.
Top OpenTelemetry Collector Components (dash0.com)
Understanding and managing the performance of your applications can be a significant challenge – but it doesn’t have to be. This is where OpenTelemetry comes in, offering a powerful framework for collecting and exporting telemetry data (traces, metrics, and logs) from your applications.
Kubestatus: Open source tool to easily add status page to your K8s cluster (github.com/soub4i)
Kubestatus is an free and open-source tool to easily add status page to your Kubernetes cluster that currently display the status (operational, degraded or DOWN) of services.It is written in Go and uses the Kubernetes API to fetch information about the clusters and resources checck the kubestatus-operand image.
Slum: The Shadow Library Uptime Monitor (open-slum.org)
Datadog acquires Quickwit (datadoghq.com)
To help our customers meet these requirements without sacrificing visibility or introducing multiple logging tools, we are pleased to announce that Quickwit—a popular open source distributed search engine—is joining Datadog.
Kubernetes horizontal pod autoscaling powered by an OpenTelemetry-native tool (dash0.com)
This blog post shows how to use Dash0 as the source of truth to automatically scale applications running on Kubernetes.
37signals Dev – Monitoring 10 Petabytes of Data in Pure Storage (37signals.com)
How we use Prometheus to have metrics and alerts for Pure Storage.
Using AZs can eat up your budget – From Prometheus to VictoriaMetrics (prezi.com)
By 2024, Prezi’s monitoring system, built around Prometheus, was becoming outdated. It was already 5+ years old, running on a deprecated internal platform and accumulating a significant amount of costs every month.
Reads Causing Writes in Postgres (jesipow.com)
It is good practice to regularly inspect the statements running in the hot path of your Postgres instance. One way to do this is to examine the pg_stat_statements view, which shows various statistics about the SQL statements executed by the Postgres server.
Monitoring Bluesky firehose to quickly detect spam accounts as they spawn (bsky.app)
Ask HN: Product to improve monitoring and documentation: a good idea? (ycombinator.com)
Hi HN,<p>I have an idea for a product, but I want to test the waters before building anything.
Prometheus 3.0 (prometheus.io)
Following the recent release of Prometheus 3.0 beta at PromCon in Berlin, the Prometheus Team is excited to announce the immediate availability of Prometheus Version 3.0!
The Enshittification of Netdata (djsumdog.com)
Ask HN: Light self-hosted logs/metrics/traces dashboard? (ycombinator.com)
I’m aware of enterprise scale observability tools like ELK stack and datadog.
Mongoose IM 6.3.0 – Erlang Solutions robust, scalable and efficient XMPP server (erlang-solutions.com)
MongooseIM is a scalable, efficient, high-performance instant messaging server using the proven, open, and extensible XMPP protocol. With each new version, we introduce new features and improvements. For example, version 6.2.0 introduced our new CETS in-memory storage, making setup and autoscaling in cloud environments easier than before (see the blog post for details). The latest release 6.3.0 is no exception. The main highlight is the complete instrumentation rework, allowing seamless integration with modern monitoring solutions like Prometheus.
Show HN: Open-source Kibana alternative for logs and traces in ClickHouse (github.com/hyperdxio)
HyperDX helps engineers quickly figure out why production is broken by making it easy to search & visualize logs and traces on top of any Clickhouse cluster (imagine Kibana, for Clickhouse).
Monitor WiFi with Raspberry Pi (netbeez.net)
Since our company’s inception, we have recognized the immense potential of the Raspberry Pi as a cost-effective and versatile platform for distributed network monitoring.
Netdata 2.0 Released (github.com/netdata)
Netdata 2.0 has arrived!
Ask HN: Hosting on Digital Ocean, any advice for monitoring and deployments? (ycombinator.com)
I'm moving over from Lambda to Digital Ocean since I found it much easier to test locally and iterate.
An OpenTelemetry Python Example – Building a Tesla Monitor (greptime.com)
This article demonstrates how to use OpenTelemetry to monitor the charging and driving data of a Tesla Model 3.
AI agents invade observability: snake oil or the future of SRE? (monitoring2.substack.com)
This newsletter was started 5 years ago to explore emerging observability and monitoring startups. In the most boring sense, these companies take operational data and create insights from that data for humans. This always involves a lot of dashboards, alerts, API integrations, and a large monthly bill.
A Beginner's Guide to the OpenTelemetry Collector (betterstack.com)
The first step towards observability with OpenTelemetry is instrumenting your application to enable it to generate essential telemetry signals such as traces, logs, and metrics.
A Look at the New Prometheus 3.0 UI (promlabs.com)
The Prometheus Team has just announced Prometheus version 3.0 at PromCon, with an official blog post detailing all the exciting new changes and features. A very visible highlight of Prometheus 3.0 is its new web UI that is enabled by default.
The Rise of Open Source Time Series Databases (victoriametrics.com)
Time series databases allow you to store and query metrics efficiently. For example, if you want to forecast load on your servers, or identify intermittent faults with your production services, time series databases can help. Besides infrastructure monitoring, time series databases have been invaluable in finance, IoT applications, manufacturing, and more.