This is an Enterprise Edition feature. This article introduces Nightingale's alert aggregation feature from the perspectives of principles and data flow, helping users understand the alert process and troubleshoot alert issues.

Feature Overview

Alert aggregation is an important noise-reduction feature. By merging similar or related alert events into a single notification, it effectively reduces the number of alert notifications, avoids the disruption caused by alert storms, and lets incident responders focus on the problems that truly matter.

Use Cases

1. Bulk Server Failures

When a data-center network failure or batch server reboot happens, hundreds or thousands of alerts may be generated in an instant. Aggregating by dimensions such as cluster or alert severity can merge alerts in the same batch into one notification.

2. Microservice Chain Alerts

In a microservice architecture, the failure of an upstream service can trigger chained alerts in many downstream services. Aggregating by business group or service name helps quickly locate the root cause.

3. Bulk Alerts on the Same Type of Metric

For example, when multiple servers run out of disk space simultaneously, aggregating by alert rule or metric name lets you see the entire list of affected servers in a single notification.

Configuration Steps

Step 1: Enter the Notification Rule Configuration

  1. Log in to the platform and go to Alert ManagementNotification Rules.
  2. Click New Notification Rule or edit an existing rule.

n3bEbr

Step 2: Enable the Aggregation Feature

On the notification rule configuration page, find the Aggregation Configuration section:

  1. Enable the aggregation switch: turn on the “Enable Aggregation” option.

BQey3K

Step 3: Configure Aggregation Dimensions

The aggregation feature supports two configuration modes:

This is the simplest configuration. The system aggregates alerts by preset dimensions automatically:

  • Alert Rule: alerts triggered by the same alert rule are aggregated.
  • Alert Severity: alerts with the same severity are aggregated.

Aggregation Time Window: how long alerts received within this window will be aggregated together (a value of 30-120 seconds is recommended).

SdTNMg

2. Fine-Grained Aggregation

If the default dimensions don’t meet your needs, use fine-grained aggregation for more precise control:

T0REGJ

  1. Click the Add Fine-Grained Aggregation button.

  2. Set filter conditions:

    • Label Filter: for example, service=nginx aggregates only alerts from the nginx service.
    • Property Filter: for example, choose a specific business group or data source.
  3. Set aggregation dimensions:

    • By Label: choose label keys to aggregate by, such as host or region.
    • By Property: choose system properties as aggregation dimensions.
  4. Set the aggregation time window: alerts that arrive within this window will be aggregated.

Step 4: Save and Apply

  1. After completing the configuration, click Save at the bottom of the page.
  2. The configuration takes effect immediately, and newly produced alerts will be aggregated by the configured rules.

FAQ

Q1: Why are alerts still sent individually even though aggregation is configured?

Possible reasons:

  • Alerts don’t meet the aggregation conditions (different dimensions or beyond the time window).
  • Aggregation is not properly enabled.
  • The alert severity is not within the configured scope.

Q2: Will aggregation introduce notification delays?

Answer: Yes, there will be some delay equal to the configured aggregation time window.

Q3: How do I view the details of aggregated alerts?

Answer: The aggregated notification includes a link to the alert events list. Click it to view the full list of alert events.

References

快猫星云 联系方式 快猫星云 联系方式
快猫星云 联系方式
快猫星云 联系方式
快猫星云 联系方式
快猫星云