Nightingale ES Log Alert Rules

ES log alerting allows you to detect abnormal logs through query analysis and trigger alerts accordingly.
First, select the ES data source, then configure query conditions and alert rules. Below is a detailed explanation of each numbered function.
1 Select Index
Supports multiple configuration methods:
- Specify a single index:
gbsearches all documents in the gb index - Specify multiple indices:
gb,ussearches all documents in both gb and us indices - Specify index prefix:
g*,u*searches all documents in any index starting with g or u
2 Set Filter Conditions
Currently supports query string syntax
You can query by field name:
- status:active - Query records where status field contains “active”
- title:(quick OR brown) - Query records where title field contains “quick” or “brown”
- author:“John Smith” - Query records where author field contains the exact phrase “John Smith”
Supports ? and * wildcards:
- qu?ck - ? matches any single character
- bro* - * matches zero or more characters
Use ~ operator for fuzzy matching:
- quikc~ - Matches words similar to “quick”
- “fox quick”~5 - Phrase query where words can be up to 5 positions apart
Supports numeric and date ranges:
- count:[1 TO 5] - Closed interval, includes 1 and 5
- date:[2022-01-01 TO 2022-12-31]
- age:>=10 - Greater than or equal to 10
Can use boolean operators like AND, OR, NOT:
- quick AND brown - Contains both words
- quick OR brown - Contains either word
- quick NOT fox - Contains quick but not fox
For detailed syntax, refer to the ES documentation
3 Set Date Field
Click to select the date field in logs, which will be used as the basis for querying log time ranges
4 Set Log Query Time Range
If set to 5 minutes, it will query logs from the past 5 minutes when performing alert queries
5 Value Extraction
Statistical analysis functions for logs, such as count, sum, avg, min, max, etc.
6 Group By
Group logs by fields, for example, grouping by host field for count statistics. Results will be grouped by the host field
7 Alert Conditions
Statistical values are assigned to variables A, B, C, etc. in alert conditions, then alerts are triggered based on these variables. For example, $A > 10 triggers an alert when log count exceeds 10
8 Advanced Configuration
In some scenarios where logs are delayed (e.g., 3-minute delay), querying the last 3 minutes may return no data. In advanced settings, you can set a delay query time, such as 180s, which shifts both start and end times backward by 180s
Usage Examples
Example 1: Error Log Monitoring
- Index: app-logs-*
- Query condition: level:ERROR AND service:payment
- Time range: 5 minutes
- Value extraction: count()
- Alert condition: $A > 10 Description: Monitor if payment service error logs exceed 10 entries within 5 minutes
Example 2: API Response Time Monitoring
- Index: nginx-access-*
- Query condition: path:"/api/v1/order*" AND response_time:>500
- Time range: 10 minutes
- Value extraction: avg(response_time)
- Group By: path
- Alert condition: $A > 1000 Description: Monitor if order-related API average response time exceeds 1 second
Example 3: Error Status Code Monitoring
- Index: nginx-*
- Query condition: status:[500 TO 599]
- Time range: 15 minutes
- Value extraction: count()
- Group By: host, status
- Alert condition: $A > 50 Description: Group 5xx errors by host and status code, alert if any host’s specific status code occurs more than 50 times
Example 4: Business Exception Keyword Monitoring
- Index: business-logs-*
- Query condition: message:(“timeout” OR “connection refused” OR “out of memory”)
- Time range: 30 minutes
- Value extraction: count()
- Alert condition: $A > 5 Description: Monitor log count containing specific error keywords