ES log alerting allows you to detect abnormal logs through query analysis and trigger alerts accordingly.
First, select the ES data source, then configure query conditions and alert rules. Below is a detailed explanation of each numbered function.
1 Select Index
Supports multiple configuration methods:
- Specify a single index:
gb
searches all documents in the gb index - Specify multiple indices:
gb,us
searches all documents in both gb and us indices - Specify index prefix:
g*,u*
searches all documents in any index starting with g or u
2 Set Filter Conditions
Currently supports query string syntax
You can query by field name:
- status:active - Query records where status field contains “active”
- title:(quick OR brown) - Query records where title field contains “quick” or “brown”
- author:“John Smith” - Query records where author field contains the exact phrase “John Smith”
Supports ? and * wildcards:
- qu?ck - ? matches any single character
- bro* - * matches zero or more characters
Use ~ operator for fuzzy matching:
- quikc~ - Matches words similar to “quick”
- “fox quick”~5 - Phrase query where words can be up to 5 positions apart
Supports numeric and date ranges:
- count:[1 TO 5] - Closed interval, includes 1 and 5
- date:[2022-01-01 TO 2022-12-31]
- age:>=10 - Greater than or equal to 10
Can use boolean operators like AND, OR, NOT:
- quick AND brown - Contains both words
- quick OR brown - Contains either word
- quick NOT fox - Contains quick but not fox
For detailed syntax, refer to the ES documentation
3 Set Date Field
Click to select the date field in logs, which will be used as the basis for querying log time ranges
4 Set Log Query Time Range
If set to 5 minutes, it will query logs from the past 5 minutes when performing alert queries
5 Value Extraction
Statistical analysis functions for logs, such as count, sum, avg, min, max, etc.
6 Group By
Group logs by fields, for example, grouping by host field for count statistics. Results will be grouped by the host field
7 Alert Conditions
Statistical values are assigned to variables A, B, C, etc. in alert conditions, then alerts are triggered based on these variables. For example, $A > 10 triggers an alert when log count exceeds 10
8 Advanced Configuration
In some scenarios where logs are delayed (e.g., 3-minute delay), querying the last 3 minutes may return no data. In advanced settings, you can set a delay query time, such as 180s, which shifts both start and end times backward by 180s
Usage Examples
Example 1: Error Log Monitoring
- Index: app-logs-*
- Query condition: level:ERROR AND service:payment
- Time range: 5 minutes
- Value extraction: count()
- Alert condition: $A > 10 Description: Monitor if payment service error logs exceed 10 entries within 5 minutes
Example 2: API Response Time Monitoring
- Index: nginx-access-*
- Query condition: path:"/api/v1/order*" AND response_time:>500
- Time range: 10 minutes
- Value extraction: avg(response_time)
- Group By: path
- Alert condition: $A > 1000 Description: Monitor if order-related API average response time exceeds 1 second
Example 3: Error Status Code Monitoring
- Index: nginx-*
- Query condition: status:[500 TO 599]
- Time range: 15 minutes
- Value extraction: count()
- Group By: host, status
- Alert condition: $A > 50 Description: Group 5xx errors by host and status code, alert if any host’s specific status code occurs more than 50 times
Example 4: Business Exception Keyword Monitoring
- Index: business-logs-*
- Query condition: message:(“timeout” OR “connection refused” OR “out of memory”)
- Time range: 30 minutes
- Value extraction: count()
- Alert condition: $A > 5 Description: Monitor log count containing specific error keywords