Oracle is a widely used relational database whose stability is critical. This article explains how to monitor Oracle.
There are multiple ways to collect Oracle monitoring data — you can use tools such as Categraf, Cprobe, and others. The principle is essentially the same: connect to the Oracle instance and execute commands to retrieve monitoring data. This article uses Categraf v0.4.15 or later as an example to explain how to configure Oracle monitoring data collection.
Overview of the Oracle Plugin Configuration
All Categraf plugin configurations are located by default under the conf directory. The Oracle plugin configuration directory is conf/input.oracle, which contains two configuration files:
oracle.toml: The main configuration file for the Oracle plugin. It defines connection and authentication information for different Oracle instances. Categraf can connect to multiple Oracle instances simultaneously — multiple instances can be configured via separate[[instances]]sections.metric.toml: The Oracle plugin collects monitoring data by executing various SQL queries. Some SQL queries are generic and should be executed against all Oracle instances; others are specific to a particular instance. Generic SQL is configured inmetric.toml, while instance-specific SQL is configured inoracle.toml.
oracle.toml
A sample oracle.toml configuration:
# 默认的采集频率,下面配置的所有的 oracle 的实例默认都会使用这个采集频率
# 如果某个实例需要不同的采集频率,可以在实例配置中使用 interval_times 来调整
# 各个实例的最终采集频率 = interval * interval_times
# 如果这里的 interval 也没有配置,那就使用 Categraf 全局配置中的 interval(默认是 15 秒)
# 单位是秒,所以默认是 15 秒采集一次监控数据
interval = 15
# 这是第一个 Oracle 实例的配置,使用一大块 [[instances]] 来配置
# [[instances]] 使用双中括号包裹,双中括号在 TOML 中表示数组
# 即可以配置多个 [[instances]] 区块,也就是可以配置多个 Oracle 实例
# 建议不要使用 sys 用户来进行采集,因为在 oracle 12c 及之后版本,go-ora 的 ping 方法在判断 oracle up 状态时,不准确
[[instances]]
address = "10.1.2.3:1521/orcl"
username = "monitor"
password = "123456"
is_sys_dba = false
is_sys_oper = false
disable_connection_pool = false
max_open_connections = 5
# 这个实例最终的采集频率是 interval * interval_times
interval_times = 1
# 这里可以为当前实例附加一些维度标签,这些维度标签最终会附加到当前实例的监控数据上面
labels = { region="cloud" }
# instances 下面的 metrics 配置段,表示当前实例需要采集的监控数据
# 注意,这个 metrics 配置段是当前实例特有的,其他实例不会去执行这些 SQL
[[instances.metrics]]
mesurement = "sessions"
label_fields = [ "status", "type" ]
metric_fields = [ "value" ]
timeout = "3s"
request = '''
SELECT status, type, COUNT(*) as value FROM v$session GROUP BY status, type
'''
[[instances]]
address = "192.168.10.10:1521/orcl"
username = "monitor"
password = "123456"
is_sys_dba = false
is_sys_oper = false
disable_connection_pool = false
max_open_connections = 5
labels = { region="local" }
# 第二个实例下面没有对应的 instances.metrics 配置段,说明没有独属于这个实例的采集 SQL
# 即:第二个实例仅会执行 metric.toml 中配置的通用 SQL
The principle of Oracle monitoring data collection: periodically execute SQL, then convert the returned results into Prometheus time-series data format and send them to the server. The result of a SQL query is a two-dimensional table with multiple rows and columns, so we need to tell Categraf via configuration which columns should be treated as labels and which columns should be treated as metric values.
- mesurement: A custom prefix for the metric name
- request: The SQL statement used to query monitoring data
- label_fields: Among the columns returned by the SQL, which columns should be used as labels of the time-series data
- metric_fields: Among the columns returned by the SQL, which columns should be used as values of the time-series data
- field_to_append: Whether to append the content of a specific column to the metric name as a suffix
- timeout: The timeout for SQL execution
- ignore_zero_result: Whether to ignore rows whose values are 0 in the query results. If set to false and no data is returned, an error log line will be printed; if set to true, no error log will be printed when no data is returned
metric.toml
This file contains commonly used SQL statements for collecting Oracle monitoring data. Categraf periodically executes these SQL statements to collect monitoring data from all Oracle instances. The specific meaning and purpose of these SQL statements may be more familiar to Oracle DBAs. We welcome Oracle DBAs to write articles explaining the meaning and purpose of these SQL statements — feel free to submit a PR with the article link to this documentation page so that more people can benefit.