夜莺-Nightingale
夜莺V7
夜莺V6
项目介绍
架构介绍
快速开始
黄埔营
安装部署
升级
采集器
使用手册
API
数据库表结构
users
target
user_group
user_group_member
task_tpl
task_tpl_host
task_record
sso_config
role
role_operation
recording_rule
notify_tpl
metric_view
datasource
configs
chart_share
busi_group
busi_group_member
builtin_cate
board
board_payload
alerting_engines
alert_subscribe
alert_rule
alert_mute
alert_his_event
alert_cur_event
alert_aggr_view
FAQ
转发数据给多个时序库
机器列表数据异常
数据流图
监控数据时有时无
查询原始监控数据
快捷视图详解
告警自愈模块使用
仪表盘里只展示我的机器
仪表盘里图表数据缺失
设置自定义告警通知方式
target_up指标的问题
夜莺可以监控 x 么
告警和恢复的判断逻辑
容量规划问题
connection refused
登录与认证
数据采集器Categraf
日志写到`/var/log/messages`
告警规则&告警模板如何引用变量
采集到的数据是字符串怎么处理
管理员密码忘记了
制作大盘如何添加图片
添加loki数据源报错
v6小版本升级有什么 sql 要执行吗
机器列表有展示,但采集数据查询不到
n9e 启动异常报错
n9e集群部署配置修改
推送 Promethus 报错 OOO
机器列表怎么忽略云资源
告警规则仅在本业务组生效失败
categraf 启动 oracle 插件报错
告警自愈不生效
n9e查询时序库EOF报错
手动编译项目报错
promQL 使用函数标签信息丢失
内存使用率+可用率不等于100
夜莺仪表盘有哪些内置变量
categraf配置文件支持热加载吗
导入 Grafana 仪表盘无效数据源
如何查看报错消息
采集器-Categraf
Flashcat 企业版
最佳实践
开源生态
Telegraf
Prometheus
版权声明
第1章:天降奇兵
第2章:探索PromQL
开篇
理解时间序列
Metrics类型
初识PromQL
PromQL操作符
PromQL聚合操作
PromQL内置函数
在HTTP API中使用PromQL
最佳实践:4个黄金指标和USE方法
小结
第3章:Prometheus告警处理
开篇
Prometheus告警简介
自定义Prometheus告警规则
部署Alertmanager
Alertmanager配置概述
基于标签的告警处理路由
使用Receiver接收告警信息
告警模板详解
屏蔽告警通知
使用Recoding Rules优化性能
小结
第4章:Exporter详解
第5章:数据与可视化
第6章:集群与高可用
第7章:Prometheus服务发现
第8章:监控Kubernetes
开篇
初识Kubernetes
在Kubernetes下部署Prometheus
Kubernetes下的服务发现
使用Prometheus监控Kubernetes集群
基于Prometheus的弹性伸缩
小结
第9章:Prometheus Operator
参考资料
emqx采集插件
请使用 v0.3.164 及以上版本的categraf
emqx插件通过emqx的dashboard api来获取指标.
- 配置及说明
address的内容会最为cluster标签的值附加到指标上
以下配置生成的指标会有一个cluster="http://192.168.11.11:18083,http://192.168.11.12:18083"
标签
# # 采集周期, 这里指定的interval会覆盖config.toml中的interval
# interval = 60
[[instances]]
## emqx的dashboard地址,ip:port, 单个集群监控使用逗号分割多个IP,会随机挑一个可用ip进行连接请求
## 多个集群,拆分到多个instance配置
addrees="http://192.168.11.11:18083,http://192.168.11.12:18083"
## 指定emqx的版本,支持v3/v5, 分别对应emqx的3.X/5.X 版本
version="v3"
## api需要鉴权,输入用户名和密码
username="admin"
password="public"
以下接口返回均以V3格式作为说明,V5接口返回的数据格式会有部分差异。
- 插件会先调用
/api/${version}/nodes
获取node的版本、运行状态、连接数等信息, 接口返回
{
"code": 0,
"data": [
{
"connections": 0,
"load1": "0.33",
"load15": "0.37",
"load5": "0.36",
"max_fds": 1048576,
"memory_total": "191.77M",
"memory_used": "129.38M",
"name": "emqx@node2.emqx.io",
"node": "emqx@node2.emqx.io",
"node_status": "Running",
"otp_release": "R21/10.3.5.6",
"process_available": 2097152,
"process_used": 509,
"uptime": "3 hours, 12 minutes, 25 seconds",
"version": "3.2.6"
},
{
"connections": 0,
"load1": "0.33",
"load15": "0.37",
"load5": "0.36",
"max_fds": 1048576,
"memory_total": "190.37M",
"memory_used": "129.82M",
"name": "emqx@node1.emqx.io",
"node": "emqx@node1.emqx.io",
"node_status": "Running",
"otp_release": "R21/10.3.5.6",
"process_available": 2097152,
"process_used": 508,
"uptime": "3 hours, 12 minutes, 45 seconds",
"version": "3.2.6"
}
]
}
对应生成的指标如下
emqx_node_connections{node="xxx",cluster="yyy"} 10
emqx_node_status{node="xxx",cluster="yyy"} 1 # 1表示running
emqx_node_info{node="xxx",cluster="yyy", name="xxxx", version="3.2.6", otp_release="R21/10.3.5.6"} 1
emqx_node_load1{node="xxx",cluster="yyy"} 0.10
emqx_node_load5{node="xxx",cluster="yyy"} 0.37
emqx_node_load15{node="xxx",cluster="yyy"} 0.32
emqx_node_max_fds{node="xxx",cluster="yyy"} 1048576
emqx_node_memory_total_bytes{node="xxx",cluster="yyy"} 194938
emqx_node_memory_used_bytes{node="xxx",cluster="yyy"} 132956
emqx_node_process_available{node="xxx",cluster="yyy"} 2097152
emqx_node_process_used{node="xxx",cluster="yyy"} 508
emqx_node_uptime_seconds{node="xxx",cluster="yyy"} 1234567 #单位秒
会额外生成两个指标, 分别表示当前集群中有多少个node正常运行和已经停止了,标签中的cluster就是配置中的address
emqx_cluster_node_running{cluster="yyy"} 2
emqx_cluster_node_stopped{cluster="yyy"} 0
调用 /api/${version}/stats
接口,获取每个node的统计信息,接口返回
{
"code": 0,
"data": [
{
"node": "emqx@node2.emqx.io",
"subscriptions.shared.max": 0,
"subscriptions.max": 0,
"subscribers.max": 0,
"resources.max": 0,
"topics.count": 0,
"subscriptions.count": 0,
"suboptions.max": 0,
"topics.max": 0,
"sessions.persistent.max": 0,
"connections.max": 0,
"sessions.persistent.count": 0,
"actions.count": 5,
"retained.count": 5,
"rules.count": 0,
"routes.count": 0,
"subscriptions.shared.count": 0,
"suboptions.count": 0,
"sessions.count": 0,
"actions.max": 5,
"retained.max": 5,
"sessions.max": 0,
"rules.max": 0,
"routes.max": 0,
"resources.count": 0,
"subscribers.count": 0,
"connections.count": 0
},
{
"node": "emqx@node1.emqx.io",
"subscriptions.shared.max": 0,
"subscriptions.max": 0,
"subscribers.max": 0,
"resources.max": 0,
"topics.count": 0,
"subscriptions.count": 0,
"suboptions.max": 0,
"topics.max": 0,
"sessions.persistent.max": 0,
"connections.max": 0,
"sessions.persistent.count": 0,
"actions.count": 5,
"retained.count": 5,
"rules.count": 0,
"routes.count": 0,
"subscriptions.shared.count": 0,
"suboptions.count": 0,
"sessions.count": 0,
"actions.max": 5,
"retained.max": 5,
"sessions.max": 0,
"rules.max": 0,
"routes.max": 0,
"resources.count": 0,
"subscribers.count": 0,
"connections.count": 0
}
]
}
根据返回的数据,生成的指标信息
emqx_subscriptions_shared_max{node="xxx",cluster="yyy"} 123
emqx_subscriptions_max{node="xxx",cluster="yyy"} 123
emqx_subscribers_max{node="xxx",cluster="yyy"} 123
emqx_resources_max{node="xxx",cluster="yyy"} 123
emqx_topics_count{node="xxx",cluster="yyy"} 0
emqx_subscriptions_count{node="xxx",cluster="yyy"} 0
emqx_suboptions_max{node="xxx",cluster="yyy"} 0
emqx_topics_max{node="xxx",cluster="yyy"} 0
emqx_sessions_persistent_max{node="xxx",cluster="yyy"} 0
emqx_connections_max{node="xxx",cluster="yyy"} 0
emqx_sessions_persistent_count{node="xxx",cluster="yyy"} 0
emqx_actions_count{node="xxx",cluster="yyy"} 5
emqx_retained_count{node="xxx",cluster="yyy"} 5
emqx_rules_count{node="xxx",cluster="yyy"} 0
emqx_routes_count{node="xxx",cluster="yyy"} 0
emqx_subscriptions_shared_count{node="xxx",cluster="yyy"} 0
emqx_suboptions_count{node="xxx",cluster="yyy"} 0
emqx_sessions_count{node="xxx",cluster="yyy"} 0
emqx_actions_max{node="xxx",cluster="yyy"} 5
emqx_retained_max{node="xxx",cluster="yyy"} 5
emqx_sessions_max{node="xxx",cluster="yyy"} 0
emqx_rules_max{node="xxx",cluster="yyy"} 0
emqx_routes_max{node="xxx",cluster="yyy"} 0
emqx_resources_count{node="xxx",cluster="yyy"} 0
emqx_subscribers_count{node="xxx",cluster="yyy"} 0
emqx_connections_count{node="xxx",cluster="yyy"} 0
调用 /api/${version}/nodes/${node}/metrics
获取指标
{
"code": 0,
"data": {
"rules.matched": 0,
"messages.sent": 0,
"packets.disconnect.sent": 0,
"bytes.sent": 0,
"packets.disconnect.received": 0,
"packets.pingresp.sent": 0,
"packets.pingreq.received": 0,
"packets.unsubscribe.received": 0,
"packets.pubcomp.missed": 0,
"packets.puback.missed": 0,
"packets.pubcomp.sent": 0,
"packets.pubcomp.received": 0,
"packets.pubrec.missed": 0,
"auth.mqtt.anonymous": 0,
"packets.connack.auth_error": 0,
"actions.failure": 0,
"packets.suback.sent": 0,
"packets.puback.sent": 0,
"messages.retained": 5,
"messages.received": 0,
"packets.connect.received": 0,
"messages.forward": 0,
"packets.pubrel.missed": 0,
"packets.publish.received": 0,
"packets.connack.sent": 0,
"packets.subscribe.received": 0,
"packets.pubrel.received": 0,
"packets.pubrec.received": 0,
"packets.puback.received": 0,
"packets.sent": 0,
"packets.received": 0,
"bytes.received": 0,
"messages.expired": 0,
"messages.dropped": 0,
"messages.qos2.dropped": 0,
"messages.qos2.expired": 0,
"packets.pubrel.sent": 0,
"packets.pubrec.sent": 0,
"packets.publish.sent": 0,
"actions.success": 0,
"packets.publish.error": 0,
"packets.unsubscribe.error": 0,
"messages.qos2.received": 0,
"messages.qos1.received": 0,
"messages.qos0.received": 0,
"packets.auth.sent": 0,
"messages.qos2.sent": 0,
"messages.qos1.sent": 0,
"messages.qos0.sent": 0,
"packets.auth.received": 0,
"packets.unsuback.sent": 0,
"packets.connack.error": 0,
"packets.publish.auth_error": 0,
"packets.subscribe.error": 0,
"packets.subscribe.auth_error": 0
}
}
对应生成的指标
emqx_rules_matched{node="xxx",cluster="yyy"} 0
emqx_messages_sent{node="xxx",cluster="yyy"} 0
emqx_packets_disconnect_sent{node="xxx",cluster="yyy"} 0
emqx_bytes_sent{node="xxx",cluster="yyy"} 0
emqx_packets_disconnect_received{node="xxx",cluster="yyy"} 0
emqx_packets_pingresp_sent{node="xxx",cluster="yyy"} 0
emqx_packets_pingreq_received{node="xxx",cluster="yyy"} 0
emqx_packets_unsubscribe_received{node="xxx",cluster="yyy"} 0
emqx_packets_pubcomp_missed{node="xxx",cluster="yyy"} 0
emqx_packets_puback_missed{node="xxx",cluster="yyy"} 0
emqx_packets_pubcomp_sent{node="xxx",cluster="yyy"} 0
emqx_packets_pubcomp_received{node="xxx",cluster="yyy"} 0
emqx_packets_pubrec_missed{node="xxx",cluster="yyy"} 0
emqx_auth_mqtt_anonymous{node="xxx",cluster="yyy"} 0
emqx_packets_connack_auth_error{node="xxx",cluster="yyy"} 0
emqx_actions_failure{node="xxx",cluster="yyy"} 0
emqx_packets_suback_sent{node="xxx",cluster="yyy"} 0
emqx_packets_puback_sent{node="xxx",cluster="yyy"} 0
emqx_messages_retained{node="xxx",cluster="yyy"} 5
emqx_messages_received{node="xxx",cluster="yyy"} 0
emqx_packets_connect_received{node="xxx",cluster="yyy"} 0
emqx_messages_forward{node="xxx",cluster="yyy"} 0
emqx_packets_pubrel_missed{node="xxx",cluster="yyy"} 0
emqx_packets_publish_received{node="xxx",cluster="yyy"} 0
emqx_packets_connack_sent{node="xxx",cluster="yyy"} 0
emqx_packets_subscribe_received{node="xxx",cluster="yyy"} 0
emqx_packets_pubrel_received{node="xxx",cluster="yyy"} 0
emqx_packets_pubrec_received{node="xxx",cluster="yyy"} 0
emqx_packets_puback_received{node="xxx",cluster="yyy"} 0
emqx_packets_sent{node="xxx",cluster="yyy"} 0
emqx_packets_received{node="xxx",cluster="yyy"} 0
emqx_bytes_received{node="xxx",cluster="yyy"} 0
emqx_messages_expired{node="xxx",cluster="yyy"} 0
emqx_messages_dropped{node="xxx",cluster="yyy"} 0
emqx_messages_qos2_dropped{node="xxx",cluster="yyy"} 0
emqx_messages_qos2_expired{node="xxx",cluster="yyy"} 0
emqx_packets_pubrel_sent{node="xxx",cluster="yyy"} 0
emqx_packets_pubrec_sent{node="xxx",cluster="yyy"} 0
emqx_packets_publish_sent{node="xxx",cluster="yyy"} 0
emqx_actions_success{node="xxx",cluster="yyy"} 0
emqx_packets_publish_error{node="xxx",cluster="yyy"} 0
emqx_packets_unsubscribe_error{node="xxx",cluster="yyy"} 0
emqx_messages_qos2_received{node="xxx",cluster="yyy"} 0
emqx_messages_qos1_received{node="xxx",cluster="yyy"} 0
emqx_messages_qos0_received{node="xxx",cluster="yyy"} 0
emqx_packets_auth_sent{node="xxx",cluster="yyy"} 0
emqx_messages_qos2_sent{node="xxx",cluster="yyy"} 0
emqx_messages_qos1_sent{node="xxx",cluster="yyy"} 0
emqx_messages_qos0_sent{node="xxx",cluster="yyy"} 0
emqx_packets_auth_received{node="xxx",cluster="yyy"} 0
emqx_packets_unsuback_sent{node="xxx",cluster="yyy"} 0
emqx_packets_connack_error{node="xxx",cluster="yyy"} 0
emqx_packets_publish_auth_error{node="xxx",cluster="yyy"} 0
emqx_packets_subscribe_error{node="xxx",cluster="yyy"} 0
emqx_packets_subscribe_auth_error{node="xxx",cluster="yyy"} 0
- 其他
5.X的emqx提供prometheus接口 ,可以使用input.prometheus插件配置
http://ip:18083/api/v5/prometheus/stats
采集, 不需要额外配置用户名密码认。
通过prometheus接口返回的指标和通过node/stats api返回的指标会有一些差异,两种接口的独有的指标如表格所示
Prometheus指标 | API指标 |
---|---|
emqx_cert_expiry_at (证书过期时间) | emqx_messages_dropped_await_pubrel_timeout (等待PUBREL超时丢弃的消息数) |
emqx_conf_sync_txid (配置同步事务ID) | emqx_messages_persisted (持久化的消息数) |
emqx_durable_subscriptions_count (持久订阅数量) | emqx_messages_transformation_failed (消息转换失败数) |
emqx_durable_subscriptions_max (最大持久订阅数) | emqx_messages_transformation_succeeded (消息转换成功数) |
emqx_messages_dropped_expired (过期丢弃的消息数) | emqx_messages_validation_failed (消息验证失败数) |
emqx_messages_retained (保留消息数量) | emqx_messages_validation_succeeded (消息验证成功数) |
emqx_mria_last_intercepted_trans (最后拦截的事务) | emqx_node_cluster_connections (集群连接数) |
emqx_mria_replicants (复制节点信息) | emqx_node_connections (节点连接数) |
emqx_mria_server_mql (Mria服务器MQL) | emqx_node_live_connections (节点活跃连接数) |
emqx_mria_weight (Mria权重) | emqx_node_load15 (15分钟负载) |
emqx_packets_connect (连接报文数) | emqx_node_load1 (1分钟负载) |
emqx_vm_cpu_idle (CPU空闲率) | emqx_node_load5 (5分钟负载) |
emqx_vm_cpu_use (CPU使用率) | emqx_node_max_fds (最大文件描述符数) |
emqx_vm_process_messages_in_queues (进程消息队列数) | emqx_node_memory_total_bytes (节点总内存) |
emqx_vm_run_queue (运行队列长度) | emqx_node_memory_used_bytes (节点已用内存) |
emqx_vm_total_memory (总内存) | emqx_node_process_available (可用进程数) |
emqx_vm_used_memory (已用内存) | emqx_node_process_used (已用进程数) |
emqx_node_uptime_seconds (节点运行时间) | |
emqx_overload_protection_delay_ok (过载保护延迟成功数) | |
emqx_overload_protection_delay_timeout (过载保护延迟超时数) | |
emqx_overload_protection_gc (过载保护GC次数) | |
emqx_overload_protection_hibernation (过载保护休眠次数) | |
emqx_overload_protection_new_conn (过载保护新连接数) | |
emqx_packets_connect_received (接收到的连接报文数) |
大盘: