- 快猫星云Flashcat

Nightingale v9 self-healing script history tasks: view execution history of scripts triggered by alerts or manually dispatched ad-hoc tasks, including per-target output, exit codes, and elapsed time.

Overview

History Tasks = all execution records of self-healing scripts after they are triggered.

Sidebar path: Alerts → Alert Self-Healing → History Tasks tab, URL /job-tasks.

Every time a self-healing script is triggered, the platform generates a task record:

Task source (triggered by alert / user-initiated / ad-hoc task created via API)
Which target machines it was dispatched to
Per-machine execution results (stdout / stderr / exit code)
Execution duration

Applicable scenarios:

A self-healing script “seems” not to be working? Check the history to see whether it really didn’t trigger or it triggered but failed.
After a large bulk operation, view per-machine output to confirm status.
One-off ad-hoc operations (restart some service, clean up disk space) — use Create Ad-hoc Task to dispatch to a group of machines with one click.
Compliance / postmortem: who, when, on which machines, ran which scripts.

List & Filters

Top filters of the page:

Control	Description
Keyword search	Search within the task title
Time range	Default last 7 days, no upper limit
Only mine	Checked by default — show only tasks created by the current account; uncheck to see tasks across the whole business group
Business group (left side)	Filter by business group

List columns:

Column	Meaning
ID	Database primary key of the task, used for cross-page reference
Title	Task title; alert-triggered tasks show the alert rule name; ad-hoc tasks show the title entered by the user
Action	Click to enter task details and view per-target output
Creator	Who triggered it: username (manual) / alert engine name (automatic)
Created at	When the task was dispatched

Create Ad-hoc Task

The Create Ad-hoc Task button in the upper right — dispatches a script to a group of machines one-off, without binding to an alert rule. Common uses:

Bulk restart a service;
Clean disk space for emergency stop-gap;
One-off operational inspection (e.g., collect the OS version of all machines).

Clicking it opens the task creation form, with main fields:

Title: makes it easier to identify later in the history task list;
Script content: bash / python / PowerShell etc., written directly in the editor;
Target machines: select from the device list, supporting filtering by business group / labels;
Timeout: maximum execution time per machine; automatically killed on timeout;
Execution mode: all machines in parallel / rolling (batch by batch) / pause mode (manual next-step).

After an ad-hoc task completes, the task record remains and can be reviewed repeatedly — it won’t disappear when the machine restarts or the session expires.

Task Details: Per-Machine Execution Results

Click a list row to enter task details:

Overview: task title, script preview, target count, success/failure counts, execution timeline;
Target machine list: one row per machine, showing the current status (pending / running / success / failed / killed);
Click a single row to expand the specific output for that machine:
- stdout: script’s standard output
- stderr: error output
- Exit code: 0 = success, non-zero = failure
- Duration: single-machine execution time
Failed machines: can be re-run with one click individually, without re-dispatching the whole task.

FAQ

Q1: The alert rule is configured with a self-healing script, but no execution record appears in History Tasks?

A: Troubleshoot in this order:

Did the alert really trigger: go to Active Alerts to see if there are events; no event = no self-healing script trigger;
Is the self-healing script bound to that alert rule: confirm in the “Self-Healing” section of the alert rule edit page;
Are the target machines online: self-healing scripts are dispatched via Categraf; machines without heartbeats are skipped;
Alert engine logs: go to Alert Engine and check the server logs for errors like job dispatch failed.

Q2: After dispatching an ad-hoc task, some machines remain in pending state?

A: Common reasons:

No heartbeat: Categraf is offline and naturally cannot receive task dispatch;
Agent version too old: very old Categraf versions don’t support task dispatch; upgrade to the current ent version;
Concurrency throttling: the task is in “rolling mode” and the previous batch hasn’t finished, so the subsequent ones are queued.

Q3: How long is history task data retained?

A: Retained long-term by default (not subject to automatic cleanup). For large task volumes, DBAs can configure retention days + automatic cleanup in n9e.toml as needed.

Q4: Will script output be leaked / can sensitive information be seen?

A: Both the script itself and the execution output are fully stored in the database — do not write plaintext passwords or tokens in the script. Recommendations:

Pull sensitive credentials via environment variables from the target machine’s /etc/environment or from vault;
Control task viewing permissions via Role Management, granting access only to operations roles;
After a P0 incident, clean up history records involving sensitive operations (requires DBA action).

Overview

List & Filters

Create Ad-hoc Task

Task Details: Per-Machine Execution Results

FAQ

References