6.2.1.11. health
6.2.1.11.1. Introduction
The health
command of the show
subgroup enables displaying statistics and the health information of the GCap.
6.2.1.11.2. Prerequisites
Users: setup, gviewadm
Dependencies: N/A
6.2.1.11.3. Command
show health
6.2.1.11.4. Example
Enter the following command.
(gcap-cli) show health
Validate.
The system displays the following information:
block
counters - Mass storage statisticscpu_stats
counters - Processor statisticsdisks
counters - Mount point occupancy statisticsemergency
counters - GCap emergency mode informationgcenter
counters - Paired GCenter informationhigh_availability
counters - High Availability (HA) informationinterfaces
counters - Statistics on network interfacesloadavg
counters - Statistics on the average load of the GCapmeminfo
counters - Statistics on the RAMnumastat
counters - Non Uniform Memory Access (NUMA) node statisticssofnet
counters - Statistics on received packets according to processor coressuricata
counters - Sigflow (monitoring-engine) informationsystemd
counters - System initialisation informationuptime
counters - Uptimevirtualmemory
counters - Swap space information (swap)
6.2.1.11.4.1. block
counters details - Mass storage statistics
sdN
- Disk statistics N where N is a letter of the alphabetread_bytes
- Bytes read since startupwritten_bytes
- Bytes written since startup
Example:
{
"block": {
"sda": {
"read_bytes": 302867968,
"written_bytes": 4837645312
},
"sdb": {
"read_bytes": 3894272,
"written_bytes": 4096
}
},
6.2.1.11.4.2. cpu_stats
counter details - CPU statistics
cpus
- CPU usage statisticscpu
- Overall core usage statisticscpuX
- CPU X core statisticsidle
- Elapsed time doing nothing in millisecondsiowait
- Elapsed time waiting for disk operations in millisecondsirq
- Elapsed time on material IRQsnice
- Time elapsed in user space on low priority processes in millisecondssoftirq
- Elapsed time on hardware IRQs in millisecondssystem
- Elapsed time in kernel space in millisecondsuser
- Elapsed time in user space in milliseconds
interrupts
- Number of interrupts since startupprocesses_blocked
- Number of blocked or dead processesprocesses_running
- Number of running processes
Example:
...
"cpu_stats": {
"cpus": {
"cpu": {
"idle": 961816208,
"iowait": 11419,
"irq": 0,
"nice": 0,
"softirq": 397899,
"system": 21788203,
"user": 50806194
},
"cpu0": {
"idle": 79960857,
"iowait": 985,
"irq": 0,
"nice": 0,
"softirq": 234748,
"system": 1795880,
"user": 4357374
},
"cpu1": {
"idle": 80166571,
"iowait": 951,
"irq": 0,
"nice": 0,
"softirq": 88078,
"system": 1830370,
"user": 4138182
}
},
"interrupts": 12942835029,
"processes_blocked": 0,
"processes_running": 1
},
...
6.2.1.11.4.3. disks
counters details - Mount point occupancy statistics
/mountpoint/path
- Mount point pathblock_free
- Number of free blocksblock_total
- Total number of blocksinode_free
- Number of remaining inodesinode_total
- Total number of inodes
Example:
...
"disks": {
"/": {
"block_free": 247909,
"block_total": 249830,
"inode_free": 64258,
"inode_total": 65536
},
"/data": {
"block_free": 7150076,
"block_total": 7161801,
"inode_free": 1827417,
"inode_total": 1827840
},
},
...
6.2.1.11.4.4. emergency
Counters details - GCap emergency mode information
emergency_active
- Active or inactive status of the emergency mode
Example:
...
"emergency": {
"emergency_active": false
},
...
6.2.1.11.4.5. gcenter
Counters details - Paired GCenter information
chronyc_sync
- Status of the NTP synchronisation with the GCenterreachable
- GCenter reachable or not (false)
Example:
...
"gcenter": {
"chronyc_sync": false,
"reachable": false
},
...
6.2.1.11.4.6. high_availability
counters details - High Availability (HA) information
healthy
- HA health statuslast_status
- Last known HA statuslast_transition
- Date of last known HA status change in ISO8601 formatleader
- True for a GCap leader, false for a GCap followerstatus
- Active or inactive (false) status of the HA
Example:
...
"high_availability": {
"healthy": false,
"last_status": -1,
"last_transition": "0001-01-01T00:00:00Z",
"leader": false,
"status": false
},
...
6.2.1.11.4.7. interfaces
counter details - Statistics on network interfaces
bond0
- Name of the network interfacerx_bytes
- Number of bytes receivedrx_drop
- Number of bytes lost in receptionrx_errs
- Number of invalid bytes receivedrx_packets
- Total number of packets received from this interfacetx_bytes
- Number of bytes senttx_drop
- Number of bytes lost while sendingtx_errs
- Number of invalid bytes senttx_packets
- Total number of packets sent from this interface
Example:
...
"interfaces": {
"bond0": {
"rx_bytes": 0,
"rx_drops": 0,
"rx_errs": 0,
"rx_packets": 0,
"tx_bytes": 0,
"tx_drops": 0,
"tx_errs": 0,
"tx_packets": 0
},
"gcp0": {
"rx_bytes": 138433006,
"rx_drops": 82901,
"rx_errs": 0,
"rx_packets": 2143236,
"tx_bytes": 796294,
"tx_drops": 0,
"tx_errs": 0,
"tx_packets": 3635
},
"gcp1": {
"rx_bytes": 137642525,
"rx_drops": 82902,
"rx_errs": 0,
"rx_packets": 2135060,
"tx_bytes": 0,
"tx_drops": 0,
"tx_errs": 0,
"tx_packets": 0
}
},
...
6.2.1.11.4.8. loadavg
counter details - Statistics on the average load of the GCap
active_processes
- Number of processes startedload_average_15_mins
- Average load over the last fifteen minutesload_average_1_min
- Average load over the last minuteload_average_5_mins
- Average load over the last five minutesrunning_processes
- Number of running processes
Example:
...
"loadavg": {
"active_processes": 561,
"load_average_15_mins": 0.99,
"load_average_1_min": 0.67,
"load_average_5_mins": 1,
"running_processes": 2
},
...
6.2.1.11.4.9. meminfo
counter details - Statistics on the RAM
available
- Total physical memory in kilobytesbuffers
- Memory used by disk operations in kilobytescached
- Memory used by the cache in kilobytesdirty
- Memory used by pending write operations in kilobytesfree
- Unused memory in kilobyteshugepages_anonymous
- Number of anonymous transparent huge pages usedhugepages_free
- Number of available transparent huge pageshugepages_reserved
- Number of reserved transparent huge pageshugepages_shmem
- Number of shared transparent huge pageshugepages_surplus
- Number of extra transparent huge pageshugepages_total
- Total number of huge pageskernel_stack
- Memory used by kernel stack allocations in kilobytespage_tables
- Memory used for page management in kilobytess_reclaimable
- Cache memory that can be reallocated in case of memory shortage in kilobytesshmem
- Memory used by shared pages in kilobytesslab
- Memory used by kernel data structures in kilobytesswap_cached
- Memory used by the swap cache in kilobytesswap_free
- Available memory in swap in kilobytesswap_total
- Total swap memory in kilobytestotal
- Total memory in kilobytesv_malloc_used
- Memory used by large memory areas allocated by the kernel
For more information, please refer to this documentation meminfo
.
Example:
...
"meminfo": {
"available": 13608896,
"buffers": 380932,
"cached": 1155824,
"dirty": 28,
"free": 13128080,
"hugepages_anonymous": 423936,
"hugepages_free": 0,
"hugepages_reserved": 0,
"hugepages_shmem": 0,
"hugepages_surplus": 0,
"hugepages_total": 0,
"kernel_stack": 9152,
"page_tables": 8400,
"s_reclaimable": 43168,
"shmem": 794564,
"slab": 210008,
"swap_cached": 0,
"swap_free": 16777212,
"swap_total": 16777212,
"total": 15977468,
"v_malloc_used": 66592
},
...
6.2.1.11.4.10. numastat
counter details- Non Uniform Memory Access (NUMA) node statistics
nodes
- List of NUMA nodesnodeX
- NUMA X node statisticsinterleave_hit
- Interleaved memory successfully allocated in this nodelocal_node
- Memory allocated in this node while a process was running on itnuma_foreign
- Memory planned for this node, but currently allocated in a different nodenuma_hit
- Memory successfully allocated in this node as expectednuma_miss
- Memory allocated in this node despite process preferences. Each numa_miss has a numa_foreign in another nodeother_node
- Memory allocated in this node while a process was running in another node
Example:
...
"numastat": {
"nodes": {
"node0": {
"interleave_hit": 3871,
"local_node": 4410557829,
"numa_foreign": 0,
"numa_hit": 4410454203,
"numa_miss": 0,
"other_node": 14170
},
"node1": {
"interleave_hit": 3869,
"local_node": 4224990850,
"numa_foreign": 0,
"numa_hit": 4224964539,
"numa_miss": 0,
"other_node": 21531
}
}
},
...
6.2.1.11.4.11. sofnet
counter details - Statistics on received packets according to processor cores
cpus
- Usage statistics per CPUcpuX
- CPU X core statisticsbacklog_len
-dropped
- Number of packets droppedflow_limit_count
- Number of times the throughput limit was reachedprocessed
- Number of packets processedreceived_rps
- Number of times the CPU was woken uptime_squeeze
- Number of times the thread could not process all the packets in its backlog within the budget
summed
- Overall core usage statisticsbacklog_len
-dropped
- Number of packets droppedflow_limit_count
- Number of times the throughput limit was reachedprocessed
- Number of packets processedreceived_rps
- Number of times the CPU was woken uptime_squeeze
- Number of times the thread could not process all the packets in its backlog within the budget
Example:
...
"softnet": {
"cpus": {
"cpu0": {
"backlog_len": 0,
"dropped": 0,
"flow_limit_count": 0,
"processed": 448550,
"received_rps": 0,
"time_squeeze": 2
},
"cpu1": {
"backlog_len": 0,
"dropped": 0,
"flow_limit_count": 0,
"processed": 36250,
"received_rps": 0,
"time_squeeze": 0
}
},
"summed": {
"backlog_len": 0,
"dropped": 0,
"flow_limit_count": 0,
"processed": 5239450,
"received_rps": 0,
"time_squeeze": 27
}
},
...
6.2.1.11.4.12. Sigflow
counter details - Sigflow (monitoring-engine) information
detailed_status
- Sigflow container status
up
- Status of Sigflow and the detection engine
detailed_status + status "up" |
signification |
---|---|
status "Container down" + "up" false |
status engine off |
status "Container down" + "up" true |
impossible status: device cannot be rotated in a disabled container |
status "Container UP" + "up" false |
unstable status: call GATEWATCHER support |
status "Container UP" + "up" true |
status engine on |
Example:
...
"suricata": {
"detailed_status": "Container down",
"up": false
},
...
6.2.1.11.4.13. systemd
counter details - System initialisation information
failed_services
- List of failed services reported bysystemctl --failed
.
Example:
...
"systemd": {
"failed_services": [ "netdata.service" ]
},
...
6.2.1.11.4.14. uptime
counter details - Uptime
up_seconds
- Number of seconds since start-up.
Example:
...
"uptime": {
"up_seconds": 874179.8
},
...
6.2.1.11.4.15. virtualmemory
counter details - Swap space information (swap)
disk_in
: Number of pages saved to disk since start-up.disk_out
- Number of pages out of disk since start-up.pagefaults_major
- Number of page faults per second.pagefaults_minor
- Number of page faults per second to load a memory page from disk to RAM.swap_in
- Number of kilobytes the system swapped from disk to RAM per second.swap_out
- Number of kilobytes the system swapped from RAM to disk per second.
Example:
...
"virtualmemory": {
"disk_in": 307828,
"disk_out": 4724267,
"pagefaults_major": 1210,
"pagefaults_minor": 14233474300,
"swap_in": 0,
"swap_out": 0
}
}
...