6.2.1.11. health

6.2.1.11.1. Introduction

The health command of the show subgroup enables displaying statistics and the health information of the GCap.


6.2.1.11.2. Prerequisites

  • Users: setup, gviewadm

  • Dependencies: N/A


6.2.1.11.3. Command

show health


6.2.1.11.4. Example

  • Enter the following command.

(gcap-cli) show health
  • Validate.

The system displays the following information:

  • block counters - Mass storage statistics

  • cpu_stats counters - Processor statistics

  • disks counters - Mount point occupancy statistics

  • emergency counters - GCap emergency mode information

  • gcenter counters - Paired GCenter information

  • high_availability counters - High Availability (HA) information

  • interfaces counters - Statistics on network interfaces

  • loadavg counters - Statistics on the average load of the GCap

  • meminfo counters - Statistics on the RAM

  • numastat counters - Non Uniform Memory Access (NUMA) node statistics

  • sofnet counters - Statistics on received packets according to processor cores

  • suricata counters - Sigflow (monitoring-engine) information

  • systemd counters - System initialisation information

  • uptime counters - Uptime

  • virtualmemory counters - Swap space information (swap)


6.2.1.11.4.1. block counters details - Mass storage statistics

  • sdN - Disk statistics N where N is a letter of the alphabet

    • read_bytes - Bytes read since startup

    • written_bytes - Bytes written since startup

    Example:

    {
     "block": {
        "sda": {
            "read_bytes": 302867968,
            "written_bytes": 4837645312
        },
        "sdb": {
            "read_bytes": 3894272,
            "written_bytes": 4096
        }
    },
    

6.2.1.11.4.2. cpu_stats counter details - CPU statistics

  • cpus - CPU usage statistics

    • cpu - Overall core usage statistics

    • cpuX - CPU X core statistics

      • idle - Elapsed time doing nothing in milliseconds

      • iowait - Elapsed time waiting for disk operations in milliseconds

      • irq - Elapsed time on material IRQs

      • nice - Time elapsed in user space on low priority processes in milliseconds

      • softirq - Elapsed time on hardware IRQs in milliseconds

      • system - Elapsed time in kernel space in milliseconds

      • user - Elapsed time in user space in milliseconds

    • interrupts - Number of interrupts since startup

    • processes_blocked - Number of blocked or dead processes

    • processes_running - Number of running processes

    Example:

    ...
    "cpu_stats": {
        "cpus": {
            "cpu": {
                "idle": 961816208,
                "iowait": 11419,
                "irq": 0,
                "nice": 0,
                "softirq": 397899,
                "system": 21788203,
                "user": 50806194
            },
            "cpu0": {
                "idle": 79960857,
                "iowait": 985,
                "irq": 0,
                "nice": 0,
                "softirq": 234748,
                "system": 1795880,
                "user": 4357374
            },
            "cpu1": {
                "idle": 80166571,
                "iowait": 951,
                "irq": 0,
                "nice": 0,
                "softirq": 88078,
                "system": 1830370,
                "user": 4138182
            }
        },
        "interrupts": 12942835029,
        "processes_blocked": 0,
        "processes_running": 1
    },
    ...

6.2.1.11.4.3. disks counters details - Mount point occupancy statistics

  • /mountpoint/path - Mount point path

    • block_free - Number of free blocks

    • block_total - Total number of blocks

    • inode_free - Number of remaining inodes

    • inode_total - Total number of inodes

    Example:

    ...
    "disks": {
        "/": {
            "block_free": 247909,
            "block_total": 249830,
            "inode_free": 64258,
            "inode_total": 65536
        },
        "/data": {
            "block_free": 7150076,
            "block_total": 7161801,
            "inode_free": 1827417,
            "inode_total": 1827840
        },
    },
    ...

6.2.1.11.4.4. emergency Counters details - GCap emergency mode information

  • emergency_active - Active or inactive status of the emergency mode

Example:

    ...
    "emergency": {
        "emergency_active": false
    },
    ...

6.2.1.11.4.5. gcenter Counters details - Paired GCenter information

  • chronyc_sync - Status of the NTP synchronisation with the GCenter

  • reachable - GCenter reachable or not (false)

Example:

    ...
    "gcenter": {
        "chronyc_sync": false,
        "reachable": false
    },
    ...

6.2.1.11.4.6. high_availability counters details - High Availability (HA) information

  • healthy - HA health status

  • last_status - Last known HA status

  • last_transition - Date of last known HA status change in ISO8601 format

  • leader - True for a GCap leader, false for a GCap follower

  • status - Active or inactive (false) status of the HA

Example:

...
    "high_availability": {
        "healthy": false,
        "last_status": -1,
        "last_transition": "0001-01-01T00:00:00Z",
        "leader": false,
        "status": false
    },
...

6.2.1.11.4.7. interfaces counter details - Statistics on network interfaces

  • bond0 - Name of the network interface

    • rx_bytes - Number of bytes received

    • rx_drop - Number of bytes lost in reception

    • rx_errs - Number of invalid bytes received

    • rx_packets - Total number of packets received from this interface

    • tx_bytes - Number of bytes sent

    • tx_drop - Number of bytes lost while sending

    • tx_errs - Number of invalid bytes sent

    • tx_packets - Total number of packets sent from this interface

Example:

...
    "interfaces": {
        "bond0": {
            "rx_bytes": 0,
            "rx_drops": 0,
            "rx_errs": 0,
            "rx_packets": 0,
            "tx_bytes": 0,
            "tx_drops": 0,
            "tx_errs": 0,
            "tx_packets": 0
        },
        "gcp0": {
            "rx_bytes": 138433006,
            "rx_drops": 82901,
            "rx_errs": 0,
            "rx_packets": 2143236,
            "tx_bytes": 796294,
            "tx_drops": 0,
            "tx_errs": 0,
            "tx_packets": 3635
        },
        "gcp1": {
            "rx_bytes": 137642525,
            "rx_drops": 82902,
            "rx_errs": 0,
            "rx_packets": 2135060,
            "tx_bytes": 0,
            "tx_drops": 0,
            "tx_errs": 0,
            "tx_packets": 0
        }
    },
...

6.2.1.11.4.8. loadavg counter details - Statistics on the average load of the GCap

  • active_processes - Number of processes started

  • load_average_15_mins - Average load over the last fifteen minutes

  • load_average_1_min - Average load over the last minute

  • load_average_5_mins - Average load over the last five minutes

  • running_processes - Number of running processes

Example:

...
    "loadavg": {
        "active_processes": 561,
        "load_average_15_mins": 0.99,
        "load_average_1_min": 0.67,
        "load_average_5_mins": 1,
        "running_processes": 2
    },
...

6.2.1.11.4.9. meminfo counter details - Statistics on the RAM

  • available - Total physical memory in kilobytes

  • buffers - Memory used by disk operations in kilobytes

  • cached - Memory used by the cache in kilobytes

  • dirty - Memory used by pending write operations in kilobytes

  • free - Unused memory in kilobytes

  • hugepages_anonymous - Number of anonymous transparent huge pages used

  • hugepages_free - Number of available transparent huge pages

  • hugepages_reserved - Number of reserved transparent huge pages

  • hugepages_shmem - Number of shared transparent huge pages

  • hugepages_surplus - Number of extra transparent huge pages

  • hugepages_total - Total number of huge pages

  • kernel_stack - Memory used by kernel stack allocations in kilobytes

  • page_tables - Memory used for page management in kilobytes

  • s_reclaimable - Cache memory that can be reallocated in case of memory shortage in kilobytes

  • shmem - Memory used by shared pages in kilobytes

  • slab - Memory used by kernel data structures in kilobytes

  • swap_cached - Memory used by the swap cache in kilobytes

  • swap_free - Available memory in swap in kilobytes

  • swap_total - Total swap memory in kilobytes

  • total - Total memory in kilobytes

  • v_malloc_used - Memory used by large memory areas allocated by the kernel

For more information, please refer to this documentation meminfo.

Example:

...
    "meminfo": {
        "available": 13608896,
        "buffers": 380932,
        "cached": 1155824,
        "dirty": 28,
        "free": 13128080,
        "hugepages_anonymous": 423936,
        "hugepages_free": 0,
        "hugepages_reserved": 0,
        "hugepages_shmem": 0,
        "hugepages_surplus": 0,
        "hugepages_total": 0,
        "kernel_stack": 9152,
        "page_tables": 8400,
        "s_reclaimable": 43168,
        "shmem": 794564,
        "slab": 210008,
        "swap_cached": 0,
        "swap_free": 16777212,
        "swap_total": 16777212,
        "total": 15977468,
        "v_malloc_used": 66592
    },
...

6.2.1.11.4.10. numastat counter details- Non Uniform Memory Access (NUMA) node statistics

  • nodes - List of NUMA nodes

    • nodeX - NUMA X node statistics

      • interleave_hit - Interleaved memory successfully allocated in this node

      • local_node - Memory allocated in this node while a process was running on it

      • numa_foreign - Memory planned for this node, but currently allocated in a different node

      • numa_hit - Memory successfully allocated in this node as expected

      • numa_miss - Memory allocated in this node despite process preferences. Each numa_miss has a numa_foreign in another node

      • other_node - Memory allocated in this node while a process was running in another node

Example:

...
    "numastat": {
        "nodes": {
            "node0": {
                "interleave_hit": 3871,
                "local_node": 4410557829,
                "numa_foreign": 0,
                "numa_hit": 4410454203,
                "numa_miss": 0,
                "other_node": 14170
            },
            "node1": {
                "interleave_hit": 3869,
                "local_node": 4224990850,
                "numa_foreign": 0,
                "numa_hit": 4224964539,
                "numa_miss": 0,
                "other_node": 21531
            }
        }
    },
...

6.2.1.11.4.11. sofnet counter details - Statistics on received packets according to processor cores

  • cpus - Usage statistics per CPU

    • cpuX - CPU X core statistics

      • backlog_len -

      • dropped - Number of packets dropped

      • flow_limit_count - Number of times the throughput limit was reached

      • processed - Number of packets processed

      • received_rps - Number of times the CPU was woken up

      • time_squeeze - Number of times the thread could not process all the packets in its backlog within the budget

    • summed - Overall core usage statistics

      • backlog_len -

      • dropped - Number of packets dropped

      • flow_limit_count - Number of times the throughput limit was reached

      • processed - Number of packets processed

      • received_rps - Number of times the CPU was woken up

      • time_squeeze - Number of times the thread could not process all the packets in its backlog within the budget

Example:

...
    "softnet": {
        "cpus": {
            "cpu0": {
                "backlog_len": 0,
                "dropped": 0,
                "flow_limit_count": 0,
                "processed": 448550,
                "received_rps": 0,
                "time_squeeze": 2
            },
            "cpu1": {
                "backlog_len": 0,
                "dropped": 0,
                "flow_limit_count": 0,
                "processed": 36250,
                "received_rps": 0,
                "time_squeeze": 0
            }
        },
        "summed": {
            "backlog_len": 0,
            "dropped": 0,
            "flow_limit_count": 0,
            "processed": 5239450,
            "received_rps": 0,
            "time_squeeze": 27
        }
    },
...

6.2.1.11.4.12. Sigflow counter details - Sigflow (monitoring-engine) information

detailed_status - Sigflow container status

  • up - Status of Sigflow and the detection engine

detailed_status + status "up"

signification

status "Container down" + "up" false

status engine off

status "Container down" + "up" true

impossible status: device cannot be rotated in a disabled container

status "Container UP" + "up" false

unstable status: call GATEWATCHER support

status "Container UP" + "up" true

status engine on

Example:

...
    "suricata": {
        "detailed_status": "Container down",
        "up": false
    },
...

6.2.1.11.4.13. systemd counter details - System initialisation information

  • failed_services - List of failed services reported by systemctl --failed.

Example:

...
    "systemd": {
        "failed_services": [ "netdata.service" ]
    },
...

6.2.1.11.4.14. uptime counter details - Uptime

  • up_seconds - Number of seconds since start-up.

Example:

...
    "uptime": {
        "up_seconds": 874179.8
    },
...

6.2.1.11.4.15. virtualmemory counter details - Swap space information (swap)

  • disk_in: Number of pages saved to disk since start-up.

  • disk_out - Number of pages out of disk since start-up.

  • pagefaults_major - Number of page faults per second.

  • pagefaults_minor - Number of page faults per second to load a memory page from disk to RAM.

  • swap_in - Number of kilobytes the system swapped from disk to RAM per second.

  • swap_out - Number of kilobytes the system swapped from RAM to disk per second.

Example:

...
    "virtualmemory": {
        "disk_in": 307828,
        "disk_out": 4724267,
        "pagefaults_major": 1210,
        "pagefaults_minor": 14233474300,
        "swap_in": 0,
        "swap_out": 0
    }
}
...