Key hardware specifications of the server executing this step.
Network discovery indicates the following cloud environment was utilized for this step.
Current and historical averages and peaks for CPU and memory usage for the current and last five successful runs.
Based on recent average CPU usage, historical peak memory usage, and observed GPU usage.
CPU usage for both the server and specific tasks is calculated by summing user+nice and system CPU times (in clock ticks), normalized by dividing by the total elapsed time and ticks per second. Task CPU usage encompasses all child processes.
Server application memory usage is monitored through the size of anonymous memory pages, while task memory usage is measured by summing all Proportional Set Size (PSS) rollups of subprocesses.
Task-specific disk usage tracking is unreliable; therefore, it is recommended to monitor disk usage at the server level, encompassing all mounted disks.
Server-level disk space usage on all mounted disks.
Network usage is monitored solely at the server level across all interfaces.
nvidia-smi
reported ratios standardized between 0 and GPU count, proxying how many GPUs have been 100% utilized. Note that task-specific GPU usage is not as reliable as server-level GPU usage and limited up to 4 GPUs.
nvidia-smi
reported number of GPUs with a utilization greater than 0. Note that task-specific GPU usage is not as reliable as server-level GPU usage and limited up to 4 GPUs.
nvidia-smi
reported, summed up VRAM usage for all GPUs. Note that task-specific GPU usage is not as reliable as server-level GPU usage and limited up to 4 GPUs.