Skip to main content

Metrics

Visent Telemetry collects comprehensive metrics about GPU performance, system resources, and application usage. All metrics include timestamps, node identifiers, and GPU device information for detailed analysis and alerting. Metrics are collected at configurable intervals and can be filtered, aggregated, and exported to external monitoring systems.

GPU Metrics

Coming soon - detailed list of GPU-specific metrics including utilization, memory, and performance counters.

Core GPU Metrics

MetricDescriptionUnit
gpu_utilizationGPU compute utilizationPercentage
gpu_memory_usedGPU memory usageBytes
gpu_temperatureGPU temperatureCelsius
gpu_power_drawGPU power consumptionWatts

Memory Metrics

Coming soon - GPU memory usage, allocation, and fragmentation metrics.

Performance Metrics

Coming soon - throughput, latency, and efficiency measurements.

System Metrics

Coming soon - host system metrics including CPU, memory, network, and storage.

Process Metrics

Coming soon - per-process GPU usage and resource consumption.

Custom Metrics

Coming soon - how to define and collect application-specific metrics.

Metric Retention

Coming soon - data retention policies and storage optimization.

Next Steps