JAWS Performance Metrics
JAWS tracks performance metrics by analyzing HTCondor history logs collected across all compute sites. These logs are shipped via Filebeat to a centralized Elasticsearch backend, where they are indexed and made searchable. This document explains the key metrics shown on the JAWS Dashboard and how they can be used to assess compute and memory resource utilization.
Units
Cores: Logical compute cores (or threads).Seconds: Real time that passes on the clock while the job runs.Core-seconds: Total time all cores combined spent actively working on the job.Example: A job running for 60 seconds using 2 cores = 120 core-seconds.
GB: Gigabytes of memory used.
Metrics
RequestCoresUnit: Cores (Integer)
Definition: Logical cores (or threads) requested by the user.
CommittedSecUnit: Seconds (Integer)
Definition: The total wall-clock time the job spent running successfully on a machine (i.e., it finished without failing).
Includes core time + idle time + I/O waits, etc.
Excludes time spent on failed attempts, retries, or time in the queue.
ActiveComputeSecUnit: Core-seconds (Integer)
Definition: Total time cores spent actively computing for the job
AvgComputeCoresUnit: Cores (Float)
Definition: An estimation of how many cores were utilized on average during a job’s final, successful run.
Formula:
AvgComputeCores = ActiveComputeSec / CommittedSec
Example 1: Mixed Concurrency Over Time
A job used 1 core for 60 seconds, then 2 cores for 60 seconds.
ActiveComputeSec= (1 core × 60 s) + (2 cores × 60 s) = 180 Core-secondsCommittedSec= 120 secondsAvgComputeCores= 180 / 120 = 1.5 coresAccurate representation of concurrency
Example 2: Consistent High Utilization
A job used 4 cores continuously for 90 seconds.
ActiveComputeSec= 4 × 90 = 360 Core-secondsCommittedSec= 90 secondsAvgComputeCores= 360 / 90 = 4.0 coresAccurate representation of concurrency
Example 3: Burst Followed by Idle
A job used 4 cores for 10 seconds, then was idle for 590 seconds.
ActiveComputeSec= 4 × 10 = 40 Core-secondsCommittedSec= 600 secondsAvgComputeCores= 40 / 600 = 0.066 coresDespite briefly using 4 cores concurrently, the
AvgComputeCoreswas very low due to extended idle timeThis does not reflect peak concurrency
Example 4: Declining Core Usage
A job used 4 cores for 30 seconds, then 1 core for 90 seconds.
ActiveComputeSec= (4 × 30) + (1 × 90) = 120 + 90 = 210 Core-secondsCommittedSec= 120 secondsAvgComputeCores= 210 / 120 = 1.75 coresConcurrency declined over time
Example 5: Underutilization Despite High Request
A job requested 4 cores, but consistently used only 2 cores for 300 seconds.
ActiveComputeSec= 2 × 300 = 600 Core-secondsCommittedSec= 300 secondsAvgComputeCores= 600 / 300 = 2.0 coresUnderused cores, despite
AvgComputeCoresbeing > 1.
ComputeUseFactorUnit: Unitless
Definition: The fraction of requested cores that were actively used for computing (on average) during the job’s successful run.
Formula:
ComputeUseFactor = AvgComputeCores / RequestCores
NonComputeSecUnit: Seconds (Integer)
Definition: The portion of the job’s runtime not actively spent on computation, including time spent in I/O waits, sleeping, blocking, or other non-CPU-bound activities.
Formula:
NonComputeSec = CommittedSec − (ActiveComputeSec)
Note
This metric is not currently calculated but may be added in a future update.
Low
ComputeUseFactor(i.e., low average core usage relative toRequestCores) does not necessarily imply low application code efficiency. However, if it is consistently low,The workload is I/O-bound or memory-bound
The job is not parallelized efficiently
Fewer cores should be requested (
RequestCoresmay be over-provisioned).
PeakMemoryGBUnit: GB (Float)
Definition: The peak memory used by the job during its successful run.
RequestMemoryGBUnit: GB (Float)
Definition: The amount of memory requested by the job.
MemoryUseFactor
Unit: Unitless
Definition: The ratio of peak memory used to memory requested during the job’s successful run.
Formula:
MemoryUseFactor = PeakMemoryGB / RequestMemoryGB
Low
MemoryUseFactorsuggests memory over-allocation. In such cases, users are encouraged to reduce their memory requests.
HTCondor Attribute Mapping (Optional)
Note
This section is intended for users interested in understanding which HTCondor attributes JAWS metrics are based on. Most users can safely ignore this.
JAWS Metric |
HTCondor ClassAd Field |
|---|---|
|
|
|
|
|
|
|
MemoryUsage (converted to GB) |
|
RequestMemory (converted to GB) |