GPU Usage Guide for WDL Workflows
This guide shows WDL users how to configure GPU resources in JAWS workflows using the runtime stanza.
Quick Start: Test GPU Access
Use this minimal WDL to verify GPU configuration before running production workflows:
version 1.0
workflow GPU_Quick_Test {
call test_gpu
}
task test_gpu {
command <<<
nvidia-smi || echo "nvidia-smi not found"
python3 -c "import torch; print('CUDA:', torch.cuda.is_available())"
>>>
output {
File log = stdout()
}
runtime {
docker: "pytorch/pytorch:latest"
memory: "4GiB"
cpu: 1
gpu: true
runtime_minutes: 10
}
}
Expected output: nvidia-smi shows GPU info and CUDA: True
If this fails, see the Troubleshooting section below.
Runtime Stanza: GPU Attributes
GPU Configuration Attributes
The runtime stanza supports two GPU-specific attributes:
runtime {
gpu: true # REQUIRED to enable GPU allocation
gpuCount: 1 # OPTIONAL, defaults to 1 when gpu is true
}
Attribute Details
Attribute |
Type |
Default |
Description |
|---|---|---|---|
|
Boolean |
|
Set to |
|
Int |
|
Number of GPUs to allocate. Only applies when |
Runtime Stanza Examples
Minimal GPU Runtime (Recommended Starting Point)
runtime {
docker: "pytorch/pytorch:latest"
memory: "16GiB"
cpu: 4
gpu: true # Enable GPU
runtime_minutes: 60
}
This requests 1 GPU (default when gpu: true).
Explicit Single GPU
runtime {
docker: "pytorch/pytorch:latest"
memory: "16GiB"
cpu: 4
gpu: true
gpuCount: 1 # Explicit, same as omitting gpuCount
runtime_minutes: 60
}
Multiple GPUs (Only if your code supports multi-GPU)
runtime {
docker: "nvcr.io/nvidia/pytorch:24.01-py3"
memory: "64GiB"
cpu: 16
gpu: true
gpuCount: 4 # Request 4 GPUs
runtime_minutes: 240
}
⚠️ Warning: Requesting gpuCount > 1 does NOT automatically parallelize your code. Your application must explicitly use multi-GPU frameworks (e.g., PyTorch DDP, Horovod).
Dynamic GPU Count from Inputs
You can parameterize gpuCount using WDL inputs:
task flexible_gpu {
input {
Int num_gpus = 1
Boolean use_gpu = true
}
command <<<
python3 train.py --gpus ~{num_gpus}
>>>
runtime {
docker: "pytorch/pytorch:latest"
memory: "32GiB"
cpu: 8
gpu: use_gpu
gpuCount: if use_gpu then num_gpus else 0
runtime_minutes: 120
}
}
In your inputs.json:
{
"workflow.flexible_gpu.num_gpus": 2,
"workflow.flexible_gpu.use_gpu": true
}
Critical: Docker Container Requirements
GPU Support Depends on Your Container
The most common GPU failure is using a CPU-only container.
Setting gpu: true in the runtime stanza does NOT add GPU support to your container. Your container must already include:
NVIDIA CUDA drivers/runtime
GPU-accelerated libraries (PyTorch, TensorFlow, etc.)
Complete Runtime Stanza Reference
All Runtime Attributes with GPU
runtime {
# Container (REQUIRED, must be GPU-enabled for GPU tasks)
docker: "pytorch/pytorch:latest"
# Compute Resources
memory: "32GiB" # RAM allocation
cpu: 8 # CPU threads (useful for data loading)
# GPU Resources
gpu: true # Enable GPU (REQUIRED for GPU access)
gpuCount: 1 # Number of GPUs (default: 1)
# Time Limit
runtime_minutes: 120 # Maximum runtime
}
Available GPU Hardware
JAWS provides GPU access at these sites:
Site |
GPU Model |
Nodes |
GPUs/Node |
Memory/GPU |
|---|---|---|---|---|
Perlmutter (NERSC) |
NVIDIA A100 |
1536 |
4 |
40GB |
Tahoma (EMSL) |
NVIDIA Tesla V100 |
24 |
2 |
32GB |
Site Selection: Specify site when submitting:
jaws submit workflow.wdl inputs.json tahoma
Troubleshooting
Common Runtime Stanza Errors
Issue: Task runs on CPU instead of GPU
Symptom:
>>> torch.cuda.is_available()
False
Causes & Fixes:
Missing
gpu: truein runtime stanza❌ Wrong:
runtime { docker: "pytorch/pytorch:latest" memory: "16GiB" cpu: 4 # gpu missing }✅ Fix:
runtime { docker: "pytorch/pytorch:latest" memory: "16GiB" cpu: 4 gpu: true # Add this }CPU-only container
❌ Wrong:
runtime { docker: "ubuntu:22.04" # No CUDA gpu: true }✅ Fix:
runtime { docker: "pytorch/pytorch:latest" # Has CUDA gpu: true }
Issue: nvidia-smi command not found
Cause: Container does not include NVIDIA drivers.
Fix: Use a CUDA-enabled base image:
runtime {
docker: "nvidia/cuda:12.0-runtime" # or pytorch/pytorch:latest
gpu: true
}
Issue: Expected X GPUs but found Y
Symptom: Your code requests more GPUs than allocated.
Cause: Mismatch between code and runtime stanza.
Fix: Align your code with gpuCount:
task train {
input {
Int num_gpus = 2
}
command <<<
python3 train.py --gpus ~{num_gpus}
>>>
runtime {
docker: "pytorch/pytorch:latest"
gpu: true
gpuCount: num_gpus # Match code expectation
}
}
FAQ
Q: What’s the minimum runtime stanza for GPU?
A: Two attributes required: docker (CUDA-enabled) and gpu: true:
runtime {
docker: "pytorch/pytorch:latest"
gpu: true
}
Q: What happens if I omit gpuCount?
A: Defaults to 1 GPU when gpu: true. These are equivalent:
runtime { gpu: true }
runtime { gpu: true, gpuCount: 1 }
Q: Can I mix GPU and CPU tasks in one workflow?
A: Yes! Only add gpu: true to tasks that need GPUs:
task preprocess {
runtime {
docker: "ubuntu:22.04"
memory: "16GiB"
cpu: 4
# No gpu → runs on CPU
}
}
task train {
runtime {
docker: "pytorch/pytorch:latest"
memory: "16GiB"
cpu: 4
gpu: true # Runs on GPU
}
}
Q: Does gpuCount: 4 automatically parallelize my code?
A: No. Your code must explicitly use multi-GPU frameworks (PyTorch DDP, Horovord, etc.).