JAWS API v3 - Runs Endpoint
Warning
The JAWS API v3 is currently experimental and under active development. Endpoints, request/response formats, and behavior may change without notice.
Note
This documentation is for users accessing the JAWS API programmatically. If you’re using
the JAWS CLI (jaws command), see JAWS Commands instead.
Overview
The /runs endpoint provides programmatic access to submit, monitor, and search workflow runs in JAWS.
All requests require authentication via API key.
Key Features:
Submit Runs - Upload workflow files and create new runs
Monitor Status - Track run progress and retrieve detailed status
Search & Filter - Query runs by team, user, status, site, and more
API Environments
JAWS provides multiple environments:
Production:
https://jaws-api.jgi.doe.gov/api/v3Staging:
https://jaws-api-staging.jgi.doe.gov/api/v3(for testing with select users)
Note
The examples below use the production environment. To use staging, simply replace the base URL.
Authentication
All requests require an API key in the Authorization header:
Authorization: Bearer YOUR_API_KEY
Contact your JAWS administrator for an API key.
Submitting a Run
Create a new workflow run by uploading WDL and inputs files.
API Endpoint
POST https://jaws-api.jgi.doe.gov/api/v3/runs
Request Parameters
Multipart Form Data:
wdl_file(file, required): WDL workflow fileinputs_json(file, required): Inputs JSON filesubworkflows(file, optional): Subworkflows ZIP archivedata(JSON string, required): Run metadata with the following fields:
Required metadata fields:
compute_site_id(string): Target compute site (e.g., “dori”, “jgi”, “perlmutter”)input_site_id(string): Site where input files are locatedteam_id(string): Team identifierworkflow_name(string): Semantic workflow namemax_ram_gb(integer): Maximum RAM in GB (1-1024). This shoud match the max RAM specified in your WDL.
Optional metadata fields:
output_site_id(string): Where to transfer outputs (defaults to input_site_id)workflow_tag(string): Version or tag for the workflowcaching(boolean): Enable Cromwell call caching (default: true)tag(string): User-defined tag for this runmanifest(array): List of input file paths to transfer
Input Data Staging (v3 Requirement)
Important
In API v3, you are responsible for staging your input data. Unlike previous versions where the JAWS client handled data movement, v3 requires you to:
Move your input files to the staging area on the input site before submission
Set correct file permissions to ensure JAWS can access the files
Reference the staged paths in your inputs JSON file using the full absolute path
Staging Directory Structure:
The staging directory path format is:
/clusterfs/jgi/scratch/dsi/aa/jaws/dori-[ENV]/inputs/[SITE]/
- Where:
[ENV]= environment (prodorstaging)[SITE]= site name (currentlydorifor MVP)
Examples:
Production Dori:
/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/Staging Dori:
/clusterfs/jgi/scratch/dsi/aa/jaws/dori-staging/inputs/dori/
How Staging Works:
Your staged file path = staging directory + your data’s absolute path
For example, if your original data is at /home/user/project/data/sample.fastq, you would:
Move it to:
/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/home/user/project/data/sample.fastqReference in inputs.json:
/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/home/user/project/data/sample.fastq
File Permission Requirements:
Files must be readable by the JAWS execution user
Recommended permissions:
chmod 644for files,chmod 755for directoriesEnsure parent directories are executable (
chmod +x)
Example workflow for Dori (production):
# Assume your data is in /home/jdoe/myproject/inputs/
STAGING_PATH="/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori"
DATA_PATH="/home/jdoe/myproject/inputs"
# 1. Create the directory structure in staging
mkdir -p "$STAGING_PATH/home/jdoe/myproject/inputs"
# 2. Move your input files to the staging area
cp "$DATA_PATH/sample.fastq" "$STAGING_PATH/home/jdoe/myproject/inputs/"
cp "$DATA_PATH/reference.fasta" "$STAGING_PATH/home/jdoe/myproject/inputs/"
# 3. Set proper permissions
chmod 755 "$STAGING_PATH/home/jdoe/myproject/inputs"
chmod 644 "$STAGING_PATH/home/jdoe/myproject/inputs"/*
Example inputs.json with staged file paths:
{
"workflow.input_fastq": "/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/home/jdoe/myproject/inputs/sample.fastq",
"workflow.reference": "/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/home/jdoe/myproject/inputs/reference.fasta"
}
After staging your data and creating your inputs.json with the full staged paths,
submit the run via API with input_site_id="dori" and compute_site_id="dori"
(see API submission examples below).
Example with cURL
curl -X POST https://jaws-api.jgi.doe.gov/api/v3/runs \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "[email protected]" \
-F "[email protected]" \
-F 'data={
"compute_site_id": "dori",
"input_site_id": "dori",
"output_site_id": "dori",
"team_id": "my_team",
"workflow_name": "example_workflow",
"max_ram_gb": 16,
"workflow_tag": "v1.0",
"caching": true,
"tag": "test_run",
"manifest": [
"/path/to/input1.fastq",
"/path/to/input2.fastq"
]
}'
Example with Python
import requests
import json
API_KEY = "your_api_key_here"
API_URL = "https://jaws-api.jgi.doe.gov/api/v3/runs"
# Prepare metadata
metadata = {
"compute_site_id": "dori",
"input_site_id": "dori",
"output_site_id": "dori",
"team_id": "my_team",
"workflow_name": "example_workflow",
"max_ram_gb": 16,
"workflow_tag": "v1.0",
"caching": True,
"tag": "test_run",
"manifest": [
"/path/to/input1.fastq",
"/path/to/input2.fastq"
]
}
# Prepare files and data
files = {
'wdl_file': open('workflow.wdl', 'rb'),
'inputs_json': open('inputs.json', 'rb'),
'data': (None, json.dumps(metadata), 'application/json')
}
# Optional: include subworkflows
# files['subworkflows'] = open('subworkflows.zip', 'rb')
headers = {
"Authorization": f"Bearer {API_KEY}"
}
# Submit the run
response = requests.post(API_URL, headers=headers, files=files)
if response.status_code == 201:
run_data = response.json()
print(f"✓ Run submitted successfully!")
print(f" Run ID: {run_data['run_id']}")
print(f" Workflow: {run_data['workflow_name']}")
print(f" Submission ID: {run_data['submission_id']}")
print(f" Submitted at: {run_data['submitted_at']}")
else:
print(f"✗ Failed to submit run: {response.status_code}")
print(response.json())
# Close files
for f in files.values():
if hasattr(f, 'close'):
f.close()
Response (201 Created)
{
"run_id": 12345,
"submission_id": "abc123def456",
"workflow_id": "wf_789xyz",
"workflow_name": "example_workflow",
"submitted_at": "2026-05-11T14:30:00.000000Z",
"submitted_by": "jdoe"
}
Monitoring a Run
Retrieve detailed status information for a specific run.
API Endpoint
GET https://jaws-api.jgi.doe.gov/api/v3/runs/{run_id}
Example with cURL
curl -H "Authorization: Bearer YOUR_API_KEY" \
https://jaws-api.jgi.doe.gov/api/v3/runs/12345
Example with Python
import requests
API_KEY = "your_api_key_here"
RUN_ID = 12345
API_URL = f"https://jaws-api.jgi.doe.gov/api/v3/runs/{RUN_ID}"
headers = {"Authorization": f"Bearer {API_KEY}"}
response = requests.get(API_URL, headers=headers)
if response.status_code == 200:
run = response.json()
print(f"Run ID: {run['id']}")
print(f"Status: {run['status']}")
print(f"Result: {run.get('result', 'N/A')}")
print(f"Workflow: {run['workflow_name']}")
print(f"Team: {run['team_id']}")
print(f"Compute Site: {run['compute_site_id']}")
print(f"Output Dir: {run.get('output_dir', 'N/A')}")
print(f"CPU Hours: {run.get('cpu_hours', 0)}")
elif response.status_code == 404:
print(f"✗ Run {RUN_ID} not found")
else:
print(f"✗ Error: {response.status_code}")
Response (200 OK)
{
"id": 12345,
"user_id": "jdoe",
"submission_id": "abc123def456",
"status": "running",
"result": null,
"workflow_name": "example_workflow",
"workflow_id": "wf_789xyz",
"team_id": "my_team",
"compute_site_id": "dori",
"input_site_id": "dori",
"output_site_id": "dori",
"output_dir": "/path/to/outputs/12345",
"max_ram_gb": 16,
"caching": true,
"tag": "test_run",
"cromwell_run_id": "cromwell-abc-123",
"wdl_file": "workflow.wdl",
"json_file": "inputs.json",
"cpu_hours": 2.5,
"submitted": "2026-05-11T14:30:00Z",
"updated": "2026-05-11T15:45:00Z",
"workflow_root": "/cromwell/executions/workflow/cromwell-abc-123",
"webhook": null
}
Polling for Status Updates
To monitor a run until completion, poll the endpoint at regular intervals:
import requests
import time
API_KEY = "your_api_key_here"
RUN_ID = 12345
API_URL = f"https://jaws-api.jgi.doe.gov/api/v3/runs/{RUN_ID}"
headers = {"Authorization": f"Bearer {API_KEY}"}
print(f"Monitoring run {RUN_ID}...")
while True:
response = requests.get(API_URL, headers=headers)
if response.status_code == 200:
run = response.json()
status = run['status']
print(f"Status: {status} (Last updated: {run['updated']})")
# Check if run reached terminal state
if status in ['succeeded', 'failed', 'cancelled', 'done']:
print(f"\n✓ Run completed")
print(f" Final Status: {status}")
print(f" Result: {run.get('result', 'N/A')}")
print(f" CPU Hours: {run.get('cpu_hours', 0)}")
print(f" Output Dir: {run.get('output_dir', 'N/A')}")
break
elif response.status_code == 404:
print(f"✗ Run not found (may have been deleted)")
break
else:
print(f"✗ Error checking status: {response.status_code}")
# Wait before next poll (recommended: 30-60 seconds)
time.sleep(30)
Run Status Values
Common status values you’ll encounter:
created - Run accepted, ID assigned
upload queued - Waiting to transfer inputs to compute site
uploading - Transferring input files
upload complete - Inputs transferred successfully
ready - Run transferred to compute site
submitted - Submitted to Cromwell
queued - At least one task is queued
running - Workflow is executing
succeeded - Run completed successfully
failed - Run failed
cancelled - Run was cancelled
download complete - Outputs transferred to output site
done - Run fully complete
For complete status descriptions, see JAWS Commands.
Searching for Runs
Query runs with filtering, pagination, and sorting.
API Endpoint
GET https://jaws-api.jgi.doe.gov/api/v3/runs
Query Parameters (all optional)
user_id(string): Filter by user IDteam_id(string): Filter by team IDstatus(string): Filter by status (e.g., “running”, “succeeded”)compute_site_id(string): Filter by compute siteinput_site_id(string): Filter by input siteoutput_site_id(string): Filter by output siteorder_by(string): Sort field (prefix with-for descending, e.g.,-id,user_id)offset(integer): Pagination offset (default: 0)limit(integer): Max results to return (max: 100, default: 25)
Example with cURL
# Get all runs for a team
curl -H "Authorization: Bearer YOUR_API_KEY" \
"https://jaws-api.jgi.doe.gov/api/v3/runs?team_id=my_team&limit=50"
# Get running runs for a specific user
curl -H "Authorization: Bearer YOUR_API_KEY" \
"https://jaws-api.jgi.doe.gov/api/v3/runs?user_id=jdoe&status=running"
# Get recent runs (sorted by ID descending)
curl -H "Authorization: Bearer YOUR_API_KEY" \
"https://jaws-api.jgi.doe.gov/api/v3/runs?order_by=-id&limit=10"
Example with Python
import requests
API_KEY = "your_api_key_here"
API_URL = "https://jaws-api.jgi.doe.gov/api/v3/runs"
headers = {"Authorization": f"Bearer {API_KEY}"}
# Example 1: Get all runs for a team
params = {
"team_id": "my_team",
"limit": 50
}
response = requests.get(API_URL, headers=headers, params=params)
if response.status_code == 200:
runs = response.json()
print(f"Found {len(runs)} runs for team 'my_team'")
for run in runs:
print(f" Run {run['id']}: {run['workflow_name']} - {run['status']}")
# Example 2: Get running runs for a user
params = {
"user_id": "jdoe",
"status": "running"
}
response = requests.get(API_URL, headers=headers, params=params)
if response.status_code == 200:
running_runs = response.json()
print(f"\n{len(running_runs)} running runs for user 'jdoe'")
# Example 3: Get recent runs with pagination
params = {
"order_by": "-id",
"limit": 10,
"offset": 0
}
response = requests.get(API_URL, headers=headers, params=params)
if response.status_code == 200:
recent_runs = response.json()
print(f"\n10 Most recent runs:")
for run in recent_runs:
print(f" {run['id']}: {run['workflow_name']} ({run['status']})")
# Example 4: Filter by compute site and status
params = {
"compute_site_id": "perlmutter",
"status": "succeeded",
"limit": 25
}
response = requests.get(API_URL, headers=headers, params=params)
if response.status_code == 200:
perlmutter_runs = response.json()
print(f"\n{len(perlmutter_runs)} succeeded runs on Perlmutter")
Response (200 OK)
Returns an array of run objects:
[
{
"id": 12345,
"user_id": "jdoe",
"workflow_name": "example_workflow",
"status": "succeeded",
"team_id": "my_team",
"compute_site_id": "dori",
"submitted": "2026-05-11T14:30:00Z",
"updated": "2026-05-11T16:00:00Z"
},
{
"id": 12344,
"user_id": "jdoe",
"workflow_name": "another_workflow",
"status": "running",
"team_id": "my_team",
"compute_site_id": "dori",
"submitted": "2026-05-11T13:00:00Z",
"updated": "2026-05-11T15:45:00Z"
}
]
Error Responses
404 Not Found - Run does not exist:
{
"detail": "Run not found"
}
422 Unprocessable Entity - Invalid request parameters:
{
"detail": [
{
"loc": ["body", "data", "max_ram_gb"],
"msg": "ensure this value is less than or equal to 1024",
"type": "value_error.number.not_le"
}
]
}
401 Unauthorized - Invalid or missing API key:
{
"detail": "Invalid authentication credentials"
}
403 Forbidden - Insufficient permissions:
{
"detail": "Insufficient permissions to access this resource"
}
Best Practices
Run Submission
Stage input data in the appropriate location with correct permissions before submission
Use descriptive
tagvalues to identify runs laterThe parameter
max_ram_gbshould match the max RAM in your WDL. We require this parameter so we can do quick validation in the API level.Set
max_ram_gbappropriately to avoid resource waste.Enable
caching(default) to leverage Cromwell call-caching and speed up reruns by reusing previously computed results
Monitoring
Poll at reasonable intervals (recommended: 30-60 seconds)
Check for terminal states:
succeeded,failed,cancelled,done
Searching
Use filters to reduce result set size
Implement pagination for large result sets (max 100 per request)
Sort by
-idto get most recent runs first
Error Handling
Always check HTTP status codes
Retry transient errors (5xx) with exponential backoff
Log submission details for troubleshooting failed runs
Validate metadata before submission to catch errors early
Performance
Cache workflow files when submitting multiple runs
Use the same
workflow_nameandworkflow_tagfor deduplicationClose file handles after submission
Limit concurrent submissions to avoid overwhelming the API
Complete Workflow Example
If you want to run the quickstart example, you will first want to git clone
the jaws-tutorial-examples repository.
git clone https://code.jgi.doe.gov/official-jgi-workflows/wdl-specific-repositories/jaws-tutorial-examples.git
cd jaws-tutorial-examples/quickstart
Here’s a complete example demonstrating submission, monitoring, and searching:
import requests
import json
import time
# ============================================================================
# CONFIGURATION - Update these values for your environment
# ============================================================================
API_KEY = "your_api_key_here" # Your JAWS API key
TEAM_ID = "jgi_genomics" # Your team identifier
WORKFLOW_NAME = "my_analysis_workflow" # Descriptive workflow name
COMPUTE_SITE_ID = "dori" # Compute site: dori, perlmutter, jgi, etc.
INPUT_SITE_ID = "dori" # Input site (typically same as compute)
OUTPUT_SITE_ID = "dori"
WDL_FILE_PATH = "align.wdl" # Path to your WDL file
INPUTS_JSON_PATH = "inputs.json" # Path to your inputs JSON
MAX_RAM_GB = 1 # Max RAM in GB (1-1024). Match this with your max in your WDL
RUN_TAG = "api_example_run" # Tag to identify this run
# ============================================================================
BASE_URL = "https://jaws-api.jgi.doe.gov/api/v3"
headers = {"Authorization": f"Bearer {API_KEY}"}
# Step 1: Submit a run
print("Step 1: Submitting run...")
metadata = {
"compute_site_id": COMPUTE_SITE_ID,
"input_site_id": INPUT_SITE_ID,
"output_site_id": OUTPUT_SITE_ID,
"team_id": TEAM_ID,
"workflow_name": WORKFLOW_NAME,
"max_ram_gb": MAX_RAM_GB,
"tag": RUN_TAG
}
files = {
'wdl_file': open(WDL_FILE_PATH, 'rb'),
'inputs_json': open(INPUTS_JSON_PATH, 'rb'),
'data': (None, json.dumps(metadata), 'application/json')
}
response = requests.post(f"{BASE_URL}/runs", headers=headers, files=files)
for f in files.values():
if hasattr(f, 'close'):
f.close()
if response.status_code != 201:
print(f"Failed to submit: {response.json()}")
exit(1)
run_data = response.json()
run_id = run_data['run_id']
print(f"✓ Run {run_id} submitted successfully\n")
# Step 2: Monitor the run
print(f"Step 2: Monitoring run {run_id}...")
while True:
response = requests.get(f"{BASE_URL}/runs/{run_id}", headers=headers)
if response.status_code == 200:
run = response.json()
status = run['status']
print(f" Status: {status}")
# Check if terminal state
if status in ['succeeded', 'failed', 'cancelled', 'done']:
print(f"\n✓ Run completed with status: {status}")
print(f" Result: {run.get('result', 'N/A')}")
print(f" CPU Hours: {run.get('cpu_hours', 0)}")
print(f" Output Dir: {run.get('output_dir', 'N/A')}")
break
# Wait before next poll (30 seconds recommended)
time.sleep(30)
# Step 3: Search for related runs
print(f"\nStep 3: Finding other runs with tag {RUN_TAG} ...")
# Note: Searching by tag requires filtering client-side
params = {
"team_id": TEAM_ID,
"limit": 100
}
response = requests.get(f"{BASE_URL}/runs", headers=headers, params=params)
if response.status_code == 200:
all_runs = response.json()
tagged_runs = [r for r in all_runs if r.get('tag') == RUN_TAG]
print(f"Found {len(tagged_runs)} runs with this tag:")
for run in tagged_runs:
print(f" Run {run['id']}: {run['status']}")