JAWS API v3 - Runs Endpoint

Warning

The JAWS API v3 is currently experimental and under active development. Endpoints, request/response formats, and behavior may change without notice.

Note

This documentation is for users accessing the JAWS API programmatically. If you’re using the JAWS CLI (jaws command), see JAWS Commands instead.

Overview

The /runs endpoint provides programmatic access to submit, monitor, and search workflow runs in JAWS. All requests require authentication via API key.

Key Features:

  • Submit Runs - Upload workflow files and create new runs

  • Monitor Status - Track run progress and retrieve detailed status

  • Search & Filter - Query runs by team, user, status, site, and more

API Environments

JAWS provides multiple environments:

  • Production: https://jaws-api.jgi.doe.gov/api/v3

  • Staging: https://jaws-api-staging.jgi.doe.gov/api/v3 (for testing with select users)

Note

The examples below use the production environment. To use staging, simply replace the base URL.

Authentication

All requests require an API key in the Authorization header:

Authorization: Bearer YOUR_API_KEY

Contact your JAWS administrator for an API key.

Submitting a Run

Create a new workflow run by uploading WDL and inputs files.

API Endpoint

POST https://jaws-api.jgi.doe.gov/api/v3/runs

Request Parameters

Multipart Form Data:

  • wdl_file (file, required): WDL workflow file

  • inputs_json (file, required): Inputs JSON file

  • subworkflows (file, optional): Subworkflows ZIP archive

  • data (JSON string, required): Run metadata with the following fields:

Required metadata fields:

  • compute_site_id (string): Target compute site (e.g., “dori”, “jgi”, “perlmutter”)

  • input_site_id (string): Site where input files are located

  • team_id (string): Team identifier

  • workflow_name (string): Semantic workflow name

  • max_ram_gb (integer): Maximum RAM in GB (1-1024). This shoud match the max RAM specified in your WDL.

Optional metadata fields:

  • output_site_id (string): Where to transfer outputs (defaults to input_site_id)

  • workflow_tag (string): Version or tag for the workflow

  • caching (boolean): Enable Cromwell call caching (default: true)

  • tag (string): User-defined tag for this run

  • manifest (array): List of input file paths to transfer

Input Data Staging (v3 Requirement)

Important

In API v3, you are responsible for staging your input data. Unlike previous versions where the JAWS client handled data movement, v3 requires you to:

  1. Move your input files to the staging area on the input site before submission

  2. Set correct file permissions to ensure JAWS can access the files

  3. Reference the staged paths in your inputs JSON file using the full absolute path

Staging Directory Structure:

The staging directory path format is:

/clusterfs/jgi/scratch/dsi/aa/jaws/dori-[ENV]/inputs/[SITE]/
Where:
  • [ENV] = environment (prod or staging)

  • [SITE] = site name (currently dori for MVP)

Examples:

  • Production Dori: /clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/

  • Staging Dori: /clusterfs/jgi/scratch/dsi/aa/jaws/dori-staging/inputs/dori/

How Staging Works:

Your staged file path = staging directory + your data’s absolute path

For example, if your original data is at /home/user/project/data/sample.fastq, you would:

  1. Move it to: /clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/home/user/project/data/sample.fastq

  2. Reference in inputs.json: /clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/home/user/project/data/sample.fastq

File Permission Requirements:

  • Files must be readable by the JAWS execution user

  • Recommended permissions: chmod 644 for files, chmod 755 for directories

  • Ensure parent directories are executable (chmod +x)

Example workflow for Dori (production):

# Assume your data is in /home/jdoe/myproject/inputs/
STAGING_PATH="/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori"
DATA_PATH="/home/jdoe/myproject/inputs"

# 1. Create the directory structure in staging
mkdir -p "$STAGING_PATH/home/jdoe/myproject/inputs"

# 2. Move your input files to the staging area
cp "$DATA_PATH/sample.fastq" "$STAGING_PATH/home/jdoe/myproject/inputs/"
cp "$DATA_PATH/reference.fasta" "$STAGING_PATH/home/jdoe/myproject/inputs/"

# 3. Set proper permissions
chmod 755 "$STAGING_PATH/home/jdoe/myproject/inputs"
chmod 644 "$STAGING_PATH/home/jdoe/myproject/inputs"/*

Example inputs.json with staged file paths:

{
  "workflow.input_fastq": "/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/home/jdoe/myproject/inputs/sample.fastq",
  "workflow.reference": "/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/home/jdoe/myproject/inputs/reference.fasta"
}

After staging your data and creating your inputs.json with the full staged paths, submit the run via API with input_site_id="dori" and compute_site_id="dori" (see API submission examples below).

Example with cURL

curl -X POST https://jaws-api.jgi.doe.gov/api/v3/runs \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "[email protected]" \
  -F "[email protected]" \
  -F 'data={
    "compute_site_id": "dori",
    "input_site_id": "dori",
    "output_site_id": "dori",
    "team_id": "my_team",
    "workflow_name": "example_workflow",
    "max_ram_gb": 16,
    "workflow_tag": "v1.0",
    "caching": true,
    "tag": "test_run",
    "manifest": [
      "/path/to/input1.fastq",
      "/path/to/input2.fastq"
    ]
  }'

Example with Python

import requests
import json

API_KEY = "your_api_key_here"
API_URL = "https://jaws-api.jgi.doe.gov/api/v3/runs"

# Prepare metadata
metadata = {
    "compute_site_id": "dori",
    "input_site_id": "dori",
    "output_site_id": "dori",
    "team_id": "my_team",
    "workflow_name": "example_workflow",
    "max_ram_gb": 16,
    "workflow_tag": "v1.0",
    "caching": True,
    "tag": "test_run",
    "manifest": [
        "/path/to/input1.fastq",
        "/path/to/input2.fastq"
    ]
}

# Prepare files and data
files = {
    'wdl_file': open('workflow.wdl', 'rb'),
    'inputs_json': open('inputs.json', 'rb'),
    'data': (None, json.dumps(metadata), 'application/json')
}

# Optional: include subworkflows
# files['subworkflows'] = open('subworkflows.zip', 'rb')

headers = {
    "Authorization": f"Bearer {API_KEY}"
}

# Submit the run
response = requests.post(API_URL, headers=headers, files=files)

if response.status_code == 201:
    run_data = response.json()
    print(f"✓ Run submitted successfully!")
    print(f"  Run ID: {run_data['run_id']}")
    print(f"  Workflow: {run_data['workflow_name']}")
    print(f"  Submission ID: {run_data['submission_id']}")
    print(f"  Submitted at: {run_data['submitted_at']}")
else:
    print(f"✗ Failed to submit run: {response.status_code}")
    print(response.json())

# Close files
for f in files.values():
    if hasattr(f, 'close'):
        f.close()

Response (201 Created)

{
  "run_id": 12345,
  "submission_id": "abc123def456",
  "workflow_id": "wf_789xyz",
  "workflow_name": "example_workflow",
  "submitted_at": "2026-05-11T14:30:00.000000Z",
  "submitted_by": "jdoe"
}

Monitoring a Run

Retrieve detailed status information for a specific run.

API Endpoint

GET https://jaws-api.jgi.doe.gov/api/v3/runs/{run_id}

Example with cURL

curl -H "Authorization: Bearer YOUR_API_KEY" \
     https://jaws-api.jgi.doe.gov/api/v3/runs/12345

Example with Python

import requests

API_KEY = "your_api_key_here"
RUN_ID = 12345
API_URL = f"https://jaws-api.jgi.doe.gov/api/v3/runs/{RUN_ID}"

headers = {"Authorization": f"Bearer {API_KEY}"}
response = requests.get(API_URL, headers=headers)

if response.status_code == 200:
    run = response.json()
    print(f"Run ID: {run['id']}")
    print(f"Status: {run['status']}")
    print(f"Result: {run.get('result', 'N/A')}")
    print(f"Workflow: {run['workflow_name']}")
    print(f"Team: {run['team_id']}")
    print(f"Compute Site: {run['compute_site_id']}")
    print(f"Output Dir: {run.get('output_dir', 'N/A')}")
    print(f"CPU Hours: {run.get('cpu_hours', 0)}")
elif response.status_code == 404:
    print(f"✗ Run {RUN_ID} not found")
else:
    print(f"✗ Error: {response.status_code}")

Response (200 OK)

{
  "id": 12345,
  "user_id": "jdoe",
  "submission_id": "abc123def456",
  "status": "running",
  "result": null,
  "workflow_name": "example_workflow",
  "workflow_id": "wf_789xyz",
  "team_id": "my_team",
  "compute_site_id": "dori",
  "input_site_id": "dori",
  "output_site_id": "dori",
  "output_dir": "/path/to/outputs/12345",
  "max_ram_gb": 16,
  "caching": true,
  "tag": "test_run",
  "cromwell_run_id": "cromwell-abc-123",
  "wdl_file": "workflow.wdl",
  "json_file": "inputs.json",
  "cpu_hours": 2.5,
  "submitted": "2026-05-11T14:30:00Z",
  "updated": "2026-05-11T15:45:00Z",
  "workflow_root": "/cromwell/executions/workflow/cromwell-abc-123",
  "webhook": null
}

Polling for Status Updates

To monitor a run until completion, poll the endpoint at regular intervals:

import requests
import time

API_KEY = "your_api_key_here"
RUN_ID = 12345
API_URL = f"https://jaws-api.jgi.doe.gov/api/v3/runs/{RUN_ID}"

headers = {"Authorization": f"Bearer {API_KEY}"}

print(f"Monitoring run {RUN_ID}...")

while True:
    response = requests.get(API_URL, headers=headers)

    if response.status_code == 200:
        run = response.json()
        status = run['status']
        print(f"Status: {status} (Last updated: {run['updated']})")

        # Check if run reached terminal state
        if status in ['succeeded', 'failed', 'cancelled', 'done']:
            print(f"\n✓ Run completed")
            print(f"  Final Status: {status}")
            print(f"  Result: {run.get('result', 'N/A')}")
            print(f"  CPU Hours: {run.get('cpu_hours', 0)}")
            print(f"  Output Dir: {run.get('output_dir', 'N/A')}")
            break
    elif response.status_code == 404:
        print(f"✗ Run not found (may have been deleted)")
        break
    else:
        print(f"✗ Error checking status: {response.status_code}")

    # Wait before next poll (recommended: 30-60 seconds)
    time.sleep(30)

Run Status Values

Common status values you’ll encounter:

created           - Run accepted, ID assigned
upload queued     - Waiting to transfer inputs to compute site
uploading         - Transferring input files
upload complete   - Inputs transferred successfully
ready             - Run transferred to compute site
submitted         - Submitted to Cromwell
queued            - At least one task is queued
running           - Workflow is executing
succeeded         - Run completed successfully
failed            - Run failed
cancelled         - Run was cancelled
download complete - Outputs transferred to output site
done              - Run fully complete

For complete status descriptions, see JAWS Commands.

Searching for Runs

Query runs with filtering, pagination, and sorting.

API Endpoint

GET https://jaws-api.jgi.doe.gov/api/v3/runs

Query Parameters (all optional)

  • user_id (string): Filter by user ID

  • team_id (string): Filter by team ID

  • status (string): Filter by status (e.g., “running”, “succeeded”)

  • compute_site_id (string): Filter by compute site

  • input_site_id (string): Filter by input site

  • output_site_id (string): Filter by output site

  • order_by (string): Sort field (prefix with - for descending, e.g., -id, user_id)

  • offset (integer): Pagination offset (default: 0)

  • limit (integer): Max results to return (max: 100, default: 25)

Example with cURL

# Get all runs for a team
curl -H "Authorization: Bearer YOUR_API_KEY" \
     "https://jaws-api.jgi.doe.gov/api/v3/runs?team_id=my_team&limit=50"

# Get running runs for a specific user
curl -H "Authorization: Bearer YOUR_API_KEY" \
     "https://jaws-api.jgi.doe.gov/api/v3/runs?user_id=jdoe&status=running"

# Get recent runs (sorted by ID descending)
curl -H "Authorization: Bearer YOUR_API_KEY" \
     "https://jaws-api.jgi.doe.gov/api/v3/runs?order_by=-id&limit=10"

Example with Python

import requests

API_KEY = "your_api_key_here"
API_URL = "https://jaws-api.jgi.doe.gov/api/v3/runs"

headers = {"Authorization": f"Bearer {API_KEY}"}

# Example 1: Get all runs for a team
params = {
    "team_id": "my_team",
    "limit": 50
}
response = requests.get(API_URL, headers=headers, params=params)

if response.status_code == 200:
    runs = response.json()
    print(f"Found {len(runs)} runs for team 'my_team'")
    for run in runs:
        print(f"  Run {run['id']}: {run['workflow_name']} - {run['status']}")

# Example 2: Get running runs for a user
params = {
    "user_id": "jdoe",
    "status": "running"
}
response = requests.get(API_URL, headers=headers, params=params)

if response.status_code == 200:
    running_runs = response.json()
    print(f"\n{len(running_runs)} running runs for user 'jdoe'")

# Example 3: Get recent runs with pagination
params = {
    "order_by": "-id",
    "limit": 10,
    "offset": 0
}
response = requests.get(API_URL, headers=headers, params=params)

if response.status_code == 200:
    recent_runs = response.json()
    print(f"\n10 Most recent runs:")
    for run in recent_runs:
        print(f"  {run['id']}: {run['workflow_name']} ({run['status']})")

# Example 4: Filter by compute site and status
params = {
    "compute_site_id": "perlmutter",
    "status": "succeeded",
    "limit": 25
}
response = requests.get(API_URL, headers=headers, params=params)

if response.status_code == 200:
    perlmutter_runs = response.json()
    print(f"\n{len(perlmutter_runs)} succeeded runs on Perlmutter")

Response (200 OK)

Returns an array of run objects:

[
  {
    "id": 12345,
    "user_id": "jdoe",
    "workflow_name": "example_workflow",
    "status": "succeeded",
    "team_id": "my_team",
    "compute_site_id": "dori",
    "submitted": "2026-05-11T14:30:00Z",
    "updated": "2026-05-11T16:00:00Z"
  },
  {
    "id": 12344,
    "user_id": "jdoe",
    "workflow_name": "another_workflow",
    "status": "running",
    "team_id": "my_team",
    "compute_site_id": "dori",
    "submitted": "2026-05-11T13:00:00Z",
    "updated": "2026-05-11T15:45:00Z"
  }
]

Error Responses

404 Not Found - Run does not exist:

{
  "detail": "Run not found"
}

422 Unprocessable Entity - Invalid request parameters:

{
  "detail": [
    {
      "loc": ["body", "data", "max_ram_gb"],
      "msg": "ensure this value is less than or equal to 1024",
      "type": "value_error.number.not_le"
    }
  ]
}

401 Unauthorized - Invalid or missing API key:

{
  "detail": "Invalid authentication credentials"
}

403 Forbidden - Insufficient permissions:

{
  "detail": "Insufficient permissions to access this resource"
}

Best Practices

  1. Run Submission

    • Stage input data in the appropriate location with correct permissions before submission

    • Use descriptive tag values to identify runs later

    • The parameter max_ram_gb should match the max RAM in your WDL. We require this parameter so we can do quick validation in the API level.

    • Set max_ram_gb appropriately to avoid resource waste.

    • Enable caching (default) to leverage Cromwell call-caching and speed up reruns by reusing previously computed results

  2. Monitoring

    • Poll at reasonable intervals (recommended: 30-60 seconds)

    • Check for terminal states: succeeded, failed, cancelled, done

  3. Searching

    • Use filters to reduce result set size

    • Implement pagination for large result sets (max 100 per request)

    • Sort by -id to get most recent runs first

  4. Error Handling

    • Always check HTTP status codes

    • Retry transient errors (5xx) with exponential backoff

    • Log submission details for troubleshooting failed runs

    • Validate metadata before submission to catch errors early

  5. Performance

    • Cache workflow files when submitting multiple runs

    • Use the same workflow_name and workflow_tag for deduplication

    • Close file handles after submission

    • Limit concurrent submissions to avoid overwhelming the API

Complete Workflow Example

If you want to run the quickstart example, you will first want to git clone the jaws-tutorial-examples repository.

git clone https://code.jgi.doe.gov/official-jgi-workflows/wdl-specific-repositories/jaws-tutorial-examples.git
cd jaws-tutorial-examples/quickstart

Here’s a complete example demonstrating submission, monitoring, and searching:

import requests
import json
import time

# ============================================================================
# CONFIGURATION - Update these values for your environment
# ============================================================================
API_KEY = "your_api_key_here"              # Your JAWS API key
TEAM_ID = "jgi_genomics"                   # Your team identifier
WORKFLOW_NAME = "my_analysis_workflow"     # Descriptive workflow name
COMPUTE_SITE_ID = "dori"                   # Compute site: dori, perlmutter, jgi, etc.
INPUT_SITE_ID = "dori"                     # Input site (typically same as compute)
OUTPUT_SITE_ID = "dori"
WDL_FILE_PATH = "align.wdl"                # Path to your WDL file
INPUTS_JSON_PATH = "inputs.json"           # Path to your inputs JSON
MAX_RAM_GB = 1                             # Max RAM in GB (1-1024). Match this with your max in your WDL
RUN_TAG = "api_example_run"                # Tag to identify this run
# ============================================================================

BASE_URL = "https://jaws-api.jgi.doe.gov/api/v3"

headers = {"Authorization": f"Bearer {API_KEY}"}

# Step 1: Submit a run
print("Step 1: Submitting run...")

metadata = {
    "compute_site_id": COMPUTE_SITE_ID,
    "input_site_id": INPUT_SITE_ID,
    "output_site_id": OUTPUT_SITE_ID,
    "team_id": TEAM_ID,
    "workflow_name": WORKFLOW_NAME,
    "max_ram_gb": MAX_RAM_GB,
    "tag": RUN_TAG
}

files = {
    'wdl_file': open(WDL_FILE_PATH, 'rb'),
    'inputs_json': open(INPUTS_JSON_PATH, 'rb'),
    'data': (None, json.dumps(metadata), 'application/json')
}

response = requests.post(f"{BASE_URL}/runs", headers=headers, files=files)

for f in files.values():
    if hasattr(f, 'close'):
        f.close()

if response.status_code != 201:
    print(f"Failed to submit: {response.json()}")
    exit(1)

run_data = response.json()
run_id = run_data['run_id']
print(f"✓ Run {run_id} submitted successfully\n")

# Step 2: Monitor the run
print(f"Step 2: Monitoring run {run_id}...")

while True:
    response = requests.get(f"{BASE_URL}/runs/{run_id}", headers=headers)

    if response.status_code == 200:
        run = response.json()
        status = run['status']
        print(f"  Status: {status}")

        # Check if terminal state
        if status in ['succeeded', 'failed', 'cancelled', 'done']:
            print(f"\n✓ Run completed with status: {status}")
            print(f"  Result: {run.get('result', 'N/A')}")
            print(f"  CPU Hours: {run.get('cpu_hours', 0)}")
            print(f"  Output Dir: {run.get('output_dir', 'N/A')}")
            break

    # Wait before next poll (30 seconds recommended)
    time.sleep(30)

# Step 3: Search for related runs
print(f"\nStep 3: Finding other runs with tag {RUN_TAG} ...")

# Note: Searching by tag requires filtering client-side
params = {
    "team_id": TEAM_ID,
    "limit": 100
}

response = requests.get(f"{BASE_URL}/runs", headers=headers, params=params)

if response.status_code == 200:
    all_runs = response.json()
    tagged_runs = [r for r in all_runs if r.get('tag') == RUN_TAG]

    print(f"Found {len(tagged_runs)} runs with this tag:")
    for run in tagged_runs:
        print(f"  Run {run['id']}: {run['status']}")