=========================== JAWS API v3 - Runs Endpoint =========================== .. role:: bash(code) :language: bash .. warning:: The JAWS API v3 is currently **experimental** and under **active development**. Endpoints, request/response formats, and behavior may change without notice. .. note:: This documentation is for users accessing the JAWS API programmatically. If you're using the JAWS CLI (``jaws`` command), see :doc:`JAWS Commands ` instead. Overview -------- The ``/runs`` endpoint provides programmatic access to submit, monitor, and search workflow runs in JAWS. All requests require authentication via API key. **Key Features:** - **Submit Runs** - Upload workflow files and create new runs - **Monitor Status** - Track run progress and retrieve detailed status - **Search & Filter** - Query runs by team, user, status, site, and more API Environments ---------------- JAWS provides multiple environments: - **Production:** ``https://jaws-api.jgi.doe.gov/api/v3`` - **Staging:** ``https://jaws-api-staging.jgi.doe.gov/api/v3`` (for testing with select users) .. note:: The examples below use the production environment. To use staging, simply replace the base URL. Authentication -------------- All requests require an API key in the Authorization header: .. code-block:: bash Authorization: Bearer YOUR_API_KEY Contact your JAWS administrator for an API key. Submitting a Run ---------------- Create a new workflow run by uploading WDL and inputs files. API Endpoint ++++++++++++ .. code-block:: text POST https://jaws-api.jgi.doe.gov/api/v3/runs Request Parameters ++++++++++++++++++ **Multipart Form Data:** - :bash:`wdl_file` (file, required): WDL workflow file - :bash:`inputs_json` (file, required): Inputs JSON file - :bash:`subworkflows` (file, optional): Subworkflows ZIP archive - :bash:`data` (JSON string, required): Run metadata with the following fields: **Required metadata fields:** - ``compute_site_id`` (string): Target compute site (e.g., "dori", "jgi", "perlmutter") - ``input_site_id`` (string): Site where input files are located - ``team_id`` (string): Team identifier - ``workflow_name`` (string): Semantic workflow name - ``max_ram_gb`` (integer): Maximum RAM in GB (1-1024). This shoud match the max RAM specified in your WDL. **Optional metadata fields:** - ``output_site_id`` (string): Where to transfer outputs (defaults to input_site_id) - ``workflow_tag`` (string): Version or tag for the workflow - ``caching`` (boolean): Enable Cromwell call caching (default: true) - ``tag`` (string): User-defined tag for this run - ``manifest`` (array): List of input file paths to transfer Input Data Staging (v3 Requirement) ++++++++++++++++++++++++++++++++++++ .. important:: **In API v3, you are responsible for staging your input data.** Unlike previous versions where the JAWS client handled data movement, v3 requires you to: 1. **Move your input files** to the staging area on the input site before submission 2. **Set correct file permissions** to ensure JAWS can access the files 3. **Reference the staged paths** in your inputs JSON file using the full absolute path **Staging Directory Structure:** The staging directory path format is: .. code-block:: text /clusterfs/jgi/scratch/dsi/aa/jaws/dori-[ENV]/inputs/[SITE]/ Where: - ``[ENV]`` = environment (``prod`` or ``staging``) - ``[SITE]`` = site name (currently ``dori`` for MVP) **Examples:** - Production Dori: ``/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/`` - Staging Dori: ``/clusterfs/jgi/scratch/dsi/aa/jaws/dori-staging/inputs/dori/`` **How Staging Works:** Your staged file path = staging directory + your data's absolute path For example, if your original data is at ``/home/user/project/data/sample.fastq``, you would: 1. Move it to: ``/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/home/user/project/data/sample.fastq`` 2. Reference in inputs.json: ``/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/home/user/project/data/sample.fastq`` **File Permission Requirements:** - Files must be readable by the JAWS execution user - Recommended permissions: ``chmod 644`` for files, ``chmod 755`` for directories - Ensure parent directories are executable (``chmod +x``) **Example workflow for Dori (production):** .. code-block:: bash # Assume your data is in /home/jdoe/myproject/inputs/ STAGING_PATH="/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori" DATA_PATH="/home/jdoe/myproject/inputs" # 1. Create the directory structure in staging mkdir -p "$STAGING_PATH/home/jdoe/myproject/inputs" # 2. Move your input files to the staging area cp "$DATA_PATH/sample.fastq" "$STAGING_PATH/home/jdoe/myproject/inputs/" cp "$DATA_PATH/reference.fasta" "$STAGING_PATH/home/jdoe/myproject/inputs/" # 3. Set proper permissions chmod 755 "$STAGING_PATH/home/jdoe/myproject/inputs" chmod 644 "$STAGING_PATH/home/jdoe/myproject/inputs"/* **Example inputs.json with staged file paths:** .. code-block:: json { "workflow.input_fastq": "/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/home/jdoe/myproject/inputs/sample.fastq", "workflow.reference": "/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/home/jdoe/myproject/inputs/reference.fasta" } After staging your data and creating your inputs.json with the full staged paths, submit the run via API with ``input_site_id="dori"`` and ``compute_site_id="dori"`` (see API submission examples below). Example with cURL +++++++++++++++++ .. code-block:: bash curl -X POST https://jaws-api.jgi.doe.gov/api/v3/runs \ -H "Authorization: Bearer YOUR_API_KEY" \ -F "wdl_file=@workflow.wdl" \ -F "inputs_json=@inputs.json" \ -F 'data={ "compute_site_id": "dori", "input_site_id": "dori", "output_site_id": "dori", "team_id": "my_team", "workflow_name": "example_workflow", "max_ram_gb": 16, "workflow_tag": "v1.0", "caching": true, "tag": "test_run", "manifest": [ "/path/to/input1.fastq", "/path/to/input2.fastq" ] }' Example with Python +++++++++++++++++++ .. code-block:: python import requests import json API_KEY = "your_api_key_here" API_URL = "https://jaws-api.jgi.doe.gov/api/v3/runs" # Prepare metadata metadata = { "compute_site_id": "dori", "input_site_id": "dori", "output_site_id": "dori", "team_id": "my_team", "workflow_name": "example_workflow", "max_ram_gb": 16, "workflow_tag": "v1.0", "caching": True, "tag": "test_run", "manifest": [ "/path/to/input1.fastq", "/path/to/input2.fastq" ] } # Prepare files and data files = { 'wdl_file': open('workflow.wdl', 'rb'), 'inputs_json': open('inputs.json', 'rb'), 'data': (None, json.dumps(metadata), 'application/json') } # Optional: include subworkflows # files['subworkflows'] = open('subworkflows.zip', 'rb') headers = { "Authorization": f"Bearer {API_KEY}" } # Submit the run response = requests.post(API_URL, headers=headers, files=files) if response.status_code == 201: run_data = response.json() print(f"✓ Run submitted successfully!") print(f" Run ID: {run_data['run_id']}") print(f" Workflow: {run_data['workflow_name']}") print(f" Submission ID: {run_data['submission_id']}") print(f" Submitted at: {run_data['submitted_at']}") else: print(f"✗ Failed to submit run: {response.status_code}") print(response.json()) # Close files for f in files.values(): if hasattr(f, 'close'): f.close() Response (201 Created) +++++++++++++++++++++++ .. code-block:: json { "run_id": 12345, "submission_id": "abc123def456", "workflow_id": "wf_789xyz", "workflow_name": "example_workflow", "submitted_at": "2026-05-11T14:30:00.000000Z", "submitted_by": "jdoe" } Monitoring a Run ---------------- Retrieve detailed status information for a specific run. API Endpoint ++++++++++++ .. code-block:: text GET https://jaws-api.jgi.doe.gov/api/v3/runs/{run_id} Example with cURL +++++++++++++++++ .. code-block:: bash curl -H "Authorization: Bearer YOUR_API_KEY" \ https://jaws-api.jgi.doe.gov/api/v3/runs/12345 Example with Python +++++++++++++++++++ .. code-block:: python import requests API_KEY = "your_api_key_here" RUN_ID = 12345 API_URL = f"https://jaws-api.jgi.doe.gov/api/v3/runs/{RUN_ID}" headers = {"Authorization": f"Bearer {API_KEY}"} response = requests.get(API_URL, headers=headers) if response.status_code == 200: run = response.json() print(f"Run ID: {run['id']}") print(f"Status: {run['status']}") print(f"Result: {run.get('result', 'N/A')}") print(f"Workflow: {run['workflow_name']}") print(f"Team: {run['team_id']}") print(f"Compute Site: {run['compute_site_id']}") print(f"Output Dir: {run.get('output_dir', 'N/A')}") print(f"CPU Hours: {run.get('cpu_hours', 0)}") elif response.status_code == 404: print(f"✗ Run {RUN_ID} not found") else: print(f"✗ Error: {response.status_code}") Response (200 OK) +++++++++++++++++ .. code-block:: json { "id": 12345, "user_id": "jdoe", "submission_id": "abc123def456", "status": "running", "result": null, "workflow_name": "example_workflow", "workflow_id": "wf_789xyz", "team_id": "my_team", "compute_site_id": "dori", "input_site_id": "dori", "output_site_id": "dori", "output_dir": "/path/to/outputs/12345", "max_ram_gb": 16, "caching": true, "tag": "test_run", "cromwell_run_id": "cromwell-abc-123", "wdl_file": "workflow.wdl", "json_file": "inputs.json", "cpu_hours": 2.5, "submitted": "2026-05-11T14:30:00Z", "updated": "2026-05-11T15:45:00Z", "workflow_root": "/cromwell/executions/workflow/cromwell-abc-123", "webhook": null } Polling for Status Updates +++++++++++++++++++++++++++ To monitor a run until completion, poll the endpoint at regular intervals: .. code-block:: python import requests import time API_KEY = "your_api_key_here" RUN_ID = 12345 API_URL = f"https://jaws-api.jgi.doe.gov/api/v3/runs/{RUN_ID}" headers = {"Authorization": f"Bearer {API_KEY}"} print(f"Monitoring run {RUN_ID}...") while True: response = requests.get(API_URL, headers=headers) if response.status_code == 200: run = response.json() status = run['status'] print(f"Status: {status} (Last updated: {run['updated']})") # Check if run reached terminal state if status in ['succeeded', 'failed', 'cancelled', 'done']: print(f"\n✓ Run completed") print(f" Final Status: {status}") print(f" Result: {run.get('result', 'N/A')}") print(f" CPU Hours: {run.get('cpu_hours', 0)}") print(f" Output Dir: {run.get('output_dir', 'N/A')}") break elif response.status_code == 404: print(f"✗ Run not found (may have been deleted)") break else: print(f"✗ Error checking status: {response.status_code}") # Wait before next poll (recommended: 30-60 seconds) time.sleep(30) Run Status Values +++++++++++++++++ Common status values you'll encounter: .. code-block:: text created - Run accepted, ID assigned upload queued - Waiting to transfer inputs to compute site uploading - Transferring input files upload complete - Inputs transferred successfully ready - Run transferred to compute site submitted - Submitted to Cromwell queued - At least one task is queued running - Workflow is executing succeeded - Run completed successfully failed - Run failed cancelled - Run was cancelled download complete - Outputs transferred to output site done - Run fully complete For complete status descriptions, see :doc:`JAWS Commands `. Searching for Runs ------------------ Query runs with filtering, pagination, and sorting. API Endpoint ++++++++++++ .. code-block:: text GET https://jaws-api.jgi.doe.gov/api/v3/runs Query Parameters (all optional) ++++++++++++++++++++++++++++++++ - :bash:`user_id` (string): Filter by user ID - :bash:`team_id` (string): Filter by team ID - :bash:`status` (string): Filter by status (e.g., "running", "succeeded") - :bash:`compute_site_id` (string): Filter by compute site - :bash:`input_site_id` (string): Filter by input site - :bash:`output_site_id` (string): Filter by output site - :bash:`order_by` (string): Sort field (prefix with ``-`` for descending, e.g., ``-id``, ``user_id``) - :bash:`offset` (integer): Pagination offset (default: 0) - :bash:`limit` (integer): Max results to return (max: 100, default: 25) Example with cURL +++++++++++++++++ .. code-block:: bash # Get all runs for a team curl -H "Authorization: Bearer YOUR_API_KEY" \ "https://jaws-api.jgi.doe.gov/api/v3/runs?team_id=my_team&limit=50" # Get running runs for a specific user curl -H "Authorization: Bearer YOUR_API_KEY" \ "https://jaws-api.jgi.doe.gov/api/v3/runs?user_id=jdoe&status=running" # Get recent runs (sorted by ID descending) curl -H "Authorization: Bearer YOUR_API_KEY" \ "https://jaws-api.jgi.doe.gov/api/v3/runs?order_by=-id&limit=10" Example with Python +++++++++++++++++++ .. code-block:: python import requests API_KEY = "your_api_key_here" API_URL = "https://jaws-api.jgi.doe.gov/api/v3/runs" headers = {"Authorization": f"Bearer {API_KEY}"} # Example 1: Get all runs for a team params = { "team_id": "my_team", "limit": 50 } response = requests.get(API_URL, headers=headers, params=params) if response.status_code == 200: runs = response.json() print(f"Found {len(runs)} runs for team 'my_team'") for run in runs: print(f" Run {run['id']}: {run['workflow_name']} - {run['status']}") # Example 2: Get running runs for a user params = { "user_id": "jdoe", "status": "running" } response = requests.get(API_URL, headers=headers, params=params) if response.status_code == 200: running_runs = response.json() print(f"\n{len(running_runs)} running runs for user 'jdoe'") # Example 3: Get recent runs with pagination params = { "order_by": "-id", "limit": 10, "offset": 0 } response = requests.get(API_URL, headers=headers, params=params) if response.status_code == 200: recent_runs = response.json() print(f"\n10 Most recent runs:") for run in recent_runs: print(f" {run['id']}: {run['workflow_name']} ({run['status']})") # Example 4: Filter by compute site and status params = { "compute_site_id": "perlmutter", "status": "succeeded", "limit": 25 } response = requests.get(API_URL, headers=headers, params=params) if response.status_code == 200: perlmutter_runs = response.json() print(f"\n{len(perlmutter_runs)} succeeded runs on Perlmutter") Response (200 OK) +++++++++++++++++ Returns an array of run objects: .. code-block:: json [ { "id": 12345, "user_id": "jdoe", "workflow_name": "example_workflow", "status": "succeeded", "team_id": "my_team", "compute_site_id": "dori", "submitted": "2026-05-11T14:30:00Z", "updated": "2026-05-11T16:00:00Z" }, { "id": 12344, "user_id": "jdoe", "workflow_name": "another_workflow", "status": "running", "team_id": "my_team", "compute_site_id": "dori", "submitted": "2026-05-11T13:00:00Z", "updated": "2026-05-11T15:45:00Z" } ] Error Responses --------------- **404 Not Found** - Run does not exist: .. code-block:: json { "detail": "Run not found" } **422 Unprocessable Entity** - Invalid request parameters: .. code-block:: json { "detail": [ { "loc": ["body", "data", "max_ram_gb"], "msg": "ensure this value is less than or equal to 1024", "type": "value_error.number.not_le" } ] } **401 Unauthorized** - Invalid or missing API key: .. code-block:: json { "detail": "Invalid authentication credentials" } **403 Forbidden** - Insufficient permissions: .. code-block:: json { "detail": "Insufficient permissions to access this resource" } Best Practices -------------- 1. **Run Submission** - **Stage input data** in the appropriate location with correct permissions before submission - Use descriptive ``tag`` values to identify runs later - The parameter ``max_ram_gb`` should match the max RAM in your WDL. We require this parameter so we can do quick validation in the API level. - Set ``max_ram_gb`` appropriately to avoid resource waste. - Enable ``caching`` (default) to leverage Cromwell call-caching and speed up reruns by reusing previously computed results 2. **Monitoring** - Poll at reasonable intervals (recommended: 30-60 seconds) - Check for terminal states: ``succeeded``, ``failed``, ``cancelled``, ``done`` 3. **Searching** - Use filters to reduce result set size - Implement pagination for large result sets (max 100 per request) - Sort by ``-id`` to get most recent runs first 4. **Error Handling** - Always check HTTP status codes - Retry transient errors (5xx) with exponential backoff - Log submission details for troubleshooting failed runs - Validate metadata before submission to catch errors early 5. **Performance** - Cache workflow files when submitting multiple runs - Use the same ``workflow_name`` and ``workflow_tag`` for deduplication - Close file handles after submission - Limit concurrent submissions to avoid overwhelming the API Complete Workflow Example -------------------------- If you want to run the quickstart example, you will first want to git clone the ``jaws-tutorial-examples`` repository. .. code-block:: bash git clone https://code.jgi.doe.gov/official-jgi-workflows/wdl-specific-repositories/jaws-tutorial-examples.git cd jaws-tutorial-examples/quickstart Here's a complete example demonstrating submission, monitoring, and searching: .. code-block:: python import requests import json import time # ============================================================================ # CONFIGURATION - Update these values for your environment # ============================================================================ API_KEY = "your_api_key_here" # Your JAWS API key TEAM_ID = "jgi_genomics" # Your team identifier WORKFLOW_NAME = "my_analysis_workflow" # Descriptive workflow name COMPUTE_SITE_ID = "dori" # Compute site: dori, perlmutter, jgi, etc. INPUT_SITE_ID = "dori" # Input site (typically same as compute) OUTPUT_SITE_ID = "dori" WDL_FILE_PATH = "align.wdl" # Path to your WDL file INPUTS_JSON_PATH = "inputs.json" # Path to your inputs JSON MAX_RAM_GB = 1 # Max RAM in GB (1-1024). Match this with your max in your WDL RUN_TAG = "api_example_run" # Tag to identify this run # ============================================================================ BASE_URL = "https://jaws-api.jgi.doe.gov/api/v3" headers = {"Authorization": f"Bearer {API_KEY}"} # Step 1: Submit a run print("Step 1: Submitting run...") metadata = { "compute_site_id": COMPUTE_SITE_ID, "input_site_id": INPUT_SITE_ID, "output_site_id": OUTPUT_SITE_ID, "team_id": TEAM_ID, "workflow_name": WORKFLOW_NAME, "max_ram_gb": MAX_RAM_GB, "tag": RUN_TAG } files = { 'wdl_file': open(WDL_FILE_PATH, 'rb'), 'inputs_json': open(INPUTS_JSON_PATH, 'rb'), 'data': (None, json.dumps(metadata), 'application/json') } response = requests.post(f"{BASE_URL}/runs", headers=headers, files=files) for f in files.values(): if hasattr(f, 'close'): f.close() if response.status_code != 201: print(f"Failed to submit: {response.json()}") exit(1) run_data = response.json() run_id = run_data['run_id'] print(f"✓ Run {run_id} submitted successfully\n") # Step 2: Monitor the run print(f"Step 2: Monitoring run {run_id}...") while True: response = requests.get(f"{BASE_URL}/runs/{run_id}", headers=headers) if response.status_code == 200: run = response.json() status = run['status'] print(f" Status: {status}") # Check if terminal state if status in ['succeeded', 'failed', 'cancelled', 'done']: print(f"\n✓ Run completed with status: {status}") print(f" Result: {run.get('result', 'N/A')}") print(f" CPU Hours: {run.get('cpu_hours', 0)}") print(f" Output Dir: {run.get('output_dir', 'N/A')}") break # Wait before next poll (30 seconds recommended) time.sleep(30) # Step 3: Search for related runs print(f"\nStep 3: Finding other runs with tag {RUN_TAG} ...") # Note: Searching by tag requires filtering client-side params = { "team_id": TEAM_ID, "limit": 100 } response = requests.get(f"{BASE_URL}/runs", headers=headers, params=params) if response.status_code == 200: all_runs = response.json() tagged_runs = [r for r in all_runs if r.get('tag') == RUN_TAG] print(f"Found {len(tagged_runs)} runs with this tag:") for run in tagged_runs: print(f" Run {run['id']}: {run['status']}") Related Documentation --------------------- See also: - :doc:`JAWS Commands ` - CLI commands for workflow submission - :doc:`JAWS Teams ` - Team management and permissions - :doc:`JAWS Configuration ` - Setting up JAWS client