===========================
JAWS API v3 - Runs Endpoint
===========================

.. role:: bash(code)
   :language: bash

.. warning::
   The JAWS API v3 is currently **experimental** and under **active development**.
   Endpoints, request/response formats, and behavior may change without notice.

.. note::
   This documentation is for users accessing the JAWS API programmatically. If you're using
   the JAWS CLI (``jaws`` command), see :doc:`JAWS Commands <jaws_usage>` instead.

Overview
--------

The ``/runs`` endpoint provides programmatic access to submit, monitor, and search workflow runs in JAWS.
All requests require authentication via API key.

**Key Features:**

- **Submit Runs** - Upload workflow files and create new runs
- **Monitor Status** - Track run progress and retrieve detailed status
- **Search & Filter** - Query runs by team, user, status, site, and more

API Environments
----------------

JAWS provides multiple environments:

- **Production:** ``https://jaws-api.jgi.doe.gov/api/v3``
- **Staging:** ``https://jaws-api-staging.jgi.doe.gov/api/v3`` (for testing with select users)

.. note::
   The examples below use the production environment. To use staging,
   simply replace the base URL.

Authentication
--------------

All requests require an API key in the Authorization header:

.. code-block:: bash

    Authorization: Bearer YOUR_API_KEY

Contact your JAWS administrator for an API key.

Submitting a Run
----------------

Create a new workflow run by uploading WDL and inputs files.

API Endpoint
++++++++++++

.. code-block:: text

    POST https://jaws-api.jgi.doe.gov/api/v3/runs

Request Parameters
++++++++++++++++++

**Multipart Form Data:**

- :bash:`wdl_file` (file, required): WDL workflow file
- :bash:`inputs_json` (file, required): Inputs JSON file
- :bash:`subworkflows` (file, optional): Subworkflows ZIP archive
- :bash:`data` (JSON string, required): Run metadata with the following fields:

**Required metadata fields:**

- ``compute_site_id`` (string): Target compute site (e.g., "dori", "jgi", "perlmutter")
- ``input_site_id`` (string): Site where input files are located
- ``team_id`` (string): Team identifier
- ``workflow_name`` (string): Semantic workflow name
- ``max_ram_gb`` (integer): Maximum RAM in GB (1-1024). This shoud match the max RAM specified in your WDL.

**Optional metadata fields:**

- ``output_site_id`` (string): Where to transfer outputs (defaults to input_site_id)
- ``workflow_tag`` (string): Version or tag for the workflow
- ``caching`` (boolean): Enable Cromwell call caching (default: true)
- ``tag`` (string): User-defined tag for this run
- ``manifest`` (array): List of input file paths to transfer

Input Data Staging (v3 Requirement)
++++++++++++++++++++++++++++++++++++

.. important::
   **In API v3, you are responsible for staging your input data.** Unlike previous versions where
   the JAWS client handled data movement, v3 requires you to:

   1. **Move your input files** to the staging area on the input site before submission
   2. **Set correct file permissions** to ensure JAWS can access the files
   3. **Reference the staged paths** in your inputs JSON file using the full absolute path

**Staging Directory Structure:**

The staging directory path format is:

.. code-block:: text

    /clusterfs/jgi/scratch/dsi/aa/jaws/dori-[ENV]/inputs/[SITE]/

Where:
  - ``[ENV]`` = environment (``prod`` or ``staging``)
  - ``[SITE]`` = site name (currently ``dori`` for MVP)

**Examples:**

- Production Dori: ``/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/``
- Staging Dori: ``/clusterfs/jgi/scratch/dsi/aa/jaws/dori-staging/inputs/dori/``

**How Staging Works:**

Your staged file path = staging directory + your data's absolute path

For example, if your original data is at ``/home/user/project/data/sample.fastq``, you would:

1. Move it to: ``/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/home/user/project/data/sample.fastq``
2. Reference in inputs.json: ``/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/home/user/project/data/sample.fastq``

**File Permission Requirements:**

- Files must be readable by the JAWS execution user
- Recommended permissions: ``chmod 644`` for files, ``chmod 755`` for directories
- Ensure parent directories are executable (``chmod +x``)

**Example workflow for Dori (production):**

.. code-block:: bash

    # Assume your data is in /home/jdoe/myproject/inputs/
    STAGING_PATH="/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori"
    DATA_PATH="/home/jdoe/myproject/inputs"

    # 1. Create the directory structure in staging
    mkdir -p "$STAGING_PATH/home/jdoe/myproject/inputs"

    # 2. Move your input files to the staging area
    cp "$DATA_PATH/sample.fastq" "$STAGING_PATH/home/jdoe/myproject/inputs/"
    cp "$DATA_PATH/reference.fasta" "$STAGING_PATH/home/jdoe/myproject/inputs/"

    # 3. Set proper permissions
    chmod 755 "$STAGING_PATH/home/jdoe/myproject/inputs"
    chmod 644 "$STAGING_PATH/home/jdoe/myproject/inputs"/*

**Example inputs.json with staged file paths:**

.. code-block:: json

    {
      "workflow.input_fastq": "/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/home/jdoe/myproject/inputs/sample.fastq",
      "workflow.reference": "/clusterfs/jgi/scratch/dsi/aa/jaws/dori-prod/inputs/dori/home/jdoe/myproject/inputs/reference.fasta"
    }

After staging your data and creating your inputs.json with the full staged paths,
submit the run via API with ``input_site_id="dori"`` and ``compute_site_id="dori"``
(see API submission examples below).

Example with cURL
+++++++++++++++++

.. code-block:: bash

    curl -X POST https://jaws-api.jgi.doe.gov/api/v3/runs \
      -H "Authorization: Bearer YOUR_API_KEY" \
      -F "wdl_file=@workflow.wdl" \
      -F "inputs_json=@inputs.json" \
      -F 'data={
        "compute_site_id": "dori",
        "input_site_id": "dori",
        "output_site_id": "dori",
        "team_id": "my_team",
        "workflow_name": "example_workflow",
        "max_ram_gb": 16,
        "workflow_tag": "v1.0",
        "caching": true,
        "tag": "test_run",
        "manifest": [
          "/path/to/input1.fastq",
          "/path/to/input2.fastq"
        ]
      }'

Example with Python
+++++++++++++++++++

.. code-block:: python

    import requests
    import json

    API_KEY = "your_api_key_here"
    API_URL = "https://jaws-api.jgi.doe.gov/api/v3/runs"

    # Prepare metadata
    metadata = {
        "compute_site_id": "dori",
        "input_site_id": "dori",
        "output_site_id": "dori",
        "team_id": "my_team",
        "workflow_name": "example_workflow",
        "max_ram_gb": 16,
        "workflow_tag": "v1.0",
        "caching": True,
        "tag": "test_run",
        "manifest": [
            "/path/to/input1.fastq",
            "/path/to/input2.fastq"
        ]
    }

    # Prepare files and data
    files = {
        'wdl_file': open('workflow.wdl', 'rb'),
        'inputs_json': open('inputs.json', 'rb'),
        'data': (None, json.dumps(metadata), 'application/json')
    }

    # Optional: include subworkflows
    # files['subworkflows'] = open('subworkflows.zip', 'rb')

    headers = {
        "Authorization": f"Bearer {API_KEY}"
    }

    # Submit the run
    response = requests.post(API_URL, headers=headers, files=files)

    if response.status_code == 201:
        run_data = response.json()
        print(f"✓ Run submitted successfully!")
        print(f"  Run ID: {run_data['run_id']}")
        print(f"  Workflow: {run_data['workflow_name']}")
        print(f"  Submission ID: {run_data['submission_id']}")
        print(f"  Submitted at: {run_data['submitted_at']}")
    else:
        print(f"✗ Failed to submit run: {response.status_code}")
        print(response.json())

    # Close files
    for f in files.values():
        if hasattr(f, 'close'):
            f.close()

Response (201 Created)
+++++++++++++++++++++++

.. code-block:: json

    {
      "run_id": 12345,
      "submission_id": "abc123def456",
      "workflow_id": "wf_789xyz",
      "workflow_name": "example_workflow",
      "submitted_at": "2026-05-11T14:30:00.000000Z",
      "submitted_by": "jdoe"
    }

Monitoring a Run
----------------

Retrieve detailed status information for a specific run.

API Endpoint
++++++++++++

.. code-block:: text

    GET https://jaws-api.jgi.doe.gov/api/v3/runs/{run_id}

Example with cURL
+++++++++++++++++

.. code-block:: bash

    curl -H "Authorization: Bearer YOUR_API_KEY" \
         https://jaws-api.jgi.doe.gov/api/v3/runs/12345

Example with Python
+++++++++++++++++++

.. code-block:: python

    import requests

    API_KEY = "your_api_key_here"
    RUN_ID = 12345
    API_URL = f"https://jaws-api.jgi.doe.gov/api/v3/runs/{RUN_ID}"

    headers = {"Authorization": f"Bearer {API_KEY}"}
    response = requests.get(API_URL, headers=headers)

    if response.status_code == 200:
        run = response.json()
        print(f"Run ID: {run['id']}")
        print(f"Status: {run['status']}")
        print(f"Result: {run.get('result', 'N/A')}")
        print(f"Workflow: {run['workflow_name']}")
        print(f"Team: {run['team_id']}")
        print(f"Compute Site: {run['compute_site_id']}")
        print(f"Output Dir: {run.get('output_dir', 'N/A')}")
        print(f"CPU Hours: {run.get('cpu_hours', 0)}")
    elif response.status_code == 404:
        print(f"✗ Run {RUN_ID} not found")
    else:
        print(f"✗ Error: {response.status_code}")

Response (200 OK)
+++++++++++++++++

.. code-block:: json

    {
      "id": 12345,
      "user_id": "jdoe",
      "submission_id": "abc123def456",
      "status": "running",
      "result": null,
      "workflow_name": "example_workflow",
      "workflow_id": "wf_789xyz",
      "team_id": "my_team",
      "compute_site_id": "dori",
      "input_site_id": "dori",
      "output_site_id": "dori",
      "output_dir": "/path/to/outputs/12345",
      "max_ram_gb": 16,
      "caching": true,
      "tag": "test_run",
      "cromwell_run_id": "cromwell-abc-123",
      "wdl_file": "workflow.wdl",
      "json_file": "inputs.json",
      "cpu_hours": 2.5,
      "submitted": "2026-05-11T14:30:00Z",
      "updated": "2026-05-11T15:45:00Z",
      "workflow_root": "/cromwell/executions/workflow/cromwell-abc-123",
      "webhook": null
    }

Polling for Status Updates
+++++++++++++++++++++++++++

To monitor a run until completion, poll the endpoint at regular intervals:

.. code-block:: python

    import requests
    import time

    API_KEY = "your_api_key_here"
    RUN_ID = 12345
    API_URL = f"https://jaws-api.jgi.doe.gov/api/v3/runs/{RUN_ID}"

    headers = {"Authorization": f"Bearer {API_KEY}"}

    print(f"Monitoring run {RUN_ID}...")

    while True:
        response = requests.get(API_URL, headers=headers)

        if response.status_code == 200:
            run = response.json()
            status = run['status']
            print(f"Status: {status} (Last updated: {run['updated']})")

            # Check if run reached terminal state
            if status in ['succeeded', 'failed', 'cancelled', 'done']:
                print(f"\n✓ Run completed")
                print(f"  Final Status: {status}")
                print(f"  Result: {run.get('result', 'N/A')}")
                print(f"  CPU Hours: {run.get('cpu_hours', 0)}")
                print(f"  Output Dir: {run.get('output_dir', 'N/A')}")
                break
        elif response.status_code == 404:
            print(f"✗ Run not found (may have been deleted)")
            break
        else:
            print(f"✗ Error checking status: {response.status_code}")

        # Wait before next poll (recommended: 30-60 seconds)
        time.sleep(30)

Run Status Values
+++++++++++++++++

Common status values you'll encounter:

.. code-block:: text

    created           - Run accepted, ID assigned
    upload queued     - Waiting to transfer inputs to compute site
    uploading         - Transferring input files
    upload complete   - Inputs transferred successfully
    ready             - Run transferred to compute site
    submitted         - Submitted to Cromwell
    queued            - At least one task is queued
    running           - Workflow is executing
    succeeded         - Run completed successfully
    failed            - Run failed
    cancelled         - Run was cancelled
    download complete - Outputs transferred to output site
    done              - Run fully complete

For complete status descriptions, see :doc:`JAWS Commands <jaws_usage>`.

Searching for Runs
------------------

Query runs with filtering, pagination, and sorting.

API Endpoint
++++++++++++

.. code-block:: text

    GET https://jaws-api.jgi.doe.gov/api/v3/runs

Query Parameters (all optional)
++++++++++++++++++++++++++++++++

- :bash:`user_id` (string): Filter by user ID
- :bash:`team_id` (string): Filter by team ID
- :bash:`status` (string): Filter by status (e.g., "running", "succeeded")
- :bash:`compute_site_id` (string): Filter by compute site
- :bash:`input_site_id` (string): Filter by input site
- :bash:`output_site_id` (string): Filter by output site
- :bash:`order_by` (string): Sort field (prefix with ``-`` for descending, e.g., ``-id``, ``user_id``)
- :bash:`offset` (integer): Pagination offset (default: 0)
- :bash:`limit` (integer): Max results to return (max: 100, default: 25)

Example with cURL
+++++++++++++++++

.. code-block:: bash

    # Get all runs for a team
    curl -H "Authorization: Bearer YOUR_API_KEY" \
         "https://jaws-api.jgi.doe.gov/api/v3/runs?team_id=my_team&limit=50"

    # Get running runs for a specific user
    curl -H "Authorization: Bearer YOUR_API_KEY" \
         "https://jaws-api.jgi.doe.gov/api/v3/runs?user_id=jdoe&status=running"

    # Get recent runs (sorted by ID descending)
    curl -H "Authorization: Bearer YOUR_API_KEY" \
         "https://jaws-api.jgi.doe.gov/api/v3/runs?order_by=-id&limit=10"

Example with Python
+++++++++++++++++++

.. code-block:: python

    import requests

    API_KEY = "your_api_key_here"
    API_URL = "https://jaws-api.jgi.doe.gov/api/v3/runs"

    headers = {"Authorization": f"Bearer {API_KEY}"}

    # Example 1: Get all runs for a team
    params = {
        "team_id": "my_team",
        "limit": 50
    }
    response = requests.get(API_URL, headers=headers, params=params)

    if response.status_code == 200:
        runs = response.json()
        print(f"Found {len(runs)} runs for team 'my_team'")
        for run in runs:
            print(f"  Run {run['id']}: {run['workflow_name']} - {run['status']}")

    # Example 2: Get running runs for a user
    params = {
        "user_id": "jdoe",
        "status": "running"
    }
    response = requests.get(API_URL, headers=headers, params=params)

    if response.status_code == 200:
        running_runs = response.json()
        print(f"\n{len(running_runs)} running runs for user 'jdoe'")

    # Example 3: Get recent runs with pagination
    params = {
        "order_by": "-id",
        "limit": 10,
        "offset": 0
    }
    response = requests.get(API_URL, headers=headers, params=params)

    if response.status_code == 200:
        recent_runs = response.json()
        print(f"\n10 Most recent runs:")
        for run in recent_runs:
            print(f"  {run['id']}: {run['workflow_name']} ({run['status']})")

    # Example 4: Filter by compute site and status
    params = {
        "compute_site_id": "perlmutter",
        "status": "succeeded",
        "limit": 25
    }
    response = requests.get(API_URL, headers=headers, params=params)

    if response.status_code == 200:
        perlmutter_runs = response.json()
        print(f"\n{len(perlmutter_runs)} succeeded runs on Perlmutter")

Response (200 OK)
+++++++++++++++++

Returns an array of run objects:

.. code-block:: json

    [
      {
        "id": 12345,
        "user_id": "jdoe",
        "workflow_name": "example_workflow",
        "status": "succeeded",
        "team_id": "my_team",
        "compute_site_id": "dori",
        "submitted": "2026-05-11T14:30:00Z",
        "updated": "2026-05-11T16:00:00Z"
      },
      {
        "id": 12344,
        "user_id": "jdoe",
        "workflow_name": "another_workflow",
        "status": "running",
        "team_id": "my_team",
        "compute_site_id": "dori",
        "submitted": "2026-05-11T13:00:00Z",
        "updated": "2026-05-11T15:45:00Z"
      }
    ]

Error Responses
---------------

**404 Not Found** - Run does not exist:

.. code-block:: json

    {
      "detail": "Run not found"
    }

**422 Unprocessable Entity** - Invalid request parameters:

.. code-block:: json

    {
      "detail": [
        {
          "loc": ["body", "data", "max_ram_gb"],
          "msg": "ensure this value is less than or equal to 1024",
          "type": "value_error.number.not_le"
        }
      ]
    }

**401 Unauthorized** - Invalid or missing API key:

.. code-block:: json

    {
      "detail": "Invalid authentication credentials"
    }

**403 Forbidden** - Insufficient permissions:

.. code-block:: json

    {
      "detail": "Insufficient permissions to access this resource"
    }

Best Practices
--------------

1. **Run Submission**

   - **Stage input data** in the appropriate location with correct permissions before submission
   - Use descriptive ``tag`` values to identify runs later
   - The parameter ``max_ram_gb`` should match the max RAM in your WDL. We require this parameter so we can do quick validation in the API level.
   - Set ``max_ram_gb`` appropriately to avoid resource waste.
   - Enable ``caching`` (default) to leverage Cromwell call-caching and speed up reruns by reusing previously computed results

2. **Monitoring**

   - Poll at reasonable intervals (recommended: 30-60 seconds)
   - Check for terminal states: ``succeeded``, ``failed``, ``cancelled``, ``done``

3. **Searching**

   - Use filters to reduce result set size
   - Implement pagination for large result sets (max 100 per request)
   - Sort by ``-id`` to get most recent runs first

4. **Error Handling**

   - Always check HTTP status codes
   - Retry transient errors (5xx) with exponential backoff
   - Log submission details for troubleshooting failed runs
   - Validate metadata before submission to catch errors early

5. **Performance**

   - Cache workflow files when submitting multiple runs
   - Use the same ``workflow_name`` and ``workflow_tag`` for deduplication
   - Close file handles after submission
   - Limit concurrent submissions to avoid overwhelming the API

Complete Workflow Example
--------------------------

If you want to run the quickstart example, you will first want to git clone
the ``jaws-tutorial-examples`` repository.

.. code-block:: bash

   git clone https://code.jgi.doe.gov/official-jgi-workflows/wdl-specific-repositories/jaws-tutorial-examples.git
   cd jaws-tutorial-examples/quickstart


Here's a complete example demonstrating submission, monitoring, and searching:


.. code-block:: python

    import requests
    import json
    import time

    # ============================================================================
    # CONFIGURATION - Update these values for your environment
    # ============================================================================
    API_KEY = "your_api_key_here"              # Your JAWS API key
    TEAM_ID = "jgi_genomics"                   # Your team identifier
    WORKFLOW_NAME = "my_analysis_workflow"     # Descriptive workflow name
    COMPUTE_SITE_ID = "dori"                   # Compute site: dori, perlmutter, jgi, etc.
    INPUT_SITE_ID = "dori"                     # Input site (typically same as compute)
    OUTPUT_SITE_ID = "dori"
    WDL_FILE_PATH = "align.wdl"                # Path to your WDL file
    INPUTS_JSON_PATH = "inputs.json"           # Path to your inputs JSON
    MAX_RAM_GB = 1                             # Max RAM in GB (1-1024). Match this with your max in your WDL
    RUN_TAG = "api_example_run"                # Tag to identify this run
    # ============================================================================

    BASE_URL = "https://jaws-api.jgi.doe.gov/api/v3"

    headers = {"Authorization": f"Bearer {API_KEY}"}

    # Step 1: Submit a run
    print("Step 1: Submitting run...")

    metadata = {
        "compute_site_id": COMPUTE_SITE_ID,
        "input_site_id": INPUT_SITE_ID,
        "output_site_id": OUTPUT_SITE_ID,
        "team_id": TEAM_ID,
        "workflow_name": WORKFLOW_NAME,
        "max_ram_gb": MAX_RAM_GB,
        "tag": RUN_TAG
    }

    files = {
        'wdl_file': open(WDL_FILE_PATH, 'rb'),
        'inputs_json': open(INPUTS_JSON_PATH, 'rb'),
        'data': (None, json.dumps(metadata), 'application/json')
    }

    response = requests.post(f"{BASE_URL}/runs", headers=headers, files=files)

    for f in files.values():
        if hasattr(f, 'close'):
            f.close()

    if response.status_code != 201:
        print(f"Failed to submit: {response.json()}")
        exit(1)

    run_data = response.json()
    run_id = run_data['run_id']
    print(f"✓ Run {run_id} submitted successfully\n")

    # Step 2: Monitor the run
    print(f"Step 2: Monitoring run {run_id}...")

    while True:
        response = requests.get(f"{BASE_URL}/runs/{run_id}", headers=headers)

        if response.status_code == 200:
            run = response.json()
            status = run['status']
            print(f"  Status: {status}")

            # Check if terminal state
            if status in ['succeeded', 'failed', 'cancelled', 'done']:
                print(f"\n✓ Run completed with status: {status}")
                print(f"  Result: {run.get('result', 'N/A')}")
                print(f"  CPU Hours: {run.get('cpu_hours', 0)}")
                print(f"  Output Dir: {run.get('output_dir', 'N/A')}")
                break

        # Wait before next poll (30 seconds recommended)
        time.sleep(30)

    # Step 3: Search for related runs
    print(f"\nStep 3: Finding other runs with tag {RUN_TAG} ...")

    # Note: Searching by tag requires filtering client-side
    params = {
        "team_id": TEAM_ID,
        "limit": 100
    }

    response = requests.get(f"{BASE_URL}/runs", headers=headers, params=params)

    if response.status_code == 200:
        all_runs = response.json()
        tagged_runs = [r for r in all_runs if r.get('tag') == RUN_TAG]

        print(f"Found {len(tagged_runs)} runs with this tag:")
        for run in tagged_runs:
            print(f"  Run {run['id']}: {run['status']}")

Related Documentation
---------------------

See also:

- :doc:`JAWS Commands <jaws_usage>` - CLI commands for workflow submission
- :doc:`JAWS Teams <jaws_teams>` - Team management and permissions
- :doc:`JAWS Configuration <jaws_config>` - Setting up JAWS client