FAQs

Frequently asked questions about JAWS, WDL development, and workflow execution. Find quick answers, code examples, and troubleshooting tips organized by category.

Setup, Authentication, and Environment

These cover the most-asked setup questions by the users. The authoritative setup walk-through is in Configuring the JAWS Client: A Step-by-Step Guide and JAWS Quickstart; this section answers the recurring “but what about…” follow-ups.

How do I get a JAWS token? 🔗

Fill out the JAWS Token Request form. The JAWS team will send your token. Place it in ~/jaws.conf under the [USER] section and set the file mode:

vi ~/jaws.conf
[USER]
token = <your token>
default_team = <team ID>

chmod 600 ~/jaws.conf

The chmod 600 step is important: it restricts the file so that only you can read or write it. Your token is effectively a password, anyone who can read ~/jaws.conf can act as you in JAWS. On a shared cluster, default file modes are often world-readable, so without chmod 600 your token may be visible to everyone with an account on the system.

Never post your token in a public channel. If it’s posted by accident, ping the team and they’ll rotate it.

Full setup including the per-site jaws.conf template paths is in Configuring the JAWS Client: A Step-by-Step Guide.

My token isn’t working, I get 401 Unauthorized 🔗

The full error looks like:

{'detail': 'Provided token is not valid', 'status': 401, 'title': 'Unauthorized', ...}

Check these in order:

  1. No quotes around the token value in ~/jaws.conf. A literal token = "abc..." is invalid; remove the quotes.

  2. No stray whitespace around = or trailing on the value.

  3. You’re sourcing the right per-site setup. Each site has its own jaws.conf template and its own module use path, see the tabs in Configuring the JAWS Client: A Step-by-Step Guide.

  4. The token may have expired or been rotated. Ask in #jaws channel for a refresh.

How do I activate the JAWS environment on my site? 🔗

Use module load jaws after pointing module use at your site’s modulefiles directory. The per-site tabs in Configuring the JAWS Client: A Step-by-Step Guide show the exact paths for DORI, NERSC (Perlmutter / NMDC / KBase / RNome / SuperBio) and Tahoma (NMDC).

# Example: NERSC Perlmutter (JGI)
module use /global/cfs/cdirs/jaws/modulefiles
module load jaws

If you see Warning: a python module is already loaded, run module unload python before loading JAWS.

You can also use JAWS container or jaws client python library (see Configuring the JAWS Client: A Step-by-Step Guide for details).

Sites and Site Selection

How do I list available compute sites? 🔗

Use the jaws teams list-sites command with your team ID. The list of sites are depending on the team you are part of. For example, for the team dsi-aa:

jaws teams list-sites dsi-aa
 [
 "defiant",
 "crux",
 "jgi",
 "nmdc_tahoma",
 "perlmutter",
 "tahoma",
 "dori"
 ]

Use the returned site IDs (dori, perlmutter, tahoma, nmdc, nmdc_tahoma, kbase, defiant, and others as they come online) when you submit:

jaws submit my.wdl my.json perlmutter

See JAWS Quickstart for the current list.

Which compute site should I use? 🔗

Guidelines (verify against jaws teams list-sites dsi-aa and the site’s queue limits in Specifying Compute Resources in WDL Tasks). If you have question or a big batch of data, ask in #jaws.

Where are outages and maintenance windows announced? 🔗

Three places, in order of immediacy:

  1. #jaws Slack channel, @channel announcements from the JAWS team for ongoing outages and same-day downtime.

  2. JAWS Outages calendar, planned maintenance windows and known downtime, scheduled in advance. Best place to look when you’re planning a run a week or two out. (Office hours, user meetings, and workshops live on the separate JAWS Events calendar.)

  3. jaws health, for the current state of each site right now. The calendar and Slack tell you what’s expected; jaws health tells you what’s actually happening.

    jaws health
    

WDL Development

Using AI Assistants to Write or Debug WDL

LLMs (CBorg, ChatGPT, Claude, Copilot, etc.) can be a real productivity boost when writing WDL, but they also produce a class of errors that is worth watching for. This section covers what to know before pasting AI-generated WDL into jaws submit.

Tell the AI which WDL spec to target, always 1.0 🔗

This is the single most important thing. Out of the box, most AI assistants will generate WDL 1.0 or later or using bash syntax, because that’s what their training data biases toward. JAWS currently supports WDL 1.0 only, because the Cromwell version JAWS runs doesn’t parse 1.1 features. (We’ll upgrade Cromwell and lift that restriction, see “Can I use WDL 1.1 with JAWS?” below, but for now, 1.0 is the answer.)

Always start your prompt with something like:

Write a WDL workflow using the WDL 1.0 specification
(https://github.com/openwdl/wdl/blob/legacy/versions/1.0/SPEC.md).
Do not use WDL 1.1 features such as keys(), values(), None,
Directory type, or enhanced struct features. The first line
of every file must be `version 1.0`.

Without this, the AI will often produce keys(my_map) or None literals that look correct but fail validation when you submit. The first symptom is usually a jaws validate error you can’t immediately explain.

Generating a new WDL with AI 🔗

A workflow that works well:

  1. Prompt the AI with: the WDL 1.0 constraint above, a clear description of the task, the input/output types you expect, and the resources the task will need (CPU, memory, runtime_minutes).

  2. Validate before submitting:

    jaws validate workflow.wdl inputs.json
    

    If validation fails, paste the error back to the AI along with a reminder to target WDL 1.0.

  3. Submit a tiny test run with minimal inputs before committing to a big batch.

Things the AI is good at:

  • Boilerplate (task structure, runtime blocks, scatter patterns)

  • Translating a shell script into a command <<< ... >>> block

  • Suggesting input/output type signatures

  • Writing the matching inputs.json

Things the AI is unreliable at:

  • JAWS-specific behavior; for example, runtime {runtime_minutes} attribute is a JAWS Cromwell engine features, and the AI might not know that.

  • Whether a particular Docker image is the right one, verify registry, tag, and that it actually exists on Docker Hub or library.jgi.doe.gov.

Debugging a failing WDL with AI 🔗

When a run fails and you want AI help diagnosing it, paste:

  1. The exact error text. JAWS writes an errors.json file to the run’s output directory; that’s the first place to look. To find the path:

    jaws status <RUN_ID>          # look for "output_dir"
    cat <output_dir>/errors.json
    

    If errors.json doesn’t tell you enough, fall back to the task’s own stderr and stdout in the Cromwell execution directory. For runs that failed before the execution dir was copied back, you may need jaws download <RUN_ID> first to pull the failed task directories. Don’t paraphrase the error text, the AI needs the literal output.

  2. The WDL section that’s failing (just the relevant task is enough; don’t paste a 500-line workflow).

  3. The relevant inputs (with sensitive values redacted).

  4. Tell the AI the JAWS context: WDL 1.0, Cromwell as the executor, Shifter or Apptainer as the container runtime depending on the site.

The AI is most useful for:

  • Suggesting alternate patterns when you hit a WDL 1.0 limitation (e.g. “can’t scatter over a Map”, see the “How do I get keys from a Map?” entry).

  • Explaining what a particular runtime attribute does.

The AI is least reliable for telling you whether something is a JAWS infrastructure issue vs. a workflow bug. When in doubt, ask in #jaws with your run ID before assuming the AI is correct.

A prompt template you can copy 🔗
I'm writing a WDL workflow for JAWS (JGI Analysis Workflow
Service). Constraints:

- WDL spec: 1.0 ONLY (https://github.com/openwdl/wdl/blob/legacy/versions/1.0/SPEC.md)
- The first line of every file must be `version 1.0`
- Do not use WDL 1.1 features: no keys(), values(), None,
  Directory type, or enhanced struct features
- Each task must have a `runtime {}` block with at least:
  docker, cpu, memory, runtime_minutes
- Use the heredoc command style: `command <<< ... >>>`
- Use ~{var} for WDL interpolation, ${var} for bash inside
  the command block
- Outputs must live inside the task's working directory
  (no /tmp, no absolute paths outside)

Now: <describe the workflow you want>.

Syntax and Language Features

How do I use bash commands with curly braces in WDL? 🔗

Problem: Bash uses curly braces for parameter expansion (e.g., ${VAR:-default}), but WDL also uses ${} for variable interpolation, causing conflicts.

Solution: Use the heredoc-style command block (<<< and >>>) with WDL 1.0+:

version 1.0

task example {
    input {
        String? optional_var
    }

    command <<<
        # Set bash default value
        VAR=${VAR:-25}

        # Strip file extension
        myvar=${somefile%.txt}

        # Use WDL variable with ~{} notation
        echo "WDL variable: ~{optional_var}"
    >>>

    runtime {
        docker: "ubuntu:22.04"
    }
}

Key Points:

  • Use <<< >>> delimiters (heredoc style) instead of { }

  • Use ~{variable} for WDL interpolation inside heredoc

  • Bash ${variable} syntax works normally inside heredoc

📖 WDL 1.0 Spec: Command Section

How do I output dynamically named files? 🔗

Problem: Bash variables created in the command block aren’t accessible in the output block.

Solution 1: Use glob() to match patterns (recommended):

task align_reads {
    command <<<
        # Generate output with dynamic name
        minimap2 ... > "${SAMPLE_ID}.bam"
    >>>

    output {
        Array[File] bam_files = glob("*.bam")
        File first_bam = glob("*.bam")[0]
    }
}

Solution 2: Write filenames to a manifest:

task process_samples {
    command <<<
        for sample in ~{sep=' ' samples}; do
            analyze.sh "$sample" > "${sample}.result"
            echo "${sample}.result" >> output_manifest.txt
        done
    >>>

    output {
        Array[File] results = read_lines("output_manifest.txt")
    }
}

Available WDL Functions:

  • glob(pattern) - Match file patterns

  • read_lines(file) - Read file into Array[String]

  • read_tsv(file) - Parse TSV into Array[Array[String]]

  • read_json(file) - Parse JSON into WDL types

📖 WDL Standard Library

How do I use conditionals in workflows? 🔗

Use if statements to conditionally execute tasks:

version 1.0

workflow conditional_processing {
    input {
        File input_file
        Boolean filter_data = true
        Int min_quality = 30
    }

    call qc_check {
        input: infile = input_file
    }

    # Conditional task execution
    if (qc_check.pass_qc && filter_data) {
        call filter_reads {
            input:
                infile = input_file,
                min_qual = min_quality
        }
    }

    # Use select_first() to handle optional outputs
    File final_file = select_first([filter_reads.filtered_file, input_file])

    call downstream_analysis {
        input: infile = final_file
    }

    output {
        File result = downstream_analysis.output_file
        File? filtered = filter_reads.filtered_file
        Boolean was_filtered = defined(filter_reads.filtered_file)
    }
}

Important Functions:

  • defined(variable) - Check if optional value is set

  • select_first([opt1, opt2, default]) - Pick first defined value

  • select_all(array) - Remove undefined elements from Array

📖 WDL Conditionals

Data Structures and Iteration

How do I scatter over arrays and maps? 🔗

Arrays (simple scatter):

workflow process_samples {
    input {
        Array[File] fastq_files
        File reference
    }

    scatter (fastq in fastq_files) {
        call align {
            input:
                reads = fastq,
                ref = reference
        }
    }

    # Gather scattered outputs into array
    output {
        Array[File] all_bams = align.bam
    }
}

Maps (iterate over key-value pairs):

workflow process_configs {
    input {
        Map[String, String] sample_to_barcode = {
            "sample1": "ATCG",
            "sample2": "GCTA"
        }
    }

    scatter (pair in as_pairs(sample_to_barcode)) {
        String sample_id = pair.left
        String barcode = pair.right

        call demultiplex {
            input:
                sample = sample_id,
                barcode = barcode
        }
    }
}

Cross-product scatter (nested):

scatter (sample in samples) {
    scatter (replicate in replicates) {
        call process {
            input:
                s = sample,
                r = replicate
        }
    }
}

# Flatten nested array: Array[Array[File]] → Array[File]
Array[File] all_results = flatten(process.output)

📚 Example workflows

How do I create custom data structures? 🔗

Use struct to define custom types:

version 1.0

struct SampleMetadata {
    String sample_id
    File fastq_r1
    File fastq_r2
    String sequencing_center
    Int read_length
    Float? quality_score
}

workflow analyze_samples {
    input {
        Array[SampleMetadata] samples
    }

    scatter (sample in samples) {
        call process {
            input:
                id = sample.sample_id,
                r1 = sample.fastq_r1,
                r2 = sample.fastq_r2
        }
    }
}

Input JSON:

{
  "analyze_samples.samples": [
    {
      "sample_id": "S001",
      "fastq_r1": "s001_R1.fastq.gz",
      "fastq_r2": "s001_R2.fastq.gz",
      "sequencing_center": "JGI",
      "read_length": 150,
      "quality_score": 35.2
    }
  ]
}

📖 WDL Structs

How do I get keys from a Map? 🔗

Workaround for WDL 1.0 (use Pairs):

version 1.0

workflow extract_keys {
    input {
        Array[Pair[String, Int]] data = [
            ("sample1", 100),
            ("sample2", 200)
        ]
    }

    scatter (pair in data) {
        String key = pair.left
        Int value = pair.right
    }

    output {
        Array[String] all_keys = key
        Array[Int] all_values = value
    }
}

Input JSON:

{
  "extract_keys.data": [
    {"Left": "sample1", "Right": 100},
    {"Left": "sample2", "Right": 200}
  ]
}

Note: WDL 1.1+ has keys(map) and values(map) functions, but JAWS currently supports WDL 1.0 only.

Advanced Patterns

How do I handle optional outputs in scatter-gather? 🔗

Problem: Scattered tasks may conditionally produce outputs, creating Array[File?].

Solution: Use select_all() and flatten():

version 1.0

workflow optional_scatter {
    input {
        Array[File] samples
        Boolean run_qc = false
    }

    scatter (sample in samples) {
        if (run_qc) {
            call quality_check {
                input: sample_file = sample
            }
        }
    }

    # quality_check.report is Array[File?]
    # Remove undefined elements
    Array[File] valid_reports = select_all(quality_check.report)

    call summarize {
        input: reports = valid_reports
    }

    output {
        File summary = summarize.output_file
        Int num_reports = length(valid_reports)
    }
}

Complex nested example:

# Nested scatter with conditionals
scatter (batch in batches) {
    scatter (sample in batch.samples) {
        if (sample.needs_processing) {
            call process { input: s = sample }
        }
    }
}

# process.output is Array[Array[File?]]
# Flatten and remove undefined
Array[File] all_outputs = flatten(select_all(process.output))

📖 WDL Array Functions

Why does jaws validate fail with string concatenation in runtime? 🔗

Problem: jaws validate (using miniwdl) is stricter than womtool:

task example {
    input {
        String? tag = "latest"
    }

    runtime {
        docker: "ubuntu:" + tag  # ❌ Fails validation
    }
}

Error:

Cannot add/concatenate String and String?

Solutions:

  1. Remove optional (if always provided):

    input {
        String tag = "latest"  # Not optional
    }
    runtime {
        docker: "ubuntu:" + tag  # ✅ Works
    }
    
  2. Use select_first() (with fallback):

    input {
        String? tag
    }
    runtime {
        docker: "ubuntu:" + select_first([tag, "latest"])  # ✅ Works
    }
    
  3. Use sub() for complex string building:

    runtime {
        docker: sub("ubuntu:TAG", "TAG", select_first([tag, "latest"]))
    }
    

Why it matters:

  • jaws submit uses womtool (permissive)

  • jaws validate uses miniwdl (strict WDL 1.0 compliance)

  • Use jaws validate before submission to catch issues early

Cromwell and Execution

Call Caching

Does Cromwell support checkpointing? 🔗

Answer: Cromwell uses call caching instead of traditional checkpointing.

How it works:

When a task completes successfully, Cromwell stores:

  1. Command template hash

  2. Input file content hashes (via MD5)

  3. Docker image digest

  4. Runtime attributes

If you rerun with identical inputs, Cromwell reuses previous outputs.

Enable/disable caching:

# Default: caching enabled
jaws submit workflow.wdl inputs.json <SITE>

# Disable caching for one run
jaws submit workflow.wdl inputs.json <SITE> --no-cache

Check which tasks were served from cache:

jaws tasks <run_id>

The output table has a CACHED column (True / False) for each task, that’s the authoritative view of what hit cache and what executed fresh.

📖 Cromwell Call Caching

Why didn’t call caching work? 🔗

Call caching fails if any of these change:

  1. WDL file contents (even whitespace/comments)

  2. Input values (variable names or values in inputs.json)

  3. Input file contents (different MD5 hash)

  4. Docker image (different digest)

  5. Runtime attributes (memory, CPU, disk)

  6. Task outputs (number or type of outputs)

Common mistakes:

❌ Using String paths instead of File:

# Bad: cache won't work if path string changes
input {
    String bam_path = "/path/to/file.bam"
}

✅ Use File type:

# Good: caching works based on file content hash
input {
    File bam_file = "/path/to/file.bam"
}

❌ Dynamic runtime attributes:

runtime {
    memory: if (large_input) then "32GB" else "16GB"
}

Each value creates a different cache key.

Debug caching issues:

Use jaws tasks <run_id> to see the CACHED column per task. For deeper inspection of why a task didn’t cache, JAWS writes a metadata.json supplementary file to the run’s output directory (find the path with jaws status <run_id>output_dir); Cromwell records the cache decision and the contributing hashes there.

jaws tasks <run_id>
jaws status <run_id>          # find output_dir
cat <output_dir>/metadata.json | jq '.calls[] | .[].callCaching'
How do I handle HTTP URLs as inputs? 🔗

Problem: Cromwell downloads HTTP files but strips extensions, breaking tools that rely on filenames.

Solution: Use Pair[File, String] to preserve original name:

version 1.0

workflow http_input_example {
    input {
        Pair[File, String] remote_file
    }

    call process {
        input: file_pair = remote_file
    }
}

task process {
    input {
        Pair[File, String] file_pair
    }

    command <<<
        # Copy to preserve extension
        cp ~{file_pair.left} ~{file_pair.right}

        # Now tool can read correct filename
        analyze.sh ~{file_pair.right}
    >>>

    output {
        File result = "result.txt"
    }
}

Input JSON:

{
  "http_input_example.remote_file": {
    "Left": "https://data.org/sample.fastq.gz",
    "Right": "sample.fastq.gz"
  }
}

📖 Cromwell HTTP Filesystems

Does using a private registry image (e.g. library.jgi.doe.gov) prevent call caching? 🔗

Yes, call caching will not work for images pulled from a private registry. Cromwell can’t perform the SHA256 digest lookup against a private registry it doesn’t have read credentials for, and call caching depends on that lookup to decide whether the container has changed.

Why this matters in practice. Cromwell uses the image digest (SHA256) as part of the cache key. When you reference an image by tag (e.g. mytool:latest), Cromwell resolves the tag to a digest at submission time. If it can’t reach the registry to do that resolution, as is the case with library.jgi.doe.gov and other private registries, it can’t compute the cache key, so every run starts from scratch even when nothing about your inputs or WDL has changed.

A new push to the same tag (different digest) would also miss the cache, but for private-registry images you don’t even get to that point, the lookup itself fails.

What to do.

  • For workflows where call caching matters, use images from Docker Hub, Quay.io, or another public registry. Reference by image@sha256:<digest> for the strongest guarantee.

  • For private-registry images, expect every run to execute fully. That’s a fine trade-off when the image content is sensitive; it’s a problem when the image is large and the workflow is repeated.

  • For images on Docker Hub or other public registries, call-caching works as normal, see “Why didn’t call caching work?” above for the other things that can break it.

Will I get cache hits if I resubmit the same workflow on a different site? 🔗

No. Each JAWS site runs its own Cromwell instance, and Cromwell has no API for importing cache results between instances. A run on DORI followed by an identical run on Perlmutter will start from scratch, even though the WDL and inputs are byte-identical.

WDL Version Support

Can I use WDL 1.1 with JAWS? 🔗

Not yet, JAWS currently supports WDL 1.0 only.

The Cromwell version JAWS runs today doesn’t parse WDL 1.1 features reliably, so workflows declared with version 1.1 will fail validation (often with confusing errors that don’t mention the version mismatch). Use version 1.0 on the first line of every WDL file.

A Cromwell upgrade is on the roadmap. Recent Cromwell releases do support WDL 1.1, and we plan to upgrade the JAWS-bundled Cromwell soon. Until Cromwell is upgraded and fully tested, please continue to use WDL 1.0 only.

Check your version:

version 1.0  # ✅ Supported
# version 1.1  # ❌ Not supported

WDL 1.1 features NOT available:

  • keys() and values() Map functions

  • None type

  • Directory type

  • Enhanced struct features

Workarounds:

  • Use Pairs instead of Maps for key extraction

  • Use File type for directories (pass as paths)

  • Use struct with optional fields instead of None

📖 WDL 1.0 Spec

JAWS-Specific Issues

Container Configuration

Will my container’s ENTRYPOINT script run? 🔗

No, JAWS/Cromwell does not execute container ENTRYPOINT scripts.

How Cromwell runs containers:

# Your Dockerfile ENTRYPOINT is ignored
apptainer run <image> /cromwell-executions/.../script

# Cromwell creates script from WDL command block
# Any ENTRYPOINT is bypassed

Solution: Move logic from ENTRYPOINT to command block:

❌ Bad (ENTRYPOINT won’t run):

FROM ubuntu:22.04
COPY setup.sh /
ENTRYPOINT ["/setup.sh"]

✅ Good (explicit in WDL):

command <<<
    # Run setup explicitly
    /setup.sh

    # Then your actual command
    analyze.sh input.txt
>>>
How do I fix timezone offset warnings? 🔗

Warning message:

Timezone offset does not match system offset: 0 != -25200

Solution: Set TZ environment variable:

# Add to ~/.bashrc
export TZ="America/Los_Angeles"

# Reload
source ~/.bashrc

For JAWS container users:

export TZ="America/Los_Angeles"
apptainer run docker://doejgi/jaws-client:latest jaws queue

Output Management

Why doesn’t JAWS copy the entire Cromwell execution? 🔗

Reason: To avoid transferring terabytes of intermediate files.

What JAWS transfers:

For successful runs:

  • Files declared in workflow output section

  • Final outputs only (not intermediate files)

For failed runs:

  • error.json with failure details

  • Full execution directory if you run jaws download <run_id>

How to preserve additional files:

task analysis {
    command <<<
        analyze.sh > output.txt 2> stderr.log
    >>>

    output {
        File result = "output.txt"
        File stderr_log = "stderr.log"  # ✅ Explicitly declared
        File stdout_log = stdout()       # ✅ Special function
    }
}
How do I access stderr and stdout logs? 🔗

For failed runs:

# Download full execution directory
jaws download <run_id>

# error.json includes embedded logs
cat error.json | jq '.failures[0].message'

For successful runs:

Declare logs as outputs:

task example {
    command <<<
        ./pipeline.sh > stdout.log 2> stderr.log
    >>>

    output {
        File stdout = "stdout.log"
        File stderr = "stderr.log"
    }
}

Use WDL functions:

output {
    String stdout_text = read_string(stdout())
    String stderr_text = read_string(stderr())
}

Run Lifecycle and Monitoring

These cover the recurring “is something wrong?” questions. The full lifecycle state diagram is in JAWS Commands (under “Understanding the Stages”) and JAWS Troubleshooting RoadMap.

My jaws history returns [] even though I’ve run jobs 🔗

Default lookback is short (1 day). Pass --days:

jaws history --days 30                  # last 30 days, all sites
jaws history --days 30 --site dori      # last 30 days, only Dori
jaws history --days 30 --result failed  # last 30 days, only failed runs

Full flag reference is in JAWS Commands.

What’s the difference between workflow_root and output_dir? 🔗

Both appear in jaws status output and confuse a lot of users:

  • ``workflow_root`` is the Cromwell execution directory: cromwell-executions/<workflow_name>/<cromwell_id>/. It contains every intermediate file, log, and task working directory. It’s auto-purged on a fixed schedule (currently ~10 days on Dori).

  • ``output_dir`` is the JAWS Teams directory where your final declared outputs are copied: <TEAM_PATH>/<user_id>/<run_id>/<cromwell_id>/. It is not auto-purged, your team manages its lifecycle.

When you need an intermediate file, look in workflow_root. When you need a final output, look in output_dir.

How is my output storage managed? Will my files be deleted? 🔗

Two different things happen, with two different timelines:

  1. ``workflow_root`` (``cromwell-executions/<RUN_ID>``), the full Cromwell execution directory, is auto-purged after ~10 days on DORI. This is JAWS-managed; you don’t need to clean it.

  2. ``output_dir`` (JAWS Teams directory), where the final declared outputs land, is not auto-purged. Each team is responsible for managing its space.

The recommendation for DORI users: periodically review your team’s output directory and remove anything you don’t need.

A compute site went into “degraded” state while my run was active. Are my outputs safe? 🔗

Not necessarily, and JAWS won’t warn you. When a compute site (Perlmutter, Tahoma, Dori, etc.) enters a degraded state, the underlying filesystem or compute nodes can corrupt files that are being read or written at that moment. Neither JAWS nor Cromwell detects this kind of silent, mid-run corruption, the run can still finish with a normal “succeeded” status and outputs that look complete but are actually wrong.

How to know a site is degraded:

  • The site’s own status page is the source of truth (for example, NERSC Live Status, or the #dori channel status announcements).

  • The JAWS team posts @channel messages in #jaws when a site they’re aware of enters a degraded state.

What to do if your run overlapped with a degraded window:

  1. Spot-check the outputs that the run produced during the degraded window. Compare file sizes, line counts, or content hashes against a known-good prior run if you have one.

  2. Look at the run timeline with jaws log <RUN_ID> and cross-reference against the site’s status timeline.

  3. When in doubt, resubmit with jaws submit--no-cache, to trigger the workflow to run fresh and produce new outputs.

We’ve had previous user reports of incorrect outputs from Perlmutter runs that finished during degraded states, so this is worth checking proactively, not just when something obviously looks off.

How do I clean up old runs on demand? 🔗
jaws purge <run_id>

This removes the cromwell-executions/ directory for a specific run early (before the auto-purge would). Your final outputs in output_dir stay where they are.

How do I cancel everything I have queued or running? 🔗
jaws cancel-all          # cancel all your active runs
jaws cancel <run_id>     # cancel one specific run

cancel-all is intended for “I made a mistake, kill everything.” See JAWS Commands for the full command list.

Troubleshooting

Common Errors

My job failed with “OutOfMemory” 🔗

Cause: Task exceeded requested memory.

Solution: Increase memory allocation:

runtime {
    memory: "64 GB"  # Was 32 GB
    cpu: 8
}
Cannot find rc file after failure 🔗

Error message:

Unable to determine that job is alive, and .../rc does not exist

Cause: HTCondor lost connection to compute node before task completed.

Solution:

# Simply resubmit
jaws resubmit <run_id>

This is a transient scheduler issue, not a problem with your workflow. The task usually succeeds on retry.

If it persists: Contact JAWS support with run ID.

500 Internal Server Error / 408 Request Timeout / 502 from jaws commands 🔗

These are usually transient and usually happen in two situations:

  • Right after a JAWS deploy (the team announces these in #jaws). Services need a few minutes to come back up. Wait 5–15 minutes and retry.

  • During an external incident (Cloudflare hiccup, NERSC maintenance, AMQP/RabbitMQ flap). jaws status returning 502, or jaws tasks returning Connection dead, no heartbeat or data received in >= 60s, both fall into this bucket.

Your runs are still progressing at the site level even when central is briefly unreachable; once the API is back, your state catches up.

If you see the same error for more than ~15 minutes and there’s no announcement in #jaws, ping the team with your run ID.

My run says submission failed and jaws task returns None... cromwell_run_id 🔗

The run never made it to Cromwell. Run jaws log <run_id> to see where the lifecycle stopped. Examples:

jaws log <run_id>
#STATUS_FROM       STATUS_TO          TIMESTAMP            COMMENT
created            upload queued      ...
upload queued      upload complete    ...
upload complete    submission failed  ...                   Server timeout: The service is unable to respond at this time

Common causes:

  • Brief central/site outage during submission. The run row exists but never got submitted to Cromwell. The fix is to jaws resubmit <run_id> (or re-jaws submit) once the team confirms services are back.

The status field will read done even for these, JAWS always terminates in done. Check the result field: null means the run never reached Cromwell.

Data Access Issues

How do I access Cromwell execution on a remote site? 🔗

Scenario: You submitted from Perlmutter to Tahoma but don’t have Tahoma access.

For successful runs:

# Outputs automatically transferred back to Perlmutter
jaws status <run_id> | jq -r '.result.output_dir'

For failed runs:

# Transfer failed task execution
jaws download <run_id>

# Check error summary
cat error.json | jq '.failures'
Why are special characters in filenames causing failures? 🔗

Problem: Files with ' (apostrophe), ; (semicolon), or \ fail.

Cause: Cromwell doesn’t properly escape special characters in bash commands.

Solution: Avoid special characters in input filenames.

❌ Bad:

sample's_data.fastq
file;with;semicolons.txt

✅ Good:

sample_data.fastq
file_with_underscores.txt

If you must use existing files: Rename them first:

command <<<
    # Sanitize filename
    SAFE_NAME=$(echo "~{input_file}" | tr "';\\\"" "_")
    cp ~{input_file} "$SAFE_NAME"
    process.sh "$SAFE_NAME"
>>>

Known Limitations

Outputs outside cromwell-executions/execution/ directory 🔗

Problem: JAWS only transfers files from cromwell-executions/.../execution/.

Won’t be transferred:

command <<<
    echo "data" > /tmp/output.txt
>>>
output {
    File result = "/tmp/output.txt"  # ❌ Outside execution dir
}

Will be transferred:

command <<<
    echo "data" > output.txt  # ✅ In execution dir
>>>
output {
    File result = "output.txt"
}

Workaround for existing files:

command <<<
    # Process creates output in /tmp
    process.sh --output /tmp/result.txt

    # Copy to execution directory
    cp /tmp/result.txt ./result.txt
>>>
output {
    File result = "result.txt"
}

Additional Resources

External Documentation

📖 WDL 1.0 Specification

Official OpenWDL language spec

🚀 Dockstore WDL Guide

Beginner-friendly WDL tutorial

🔧 Cromwell Documentation

Workflow engine reference

JAWS Documentation

See also

Related Guides

Need More Help?

If you didn’t find your answer:

💬 Contact JAWS Support

#jaws Slack Channel