=============== JAWS Guidelines =============== .. role:: bash(code) :language: bash In this section, we aim to address all the particularities of JAWS and best practices for using JAWS. Temporary Directories ===================== If your workflow requires the use of the :bash:`/tmp` directory, JAWS is configured to handle it in the following way: - If the WDL command stanza uses :bash:`$TMPDIR`, t will automatically have access to :bash:`/tmp`. This ensures that temporary directories are properly cleaned up after use, maintaining system stability and performance. .. dropdown:: Example :color: info :animate: fade-in .. code-block:: text version 1.0 workflow HelloWorld { call Hello } task Hello { command <<< echo ${TMPDIR} >>> runtime { docker: "ubuntu:latest" } } JAWS Staging Area for Input Files ================================= When you submit a run, JAWS copies the input files to "JAWS staging area." This ensures that `GLOBUS` has access to the files, enabling them to be transferred to the compute sites. A key advantage of this approach is that it allows input files to be cached, reducing the time it takes to submit a new run that reuses the same files. JAWS client follows a specific pattern for copying files to the staging area: .. code-block:: text /inputs// The `JAWS staging area` is mounted to the container executing a Cromwell task, providing Cromwell access to all the necessary input files for the task. However, JAWS avoids file duplication since Cromwell is configured to use `hard-links`, which reference the same data without copying it. Hard-linking is preferred over copying in Cromwell because it saves storage space, speeds up workflow setup, ensures data consistency, preserves file metadata, and reduces I/O overhead. File Caching ------------ When you submit a new run using the same input files, JAWS will not recopy them. Instead, it references the existing files in the staging area (based on the path above). You will see a message like this: .. code-block:: text jaws submit --no-cache align_final.wdl inputs.json dori Using cached copy of sample.fastq.bz2 Using cached copy of sample.fasta Note that the `--no-cache` flag is used by Cromwell and does not relate to input files. This flag determines whether the run outputs should be cached or not. Handling File Changes --------------------- If you modify the content of a file but keep the same filename, the JAWS client will detect the change and provide an error message: .. code-block:: text jaws submit --no-cache align_final.wdl inputs.json dori Error initializing Run: Unable to copy input files: /clusterfs/jgi/groups/dsi/homes/dcassol/jaws/jaws-tutorial-examples/data/sample.fasta is different from its cached version. Submitting with this input file can affect previous runs. Use --overwrite-inputs to force update the cached input files. If you choose to proceed with the updated input file, use the `--overwrite-inputs` flag to force the update of the cached input files. However, be aware that this can affect previous runs that use the same input filename. Use of Relative Paths for the `inputs.json` =========================================== The default relative path for inputs is based on the location of the `inputs.json` file, not the directory from which the run is submitted. This behavior is different from Cromwell, where the relative path defaults to the submission directory. However, this was a decision made by the JAWS community to ensure consistency and to keep the `inputs.json` file in the same location as the WDL file. In this example, the full path to the `inputs.json` file is: .. code-block:: bash cat $HOME/jaws-tutorial-examples/5min_example/inputs.json { "bbtools.reads": "../data/sample.fastq.bz2", "bbtools.ref": "../data/sample.fasta" } Here, the paths provided are relative to the location of the `inputs.json` file. These paths refer to input files located in the `../data/` directory, relative to the `5min_example` folder where the `inputs.json` is located. In the data directory, the referenced files are as follows: .. code-block:: bash ls -la $HOME/jaws-tutorial-examples/data -rwxrwxr-x 2 dcassol grp-dcassol 2929 Oct 3 2023 sample1.fasta -rwxrwxr-x 10 dcassol grp-dcassol 792 Mar 20 2023 sample.fastq.bz2 These relative paths ensure that the input files are correctly referenced from the `inputs.json` file's location. If you have any questions, please contact the JAWS team. Use of `$JAWS_SITE` Environment Variable ======================================== The `$JAWS_SITE` environment variable is now automatically exported within container environments during task execution. This enhancement allows you to use `$JAWS_SITE` within the `command {}` section of your WDL scripts to implement site-specific conditional logic. .. dropdown:: Example :color: info :animate: fade-in .. code-block:: bash task example_task { command { if [ "$JAWS_SITE" == "DORI" ]; then # Commands specific to the 'dori' site echo "Running on dori site" # Add site-specific commands here else # Commands for other sites echo "Running on a different site" # Add alternative commands here fi } runtime { docker: "ubuntu:latest" } } In this example, the task checks the value of the `$JAWS_SITE` environment variable to determine which set of commands to execute, enabling dynamic behavior based on the execution site. Note: The value of `$JAWS_SITE` is in uppercase (e.g., `DORI`). Ensure your conditional checks account for this. This feature is particularly useful for workflows that need to adapt their behavior depending on the compute environment, enhancing flexibility and portability across different sites.