Usage

The usage can be seen by running processMeerKAT.py -h the output of which is documented below.

Simple usage

To get things working, source setup.sh, which will add to your $PATH and $PYTHONPATH (add this to your ~/.profile, for future use)

source /idia/software/pipelines/master/setup.sh
To build a config file, which the pipeline reads as input for how to process the data, run

processMeerKAT.py -B -C myconfig.txt -M mydata.ms
To run the pipeline, run

processMeerKAT.py -R -C myconfig.txt

This will create submit_pipeline.sh, which you can then run to submit all pipeline jobs to a SLURM queue: ./submit_pipeline.sh
Display a summary of the submitted jobs

./summary.sh
Kill the submitted jobs

./killJobs.sh
If the pipeline crashes, or reports an error, find the error(s) by running (after the pipeline has run)

./findErrors.sh
Once the pipeline has completed, display the start and end times of each job by running

./displayTimes.sh

Detailed usage

Build config file using a custom SLURM configuration (nodes and tasks per node may be overwritten in your config file with something more appropriate by the end of the build step)

processMeerKAT.py -B -C myconfig.txt -M mydata.ms -p Test02 -N 10 -t 8 -D 4 -m 100 -T 06:00:00 -n run1_
Build config file using different MPI wrapper and container

processMeerKAT.py -B -C myconfig.txt -M mydata.ms --mpi_wrapper /path/to/another/mpi/wrapper --container /path/to/another/container
Build config file with different set of (python) scripts

processMeerKAT.py -B -C myconfig.txt -S /absolute/path/to/my/script.py False /absolute/path/to/container.simg -S partition.py True '' -S relative/path/to/my/script.py True relative/path/to/container.simg -S flag_round_1.py True '' -S script_in_bash_PATH.py False container_in_bash_PATH.simg -S setjy.py True ''
Run the pipeline immediately in verbose mode

processMeerKAT.py -R -v -s -C myconfig.txt

NOTE: All other command-line arguments passed into processMeerKAT.py when using option [-R --run] will have no effect, since the arguments are read from the config file at this point. Only options [-s --submit], [-v --verbose] and [-C --config] will have any effect at this point. Similarly, changing the [slurm] section in your config file after using option [-R --run] will have no effect unless you [-R --run] again.

Command-line options

The command line help text is:

usage: /idia/software/pipelines/master/processMeerKAT/processMeerKAT.py [-h] [-M path] [-C path] [-N num] [-t num] [-D num] [-m num] [-p name] [-T time] [-S script threadsafe container] [-b script threadsafe container]
                                                                                    [-a script threadsafe container] [--modules [module [module ...]]] [-w path] [-c path] [-n unique] [-d list] [-e nodes] [-A group] [-r name] [-l] [-s]
                                                                                    [-v] [-q] [-P] [-2] [-I] [-x] [-j] (-B | -R | -V | -L)

Process MeerKAT data via CASA MeasurementSet. Version: 2.0

optional arguments:
  -h, --help            show this help message and exit
  -M path, --MS path    Path to MeasurementSet.
  -C path, --config path
                        Relative (not absolute) path to config file.
  -N num, --nodes num   Use this number of nodes [default: 1; max: 79].
  -t num, --ntasks-per-node num
                        Use this number of tasks (per node) [default: 16; max: 32].
  -D num, --plane num   Distribute tasks of this block size before moving onto next node [default: 1; max: ntasks-per-node].
  -m num, --mem num     Use this many GB of memory (per node) for threadsafe scripts [default: 232; max: 232].
  -p name, --partition name
                        SLURM partition to use [default: 'Main'].
  -T time, --time time  Time limit to use for all jobs, in the form d-hh:mm:ss [default: '12:00:00'].
  -S script threadsafe container, --scripts script threadsafe container
                        Run pipeline with these scripts, in this order, using these containers (3rd value - empty string to default to [-c --container]). Is it threadsafe (2nd value)?
  -b script threadsafe container, --precal_scripts script threadsafe container
                        Same as [-S --scripts], but run before calibration.
  -a script threadsafe container, --postcal_scripts script threadsafe container
                        Same as [-S --scripts], but run after calibration.
  --modules [module [module ...]]
                        Load these modules within each sbatch script.
  -w path, --mpi_wrapper path
                        Use this mpi wrapper when calling threadsafe scripts [default: 'mpirun'].
  -c path, --container path
                        Use this container when calling scripts [default: '/idia/software/containers/casa-6.4.4-modular.simg'].
  -n unique, --name unique
                        Unique name to give this pipeline run (e.g. 'run1_'), appended to the start of all job names. [default: ''].
  -d list, --dependencies list
                        Comma-separated list (without spaces) of SLURM job dependencies (only used when nspw=1). [default: ''].
  -e nodes, --exclude nodes
                        SLURM worker nodes to exclude [default: ''].
  -A group, --account group
                        SLURM accounting group to use (e.g. 'b05-pipelines-ag' - check 'sacctmgr show user $USER cluster=ilifu-slurm20 -s format=account%30') [default: 'b03-idia-ag'].
  -r name, --reservation name
                        SLURM reservation to use. [default: ''].
  -l, --local           Build config file locally (i.e. without calling srun) [default: False].
  -s, --submit          Submit jobs immediately to SLURM queue [default: False].
  -v, --verbose         Verbose output? [default: False].
  -q, --quiet           Activate quiet mode, with suppressed output [default: False].
  -P, --dopol           Perform polarization calibration in the pipeline [default: False].
  -2, --do2GC           Perform (2GC) self-calibration in the pipeline [default: False].
  -I, --science_image   Create a science image [default: False].
  -x, --nofields        Do not read the input MS to extract field IDs [default: False].
  -j, --justrun         Just run the pipeline, don't rebuild each job script if it exists [default: False].
  -B, --build           Build config file using input MS.
  -R, --run             Run pipeline with input config file.
  -V, --version         Display the version of this pipeline and quit.
  -L, --license         Display this program's license and quit.

Selecting MS and field IDs

As previously stated, to build a config file, run

processMeerKAT.py -B -C myconfig.txt -M mydata.ms

This calls CASA and adds a [data] section to your config file, which points to your MS, and a [fields] section, which points to the field IDs you want to process as bandpass, total flux and phase calibrators, and science target(s), as extracted from your input MS via their INTENT labels. Only targets and extra fields may have multiple fields separated by a comma, and all extra calibrator fields are appended as “targets”, to allow for solutions to be applied to them, and quick-look images to be made of them (see v1.0 release notes).

The following is an example of what is appended to the bottom of your config file.

[data]
vis = '/idia/software/pipelines/test_data/mightee_cdfs_1350_1400mhz.ms'

[fields]
bpassfield = 'J1939-6342'
fluxfield = 'J1939-6342'
phasecalfield = 'J0240-2309'
targetfields = 'CDFS16'
extrafields = 'J0521+1638'

You can edit your config file and change the field IDs, as discussed in config files.