Adding pipeline

Resources

Adding pipeline on database server

Database server is a server, that stores data. In our case,that is either Merlin (FMF) or ONKO-NIX (OI).

Installation steps

Software installation

  • Perform ssh to database server: ssh nixUser@server.adress
  • Install analysisModule as tomcat8:
    su nixManager
    sudo su tomcat8
    cd ../labkey/externalModules/analysisModule/
  • Install websocket as nixUser:
    ssh nixUser@server.adress
    cd software/src/websocket

Use git to download and install the packages.

Database setup (Labkey):

  • Create project called Analysis and a subfolder of type Study called Run.
  • Import list archive from analysisModule.
  • The archive contains four lists:
    • runs which keeps track of jobs performed (or running)
    • parameterFiles which lists json files to be applied at processing server. The list points to files loaded to configuration in Analysis/Run.
    • runStatus which lists the possible status values of jobs
    • scripts which lists potential analysis algorithm. More on convention for passing script in separate section
  • Create a dataset called Runs where each line corresponds to a fully fledged instruction, composed of an arbitrary ID, the script to be run, associated parameter file and server location, which is either local or contains an IP of a server where websocketServer is running. A sample dataset is part of the zipped study which is part of analysisModule.

Processing instruction convention

Location of scripts

The actual scripts and associated files should be available on processing server (see separate instructions on adding a new processing machine). A replacement string _softwareSrc_ points to a common directory of all installed scripts on processing server.

Virtual environments

Should the script require a particular environment, the name of the script can be codded as two items separated by a colon. The first item is the name of the virtual environment, path to which should be supplied in the local .labkey/setup.json as ["venv"][name] on processing server. This implies that the actual virtual environment is installed at that location.
How to setup a virtual environment - example with instructions: websocket.
Requirenments for our virtual environments - labkey, socket, nnUNet: venv

Starting the script

A script, packed in the Runs dataset is run by checking the doRun check-mark. This adds an entry to runs list. Upon completion, the associated log file will be part of the runs list. In the log file, completed actions can be seen, as well as any possible mistakes.

Discussion