Table of Contents

guest
2025-04-29
NIXfying code
   Add new processing machine
   Managing nixWorker account

NIXfying code


Making the code NIX compliant

Background

NIX

Network of imaging excellence is an initiative for harmonized and standardized infrastructure and environment for multi-center trials that involve imaging data. NIX relies upon a combination of database (current LabKey driven), analysis code and good clinical practice to maximize yield of clinical studies involving quantitative imaging biomarkers.

NIX database layout

NIX relies upon standardized layout of the database for software analysis to run smoothly. However, deviations from the base are possible and can be provided in configuration files.

NIXfication of a code

Initial assumptions

For code to be NIXfiable, we provide a loose set of guidelines that encompass general properties of QIB based data analysis.

Programing language

The most simple case is a python based code. If your code is not python based, a python wrapper that calls your code is required. Such wrapper should provide file and outcome management and wrap database calls.

Database access

Database access is provided through labkeyInterface based calls. Rather than loading the package directly, we strongly suggest to use nixSuite as the singular access point for all NIX based software. nixSuite provides getWrapper() routine which in turn allows to perform wrapper.loadLibrary() calls that will load latest version of labkeyInterface and similar tools. nixSuite is normally provided in computing environments that are controlled by NIX. To download data, use labkeyDatabaseBrowser.selectRows() and for files, labkeyFileBrowser.readFileToFile(). More details are available at git0.

Output

We expect the output to come in forms of tables (ntuples, if you want), with multiple variables per data entity. Also, files could be generated, which can be attributed to data item, qualified by id, a visit id and potentially sub-qualifiers (person doing the analysis, version of the analysis software, etc.)

NIX requirements

Main routine

The code should be packed into a python script that provides a main routine, which can take a single parameter, which is a configuration file. We encourage the file to come in json format, but the code can run with any format.

def main(parFile):
   doStuff

Sometimes it pays to end the file with:

if __name__=="__main__":
        main(sys.argv[1])

which allows to test the algorithm locally via:

python script.py parameters.json

Server setup

The job can refer to a local configuration file that provides paths to local copy of nixSuite and other common programs matlab, etc. In code, it can be assumed that this configuration is available from os.path.join(os.path.expanduser('~'),'.labkey','setup.json'), and has at the minimum, the following properites:

  • paths
    • nixWrapper: path to the wrapper; add this to sys.path to be able to import nixWrapper
    • softwareSrc: path to expanded software. Normally not needed, but if you are combine algorithms from different projects, here is where you would expect to find them
  • venv - dictionary of locations of different virtual enviroments that are available. Normally not needed in code, but useful in specifying (or checking) for codes that can be used in scripts specification, see below

Optionally, setup.json can provide additional software:

  • paths
    • matlab: path to matlab executable
    • gzip: path to gzip executable
    • generalCodes: path to generalCodes, a package of matlab codes from UW, see git0
    • nnUNetRunInference: path to installed nnUNet code

Getting to labkeyInterface from main:

import os
import json
import sys

def main(parFile):
        setupFile=os.path.join(os.path.expanduser('~'),'.labkey','setup.json')
        with open(setupFile,'r') as f:
                setup=json.load(f)
        sys.path.append(setup['paths']['nixWrapper'])
        import nixWrapper
        nixWrapper.loadLibrary('labkeyInterface')

        import labkeyInterface
        #do something with labkeyInterface
        

Handling output

Output should be transmitted back to the data server. All files should be placed in the project directories, which can be created via labkeyFileBrowser.buildPathURL(),
and finally labkeyFileBrowser.writeFileToFile() to copy the actual file. All tuples should fill logically fitting datasets (for id/visit identifiable data) or list (to avoid id/visit requirement, where multiple entries for id/visit are foreseen), and use labkeyDatabaseBrowser.modifyRows('insert',...) or labkeyDatabaseBrowser.modifyRows('update',...) accordingly to either insert or update entries. While not enforced, local computing directories could get deleted.

Parameter file

The parameter file supplied should direct operation of algorithm, including names of relevant datasets, common directory names, filtering of participants for analysis and similar. The actual use is left to the user. A copy of the parameter file will be available in the Analysis project under jobs/ID for later reference.

Distributing the code

A copy of the code

The code should be available at the engine which will perform the analysis. (This is still being implemented). At NIX managed workstations, a nixWorker account is the target user that will perform analysis, so the code should be unpacked to its software/src directory. It is up to the user to maintain the code and provide updates. For people without access to the NIX workstations, the installation of the code will be performed by NIX managers.

Server setup (NIX managers only)

The user nixWorker has a configuration file that provides local copy of nixSuite and other common programs matlab, etc. In code, it can be assumed that this configuration is available from os.path.join(os.path.expanduser('~'),'.labkey','setup.json'), and has

Analysis setup

  • Parameter file. In Analysis project, one should copy a version of the configuration to the configuration directory.

  • parameterFiles is a list in Analysis project, where one should add the exact name of the configuration file copied in previous step.

  • scripts is a list, where we specify the details on which script to use. In Path, the format of script path specification is VENV:PATH where VENV is the name of virtual environment needed to run the algorith, and PATH is the path of the script. PATH can include a mock variable _softwareSrc_ which will expand to software/src or similar directory on the operating machine. This is qualifed in the ./labkey/setup.json, see above. Typicall setting is:

    PBPK:_softwareSrc_/PBPK/pythonScripts/runSolver.py
    
    
  • runs, where you combine a script and parameter file, specify job as local, which is to be run at LabKey instance, or give name/IP of the server (at OIL, it is IP, at FMF, it is the name), and specify job as python. If the parameter file is a json on vangogh one can override the names of variables in setup file by specifying parameterOverload variable with a series of semicolon delimeted instructions, where each instruction is of form FILENAME:VARIABLE=VALUE, and works for simple types only. One can also specify CPU range, useful for limiting job to a small number of CPU, but with a drawback that the job will wait for those specific CPUs to be available. See [example][Screenshot at 2023-03-08 11-24-58.png]

Running the code

In Analysis, select the entry in edit mode, mark doRun and Submit.




Add new processing machine


Adding a new GPU server/user

Installation steps:

  • Add local user on processing machine via adduser
  • Install websocket for that user.
  • Install analysisInterface since websocket will delegate processing to runAnalysis.
  • Install potential other software, depending on what script in Analysis/Runs need. Perhaps such scripts might require virtual environment, check with particular code (e.g. irAEMM, nixSuite, SegmentationModels,...)

Processing software

The websocket and analysisInterface code will look in .labkey/setup.json to replace _softwareSrc_ with locally adjusted ["paths"]["softwareSrc"].
_softwareSrc_ is a shortcut/replacement string coded into Analysis/Runs scripts on labkey/merlin.
If a script requires _softwareSrc_/A/B.py, there should be a script under ["paths"]["softwareSrc"]/A/B.py on vangogh for the user you created.

In .labkey folder, there should also be a network.json file, which will enable the connection to database server. This file includes certificates (if needed), username and password.




Managing nixWorker account


NIX worker

NIX worker is a generic name for user that performs analysis when prompted by the a NIX user.

NIX worker setup

Create and scaffold

sudo adduser nixWorker
sudo su nixWorker
cd 
mkdir venv
mkdir -p software/src
mkdir analysis
mkdir logs
mkdir .labkey
cd .labkey
#edit file, add username and password
vi onko-nix.json
#edit paths
vi setup.json

Virtual env

Get list of venv from git0