Skip to contents

This document is still work in progress

User of OpenMalaria will often have to answer research questions which require multiple simulations with varying input parameters. Thus, for each input combination a separate input XML file needs to be created. openMalariaUtilities (OMU) provides the facilities to make this task easier for the user.

Introduction

This tutorial will outline the requirements and the process of running multiple scenarios. The input is based on OpenMalaria’s example. The code the generate the input list can be found in the Appendix.

First, we set up our experiment as usual.

## Setup dirs
setupDirs(experimentName = "exp_test", replace = TRUE)

## Copy necessary OpenMalaria files.
setupOM()
#> [1] TRUE

Placeholders

OMU provides a special syntax which can be used to define placeholders with varying values. Nearly any value in the input list can replaced by a @placeholder@. We will demonstrate that in the following example:

baseList[["interventions"]] <- list(
  name = "test",
  human = list(
    component = list(
      id = "GVI",
      name = "DDT test",
      GVI = list(
        decay = list(
          L = "0.5",
          "function" = "exponential"
        ),
        anophelesParams = list(
          mosquito = "gambiae_ss", propActive = "1",
          deterrency = list(value = "0.56"),
          preprandialKillingEffect = list(value = "0"),
          postprandialKillingEffect = list(value = "0.24")
        )
      )
    ),
    deployment = list(
      name = "DDT test",
      component = list(id = "GVI"),
      timed = list(
        deploy = list(coverage = "0.9", time = "10y") # <- Coverage
      )
    )
  )
)

The above code specifies the deployment of the intervention “DDT test” which is done after 10 years. For our (trivial) example, we want to vary the coverage parameter. Here, we override the intervention section with a new definition:

baseList[["interventions"]] <- list(
  name = "test",
  human = list(
    component = list(
      id = "GVI",
      name = "DDT test",
      GVI = list(
        decay = list(
          L = "0.5",
          "function" = "exponential"
        ),
        anophelesParams = list(
          mosquito = "gambiae_ss", propActive = "1",
          deterrency = list(value = "0.56"),
          preprandialKillingEffect = list(value = "0"),
          postprandialKillingEffect = list(value = "0.24")
        )
      )
    ),
    deployment = list(
      name = "DDT test",
      component = list(id = "GVI"),
      timed = list(
        deploy = list(coverage = "@coverage@", time = "10y") # <- Coverage
      )
    )
  )
)

Note, that we replaced the numerical value of coverage with a placeholder.

Scenarios’ Data Frame

For each placeholder we defined, we also need to have a table with the corresponding values. This is done in the scenarios data frame, which need to have the following structure:

  • Each column corresponds to a placeholder
  • The column names are the same as the placeholders, without the @ enclosement
  • Each row corresponds to a scenario

Thus, for our example, the scenario data frame should look like this:

scens <- data.frame(coverage = c(0.1, 0.3, 0.5, 0.8, 1))
scens
#>   coverage
#> 1      0.1
#> 2      0.3
#> 3      0.5
#> 4      0.8
#> 5      1.0

The placeholder names are up to the users, however the names should follow good practices of naming R column names, e.g. no spaces or special characters which also R operators like + or -.

OMU will check if the placeholders used in the XML file are present in the scenario data frame and warn the user if not. On the other hand, users can add non-placeholder columns to the scenario data frame which are used for metadata (more on that later).

Before the scenarios can be generated, it is important to run the following steps:

scens <- finalizeScenarios(scens)
storeScenarios(scens, csv = FALSE)
scens
#>   ID coverage           file
#> 1  1      0.1 exp_test_1.xml
#> 2  2      0.3 exp_test_2.xml
#> 3  3      0.5 exp_test_3.xml
#> 4  4      0.8 exp_test_4.xml
#> 5  5      1.0 exp_test_5.xml

The above command will add the two columns ID and file to the scenario data frame which are later used to uniquely identify each individual scenario. Furthermore, the scenarios are stored in the cache and optionally written as a CSV file to disk.

Please note, that the finalizeScenarios() and storeScenarios() should be the last commands you run on the scenarios.

Create XML files

## Create base XML file
createBaseXml(baseList, replace = TRUE)

## Create each scenarios' XML file
setupScenarios(scenarios = scens)

## Validate
validateXML(scenarios = scens)

setupScenarios() will take care of the scenario generation and place the resulting files in the scenarios directory of you experiment.

validateXML() will verify that all values in the XML file, and also the scenario data frame if specified, are valid according to the XSD schema.

Running simulations

The simulations can be run by using:

runSimulations(scenarios = scens, cmd = "openMalaria", ncores = 3)
#> starting worker pid=50911 on localhost:11193 at 10:02:30.825
#> starting worker pid=50909 on localhost:11193 at 10:02:30.831
#> starting worker pid=50910 on localhost:11193 at 10:02:30.836
#> [1] "Running scenario [3/5]"
#> [1] "Running scenario [1/5]"
#> [1] "Running scenario [4/5]"
#> [1] "Running scenario [5/5]"
#> [1] "Running scenario [2/5]"

Here, we pass the scenarios to the corresponding argument. Additionally, we can specify how many CPU cores should be used for the simulations via the ncores argument. Setting it to 3 implies that up to three simulations will be run in parallel, as can be seen from the console output.

OpenMalaria simulations are usually fully independent processes, thus they should scale well with more CPU cores. If you have the computational resources, this can greatly speed up the total run time.

The individual output files will be placed in the outputs folder.

Collecting the results

openMalariaUtilities provides the facilities to collect the simulation results, aggregate and store them in a SQLite database.

This whole process is handled by collectResults(). This function accepts a variety of input arguments which are part of a different vignette.

collectResults() has fore main parts:

  1. Determine which output files are needed
  2. Read the requested files
  3. Run aggregation on the collected data
  4. Store data in database

In this tutorial we will stick mainly to the default options.

We specify the directory where the output folders is stored and assign the name test to the database. By default, the database will be created in the project directory.

collectResults(expDir = getCache("experimentDir"), dbName = "test")

Using the RSQLite and the DBI package, we can now take a look a the database and the stored data.

con <- DBI::dbConnect(RSQLite::SQLite(), "test.sqlite")
DBI::dbListTables(con)
#> [1] "experiments"        "placeholders"       "results"            "scenarios"         
#> [5] "scenarios_metadata" "sqlite_stat1"       "sqlite_stat4"     

The first five tables are relevant for us.

DBI::dbReadTable(con, "experiments")
#>   experiment_id     name
#> 1             1 exp_test

The experiments table maps each experiment in the database to an ID. The scenarios table list all scenarios per experiment.

DBI::dbReadTable(con, "scenarios")
#>   experiment_id scenario_id
#> 1             1           1
#> 2             1           2
#> 3             1           3
#> 4             1           4
#> 5             1           5

DBI::dbReadTable(con, "placeholders")
#>   experiment_id scenario_id placeholder value
#> 1             1           1    coverage   0.1
#> 2             1           2    coverage   0.3
#> 3             1           3    coverage   0.5
#> 4             1           4    coverage   0.8
#> 5             1           5    coverage   1.0

DBI::dbReadTable(con, "scenarios_metadata")
#>   experiment_id scenario_id key_var          value
#> 1             1           1    file exp_test_1.xml
#> 2             1           2    file exp_test_2.xml
#> 3             1           3    file exp_test_3.xml
#> 4             1           4    file exp_test_4.xml
#> 5             1           5    file exp_test_5.xml

All placeholders are stored in the placeholders table and all non-placeholder columns from the scenarios data frame is stored in the scenarios_metadata table.

The results are stored in the results table,

head(DBI::dbReadTable(con, "results"))
#>   experiment_id scenario_id survey_date third_dimension measure value
#> 1             1           1  2000-03-27             0-5   nHost   609
#> 2             1           1  2000-03-27            5-90   nHost  3391
#> 3             1           1  2000-03-27             0-5 nPatent   514
#> 4             1           1  2000-03-27            5-90 nPatent  2423
#> 5             1           1  2000-03-27             0-5 nUncomp   806
#> 6             1           1  2000-03-27            5-90 nUncomp   769
DBI::dbDisconnect(con)

which can be used for further analysis.

Appendix

## Basic skeleton
baseList <- list(
  ## Mandatory
  expName = "example",
  ## Mandatory
  OMVersion = 44L,
  ## Optional
  ## analysisNo = 1L,
  ## Mandatory
  demography = list(),
  monitoring = list(),
  interventions = list(),
  healthSystem = list(),
  entomology = list(),
  ## These are optional for OM
  ## parasiteGenetics = list(),
  ## pharmacology = list(),
  ## diagnostics = list(),
  model = list()
)

demo_data <- data.frame(
  poppercent = c(
    3.474714994, 12.76004028, 14.52151394, 12.75565434, 10.83632374,
    8.393312454, 7.001421452, 5.800587654, 5.102136612, 4.182561874,
    3.339409351, 2.986112356, 2.555766582, 2.332763433, 1.77400255,
    1.008525491, 0.74167341, 0.271863401, 0.161614642
  ),
  upperbound = c(
    1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90
  )
)

baseList <- defineDemography(
  baseList,
  name = "Ifakara",
  popSize = 4000L,
  maximumAgeYrs = 90,
  lowerbound = 0,
  poppercent = demo_data$poppercent,
  upperbound = demo_data$upperbound
)

baseList[["monitoring"]] <- list(
  name = "Quarterly Surveys",
  ## Mandatory, different from OM schema
  startDate = "2000-01-01",
  continuous = monitoringContinuousGen(
    period = 1,
    options = list(
      name = c("input EIR", "simulated EIR", "human infectiousness", "N_v0",
               "immunity h", "immunity Y", "new infections",
               "num transmitting humans", "ITN coverage", "GVI coverage", "alpha",
               "P_B", "P_C*P_D"),
      value = c("true", "true", "true", "true", "true", "true", "true", "true",
                "true", "true", "true", "false", "false")
    )
  ),
  SurveyOptions = monitoringSurveyOptionsGen(
    options = list(
      name = c("nHost", "nPatent", "nUncomp", "nSevere", "nDirDeaths",
               "inputEIR", "simulatedEIR"),
      value = c("true", "true", "true", "true", "false", "true", "true")
    )
  )
)

baseList[["monitoring"]] <- c(
  baseList[["monitoring"]],
  list(
    surveys =  monitoringSurveyTimesGen(
      detectionLimit = 40, startDate = "2000-01-01", endDate = "2020-01-01",
      interval = "quarterly"
    ),
    ageGroup = surveyAgeGroupsGen(
      lowerbound = 0,
      upperbounds = c(5, 90)
    )
  )
)

baseList[["interventions"]] <- list(
  name = "test",
  human = list(
    component = list(
      id = "GVI",
      name = "DDT test",
      GVI = list(
        decay = list(
          L = "0.5",
          "function" = "exponential"
        ),
        anophelesParams = list(
          mosquito = "gambiae_ss", propActive = "1",
          deterrency = list(value = "0.56"),
          preprandialKillingEffect = list(value = "0"),
          postprandialKillingEffect = list(value = "0.24")
        )
      )
    ),
    deployment = list(
      name = "DDT test",
      component = list(id = "GVI"),
      timed = list(
        deploy = list(coverage = "0.9", time = "10y")
      )
    )
  )
)

baseList[["healthSystem"]] <- list(
  ImmediateOutcomes = list(
    name = "Tanzania ACT",
    drugRegimen = list(
      firstLine = "ACT",
      inpatient = "QN",
      secondLine = "ACT"
    ),
    initialACR = list(
      ACT = list(value = 0.85),
      QN = list(value = 0.998),
      selfTreatment = list(value = 0.63)
    ),
    compliance = list(
      ACT = list(value = 0.9),
      selfTreatment = list(value = 0.85)
    ),
    nonCompliersEffective = list(
      ACT = list(value = 0),
      selfTreatment = list(value = 0)
    ),
    treatmentActions = list(
      ACT = list(
        name = "clear blood-stage infections",
        clearInfections = list(
          stage = "blood",
          timesteps = "1"
        )
      ),
      QN = list(
        name = "clear blood-stage infections",
        clearInfections = list(
          stage = "blood",
          timesteps = "1"
        )
      )
    ),
    pSeekOfficialCareUncomplicated1 = list(value = 0.04),
    pSelfTreatUncomplicated = list(value = 0.01),
    pSeekOfficialCareUncomplicated2 = list(value = 0.04),
    pSeekOfficialCareSevere = list(value = 0.48)
  ),
  CFR = list(
    group = list(lowerbound = 0, value = 0.09189),
    group = list(lowerbound = 0.25, value = 0.0810811),
    group = list(lowerbound = 0.75, value = 0.0648649),
    group = list(lowerbound = 1.5, value = 0.0689189),
    group = list(lowerbound = 2.5, value = 0.0675676),
    group = list(lowerbound = 3.5, value = 0.0297297),
    group = list(lowerbound = 4.5, value = 0.0459459),
    group = list(lowerbound = 7.5, value = 0.0945946),
    group = list(lowerbound = 12.5, value = 0.1243243),
    group = list(lowerbound = 15, value = 0.1378378)
  ),
  ## Mandatory
  pSequelaeInpatient = list(
    interpolation = "none",
    group = list(lowerbound = "0.0", value = 0.0132),
    group = list(lowerbound = "5.0", value = 0.005)
  )
)

baseList[["entomology"]] <- list(
  mode = "dynamic", name = "Namawala",
  vector = list(
    anopheles = list(
      mosquito = "gambiae_ss", propInfected = 0.078, propInfectious = "0.021",
      seasonality = list(
        annualEIR = "16", input = "EIR",
        fourierSeries = list(
          EIRRotateAngle = "0",
          coeffic = list(
            a = "0.8968", b = "2.678"
          ),
          coeffic = list(a = "-0.4551", b = "2.599")
        )
      ),
      mosq = list(
        minInfectedThreshold = "0.001",
        mosqRestDuration = list(value = "3"),
        extrinsicIncubationPeriod = list(value = "11"),
        mosqLaidEggsSameDayProportion = list(value = "0.313"),
        mosqSeekingDuration = list(value = "0.33"),
        mosqSurvivalFeedingCycleProbability = list(value = "0.623"),
        availability = list(distr = "const"),
        mosqProbBiting = list(mean = "0.95", variance = "0"),
        mosqProbFindRestSite = list(mean = "0.95", variance = "0"),
        mosqProbResting = list(mean = "0.99", variance = "0"),
        mosqProbOvipositing = list(value = "0.88"),
        mosqHumanBloodIndex = list(value = "0.939")
      ),
      nonHumanHosts = list(
        name = "unprotectedAnimals",
        mosqRelativeEntoAvailability = list(value = "1.0"),
        mosqProbBiting = list(value = "0.95"),
        mosqProbFindRestSite = list(value = "0.95"),
        mosqProbResting = list(value = "0.99")
      )
    ),
    nonHumanHosts = list(name = "unprotectedAnimals", number = "1.0")
  )
)

baseList[["model"]] <- list(
  ModelOptions = list(
    option = list(name = "LOGNORMAL_MASS_ACTION", value = "true"),
    option = list(name = "NO_PRE_ERYTHROCYTIC", value = "true"),
    option = list(name = "INNATE_MAX_DENS", value = "false"),
    option = list(name = "INDIRECT_MORTALITY_FIX", value = "false"),
    option = list(name = "NON_MALARIA_FEVERS", value = "true")
  ),
  clinical = list(
    healthSystemMemory = "6",
    NonMalariaFevers = list(
      incidence = list(
        group = list(lowerbound = "0", value = "0.322769924518357"),
        group = list(lowerbound = "0", value = "0.308520194304172"),
        group = list(lowerbound = "1", value = "0.279441774808493"),
        group = list(lowerbound = "2", value = "0.250431781111273"),
        group = list(lowerbound = "3", value = "0.223285859756841"),
        group = list(lowerbound = "4", value = "0.199298352451799"),
        group = list(lowerbound = "5", value = "0.179376872365614"),
        group = list(lowerbound = "6", value = "0.163623659390782"),
        group = list(lowerbound = "7", value = "0.152227726923469"),
        group = list(lowerbound = "8", value = "0.145022785567758"),
        group = list(lowerbound = "9", value = "0.141493087461765"),
        group = list(lowerbound = "10", value = "0.140473293219353"),
        group = list(lowerbound = "11", value = "0.141109775159515"),
        group = list(lowerbound = "12", value = "0.142644475217328"),
        group = list(lowerbound = "13", value = "0.144335079395766"),
        group = list(lowerbound = "14", value = "0.145964032924869"),
        group = list(lowerbound = "15", value = "0.147708915135714"),
        group = list(lowerbound = "16", value = "0.149731543445568"),
        group = list(lowerbound = "17", value = "0.151887428568276"),
        group = list(lowerbound = "18", value = "0.154060663485195"),
        group = list(lowerbound = "19", value = "0.156179169710494"),
        group = list(lowerbound = "20", value = "0.158135015380583"),
        group = list(lowerbound = "21", value = "0.159704766482219"),
        group = list(lowerbound = "22", value = "0.160807788387655"),
        group = list(lowerbound = "23", value = "0.161427976448279"),
        group = list(lowerbound = "24", value = "0.161620429119137"),
        group = list(lowerbound = "25", value = "0.16144021875986"),
        group = list(lowerbound = "26", value = "0.160943264630612"),
        group = list(lowerbound = "27", value = "0.160217573697398"),
        group = list(lowerbound = "28", value = "0.159422614374451"),
        group = list(lowerbound = "29", value = "0.158542519631641"),
        group = list(lowerbound = "30", value = "0.157501217628248"),
        group = list(lowerbound = "31", value = "0.156175160594841"),
        group = list(lowerbound = "32", value = "0.154402302191411"),
        group = list(lowerbound = "33", value = "0.152102040636481"),
        group = list(lowerbound = "34", value = "0.14921450014676"),
        group = list(lowerbound = "35", value = "0.145714433541659"),
        group = list(lowerbound = "36", value = "0.141800502067518"),
        group = list(lowerbound = "37", value = "0.137916853907569"),
        group = list(lowerbound = "38", value = "0.134503529382102"),
        group = list(lowerbound = "39", value = "0.131746276580642"),
        group = list(lowerbound = "40", value = "0.12969902537497"),
        group = list(lowerbound = "41", value = "0.128398077347679"),
        group = list(lowerbound = "42", value = "0.127864136551891"),
        group = list(lowerbound = "43", value = "0.12804497197004"),
        group = list(lowerbound = "44", value = "0.128894055047661"),
        group = list(lowerbound = "45", value = "0.130350838992718"),
        group = list(lowerbound = "46", value = "0.132286605622701"),
        group = list(lowerbound = "47", value = "0.134599921072495"),
        group = list(lowerbound = "48", value = "0.137212726976988"),
        group = list(lowerbound = "49", value = "0.140035253913284"),
        group = list(lowerbound = "50", value = "0.142934573453621"),
        group = list(lowerbound = "51", value = "0.145830221511879"),
        group = list(lowerbound = "52", value = "0.148674810561069"),
        group = list(lowerbound = "53", value = "0.151497963594518"),
        group = list(lowerbound = "54", value = "0.15438856687865"),
        group = list(lowerbound = "55", value = "0.157403790093505"),
        group = list(lowerbound = "56", value = "0.16059513222516"),
        group = list(lowerbound = "57", value = "0.16402433342886"),
        group = list(lowerbound = "58", value = "0.16770481415944"),
        group = list(lowerbound = "59", value = "0.171626873047865"),
        group = list(lowerbound = "60", value = "0.175748327054247"),
        group = list(lowerbound = "61", value = "0.180030857856799"),
        group = list(lowerbound = "62", value = "0.184411365583771"),
        group = list(lowerbound = "63", value = "0.188816421789366"),
        group = list(lowerbound = "64", value = "0.19316997803338"),
        group = list(lowerbound = "65", value = "0.197435603275487"),
        group = list(lowerbound = "66", value = "0.201578808813379"),
        group = list(lowerbound = "67", value = "0.205556806881398"),
        group = list(lowerbound = "68", value = "0.209307183457343"),
        group = list(lowerbound = "69", value = "0.212783260344084"),
        group = list(lowerbound = "70", value = "0.215944154621391"),
        group = list(lowerbound = "71", value = "0.218749275266548"),
        group = list(lowerbound = "72", value = "0.221187990639016"),
        group = list(lowerbound = "73", value = "0.223361260399378"),
        group = list(lowerbound = "74", value = "0.225363436789592"),
        group = list(lowerbound = "75", value = "0.227254280093211"),
        group = list(lowerbound = "76", value = "0.229084576349576"),
        group = list(lowerbound = "77", value = "0.230891971097789"),
        group = list(lowerbound = "78", value = "0.232690225166173"),
        group = list(lowerbound = "79", value = "0.234484973338876"),
        group = list(lowerbound = "80", value = "0.236276361586796"),
        group = list(lowerbound = "81", value = "0.238064394629696"),
        group = list(lowerbound = "82", value = "0.239849077182917"),
        group = list(lowerbound = "83", value = "0.241630413957381"),
        group = list(lowerbound = "84", value = "0.243408409659591"),
        group = list(lowerbound = "85", value = "0.245183068991633"),
        group = list(lowerbound = "86", value = "0.246954396651183"),
        group = list(lowerbound = "87", value = "0.248722397331501"),
        group = list(lowerbound = "88", value = "0.250487075721441"),
        group = list(lowerbound = "89", value = "0.252248436505447"),
        group = list(lowerbound = "90", value = "0.253127874257909")
      )
    )
  ),
  human = list(
    availabilityToMosquitoes = list(
      group = list(lowerbound = "0", value = "0.7076"),
      group = list(lowerbound = "0", value = "0.8538"),
      group = list(lowerbound = "5", value = "1.0"),
      group = list(lowerbound = "5", value = "1.0")
    )
  ),
  parameters = list(
    interval = "5", iseed = "1", latentp = "3",
    parameter = list(include = "false", name = "'-ln(1-Sinf)'", number = "1", value = "0.050736"),
    parameter = list(include = "false", name = "Estar", number = "2", value = "0.03247"),
    parameter = list(include = "false", name = "Simm", number = "3", value = "0.138161050830301"),
    parameter = list(include = "false", name = "Xstar_p", number = "4", value = "1514.385853233699891"),
    parameter = list(include = "false", name = "gamma_p", number = "5", value = "2.03692533424484"),
    parameter = list(include = "false", name = "sigma2i", number = "6", value = "10.173598698525799"),
    parameter = list(include = "false", name = "CumulativeYstar", number = "7", value = "35158523.31132510304451"),
    parameter = list(include = "false", name = "CumulativeHstar", number = "8", value = "97.334652723897705"),
    parameter = list(include = "false", name = "'-ln(1-alpha_m)'", number = "9", value = "2.33031045876193"),
    parameter = list(include = "false", name = "decay_m", number = "10", value = "2.53106547375805"),
    parameter = list(include = "false", name = "sigma2_0", number = "11", value = "0.655747311168152"),
    parameter = list(include = "false", name = "Xstar_v", number = "12", value = "0.916181104713054"),
    parameter = list(include = "false", name = "Ystar2", number = "13", value = "6502.26335600001039"),
    parameter = list(include = "false", name = "alpha", number = "14", value = "142601.912520000012591"),
    parameter = list(include = "false", name = "Density bias (non Garki)", number = "15", value = "0.177378570987455"),
    parameter = list(include = "false", name = "        sigma2        ", number = "16", value = "0.05"),
    parameter = list(include = "false", name = "log oddsr CF community", number = "17", value = "0.736202"),
    parameter = list(include = "false", name = "Indirect risk cofactor", number = "18", value = "0.018777338"),
    parameter = list(include = "false", name = "Non-malaria infant mortality", number = "19", value = "49.539046599999999"),
    parameter = list(include = "false", name = "Density bias (Garki)", number = "20", value = "4.79610772546704"),
    parameter = list(include = "false", name = "Severe Malaria Threshhold", number = "21", value = "784455.599999999976717"),
    parameter = list(include = "false", name = "Immunity Penalty", number = "22", value = "1"),
    parameter = list(include = "false", name = "Immune effector decay", number = "23", value = "0"),
    parameter = list(include = "false", name = "comorbidity intercept", number = "24", value = "0.0968"),
    parameter = list(include = "false", name = "Ystar half life", number = "25", value = "0.275437402"),
    parameter = list(include = "false", name = "Ystar1", number = "26", value = "0.596539864"),
    parameter = list(include = "false", name = "Asexual immunity decay", number = "27", value = "0"),
    parameter = list(include = "false", name = "Ystar0", number = "28", value = "296.302437899999973"),
    parameter = list(include = "false", name = "Idete multiplier", number = "29", value = "2.797523626"),
    parameter = list(include = "false", name = "critical age for comorbidity", number = "30", value = "0.117383")
  )
)