Skip to contents

Collect Open Malaria results into a database

Usage

collectResults(
  expDir,
  dbName,
  dbDir = NULL,
  replace = FALSE,
  resultsName = "results",
  resultsCols = list(names = c("scenario_id", "survey_date", "third_dimension",
    "measure", "value"), types = c("INTEGER", "TEXT", "", "TEXT", "NUMERIC")),
  indexOn = list(c("results", "scenario_id")),
  ncores = 1,
  ncoresDT = 1,
  strategy = "serial",
  appendResults = FALSE,
  fileFun = NULL,
  fileFunArgs = NULL,
  readFun = NULL,
  readFunArgs = NULL,
  aggrFun = NULL,
  aggrFunArgs = NULL,
  verbose = get("verboseOutput", envir = .pkgenv)
)

collect_results(
  expDir,
  dbName,
  dbDir = NULL,
  replace = FALSE,
  resultsName = "results",
  resultsCols = list(names = c("scenario_id", "survey_date", "third_dimension",
    "measure", "value"), types = c("INTEGER", "TEXT", "", "TEXT", "NUMERIC")),
  indexOn = list(c("results", "scenario_id")),
  ncores = 1,
  ncoresDT = 1,
  strategy = "serial",
  appendResults = FALSE,
  fileFun = NULL,
  fileFunArgs = NULL,
  readFun = NULL,
  readFunArgs = NULL,
  aggrFun = NULL,
  aggrFunArgs = NULL,
  verbose = get("verboseOutput", envir = .pkgenv)
)

Arguments

expDir

Database connection.

dbName

Name of the database file without extension.

dbDir

Directory of the database file. Defaults to the root directory.

replace

If TRUE, replace an exisiting database with same name as in dbName. Else, try to append the date to the exisiting database.

resultsName

Name of the database table to add the results to.

resultsCols

A list containing the column names and the column types of the results table. For example, list(names = c("scenario_id", ...), types = c("INTEGER", ...)). The "experiment_id" is added automatically. Types as available for SQLite.

indexOn

Define which index to create. Needs to be a lis of the form list(c(TABLE, COLUMN), c(TABLE, COLUMN), ...).

ncores

Number of CPU cores to use.

ncoresDT

Number of data.table threads to use on each parallel cluster.

strategy

Defines how to process the files. "batch" means that all files are read into a single data frame first, then the aggregation funciton is applied to that data frame and the result is added to the database. "serial" means that each individual file is processed with the aggregation function and added to the database.

appendResults

If TRUE, do not add metadata to the database and only write results.

fileFun

A function for filtering the input files. Needs to return a vector of the scenario XML files without path as in the file column of the scenario data frame. No default.

fileFunArgs

Arguments for fileFun as a (named) list.

readFun

A function for reading and processing OpenMalaria output files. Needs to return as data frame. The first argument needs to be the file name and it needs to have ... as an argument. Scenario IDs are available by using scenID as an argument. If NULL, defaults to readOutputFile and the scenario IDs are added automatically.

readFunArgs

Arguments for readFun as a (named) list.

aggrFun

A function for aggregating the output of readFun. First argument needs to be the output data frame of readFun and it needs to generate a data frame. The data frame should NOT contain an experiment_id column as this is added automatically. The column names needs to match the ones defined in resultsCols.

aggrFunArgs

Arguments for aggrFun as a (named) list.

verbose

Boolean, toggle verbose output.