Collect Open Malaria results into a database

Usage

collectResults(
  expDir,
  dbName,
  dbDir = NULL,
  replace = FALSE,
  resultsName = "results",
  resultsCols = list(names = c("scenario_id", "survey_date", "third_dimension",
    "measure", "value"), types = c("INTEGER", "TEXT", "", "TEXT", "NUMERIC")),
  indexOn = list(c("results", "scenario_id")),
  ncores = 1,
  ncoresDT = 1,
  strategy = "serial",
  appendResults = FALSE,
  fileFun = NULL,
  fileFunArgs = NULL,
  readFun = NULL,
  readFunArgs = NULL,
  aggrFun = NULL,
  aggrFunArgs = NULL,
  verbose = get("verboseOutput", envir = .pkgenv)
)

collect_results(
  expDir,
  dbName,
  dbDir = NULL,
  replace = FALSE,
  resultsName = "results",
  resultsCols = list(names = c("scenario_id", "survey_date", "third_dimension",
    "measure", "value"), types = c("INTEGER", "TEXT", "", "TEXT", "NUMERIC")),
  indexOn = list(c("results", "scenario_id")),
  ncores = 1,
  ncoresDT = 1,
  strategy = "serial",
  appendResults = FALSE,
  fileFun = NULL,
  fileFunArgs = NULL,
  readFun = NULL,
  readFunArgs = NULL,
  aggrFun = NULL,
  aggrFunArgs = NULL,
  verbose = get("verboseOutput", envir = .pkgenv)
)

Arguments

expDir: Database connection.
dbName: Name of the database file without extension.
dbDir: Directory of the database file. Defaults to the root directory.
replace: If TRUE, replace an exisiting database with same name as in dbName. Else, try to append the date to the exisiting database.
resultsName: Name of the database table to add the results to.
resultsCols: A list containing the column names and the column types of the results table. For example, list(names = c("scenario_id", ...), types = c("INTEGER", ...)). The "experiment_id" is added automatically. Types as available for SQLite.
indexOn: Define which index to create. Needs to be a lis of the form list(c(TABLE, COLUMN), c(TABLE, COLUMN), ...).
ncores: Number of CPU cores to use.
ncoresDT: Number of data.table threads to use on each parallel cluster.
strategy: Defines how to process the files. "batch" means that all files are read into a single data frame first, then the aggregation funciton is applied to that data frame and the result is added to the database. "serial" means that each individual file is processed with the aggregation function and added to the database.
appendResults: If TRUE, do not add metadata to the database and only write results.
fileFun: A function for filtering the input files. Needs to return a vector of the scenario XML files without path as in the file column of the scenario data frame. No default.
fileFunArgs: Arguments for fileFun as a (named) list.
readFun: A function for reading and processing OpenMalaria output files. Needs to return as data frame. The first argument needs to be the file name and it needs to have ... as an argument. Scenario IDs are available by using scenID as an argument. If NULL, defaults to readOutputFile and the scenario IDs are added automatically.
readFunArgs: Arguments for readFun as a (named) list.
aggrFun: A function for aggregating the output of readFun. First argument needs to be the output data frame of readFun and it needs to generate a data frame. The data frame should NOT contain an experiment_id column as this is added automatically. The column names needs to match the ones defined in resultsCols.
aggrFunArgs: Arguments for aggrFun as a (named) list.
verbose: Boolean, toggle verbose output.