Collect Open Malaria results into a database
Usage
collectResults(
expDir,
dbName,
dbDir = NULL,
replace = FALSE,
resultsName = "results",
resultsCols = list(names = c("scenario_id", "survey_date", "third_dimension",
"measure", "value"), types = c("INTEGER", "TEXT", "", "TEXT", "NUMERIC")),
indexOn = list(c("results", "scenario_id")),
ncores = 1,
ncoresDT = 1,
strategy = "serial",
appendResults = FALSE,
fileFun = NULL,
fileFunArgs = NULL,
readFun = NULL,
readFunArgs = NULL,
aggrFun = NULL,
aggrFunArgs = NULL,
verbose = get("verboseOutput", envir = .pkgenv)
)
collect_results(
expDir,
dbName,
dbDir = NULL,
replace = FALSE,
resultsName = "results",
resultsCols = list(names = c("scenario_id", "survey_date", "third_dimension",
"measure", "value"), types = c("INTEGER", "TEXT", "", "TEXT", "NUMERIC")),
indexOn = list(c("results", "scenario_id")),
ncores = 1,
ncoresDT = 1,
strategy = "serial",
appendResults = FALSE,
fileFun = NULL,
fileFunArgs = NULL,
readFun = NULL,
readFunArgs = NULL,
aggrFun = NULL,
aggrFunArgs = NULL,
verbose = get("verboseOutput", envir = .pkgenv)
)
Arguments
- expDir
Database connection.
- dbName
Name of the database file without extension.
- dbDir
Directory of the database file. Defaults to the root directory.
- replace
If TRUE, replace an exisiting database with same name as in dbName. Else, try to append the date to the exisiting database.
- resultsName
Name of the database table to add the results to.
- resultsCols
A list containing the column names and the column types of the results table. For example, list(names = c("scenario_id", ...), types = c("INTEGER", ...)). The "experiment_id" is added automatically. Types as available for SQLite.
- indexOn
Define which index to create. Needs to be a lis of the form list(c(TABLE, COLUMN), c(TABLE, COLUMN), ...).
- ncores
Number of CPU cores to use.
- ncoresDT
Number of data.table threads to use on each parallel cluster.
- strategy
Defines how to process the files. "batch" means that all files are read into a single data frame first, then the aggregation funciton is applied to that data frame and the result is added to the database. "serial" means that each individual file is processed with the aggregation function and added to the database.
- appendResults
If TRUE, do not add metadata to the database and only write results.
- fileFun
A function for filtering the input files. Needs to return a vector of the scenario XML files without path as in the file column of the scenario data frame. No default.
- fileFunArgs
Arguments for fileFun as a (named) list.
- readFun
A function for reading and processing OpenMalaria output files. Needs to return as data frame. The first argument needs to be the file name and it needs to have ... as an argument. Scenario IDs are available by using scenID as an argument. If NULL, defaults to readOutputFile and the scenario IDs are added automatically.
- readFunArgs
Arguments for readFun as a (named) list.
- aggrFun
A function for aggregating the output of readFun. First argument needs to be the output data frame of readFun and it needs to generate a data frame. The data frame should NOT contain an experiment_id column as this is added automatically. The column names needs to match the ones defined in resultsCols.
- aggrFunArgs
Arguments for aggrFun as a (named) list.
- verbose
Boolean, toggle verbose output.