vivaxmodelr.Rmd
title: “VivaxModelR: simple version” subtitle: “An R package to simulate P. vivax dynamics” author: “Clara Champagne, Maximilian Gerhards, Justin Lana, Bernardo García Espinosa, Christina Bradley, Óscar González, Justin Cohen, Arnaud Le Menach, Michael White, Emilie Pothin” output: rmarkdown::html_vignette bibliography: bibliography.json vignette: > % % % —
This package contains functions to simulate a compartmental model for Plasmodium Vivax and calculate the associated reproduction number for a given epidemiological setting, based on data on reported local and imported cases. The model includes case management and vector control.
It is based on (white2016?), and can be represented by the following diagram:where \(I_L\) is the proportion of individuals with both blood and liver stage parasites, \(I_0\) is the proportion of individuals with blood stage parasites only, \(S_L\) is the proportion of individuals with liver stage parasites only, and \(S_0\) is the proportion of susceptible individuals.
As in (white2016?), it requires 4 biological parameters: \(\lambda\) (transmission rate, called “lambda” in the package), \(r\) (blood parasites clearance rate), \(\gamma_L\) (liver parasites clearance rate, called “gamma”) and \(f\) (relapse frequency). Values from the literature can be found for \(r\), \(\gamma_L\) and \(f\). The parameter \(\lambda\) is highly setting specific and can be calculated based on incidence and importation data using the functions from this package. This also provides estimates of setting specific reproduction numbers: \(R_0\) (in the absence of control intervention) and \(R_c\) (considering current control interventions).
For more information, please refer to the associated paper ((champagne2022?)).
Creating a dummy dataset with reported incidence and the proportion of imported cases (the user should include their real data). Incidence is in cases per 1000 person-year. Each line corresponds to an administrative area or a year of data.
mydata=data.frame(id=c("regionA","regionB"),incidence=c(23,112),prop_import=c(0,0.1))
Convert incidence to cases per person day:
mydata$h=incidence_year2day(mydata$incidence)
Indicate model intervention parameter values for current case management (\(\alpha\) and \(\beta\)) and vector control (\(\omega\)), as well as the observation rate (\(\rho\)).
The parameter \(\alpha\) (called “alpha” in the package) represents the probability to receive effective cure for the blood stage infection and the parameter \(\beta\) (called “beta”) represents the probability that individuals clear their liver stage infection when receiving their treatment. The parameter \(\omega\) (called “omega”) quantifies the intensity of vector control (0 indicating perfect vector control and 1 indicating the absence of vector control). The observation rate \(\rho\) (called “rho”) indicates the proportion of all infections that are effectively observed and reported in the incidence data. The parameter \(\alpha\) should reflect the proportion of individual effectively cured (for example because they present symptoms, seek care, and receive and adhere to an effective treatment, as in (galactionova2015?) for example).
mydata$alpha=c(0.17, 0.12) # proportion of treated cases
mydata$beta=c(0.43,0.42) # proportion of radical cure
mydata$rho=c(0.18, 0.13) # reporting rate (here, we assume that all treated cases are reported)
mydata$omega=c(1,1) # no vector control
Run the model and save the results in a new database.
mydata_withR0RC=calibrate_vivax_equilibrium(df=mydata, f=1/72, gamma=1/223, r=1/60, return.all = TRUE)
#> Warning in calibrate_vivax_equilibrium(df = mydata, f = 1/72, gamma = 1/223, :
#> no rho2 parameter, assumed rho2=1
#> Warning in solve_lambda_vivax(h = my.h, r = r, gamma = gamma, f = f, alpha =
#> my.alpha, : imaginary parts discarded in coercion
Visualise the output: several additional columns were created, including \(R_0\) and \(R_c\) for each region/year.
mydata_withR0RC
#> id incidence prop_import h alpha beta rho omega rho2
#> 1 regionA 23 0.0 0.0000630137 0.17 0.43 0.18 1 1
#> 2 regionB 112 0.1 0.0003068493 0.12 0.42 0.13 1 1
#> lambda Rc R0 delta I
#> 1 0.008913383 1.0266852 1.322669 0.000000000 0.01743379
#> 2 0.007613010 0.9498208 1.129705 0.000269643 0.12462803
The transmission rate lambda has been calibrated to the incidence and import proportion and it can be used to simulate future scenarios. The quantity \(\delta\) (called “delta”) representing the daily importation rate was also calculated based on the observed data.
In this section, we will see how to use the calibrated model to simulate the impact of a future intervention, starting from the model equilibrium.
As a first step, the intervention variables describing the new intervention scenario need to be defined. For each intervention, we create an intervention object containing new parameter values after the intervention. For example, we want to increase the proportion of radical cure \(\beta\) from 0.4 to 0.6 (intervention A) or 0.8 (intervention B).
intervention_object0=list(intervention_name="baseline", "alpha.new"=NA, "beta.new"=NA, "omega.new"=NA, "rho.new"=NA )
intervention_objectA=list(intervention_name="intervA", "alpha.new"=NA, "beta.new"=0.6, "omega.new"=NA, "rho.new"=NA )
intervention_objectB=list(intervention_name="intervB", "alpha.new"=NA, "beta.new"=0.8, "omega.new"=NA , "rho.new"=NA)
Now, we use the previously calibrated model (using the values contained in mydata_withR0RC) to simulate the interventions. Make sure that the data frame contains a variable called “id” that identifies uniquely each row.
Now, we simulate the model with the new parameterisation.
my_intervention_list=list(intervention_object0, intervention_objectA, intervention_objectB)
simulation_model= simulate_vivax_interventions(df=mydata_withR0RC, my_intervention_list)
#> Starting from equilibrium
#> simulating from equilibrium
head(simulation_model)
#> id time Il I0 Sl S0 h hr hl
#> 1 regionA 0 0.01376454 0.003669252 0.01421216 0.968354 0.023 0.0129686 0.023
#> 2 regionA 365 0.01376454 0.003669252 0.01421216 0.968354 0.023 0.0129686 0.023
#> 3 regionA 730 0.01376454 0.003669252 0.01421216 0.968354 0.023 0.0129686 0.023
#> 4 regionA 1095 0.01376454 0.003669252 0.01421216 0.968354 0.023 0.0129686 0.023
#> 5 regionA 1460 0.01376454 0.003669252 0.01421216 0.968354 0.023 0.0129686 0.023
#> 6 regionA 1825 0.01376454 0.003669252 0.01421216 0.968354 0.023 0.0129686 0.023
#> hh hhl I p incidence intervention step
#> 1 0.1277778 0.1277778 0.01743379 0 23 baseline 1
#> 2 0.1277778 0.1277778 0.01743379 0 23 baseline 1
#> 3 0.1277778 0.1277778 0.01743379 0 23 baseline 1
#> 4 0.1277778 0.1277778 0.01743379 0 23 baseline 1
#> 5 0.1277778 0.1277778 0.01743379 0 23 baseline 1
#> 6 0.1277778 0.1277778 0.01743379 0 23 baseline 1
The variable time indicates the number of days since baseline. The other variables are the state variables of the model. The variable \(I\) indicates the proportion of all individuals with blood stage infection (\(I=I_L+I_0\)), i.e. prevalence (including any parasite concentration). The variable incidence indicates the incidence per 1000 person year. All variables are detailed in the following table:
Column name | Description | Unit |
---|---|---|
id | Unique identifier of the area | |
time | Number of days since baseline | days |
Il | \(I_L\), proportion of individuals with both blood and liver stage parasites | dimensionless |
I0 | \(I_0\), proportion of individuals with blood stage parasites only | dimensionless |
Sl | \(S_L\), proportion of individuals with liver stage parasites only | dimensionless |
S0 | \(S_0\), proportion of susceptible individuals | dimensionless |
h | Reported incidence | cases per person year if year=T or per person day if year=F |
hr | Reported incidence due to relapses | cases per person year if year=T or per person day if year=F |
hl | Local reported incidence | cases per person year if year=T or per person day if year=F |
hh | Total incidence | cases per person year if year=T or per person day if year=F |
hhl | Total local incidence | cases per person year if year=T or per person day if year=F |
I | Prevalence (\(I_L+I_0\)) | dimensionless |
p | Proportion of imported cases | dimensionless |
incidence | Reported incidence | cases per 1000 person year |
intervention | Name of the intervention | |
step | Simulation step | 1 corresponds to the first simulation from equilibrium |
The model outputs can be represented graphically, for example using the ggplot2 package.
library(ggplot2)
ggplot(simulation_model)+
geom_line(aes(x=time/365,y=incidence, color=intervention), lwd=1)+
facet_wrap(id ~., scales = "free_y")+xlab("Year since baseline")
simulation_model_day= simulate_vivax_interventions(df=mydata_withR0RC, my_intervention_list, year=F)
#> Starting from equilibrium
#> simulating from equilibrium
intervention_object0=list(intervention_name="baseline", "alpha.new"=NA, "beta.new"=NA, "omega.new"=NA, "rho.new"=NA )
intervention_objectB2=list(intervention_name="intervB", "alpha.new"=NA, "beta.new"=0.9, "omega.new"=NA, "rho.new"=NA )
my_intervention_list2=list(intervention_object0, intervention_objectA, intervention_objectB2)
simulation_model2= simulate_vivax_interventions(df=mydata_withR0RC, my_intervention_list2, previous_simulation = simulation_model_day, year=F)
ggplot(simulation_model2)+
geom_line(aes(x=time/365,y=I, color=intervention, linetype=as.factor(step)), lwd=1)+
facet_wrap(id ~., scales = "free_y")+xlab("Year since baseline")+ylab("Prevalence")
It is possible to include a decay in the vector control parameter \(\omega\). For example, an exponential decay, using the following function:
my_omega=vector_control_exponential_decay(initial_omega = 0.8, half_life = 3, every_x_years = 3, maxtime = 7*365 )
head(my_omega)
#> time value
#> 1 0 0.8000000
#> 2 1 0.8001266
#> 3 2 0.8002530
#> 4 3 0.8003794
#> 5 4 0.8005058
#> 6 5 0.8006320
ggplot(my_omega)+ geom_line(aes(x=time/365,y=value), lwd=1)+labs(x="years", y=expression(omega))
In this example, we implement a vector control intervention every 3 years, with an initial efficacy of \(\omega=0.8\) and a half-life of 3 years. Other patterns can be used, especially derived from the AnophelesModel package (Golumbeanu et al. in prep), as long a the dataset is provided as a dataframe with a “time” and “value” column.
intervention_objectC=list(intervention_name="intervC", "alpha.new"=NA, "beta.new"=NA, "omega.new"=0.8, "rho.new"=NA )
intervention_objectCdecay=list(intervention_name="intervCdecay", "alpha.new"=NA, "beta.new"=NA, "omega.new"=my_omega, "rho.new"=NA )
my_intervention_list_vc=list(intervention_object0, intervention_objectC, intervention_objectCdecay)
simulation_model_vc= simulate_vivax_interventions(df=mydata_withR0RC, my_intervention_list_vc, maxtime = 7*365, year=F)
#> Starting from equilibrium
#> simulating from equilibrium
ggplot(simulation_model_vc)+
geom_line(aes(x=time/365,y=incidence, color=intervention), lwd=1)+
facet_wrap(id ~., scales = "free_y")+xlab("Year since baseline")
#> Warning: Removed 1 row containing missing values (`geom_line()`).
It is possible to modify the importation parameter \(\delta\) over time. For example, we want to
leave it constant for 1 year, and then make it decrease linearly to zero
over the next 2 years (starting from the initial equilibrium
value).
We write the desired pattern in a dataframe:
my_trend=data.frame(time=c(0, # baseline
365, # after 1 year
3*365+1 # after 3 years (+1 to ensure the whole 3 year period is entirely covered)
),
relative_change=c(1, # initial value is constant (*1)
1, # still constant after 1 year (*1)
0 # equalsto zero after 3 years (*0)
))
Secondly, we create a database containing all unique regions, their respective value for \(\delta\) and the desired time points:
full=list()
full$id=mydata_withR0RC$id # extract all unique regions from previous database
full$time=c(0,365,3*365+1) # desired time points
my_delta0=dplyr::left_join(expand.grid( full ), mydata_withR0RC[,c("id","delta")])
#> Joining with `by = join_by(id)`
my_delta0
#> id time delta
#> 1 regionA 0 0.000000000
#> 2 regionB 0 0.000269643
#> 3 regionA 365 0.000000000
#> 4 regionB 365 0.000269643
#> 5 regionA 1096 0.000000000
#> 6 regionB 1096 0.000269643
We now merge this table with the pre-defined trend and build the input database for the package:
my_delta=my_delta0 %>% # create a data.frame with the
dplyr::left_join(my_trend) %>%
dplyr::mutate(value=relative_change*delta) %>%
dplyr::select(id,time,value)
#> Joining with `by = join_by(time)`
my_delta
#> id time value
#> 1 regionA 0 0.000000000
#> 2 regionB 0 0.000269643
#> 3 regionA 365 0.000000000
#> 4 regionB 365 0.000269643
#> 5 regionA 1096 0.000000000
#> 6 regionB 1096 0.000000000
This data.frame can then be used in the creation of the intervention object. Other patterns can be used, as long a the dataset is provided as a ata.frame with an “id”, “time” and “value” column.
intervention_objectD=list(intervention_name="baseline", "alpha.new"=NA, "beta.new"=NA, "omega.new"=NA, "rho.new"=NA)
intervention_objectD_delta=list(intervention_name="A", "alpha.new"=NA, "beta.new"=NA, "omega.new"=NA, "rho.new"=NA, "delta.new"=my_delta)
my_intervention_list_delta=list(intervention_objectD,intervention_objectD_delta)
simulation_model_delta=simulate_vivax_interventions(df=mydata_withR0RC, intervention_list = my_intervention_list_delta, year=F, maxtime = 365*3)
#> Starting from equilibrium
#> simulating from equilibrium
ggplot(simulation_model_delta)+
geom_line(aes(x=time/365,y=incidence, color=intervention), lwd=1)+
facet_wrap(id ~., scales = "free_y")+xlab("Year since baseline")+ylim(0,NA)
In region A, the importation rate was already equal to zero, therefore, the trajectory remains unchanged.
Creating a dummy dataset with reported numbers or total cases and local cases, i.e. as opposed to imported cases, and the population size of the area (the user should include their real data). Each line corresponds to an administrative area or a year of data. We also added the reporting and case management parameters for each setting.
mydata_2=data.frame(id=c("regionA","regionB"),cases=c(114,312),cases_local=c(110,290), population=c(10000,5000))
mydata_2$alpha=c(0.18, 0.13) # proportion of treated cases
mydata_2$beta=c(0.43,0.42) # proportion of radical cure
mydata_2$rho=c(0.17, 0.12) # reporting rate (here, we assume that all treated cases are reported)
mydata_2$omega=c(1,1) # no vector control
Draw 100 samples of the incidence and proportion of imported cases, given the observed total and local number of cases. The uncertainty modelled here is due to population size: for additional information concerning the model, please refer to the associated publication.
mydata_2u=sample_uncertainty_incidence_import(mydata_2, ndraw = 100)
Run the model and save the results in a new database.
This database includes 100 replicates of the \(R_0\) and \(R_c\) calculation, including the sampling uncertainty in the incidence and proportion of imported cases, and it can also be used further for simulations.
#nrow(mydata_withR0RC_u)
The model is a simplified representation of reality, and has therefore some limitations.
Firstly, the model does not include any form of immunity. This simplifcation is acceptable for settings with low to moderate transmission intensity. For high transmission settings (roughly observed annual incidence > 200), other models should be used instead.
Secondly, the model is deterministic and therefore neglects the randomness in very small populations. For very low transmission settings (< 20 reported cases annually), other models should be used instead.
Finally, in some instances, the models cannot provide \(R_0\) and \(R_c\) estimates (NA values, and lambda=-2). This means that the combination of incidence, importation and parameter values does not correspond to any mathematical solution. In this case, please revise the assumptions used.