Agent-Based Markov Modeling for Improved COVID-19 Mitigation Policies

Authors: Roberto Capobianco, Varun Kompella, James Ault, Guni Sharon, Stacy Jong, Spencer Fox, Lauren Meyers, Peter R. Wurman, Peter Stone

JAIR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we validate that the simulation behaves as expected under controlled conditions, illustrate some of the many analyses it facilitates, and most importantly, demonstrate that it enables optimization via RL. Unless otherwise specified, we consider a community size of 1,000 people and a hospital capacity of 10 people.5 To enable calibration with real data, we limit government actions to five regulation stages similar to those used by real-world cities6 (see Section 3.6 for details), and assume the government does not act until at least five people are infected. Figure 5 shows plots of a single simulation run with no government regulations (Stage 0). Figure 5(a) shows the number of people in each infection category per day. Without government intervention, all individuals get infected, with the infection peaking around the 25th day. Figure 5(b) shows the metrics observed by the government through the lens of testing and hospitalizations. This plot illustrates how the government sees information that is both an underestimate of the penetration and delayed in time from the true state. Finally, Figure 5(c) shows that the number of people in critical condition goes well above the maximum hospital capacity (denoted with a yellow line) resulting in many people being more likely to die. The goal of a good reopening policy is to keep the red curve below the yellow line, while keeping as many businesses open as possible. Figure 6 shows plots of our infection metrics averaged over 30 randomly seeded runs. Each row in Figures 6(a-o) shows the results of executing a different (constant) regulation stage (after a short initial S0 phase), where S4 is the most restrictive and S0 is no restrictions. As expected, Figures 6(p-r) show that the infection peaks, critical cases and number of deaths are all lower for more restrictive stages. One way of explaining the effects of these regulations is that the government restrictions alter the connectivity of the contact graph. For example, in the experiments above, under stage 4 restrictions there are many more connected components in the resulting contact graph than in any of the other 4 cases. See Section 5.1.1 for details of this analysis. Higher stage restrictions, however, have increased socio-economic costs computed using the second objective in Eq. 3). Our RL experiments illustrate how these competing objectives can be balanced. A key benefit of Pandemic Simulator s agent-based approach is that it enables us to evaluate more dynamic policies7 than those described above. In the remainder of this section we analyze the model s sensitivity to its parameters, we compare a set of hand constructed policies, examine (approximations) of two real country s policies, and study the impact of contact tracing. Finally, we demonstrate the application of RL to construct dynamic polices
Researcher Affiliation Collaboration Roberto Capobianco* EMAIL Sony AI Sapienza University of Rome, 00185 Rome, Italy Varun Kompella* EMAIL Sony AI James Ault EMAIL Guni Sharon EMAIL Texas A&M University, College Station, TX 77843, USA Stacy Jong EMAIL Spencer Fox EMAIL Lauren Meyers EMAIL The University of Texas at Austin, Austin, TX 78712, USA Peter R. Wurman EMAIL Sony AI Peter Stone EMAIL Sony AI The University of Texas at Austin, Austin, TX 78712, USA
Pseudocode Yes Algorithm 1: Infection Likelihood Inference Input: daily contacts, test results, and symptoms report Result: Daily individual infection probabilities, B 1 Initialization: 2 Init belief state as a vector of probabilities: B = R|S| where si S , B[i] = |I|/|S|; 3 Set the decay rate based on the infection median length (d): γ = d 4 Init double decay contact history as a symmetric matrix of probabilities: Cγ2 = R|S| |S|; 5 Init triple decay contact history as a symmetric matrix of probabilities: Cγ3 = R|S| |S|;
Open Source Code Yes 1. The introduction of Pandemic Simulator, a novel open-source1 agent-based simulator that models the interactions between individuals at specific locations within a community. Developed in collaboration between AI researchers and epidemiologists (the co-authors of this paper), Pandemic Simulator models realistic effects such as testing with false positive/negative rates, imperfect public adherence to social distancing measures, contact tracing, and variable spread rates among infected individuals. Crucially, Pandemic Simulator models community interactions at a level of detail that allows the spread of the disease to be an emergent property of people s behaviors and the government s policies. An interface with Open AI Gym (Brockman et al., 2016) is provided to enable support for standard RL libraries; 1https://github.com/Sony AI/Pandemic Simulator
Open Datasets Yes We use real-world data of Sweden as provided by the World Health Organization2 to calibrate the simulator. Specifically, we use new hospitalizations data because it is least affected by imperfect testing strategies. We chose Sweden as our source of data as it was the nation where the least restrictions (Claeson & Hanson, 2021) were applied (at least) during the rise of the first pandemic wave, and thus, where the dynamics of the virus is the most natural , despite several factors influencing it (e.g, population density and mobility). The subsequent data following the peak is usually impacted by several factors that are either unknown to us or are not currently modeled in the simulator, such as, fine grained changes in people s behavior, exact regulations followed, significantly smaller simulated population sizes, etc. In order to match real data (time-to-peak hospitalizations 10 weeks) we run a Bayesian Optimization algorithm on the spread rate mean in the range [0.005, 0.03] and the social distancing rate mean in the range [0.0, 0.8], resulting in a final parameter settings of spread rate = 0.02056 and social distancing = 0.00198 (see Figure 3). 2https://covid19.who.int/region/euro/country/se
Dataset Splits No The paper describes how the simulator creates a population and initializes the pandemic, as well as running multiple 'randomly seeded runs' (e.g., in Figure 6 and Section 6.4.3). However, it does not specify explicit training/testing/validation dataset splits with percentages or sample counts for any machine learning model evaluation, nor does it reference predefined splits of an external dataset.
Hardware Specification Yes All our experiments were run on a single core, using an Intel i7-7700K CPU @ 4.2GHz with 32GB of RAM.
Software Dependencies No The paper mentions using Open AI Gym and the Soft Actor Critic (SAC) algorithm but does not provide specific version numbers for these or any other software libraries, programming languages, or environments.
Experiment Setup Yes Table 4: Learning Parameters used in our experiments Parameter Value Comment RL critic inputs Global infection summary, stage Critic is only used during training. RL actor inputs Global testing summary, stage To keep it realistic. RL Actions [-1, 0, 1] Stage change Critic and actor networks 2 hidden layers of 256 Re LU units each Simulator steps per action 24 A new action at the start of each day Learning rates Critic: 1e-3, actor: 1e-4 SAC entropy coefficient α 0.01 Stale network refresh rate 0.005 RL discount factor 0.99