### Introduction

With the COVID-19 pandemic raging across India, we have been under lockdown since March 25th, 2020. It is widely welcomed by close to 1.3 billion people, even though this has bought their lives to a standstill. The 800-pound gorilla in the room, of course, are the questions** “When should this lockdown be relaxed and how do we know that we are making progress?”.**

In any epidemic, **R _{t}** is the measure known as the

**It is the average number of people who become infected by an infectious person at time**

*effective reproduction number.***. The most well-known version of this number is the basic reproduction number:**

*t***R**when t = 0. However,

_{0 }**R**is a single measure that does not adapt with changes in behaviour and restrictions.

_{0}As a pandemic evolves, increasing restrictions (or potential relaxing of restrictions) changes R_{t}. Knowing the current R_{t }is essential for policy-based decision making. When R_{t}>1, the pandemic will spread through the entire population. The lower R_{t}, the more manageable the situation.

The value of R_{t }helps us in:

- Understanding how effective the non-pharmaceutical interventions have been in controlling the outbreak.
- Giving vital information, regarding whether we should increase or reduce restrictions, based on our competing goals of economic prosperity and saving human lives.
^{[1]}

Somehow this particular insight has been mainly missed by the world. Except for Hongkong^{[2]}, no one seems to be tracking this, at least on a real-time basis. This number is generally not that useful at the national level. The key aspect is to understand this number at the state or district level, where decisions regarding tightening or relaxing the non-pharmaceutical interventions are implemented.

In this post, let’s try and discuss a framework for this solution for the Indian states of Telangana (that I am based out of), Maharashtra, and Tamil Nadu, where the number of COVID cases seem to be growing at the fastest rate in India.

As part of future work, we will be trying to do the same at the district / city level for a better understanding of R_{t }at the ground level.

This borrows heavily from the work of Betterncourt and Riberio^{[3]} and also from Kevin’s GithubRepository^{[4]}.

### Approach

We have an estimate of the number of new COVID-19 patients on a daily basis. We can use this to estimate the current value of R_{t}. We can also see that the value of R_{t} will depend on R_{t-1} (yesterday’s value) and for every previous value of R_{t-n} .

We can use Bayes Rule to update our belief about R_{t}, based on the new infection data that we are seeing each day.

**P(R**_{t} | k) = [P(R_{t}) . *Likelihood*(R_{t} | k)] / P(k)

_{t}| k) = [P(R

_{t}) .

*Likelihood*(R

_{t}| k)] / P(k)

The above equation can be interpreted as, having seen k cases, the distribution of R_{t} is equal to:

- The prior belief of the value R
_{t }is assumed to be P(R_{t}) - Times the likelihood of R
_{t}given that we have seen k cases - Divided by the probability of seeing k cases under all hypothesis of R
_{t}.

Importantly, since P(k) is a constant, the numerator is proportional to the posterior. As all probabilities sum to 1.0, we can ignore P(k) and normalize the posterior sum to 1.0

**P(R**_{t} | k) P(R_{t}) . Likelihood (R_{t}|k)

_{t}| k) P(R

_{t}) . Likelihood (R

_{t}|k)

Of course, this is for one day. Generalizing this across all the previous days we have measurements for, we can write the same as

**P(R**_{t} | k) P(R_{0}) . Likelihood (R_{n}|k_{n}) . Likelihood (R_{n-1}|k_{n-1})………Likelihood (R_{1}|k_{1})

_{t}| k) P(R

_{0}) . Likelihood (R

_{n}|k

_{n}) . Likelihood (R

_{n-1}|k

_{n-1})………Likelihood (R

_{1}|k

_{1})

With a uniform prior P(R_{0}), this reduces to:

**P(R**_{t} | k_{t})∏ Likelihood (R_{t} | k_{t})

_{t}| k

_{t})∏ Likelihood (R

_{t}| k

_{t})

One of the potential issues with this Bayesian approach is that the posterior is equally influenced by events in the distant past as much as in the recent past. In our case, this would mean that if R_{t}> 1 for a long period, and has come under control (R_{t}< 1) recently, the posterior will get stuck at values > 1 for a long time.

**Of course, this would not work for us, because the entire purpose of this exercise is to see when R**_{t }has dipped below 1.

_{t }has dipped below 1.

One way to resolve this would be to just use the previous “m” days for calculating the likelihood function, rather than the entire history.

**LIKELIHOOD FUNCTION:**

We will be using Poisson Distribution as the likelihood function for this analysis, as this is the preferred model for understanding the “number of arrivals” in a given time period. Given an average arrival rate of ‘λ’ new cases per day, the probability of seeing *k* new cases is distributed according to the Poisson distribution:

**P(k|λ) = (λ**^{k}e^{-λ}) / k!

^{k}e

^{-λ}) / k!

**Figure 1: Poisson Distribution**

##### DERIVING R_{t }FROM λ

The most important feature of this work is to connect R_{t} to λ. The derivation is itself out of the scope of this blog post, but the derivation can be found here.

**Derivation = λ = k**_{t-1}e^{ϒ(R}_{t}-1)

_{t-1}e

^{ϒ(R}

_{t}-1)

The ϒ is taken is the reciprocal of the serial interval (5 days for COVID-19).

The problem can now be written as

**Likelihood(R**_{t}|k) = (λ^{k}e^{-λ}) / k!

_{t}|k) = (λ

^{k}e

^{-λ}) / k!

As the next steps, we just have to perform the Bayesian update on the most likelihood function, which in this case we have chosen to be Poisson.

### Just to Summarize

### Data for the Real World

We have used data from the COVID-19 India Tracker website (https://www.covid19india.org/). We have extracted the data for the states of Telangana, Maharashtra, and Tamil Nadu for the period 14^{th} March 2020 to 14^{th} April 2020.

We are in the process of collecting more data, but the present analysis is limited to the above mentioned three states.

### Analysis

The analysis has been conducted for each of the three states of Telangana, Maharashtra, and Tamil Nadu.