Data Science, Algorthims & Analysis

A statistical and deep learning-based daily infected count prediction system for COVID-19

  • From September 2020

      To July 2020

Connecting the dots on a graph may not always reveal a bell, but probably a bridge to show how COVID-19 spreads.

The year 2020 is going to be remembered for the bat virus that shrunk us into small data sets. A cough, cold, fever, headache, or breathing discomfort have never been so frightening or life-threatening. Now we are getting sampled, documented, isolated, and analyzed to predict who is going to cough next!!

The count of COVID positives cases per day (actual count/active count) is used by technologists across the globe to forecast the possible active cases in the future (expected count). Such a forecast will help the national & local authorities, administrators, and health workers to make informed decisions and timely measures to control the pandemic spread. The researchers at Somaiya Vidyavihar University are also playing an active role in this noble cause. Dr. Ninad Dileep Mehendale and his team of students from K J Somaiya College of Engineering have developed a model that could forecast the expected count with 94% accuracy.

Interestingly, this current study gives a new perspective on viral infection. Until now, the simple figurative explanation for COVID infection is a bell-shaped curve. A curve that traces a slow or fast increase in the outbreak until it reaches a peak, reflecting a short period of maximum active cases and then a sudden decline in infection. Here, the team observed a bridge-like pattern, showing a fast increase in the outbreak until it reaches a peak then a flat line followed by linear decay based on the value of daily infected new cases in each country.

The researchers collated active counts of 190+ countries, updated between 31st Jan to 10th May 2020, from 'Coronavirus disease statistics' displayed on Wikipedia. Eight mathematical models were used in this study. The expected count from each model was compared with the actual counts (validated from 10th May to 25th May) to figure out the best mathematical model for prediction. Dr. Ninad lists their key findings as:


  • To ‘figure out’ the best mathematical model for predicting daily active COVID-19 count.
  • Estimate the best and worst-case scenarios of COVID-19 count and the nature of the graph, followed by it

Key Findings

  • Among the eight different mathematical models tested, the recursive neural network model predicted the count with the least error.
  • The nature of the curve observed for COVID-19 is a 3-phase curve: an exponential rise (Phase 1), a steady phase (Phase 2), and a linear decay (Phase 3).
  • The phases are divided by two points : EOP1 (marks the end of Phase 1) and EOP2 (marks the end of Phase 2).
  • EOP1 indicates maximum active COVID cases, and this count remains steady in Phase 2 till EOP2, after which the number of active cases decreases.
  • India has still not reached EOP1, whereas most of the countries in the world are in the transition phase from EOP1 to EOP2.

An unobstructed deep breath is what we all aim for, but technologists are making it right through deep learning. The language of math and intelligence of computers is helping humanity to figure out the best forward path to control a biological outbreak.

"The project was designed with the social motive to help people predict the outbreak”

Ninad Dileep Mehendale

Coordinator, Research Promotion-SIRAC

Somaiya Vidyavihar University

Published on: 5th June 2020

Principal Investigator


  • Ankita Shelke

    Role: Coding deep learning algorithms and developing new methodology

    SY B.Tech, Computer Engineering

    " Addressing this worldwide outbreak of corona virus, it was essential to predict the spread and assist the government and medical centre to take critical decisions for allocating the resources. With this project, we could predict the spread count and find the general trend using various methods in deep learning. "

  • Jainam Shah

    Role: Mathematical modelling of data set from 190+ countries

    SY B.Tech, Computer Engineering

    " This was a project based on the real crisis happening all over the world right now, and it was a good experience to work with real data and try to make accurate predictions as well as find general trends. Overall it gave a good start on understanding time series and disease simulation. "

  • Mamata Parab

    " a researcher at Terna college and Ninad’s Research Lab assisted the team with graphics and data representation "

  • Vruddhi Shah

    Role: Preparing Technical reports

    SY B.Tech, Computer Engineering

    " I am grateful to our team lead for giving me this opportunity to work for the noble cause. The pandemic has created a lot of disturbances in our lives in various ways. Our lifestyle had to change, and we all had to adapt to the changing environment. Through this project, we aim to benefit society and help the government and medical staff take adequate measures and predict a general trend for the spread of this disease. "

What to read next