COVID-19’s Geographic Distribution along Major US Interstates and Potential Impact on Trucking Industry

Chirality Research Inc
5 min readMar 21, 2020

With the number of active COVID-19 cases increasing at an exponential rate in the U.S, it is imperative to understand all factors contributing to its spread. The prime contributing factors that lead to the spread of COVID-19 are well communicated, and include proximity to someone infected, recent travel to a high intensity location, age and underlying health conditions that make one more susceptible to contraction. However, an intriguing facet of this long list of factors outside of the primary are the questions: does geography impact spread? What is the statistical probability of contracting COVID-19 solely based on geographic coordinates? Are there any geographical patterns evident in the outbreak of COVID-19?

To help answer these questions, an array of open sourced, public data from sources such as Johns Hopkins University, Texas Public Health and Worldometers are used to create a dynamic map of Texas as illustrated in Figure 1, at the county level with active, confirmed cases. In addition to identifying potential areas prone to the outbreak, historical trends of other countries may help in preparing for an outbreak.

Figure 1. County level breakdown of active, confirmed COVID-19 cases as of March 20, 2020 0600 hours

The number of active cases in Texas jumped from 40 to 160 in a span of four days, as displayed in Figure 2. Majority of these cases are concentrated around major highways.

Figure 2. Number of active, confirmed cases from March 16th to March 20th

To discuss noticeable patterns; from the map a clear pattern can be assessed in Texas where the outbreaks have been concentrated around major highways such as the I-35 and I-10. This corresponds with the general understanding of the spread of COVID-19 which is directly proportional to the frequency of travelers, especially from infected areas. Specifically, with the I-35 which serves as a north-south artery of Texas and an interstate that connects Canada and Mexico, it is bound to have drivers travelling all across North America. According to the American transportation Research Institute both these highways make the list of top 100 most congested highways in the United States. This has some serious implications in preventing spread of COVID-19 both from out of state to Texas and from Texas to out of state.

A major concern would be the trucking industry of the United States considering these highways are also heavily used by freight trucks. According to the Bureau of Transportation Statistics, trucks are the most utilized mode of goods transportation for America between Canada and Mexico, accounting for 64% of all freight transportation and approximately $721 billion out of the $1.1 trillion in freight flows between Canada and Mexico. So, if the truck drivers were to get sick, it would cause a serious toll on the economy and hinder the freight transport severely.

Recommendations for State Agencies and Trucking company:

· Place COVID-19 test centers along the highways; ideally around gas stations, or often used rest stops to optimize convenience for truck drivers to get tested

· Establish contingency plan for truck drivers in the event of spread amongst truck drivers

· Logistical data driven approach to sanitize rest stops, restrooms and gas station nozzles along these major highways

There are, however, limitations to the data acquired by different agencies and an approach of extracting data from multiple sources may help define a data strategy that is dynamic and detailed. The data sources utilized in this study and some of its associated limitations are outlined in Table 1. While Johns Hopkins University provided time series data at a global scale, and a demographic breakdown was available on Worldometer, the project only yielded a Texas specific map due to the County level granularity provided by the Texas Public Health. In order to study possible geographical patterns, it is crucial to get the data with as much granularity as possible. So, data simply at Country level or even State level was not sufficient. The geographic region would be too big to identify any significant patterns. A drawback for this level of data granularity is typically a compromise in time series data.

Table 1. Data Sources cases as of March 20, 2020 0600 hours

Further, there are several challenges to the data acquisition process including scraping through state level public websites and infrequent updates from state officials. To combat this a few script driven models were developed to detect changes in numbers on federal websites for state/county level data on an hourly basis to ensure the map remains updated.

A process is underway to extrapolate this workflow for all states. So far successful extrapolation is performed for the states of Louisiana and Arkansas, however, county level data for Arkansas is available in ranges not an absolute number so a few assumptions were required to be made. Despite the workflow in place, this process has proven to be difficult as majority of the states in the U.S. has not open-sourced county level data. This data should be made available to the data science community so patterns such as these could be discovered and used to establish logistical contingency plans.

In parallel to the data driven map, an anonymous health risk assessment application was developed that is able to quantify potential risk by incorporating geographical and health information for individuals. The application, as demonstrated in Figure 3, does a risk assessment based on key questions of an individual’s recent travel history and high-level summary of pre-existing health concerns. Based on the answers, the application outputs Mild, Moderate or Severe assessment based on the likeliness of an individual to have contracted COVID-19 in the past two weeks and whether the symptoms they are experiencing are possibly flu symptoms. It is currently an active process and the application utilizes CDC’s guidelines to assess risk.

Figure 32. Health risk Assessment Application that assesses risk of contracting COVID-19 based on recent travel history and pre-existing health conditions

Chirality Research Inc. is a data science start up that analyzes structured, non-structured and real-time data, to provide actionable solutions to help our clients manage their day to day operations or devise a roadmap for their long-term strategies.

Dr. Huzeifa Ismail, Founder, Chirality Research Inc

Dr. Huzeifa Ismail, Founder, has more than 10 years of experience addressing cross-industry challenges in the area of engineering and data science. Ismail holds bachelor’s and master’s degrees from Brandeis University and a PhD degree in chemical physics from the Massachusetts Institute of Technology and has authored numerous technical articles and patents.

http://ftp.dot.state.tx.us/pub/txdot/my35/planning/corridor-plan.pdf

https://www.bts.gov/newsroom/2017-north-american-freight-numbers

--

--

Chirality Research Inc

Chirality Research is a data science company that develops technological solutions using Data Science and Machine Learning.