How to leverage Unacast data to estimate business potential in a region?

How to leverage Unacast data to estimate business potential in a region?

Measuring the number of people visiting a certain venue, a set of venues, or a brand, is probably the most common way of using human mobility data. It is a perfectly valid use-case as long as there exists a list of existing venues and one wants to know how they perform. What if, however, a venue does not exist yet, but we still want to know which regions hold the potential for opening one?

One insight that could get us closer to answering this question is finding out in what areas certain types of services are underrepresented, overrepresented, or saturated just right - especially during the turbulent period that Covid pandemic has brought us. Here we demonstrate one approach to visualizing and measuring the degree of disruption to distribution of services as compared to the pre-Covid era.

With use of Unacast’s Venue Data Package, we will:

  1. Show that the relative equilibrium of services (restaurants, clothing, healthcare, etc.) has been disrupted during the Covid pandemic;
  2. Quantify the degree of such disproportion; and
  3. Show which services have remained under or over-represented and in which locations.

Typical Visitation to Services

First of all, let us explain what we mean by typical visitation. For the sake of this article, we pick 13 categories of businesses: Healthcare, Restaurants, General Retail, Beauty & Grooming, Wellness & Fitness, Miscellaneous Goods, Travel & Hospitality, Entertainment & Hobby, Grocery & Food Retail, Clothing & Accessories, Services, Home Goods & Improvement, Auto Dealerships & Car Rentals

Then, to represent each of the categories by a single number, we calculate a “typical daily traffic” in a given category of venues simply as an average daily count of unique visitors per venue in that category. Intuitively, we feel that the number of customers visiting a hairdresser’s is on average lower than the number of people coming to a shopping mall. Plotting these category-specific averages side-by-side provides relative relative comparison.


Average Number of Visitors per Venue Category.png


Adding the Dimension of Time

Placing data onto a timeline often brings life to static numbers. The line chart in the figure below displays the same measure as the chart above, that is, the average number of unique visitors per venue category. This time, however, we show the averages in quarterly snapshots. Three observations can be made right away:

  • The typical visitation per venue category is not time-agnostic. Even during 2019 we see fluctuations between the quarters, which can be to a large degree ascribed to the effect of seasonality;
  • The typical visitation per venue category in 2020 is significantly different from 2019. The effects of the Covid pandemic are undeniable; and
  • Not all categories have recovered to the pre-Covid levels. We see that, for instance, General Retail (red line) is still under-performing in 2021-Q2 when compared to 2019-Q2.


Average Number of Visitors per Venue Category over Time.png


Area’s Service Profile

Given what we’ve seen so far, the question that naturally arises is whether location is yet another factor influencing the typical visitation. And if it is, which are the susceptible categories and how susceptible they are? For that purpose, we construct a profile of an area that captures the typical visitation for each category. On the plot below, we see visualization for Tift County in Georgia in Q3 of 2019. Visualized by means of a radar plot the profile looks as follows:

profile-tift-2019q3.png


When overlaid on top of each other,  such profiles make it rather easy to visually compare the differences between two regions or between two temporal snapshots of one location (Fig 4b):

spatial comparison.png
temporal comparison.png


Establishing Baseline

On the line plot above, we clearly see that the typical visitation remains rather stable throughout the year 2019 and we attribute the slight variations in some categories to the seasonal effect. This observation can be used to establish a baseline for each combination of category and location, which consequently allows us to compare relative changes between two profiles (rather than the profile values in absolute terms).

We turn the profile into a relative-to-baseline-profile simply by dividing the typical traffic by the traffic in corresponding quarter of 2019. The relative profile of typical visitation to services then contains baseline values, which are normalized to 1.0 for each category (orange “circle”), and multiples of the baseline that indicate the change from the baseline (<1 if there is a decrease from baseline, >1 if there is an increase).


baseline-explanation.png



Animated snapshots over a longer time range offer a peak into the dynamics of the service utilization for a given region (notice the massive deviation from the norm in 2020-Q2).


hawaii-animation.gif


Quantifying the Deviation

The deviation from the baseline can of course be quantified. By using a statistical method for comparison of two distributions (Jensen-Shannon distance) we can measure the difference between two profiles. The distance is 0 if two profiles are exactly the same (i.e., no change) and positive real number if their distributions differ (the higher, the larger the difference). The plot below indicates a significant disruption of services in Q2 2020 which, with decreasing tendency, lasts until this day.


image.png


Service-profile Deviations over Time and Space

Eager to know which areas are the most and the least affected, we plot the deviation from baseline for selected counties on a map. The graphics below follows the thermometer color scheme, i.e., blue represents  areas with low deviation from the baseline (businesses have returned to their pre-Covid patterns) while red is on the other side of the spectrum.


spatial_distance_over_time.gif
Fig. 8: Spatiotemporal visualization of service distribution deviations.


ny_example.png
Fig. 9: The services in wider area of NY are returning to distributions similar to the pre-Covid period (blue regions). On the other hand, some rural areas are still disrupted (red regions).


2020q1-animation-frames_with_profiles_merged.gif
Fig. 10: A radar-plot profile is there to help us with inspection of specific categories.

Summary

We are not making any conclusions about specific businesses or regions. The aim of this article was purely to demonstrate one possible way to use mobility data and simultaneously show that the space of services has been significantly affected during the pandemic and that people’s visitation behavior has shifted (naturally or as a consequence of restrictions, or both).

While we can observe return to pre-Covid normal in some areas, other regions haven’t and might not return to the same visitation patterns any time soon. That is where new challenges and opportunities open for those who know how to obtain the right information at the right time.

Related Articles