Smarter location decisions through analytics

Even though the online share of retail sales in Norway is about 10% and expected to further increase to 12.7% in 20181, the lion share of retail trade is still realized in physical stores.

Auke Hunneman

Om forfatteren

Despite the growth in e-commerce and its potential threat to bricks-and-mortar retailers (see, for example, NRK 2014), the physical world that we inhabit continues to dictate our (online and offline) shopping behavior (Bell 2014). An important reason for this is that the “real” world imposes frictions on us, for example through the distance that we need to travel from home to our preferred retail stores. Retailers try to reduce those frictions by finding the right store locations, namely those that minimize the travel distance from their stores to the target group. At the same time, competitive pressures make it difficult to find commercial property at good sites for attractive prices. Hence, the old adage that the three most important factors in retailing are “location, location, and location” still applies to today’s retail environment.

Historically, location decisions were and sometimes are still made based on gut feelings of the retailer who evaluates each location based on his knowledge of and experience in the market. However, increased competition and the dynamics of the retail environment make experience a less reliable factor and call for more systematic approaches to location choices. This, in combination with the increased availability of high quality customer-level data, offers great opportunities for the development and use of a statistical model to support location decisions. In this article, we introduce such a model based on the author’s dissertation (Hunneman 2011) and we discuss the potential benefits of using that model as well as its data requirements.

The decision whether and where an (additional) new store should be opened depends on that store’s contribution to company performance. This implies that, on the one hand, we need an estimate of the store’s potential turnover, while –on the other hand– we have to identify the costs associated with opening and maintaining a store at a particular location. The costs of, for example, renting commercial property and hiring personnel can be projected relatively easy. However, the potential revenues of a new location are much harder to predict. Hence, a statistical model that helps estimating those figures is a useful tool for evaluating store locations. Based on the predicted revenues for each location and its associated cost, the company’s management can then decide whether it wants to invest in a particular location. The in this article proposed model thus is a decision support tool that will not give a definite answer to the question whether to invest or not. Rather, the retailer can use the model to make smarter and better decisions by combining the model estimates with his own judgement. Another application of the model is to evaluate the performance of existing stores. By using statistical models it is possible to obtain an estimate of a store’s expected sales level, given its attributes including its location and the competitive and demographic environment it operates in. If the sales level fall short of the predicted sales figures, one can investigate more carefully the reasons why that is. Did construction work on the main street affect traffic negatively or was it perhaps something more structural?

We model the revenue potential of (new) store locations by using an advanced regression model that assumes that a store’s performance depends on store, market and consumer characteristics. Store characteristics may include variables like the store’s size and the composition of its assortment, while market characteristics can be the locations of competitors and other retail stores. Consumer characteristics are the number of households in the store’s trade area and their sociodemographic make-up. The trade area of a store is identified based on transactional data from the chain’s customer loyalty program2 . Thanks to the transaction data that is collected as part of such programs, many retailers know where their customers live and are able to delineate the trade areas for all their stores. Estimating the model on data for existing stores enables us to determine the relative impact of drivers of store performance and to predict the potential revenues for new locations.

We estimate the revenue potential of a new location by following a fixed procedure. First, we determine the trade area of the new store, which we can predict based on the spatial distribution of the chain’s current customers; customer locations are usually registered as part of a company’s loyalty program. The next step is to estimate, for all zip codes belonging to the store’s trade area, the following components of the revenue equation:

  • The penetration rate of the loyalty program;
  • The average number of visits per customer;
  • The average expenditures per visit.

If we multiply the above-mentioned components with the number of households for a zip code, we obtain a precise estimate of the potential revenues in that zip code. If we do so for all zip codes and we sum those revenue estimates over all zip codes, we get an impression of the total revenues that can be generated at a certain location.

Our modeling approach assumes that the drivers of store performance can have a different impact on the penetration of the loyalty program, the average number of visits per customer, and the average expenditures per visit. By doing so, we allow the retailer to identify the exact reason for a change in revenues. For instance, a drop in store revenues may be the result of either a declining number of customers or a fall in the number of visits per customer. Increasing revenues (again) requires a different approach in each of these cases. In the first case, the retailer may try to increase brand awareness through an advertising campaign, while in the second case rewarding customers for the number of times they visit the store seems a more appropriate solution.

In order to successfully apply the model, the following data needs to be collected:

  • Transaction data per store at the individual customer level. This data is often routinely collected as part of a company’s customer loyalty program.
  • Attributes of each store including the number of competitors.
  • Sociodemographic data at the zip code level. This data can be obtained from commercial data vendors or from Statistics Norway.
  • Travel distances between all zip codes.

A major advantage of the proposed model is that it can be used for any type of retailer whether it is an apparel chain or a grocer. The performance of a clothing store may largely depend on the attractiveness of the area in close vicinity to the store, while the revenues of a grocery store depend on the availability of free parking space. The researcher can adjust the set of explanatory variables in the model in accordance with the specifics of the focal retailer.

In sum, this article proposes a model to support location decisions. The proposed model can be easily applied in different settings and it can be implemented as a decision support system in, for example, Microsoft Excel. The model predicts the revenue potential of candidate sites and it shows how these revenues are generated. Hence, it is a useful tool in search of the best “location, location, and location”.

1 Please visit for more information.

2 Alternatively, we can use the model developed by Hunneman and Van Oest (2012). This model can determine a store’s trade area and revenue potential based on aggregate sales figures for all stores without having access to loyalty program data (please see Magma 2012(3)).


The author thanks Olessia Bankovskaya for proofreading the manuscript.


TBell, D.R. (2014). Location is (still) everything: the surprising influence of the real world on how we search, shop, and sell in the virtual one. Houghton Mifflin Harcourt, Boston.

Hunneman, A. (2010). Advances in methods to support location and design decisions. Doctoral thesis, University of Groningen. SOM Research School, Groningen.

Hunneman, A. and R.D. van Oest (2012). Å estimere handelsområder uten å følge kundene hjem. Magma 2012(3): 35-41.

NRK (2014). Netthandelen øker eksplosivt. [available at].

Andre artikler du kan være interessert i

Se alle artikler

Segmenteringsutfordringer – del 3

Har du en god segmenteringsmodell?

fagartikkel, tema-artikkel

Segmenteringsutfordringer – del 2

Hvilken segmenteringsmodell bør du velge?

fagartikkel, tema-artikkel


- Det er få ting i markedsanalysefaget som er så lett i teorien og så vanskelig i praksis som segmentering, skriver…

fagartikkel, tema-artikkel