Retailers with multiple stores of commodity products face the problem of product assortment that incorporates the varying geographic and demographic conditions of the locations they serve. How can such retailers use the sheer volume of available transactional data to their best advantage? In this article, Professors Xue Bai, Sudip Bhattacharjee, Fidan Boylu and Ram Gopal of University of Connecticut, present a data mining and optimisation based decision model for product assortment across multiple stores of a company.
Global and Local Patterns of Sales Merchandising managers and retail buying executives frequently decide on the assortment of products to be carried in a retail store. Product variety is important for decision making, and so are the consumer demographics ser ved by the retail store as well as meeting the inventor y constraints at the location. While consumer products and brand name items can be marketed using pricing, promotions, bundling and other techniques, this is usually not the case for commercially used commodity products such as oil, steel and plastics, where markets are competitive and pricing and product differentiation are not salient sales tools. For retailers selling such products, product assortment and availability are important determinants of sales success, while price is usually closely tied to the cost of the product. Retailers with multiple stores in different geographical locations have the problem of deciding on the most profitable product assortment for each store, and frequently do not have a common metric to compare the profitability of different stores, given the differences in product assortment and the diverse demographics ser ved by the stores (see Figure
1). Measuring such a commodity retail store’s sales effectiveness is usually achieved through metrics such as total revenue, average turnover and operating
margin. While useful, these efficiency measures do not usually provide growth goals for managerial decision making and planning.
To achieve the best product assortment and performance for a given store, a first step involves identifying global patterns of sales of associated products through data mining of transaction information of the different stores of a firm (see Table 1). However, without a method to identify demographics around a given retail store and estimate differentiated sales opportunities for existing stores, a centralised retail director may set similar growth goals for all the stores, which we have frequently obser ved to be the case in industr y engagements. It is not uncommon, therefore, to find some stores that easily exceed expectations, while others seem to lag significantly behind the goals. It is possible that the stores that do not meet goals are already performing at peak efficiency. This creates subsequent planning problems as well as personnel related equity and performance disparities.
In this article, we present a data mining and optimisation based product assortment and performance assessment methodology for each store of a firm. Our methodology allows a merchandising manager to glean global knowledge from sales patterns and identify frequently purchased itemsets. We use a dataset from an industr y leading plastics manufacturer and retailer in the United States to demonstrate the utility of our model.
Frequently Purchased and Revenue Generating Itemsets
The complete set of transactions captures the purchase behaviour of client companies. To extract product dependencies, a commonly used approach in data mining is frequent itemset analysis over transactions. However, a concern that arises with frequent itemset analysis is the large number of itemsets that are generated. Moreover, the frequent
Our methodology allows a merchandising manager to glean global knowledge from sales patterns and identify frequently purchased itemsets.
itemsets can enhance the sales of a product while others can dampen it. Translating this knowledge into a viable decision-making model to further firm objectives has remained an unaddressed challenge. Key among these objectives is the development of efficiency measures for each of the stores and related product assortment selections.
We developed a robust mechanism to prune the large number of resulting itemsets and also developed a metric to identify revenue generating items that can subsequently be used to choose beneficial itemsets (see Table 2). Using data from one of the industr y’s largest plastics manufacturers and distributors in the United States, we show that when the itemsets, after pruning, are examined for a given industr y segment, the initial set of product association rules (in the magnitude of tens of millions) can be significantly decreased. After pruning, the resulting number of itemsets is reduced to a ver y manageable size (in the magnitude of hundreds). Our computational results also show
that frequently purchased itemsets that are bought by one industr y segment significantly differ from those bought by another, with only a small number of overlapping products. This suggests that product offerings in a given store should be carefully calibrated depending on the industr y segment potential around the store location.
Efficiency metric for Performance
Evaluation and Growth Projection
We developed a metric to measure and compare store performance using an average, or quartile, or other ranking measure, which helps to provide differentiated growth goals for each store. This metric utilises global knowledge to optimise a store’s product assortment by taking into account the local constraints around a given store. In addition to the value created for existing stores, the methodology can also be extended to determine locations to open new stores based on location demographics.
Our analysis compares current and optimal revenues across 10 stores based on the “average” revenue capture ratio (see Figure 2). For most stores, the optimal revenues are higher than the current revenues, signifying that these stores are currently performing below average and can be targeted for
growth to meet the average revenue capture ratio for all the stores.
When the same analysis is performed with a “90th percentile” revenue capture ratio, some stores remain in the 90th percentile while others drop out – suggesting that these stores could improve under the “90th percentile” metric criterion (see Figure 3).
In industry, it is the usual practice to fix growth goals for all stores at the same rate, only to subsequently find that some stores easily achieve the goals, while others do not. That is the reason we see stores that are already performing at the highest level and cannot be expected to improve growth on par with other stores that have a better potential to grow. We can thus label these high performing stores as “turnips,” because as managers well know, “you cannot squeeze blood out of a turnip.” Our method can identify lower performing stores, and as managerial incentives are frequently tied to growth performance, the firm can set data-driven differential growth projections for stores, as opposed to a “one size fits all” target.