What is Market Basket Analysis? How can market basket analysis help e-commerce business?
Market basket analysis is a most important topic for every online or offline retail business. Let us begin by understanding the basics of market basket analysis.
What is market basket analysis?
Market basket analysis is a modelling method used to identify the products purchased together. In other words, if a customer buys a product, what is the probability that he/she will buy another product along. Market Basket analysis also called Affinity Analysis.
The objective of market basket analysis is to increase sales by identifying the products bought together by customers. Based on this data or prediction a recommendation can be displayed on the e-commerce website.
Before we move on to the association rules and measures of market basket analysis, let us understand the recommender system. It will give you a complete picture of market basket analysis concept and objective.
Recommender system primarily collects the data of customers purchasing behavior and predict the possibility of products bought together. Based on the prediction of a buyer's preferences, it recommends a list of products that the customer is more likely to buy. Amazon, Netflix are the great examples of recommender system usage.
Types of Recommender System
1. Knowledge-based Recommendations
Knowledge base system primarily predicts based on the products bought together. Examples, retail stores or super Markets.
2. Collaborative Filtering Models
Collaborative Filtering uses the data collected based on the buyer's preferences and recommends products to the similar type of buyers. Example, Online stores
3. Content-based Recommendations
The content-based support system predicts based on the historical purchase data of customers. Example, Movies, News, etc.
There are multiple methods to analyze the relationships between products. Association rules are the techniques that help to study the connection between products in common. The result of this methods is the set of rules to describe if this, then that.
Type of Association
1. Useful – Example, On Saturday, Hyper city customers purchase diapers and beer together.
2. Trivial – Example, Croma customers who purchase maintenance contracts also purchase large appliances.
3. Inexplicable – Example, when a new home depot store opens in an area, the most commonly sold items are toilet rings.
Three Important Measures
To understand the Market basket analysis measures better we have used the below case.
Assume 100 Customers visit a store
10 of them bought bread
8 of them bought butter, and
6 bought both bread and butter
Bought bread=> bought butter
Support = P(Buy both Bread & Butter) = 6/100 = 0.06
Confidence = P(Buy both Bread & Butter)/P(Buy Bread) = 0.06/0.08 = 0.75This is a conditional probability, what is the probability that the customer who has bought bread will also buy Butter
Lift = Confidence/P(Buy Bread) = 0.75/0.1 = 7.5The customers who bought bread are 7.5% likely to buy butter as well. Lift is the measure that helps to improve the performance
Why is Market Basket Analysis Important for ecommerce Business?
Market Basket Analysis helps to increase sales by recommending appropriate products to the customers who are likely to buy. It improves the store credibility and user experience by providing specific recommendations. Also, reduces the time spent by the customers on product search.
Market Basket Analysis with R
In the given Groceries DataSet, we have 9835 transactions. Each line contains the list of products bought together by customers. Our objective is to do the market basket analysis to find the actionable and meaningful patterns or association with the purchasing of products by consumers. This knowledge could be leveraged in store layout and boosting sales and online (e-commerce store) product recommendations. To do the market basket analysis for the given data set we are following the below steps,
1. Exploratory data analysis
2. Market basket analysis using R
3. Displaying the result to Analyse the insight
Exploratory data analysis
There are 9835 transaction line items with the 169 columns. Transactions as item Matrix in sparse format with 9835 rows (elements/item sets/transactions) and 169 columns (items) and a density of 0.02609146.
Most frequently purchased products list
Summary of Groceries data set
Density is the total number of none empty cells in the product data, in other words, the total number of products bought divided by the total number of possible products in the provided data.
In the most frequently purchased items, we can see that the whole milk is the most purchased 2513 transactions. Second most purchased item is Vegetables 1903. If you add all these items, we get the total number of purchased items in the given dataset, which is 43,367 total items.
Elements section represents the size of the transactions, there are 2159 transactions have only one item and 1643 transactions have only two items so on. There is only one transaction with 32 items and it occurred only once.
From the five number summary stats, we can understand that the minimum transaction product is one and max is 32 with the median of 3 items per transaction. Also, it is a right-skewed distribution data. In the 2nd and 3rd quartile, we can interpret that 50% of the transactions have 2 to 6 products, only 25% of transactions have over 6 products purchased.
Displaying frequently purchased products graph
Analysing the items that are having purchase frequency of 10%
From the above graph, we can understand that these are the products having purchase frequency of 10% and above.
The above graph represents the top 20 frequently purchased products. This matrix will help in terms of product focus and layout. The top 5 products are Whole milk, Vegetables, Rolls/buns, Soda and Yogurt.
Market basket analysis using R
In the rule setting section, we are determining the level of support and confidence we want for this basket analysis. So here, we are creating the basket with the Support level of 0.001, and the confidence level of 0.8. Here we are trying to build the baskets, which can significantly occur. This helps us to analyze the products associate better.
Now let us interpret the rules output we have got for the given data set. Creating the rules to basket the products
Displaying the result to analyse the insight
With the given support and confidence parameter we have 410 rules in order to create the baskets. From the summary, we can understand that there are 29 rules for 3 products, 229 rules for 4 products, 140 rules for 5 products and 12 rules for 6 products.
Now let us inspect the first few rules to study them. Below listed are 20 rules for references.
Now let us plot the 20 rules in a graph for better representation,
Benefits of Market Basket Analysis