Market basket analysis is the most important topic for every online or offline retail business. Let us begin by understanding the basics of market basket analysis.
Market basket analysis is a modeling method used to identify the products purchased together. In other words, if a customer buys a product, what is the probability that he/she will buy another product along. Market Basket analysis also called Affinity Analysis.
The objective of market basket analysis is to increase sales by identifying the products bought together by customers. Based on this data or prediction a recommendation can be displayed on the e-commerce website.
Before we move on to the association rules and measures of market basket analysis, let us understand the recommender system. It will give you a complete picture of the market basket analysis concept and objective.
The recommender system primarily collects the data of customers purchasing behavior and predicts the possibility of products bought together. Based on the prediction of a buyer's preferences, it recommends a list of products that the customer is more likely to buy. Amazon, Netflix are great examples of recommender system usage.
The knowledge base system primarily predicts based on the products bought together. Examples, retail stores, or supermarkets.
Collaborative Filtering uses the data collected based on the buyer's preferences and recommends products to a similar type of buyer. For example, Online stores
The content-based support system predicts based on the historical purchase data of customers. For example, Movies, News, etc.
There are multiple methods to analyze the relationships between products. Association rules are the techniques that help to study the connection between products in common. The result of these methods is the set of rules to describe if this, then that.
1. Useful – Example, On Saturday, Hyper city customers purchase diapers and beer together.
2. Trivial – Example, Croma customers who purchase maintenance contracts also purchase large appliances.
3. Inexplicable – Example, when a new home depot store opens in an area, the most commonly sold items are toilet rings.
To understand the Market basket analysis measures better we have used the below case.
Assume 100 Customers visit a store
10 of them bought bread
8 of them bought butter, and
6 bought both bread and butter
Bought bread=> bought butter
Support = P(Buy both Bread & Butter) = 6/100 = 0.06
Confidence = P(Buy both Bread & Butter)/P(Buy Bread) = 0.06/0.08 = 0.75
Lift = Confidence/P(Buy Bread) = 0.75/0.1 = 7.5The customers who bought bread are 7.5% likely to buy butter as well. Lift is the measure that helps to improve the performance
Market Basket Analysis helps to increase sales by recommending appropriate products to the customers who are likely to buy. It improves the store credibility and user experience by providing specific recommendations. Also, reduces the time spent by the customers on product search.
The objective of market basket analysis is to increase sales by identifying the products bought together by customers.
In the given Groceries DataSet, we have 9835 transactions. Each line contains a list of products bought together by customers. Our objective is to do the market basket analysis to find actionable and meaningful patterns or associations with the purchasing of products by consumers. This knowledge could be leveraged in-store layout and boosting sales and online (e-commerce store) product recommendations. To do the market basket analysis for the given data set we are following the below steps,
1. Exploratory data analysis
2. Market basket analysis using R
3. Displaying the result to Analyse the insight
There are 9835 transaction line items with 169 columns. Transactions as item Matrix in sparse format with 9835 rows (elements/item sets/transactions) and 169 columns (items) and a density of 0.02609146.
Most frequently purchased products list
Density is the total number of none empty cells in the product data, in other words, the total number of products bought divided by the total number of possible products in the provided data.
In the most frequently purchased items, we can see that the whole milk is the most purchased in 2513 transactions. The second most purchased item is Vegetables 1903. If you add all these items, we get the total number of purchased items in the given dataset, which is 43,367 total items.
Elements section represents the size of the transactions, there are 2159 transactions that have only one item and 1643 transactions have only two items so on. There is only one transaction with 32 items and it occurred only once.
From the five-number summary stats, we can understand that the minimum transaction product is one and max is 32 with a median of 3 items per transaction. Also, it is a right-skewed distribution data. In the 2nd and 3rd quartile, we can interpret that 50% of the transactions have 2 to 6 products, only 25% of transactions have over 6 products purchased.
Analyzing the items that are having a purchase frequency of 10%
From the above graph, we can understand that these are the products having purchase frequency of 10% and above.
The above graph represents the top 20 frequently purchased products. This matrix will help in terms of product focus and layout. The top 5 products are Whole milk, Vegetables, Rolls/buns, Soda, and Yogurt.
In the rule setting section, we are determining the level of support and confidence we want for this basket analysis. So here, we are creating the basket with the Support level of 0.001, and the confidence level of 0.8. Here we are trying to build the baskets, which can significantly occur. This helps us to analyze the product's associate better.
Now let us interpret the rules output we have got for the given data set. Creating the rules to basket the products
Displaying the result to analyze the insight
With the given support and confidence parameter we have 410 rules in order to create the baskets. From the summary, we can understand that there are 29 rules for 3 products, 229 rules for 4 products, 140 rules for 5 products and 12 rules for 6 products.
Now let us inspect the first few rules to study them. Below listed are 20 rules for references.
Now let us plot the 20 rules in a graph for better representation,
An energetic entrepreneur with 12 years of corporate experience in the field of brokerage operations and functions. A Data Science Aspirant - Business Analytics and Business Intelligence postgraduate professional from a reputed B-School in India. His passion for Data Visualization, Web and social media Analytics helped him to become a learner, speaker, and writer in the space of SEO, Digital Marketing, and UI, UX designs.