Back to: Data Science Tutorials
Association Rules and Their Use Cases in Machine Learning
In this article, I am going to discuss Association Rules and their Use Cases in Machine Learning with Examples. Please read our previous article where we discussed TF-IDF and Cosine Similarity in Machine and their application to Vector Space Model with Examples.
Association Rules in Machine Learning
The Association Rule is a rule-based machine learning method for identifying associations between unrelated elements using pattern recognition.
It has been employed in the retail business since the 1990s to aid in the analysis of products that are purchased at the same time by customers. This can aid store managers in devising better product positioning, discounting, and stock management tactics, among other things. Of course, this technique can be applied to a variety of sectors.
Take the case of tea and sugar, which are usually purchased together. You might enhance revenue by using the information to:
- Putting tea and sugar near to one other so that customers who buy one don’t have to walk far to get the other.
- Advertising to consumers who purchase either tea or milk in order to boost their likelihood of purchasing the associated other product.
- If a consumer buys both milk and tea at the same time, give them a discount.
First and foremost, an association rule consists of two parts:
- An antecedent – it is a piece of information found in the data.
- A consequent – an object found in combination with the antecedent.
With the example of tea and milk, we can get the following expression by applying the association rule to our prior use case. “If tea is purchased, the potential of purchasing sugar is” It is exemplified by –{tea} → {sugar}
Ways to Measure Association in Machine Learning?
There are three main ways to measure association.
Let’s use our coffee and sugar analogy to illustrate these points. Consider the following situation:
- There have been 6000 transactions in the store.
- The total number of transactions for Coffee (C) is 1600.
- The total number of transactions for Sugar (S) is 1000.
- The total number of transactions for both tea and sugar (C S) is 800.
1. Support –
This is the item’s relative frequency in the supplied dataset (all the transactions). It indicates the things’ popularity, as well as their share of total items sold.
Based on the above-mentioned Example –
2. Confidence –
Given that the data also contains the antecedent item, this relates to the chance of seeing the subsequent item inside the data. In other words, it indicates the likelihood of one thing being purchased if another is purchased.
Based on the above-mentioned Example –
This means that there is a 50% possibility of buying sugar with tea.
3. Lift –
This metric counts how frequently the antecedent and consequent occur together rather than separately.
- If the Lift score is less than one, it is doubtful that S will be acquired if C is purchased.
- If the Lift score is more than one, C is strongly linked to S., To put it another way, if C is bought, it’s likely that S will be bought as well.
- If the Lift score is 1, there is no correlation between C and S.
This score reflects that if tea is purchased, it is likely that sugar will be purchased together as well.
Implementing Association Rule Mining in Machine Learning
One of the most important concepts in data mining and machine learning is association rule mining, which is simply used to find the occurrence pattern in a huge dataset. We devise a set of rules to determine how the placement of various items affects one another. Telephone contacting patterns, suspicious activity patterns, patterns in sickness symptoms, and client shopping habits are all examples of these patterns. Here, we’ll concentrate on customer shopping habits, which is better referred to as Market Basket Analysis.
Market Basket Analysis is one of the most widely used methods for determining the best product placement in a store and determining offers that enhance overall sales. The objective is to bring together a group of products that are interdependent in terms of their use. This will almost certainly increase sales because grouping them together will recall or encourage people to purchase the related item. To tackle this, we create all feasible sets of product association rules and determine which ones are the most effective. Now the challenge is: how can we create and test these association rules to see how effective they are? The Apriori Algorithm is the answer.
We name it ‘Apriori’ because it finds relationships and patterns using prior knowledge, such as existing transactions. Before we go into how it works, let’s have a look at its characteristics:
- Subsets of a frequent itemset should also be frequent, according to the downward closure property.
- Every infrequent item subset has a superset of infrequent items.
We currently know how to select strong association rules using multiple measurements, and we’re ready to learn how the Apriori algorithm works in its entirety. Let’s have a look at the flowchart that demonstrates how the Apriori algorithm works –
We must determine the Support and Confidence thresholds. Example –
- 30 percent support threshold
- 70 percent Confidence Threshold
These criteria are merely a minimal requirement for selecting popular Item sets and strong association rules during trimming.
We must also recognize that these threshold values should be determined by the type of commodities and the size of the market. In the actual world, there will be so many things in any business that you can’t establish a support level of 30% because not all items are part of our daily necessities and may be less popular.
In the next article, I am going to discuss the Recommendation Engine and its Working in Machine Learning with Examples. Here, in this article, I try to explain Association Rules and their Use Cases in Machine Learning with Examples. I hope you enjoy this Association Rules and Their Use Cases in Machine Learning with Examples article.