Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
suyashi29
GitHub Repository: suyashi29/python-su
Path: blob/master/Machine Learning Unsupervised Methods/ Day1 ARM 2.ipynb
3074 views
Kernel: Python 3 (ipykernel)

Association Rule Mining (ARM) is a popular unsupervised learning technique used to discover interesting relationships between variables in large datasets. The most common application of ARM is in market basket analysis, where the goal is to find associations between items that frequently co-occur in transactions.

pip install wordcloud --trusted-host pypi.org --trusted-host files.pythonhosted.org mlxtend

import pandas as pd from mlxtend.frequent_patterns import apriori, association_rules from mlxtend.preprocessing import TransactionEncoder # Sample dataset: transactions represented as a list of lists dataset = [['milk', 'bread', 'butter'], ['beer', 'bread'], ['milk', 'diapers', 'beer', 'bread'], ['butter', 'diapers', 'milk', 'beer', 'bread'], ['butter', 'diapers', 'milk', 'beer']]
# Convert dataset into a one-hot encoded DataFrame te = TransactionEncoder() te_ary = te.fit(dataset).transform(dataset) df = pd.DataFrame(te_ary, columns=te.columns_) df
# Generate frequent itemsets with a minimum support of 0.6 frequent_itemsets = apriori(df, min_support=0.6, use_colnames=True) print("Frequent Itemsets:\n", frequent_itemsets)
Frequent Itemsets: support itemsets 0 0.8 (beer) 1 0.8 (bread) 2 0.6 (butter) 3 0.6 (diapers) 4 0.8 (milk) 5 0.6 (beer, bread) 6 0.6 (beer, diapers) 7 0.6 (beer, milk) 8 0.6 (bread, milk) 9 0.6 (butter, milk) 10 0.6 (milk, diapers) 11 0.6 (beer, milk, diapers)
frequent_itemsets = apriori(df, min_support=0.6, use_colnames=True)
pen_book = 3 pen=4 book=3 s_b = 6/10 s_p = 7/10 print("support for book",s_b) print("support for pen",s_p) print("How much prob that if one buys book will also purcahse pen?") C_p_b = 3/6 C_b_p = 4/7 print(C_p_b) print(C_b_p) Lift= C_p_b/s_p
support for book 0.6 support for pen 0.7 How much prob that if one buys book will also purcahse pen? 0.5 0.5714285714285714
# Generate association rules with a minimum confidence of 0.7 rules = association_rules(frequent_itemsets, metric="confidence", min_threshold=0.7) print("\nAssociation Rules:\n", rules)
Association Rules: antecedents consequents antecedent support consequent support \ 0 (beer) (bread) 0.8 0.8 1 (bread) (beer) 0.8 0.8 2 (beer) (diapers) 0.8 0.6 3 (diapers) (beer) 0.6 0.8 4 (beer) (milk) 0.8 0.8 5 (milk) (beer) 0.8 0.8 6 (bread) (milk) 0.8 0.8 7 (milk) (bread) 0.8 0.8 8 (butter) (milk) 0.6 0.8 9 (milk) (butter) 0.8 0.6 10 (milk) (diapers) 0.8 0.6 11 (diapers) (milk) 0.6 0.8 12 (beer, milk) (diapers) 0.6 0.6 13 (beer, diapers) (milk) 0.6 0.8 14 (milk, diapers) (beer) 0.6 0.8 15 (beer) (milk, diapers) 0.8 0.6 16 (milk) (beer, diapers) 0.8 0.6 17 (diapers) (beer, milk) 0.6 0.6 support confidence lift leverage conviction zhangs_metric 0 0.6 0.75 0.937500 -0.04 0.8 -0.25 1 0.6 0.75 0.937500 -0.04 0.8 -0.25 2 0.6 0.75 1.250000 0.12 1.6 1.00 3 0.6 1.00 1.250000 0.12 inf 0.50 4 0.6 0.75 0.937500 -0.04 0.8 -0.25 5 0.6 0.75 0.937500 -0.04 0.8 -0.25 6 0.6 0.75 0.937500 -0.04 0.8 -0.25 7 0.6 0.75 0.937500 -0.04 0.8 -0.25 8 0.6 1.00 1.250000 0.12 inf 0.50 9 0.6 0.75 1.250000 0.12 1.6 1.00 10 0.6 0.75 1.250000 0.12 1.6 1.00 11 0.6 1.00 1.250000 0.12 inf 0.50 12 0.6 1.00 1.666667 0.24 inf 1.00 13 0.6 1.00 1.250000 0.12 inf 0.50 14 0.6 1.00 1.250000 0.12 inf 0.50 15 0.6 0.75 1.250000 0.12 1.6 1.00 16 0.6 0.75 1.250000 0.12 1.6 1.00 17 0.6 1.00 1.666667 0.24 inf 1.00