Path: blob/master/notebooks/book1/22/matrix_factorization_recommender_surprise_lib.ipynb
1192 views
Kernel: Python 3
Matrix Factorization for Movie Lens Recommendations using Surprise library
In [109]:
Surprise library for collaborative filtering
http://surpriselib.com/ Simple Python RecommendatIon System Engine
In [110]:
Out[110]:
Requirement already satisfied: surprise in /usr/local/lib/python3.7/dist-packages (0.1)
Requirement already satisfied: scikit-surprise in /usr/local/lib/python3.7/dist-packages (from surprise) (1.1.1)
Requirement already satisfied: six>=1.10.0 in /usr/local/lib/python3.7/dist-packages (from scikit-surprise->surprise) (1.15.0)
Requirement already satisfied: numpy>=1.11.2 in /usr/local/lib/python3.7/dist-packages (from scikit-surprise->surprise) (1.19.5)
Requirement already satisfied: scipy>=1.0.0 in /usr/local/lib/python3.7/dist-packages (from scikit-surprise->surprise) (1.4.1)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/dist-packages (from scikit-surprise->surprise) (1.0.1)
In [111]:
In [112]:
In [113]:
In [114]:
Out[114]:
[6040, 3706, 1000209]
Setting Up the Ratings Data
We read the data directly from MovieLens website, since they don't allow redistribution. We want to include the metadata (movie titles, etc), not just the ratings matrix.
In [115]:
Out[115]:
--2021-04-20 14:51:23-- http://files.grouplens.org/datasets/movielens/ml-1m.zip
Resolving files.grouplens.org (files.grouplens.org)... 128.101.65.152
Connecting to files.grouplens.org (files.grouplens.org)|128.101.65.152|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5917549 (5.6M) [application/zip]
Saving to: ‘ml-1m.zip.4’
ml-1m.zip.4 100%[===================>] 5.64M 6.80MB/s in 0.8s
2021-04-20 14:51:24 (6.80 MB/s) - ‘ml-1m.zip.4’ saved [5917549/5917549]
Archive: ml-1m.zip
replace ml-1m/movies.dat? [y]es, [n]o, [A]ll, [N]one, [r]ename: N
ml-1m ml-1m.zip.1 ml-1m.zip.3 sample_data
ml-1m.zip ml-1m.zip.2 ml-1m.zip.4
In [116]:
In [117]:
In [118]:
Out[118]:
In [119]:
Out[119]:
Toy Story (1995)
Schindler's List (1993)
In [120]:
Out[120]:
Animation|Children's|Comedy
Drama|War
In [121]:
Out[121]:
In [122]:
Out[122]:
uid inner 0, raw 1, iid inner 0, raw 1193, rating 5.0
uid inner 0, raw 1, iid inner 1, raw 661, rating 3.0
uid inner 0, raw 1, iid inner 2, raw 914, rating 3.0
uid inner 0, raw 1, iid inner 3, raw 3408, rating 4.0
uid inner 0, raw 1, iid inner 4, raw 2355, rating 5.0
uid inner 0, raw 1, iid inner 5, raw 1197, rating 3.0
In [123]:
Out[123]:
['1193', '661', '914', '3408', '2355', '1197', '1287', '2804', '594', '919']
<class 'str'>
3706
In [124]:
Out[124]:
['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
6040
In [125]:
Out[125]:
69
[(1231, 3.0), (1113, 3.0), (559, 4.0), (97, 3.0), (255, 3.0), (1240, 2.0), (597, 4.0), (631, 1.0), (1448, 3.0), (1195, 3.0), (540, 3.0), (338, 5.0), (54, 4.0), (78, 3.0), (246, 3.0), (985, 3.0), (266, 3.0), (275, 2.0), (1454, 2.0), (159, 3.0), (859, 5.0), (156, 4.0), (1237, 2.0), (244, 3.0), (99, 4.0), (104, 3.0), (151, 3.0), (1456, 3.0), (199, 4.0), (1891, 2.0), (945, 3.0), (88, 4.0), (799, 3.0), (617, 3.0), (1450, 4.0), (213, 5.0), (669, 5.0), (1922, 4.0), (92, 3.0), (1518, 3.0), (1529, 3.0), (829, 2.0), (1468, 3.0), (132, 1.0), (955, 3.0), (1249, 3.0), (882, 3.0), (167, 4.0), (315, 3.0), (41, 4.0), (381, 4.0), (210, 4.0), (2451, 3.0), (135, 3.0), (1511, 3.0), (15, 4.0), (711, 3.0), (215, 2.0), (1475, 4.0), (216, 3.0), (728, 4.0), (1154, 3.0), (67, 4.0), (434, 5.0), (1228, 3.0), (48, 5.0), (546, 4.0), (1009, 2.0), (170, 3.0)]
['1248', '2991', '1252', '589', '6', '1267', '1276', '1292', '905', '910', '1446', '913', '3068', '1610', '1617', '942', '3083', '2289', '955', '3095', '3417', '3418', '3435', '296', '3654', '2858', '457', '3683', '1304', '1177', '1179', '1188', '3101', '2300', '2186', '1387', '858', '3307', '110', '3341', '3504', '3362', '3366', '2571', '164', '2726', '1783', '318', '2599', '1961', '34', '1036', '3720', '3735', '2935', '2791', '2000', '1201', '2944', '2947', '2948', '2949', '1213', '1221', '1222', '2028', '1233', '707', '1244']
3637
Join with meta data
In [126]:
In [127]:
Out[127]:
Fit/ predict
In [128]:
Out[128]:
Processing epoch 0
Processing epoch 1
Processing epoch 2
Processing epoch 3
Processing epoch 4
Processing epoch 5
Processing epoch 6
Processing epoch 7
Processing epoch 8
Processing epoch 9
Processing epoch 10
Processing epoch 11
Processing epoch 12
Processing epoch 13
Processing epoch 14
Processing epoch 15
Processing epoch 16
Processing epoch 17
Processing epoch 18
Processing epoch 19
<surprise.prediction_algorithms.matrix_factorization.SVD at 0x7fc9540a6bd0>
In [129]:
Out[129]:
Visualize matrix of predictions
In [130]:
Out[130]:
69
[(1231, 3.0), (1113, 3.0), (559, 4.0), (97, 3.0), (255, 3.0), (1240, 2.0), (597, 4.0), (631, 1.0), (1448, 3.0), (1195, 3.0), (540, 3.0), (338, 5.0), (54, 4.0), (78, 3.0), (246, 3.0), (985, 3.0), (266, 3.0), (275, 2.0), (1454, 2.0), (159, 3.0), (859, 5.0), (156, 4.0), (1237, 2.0), (244, 3.0), (99, 4.0), (104, 3.0), (151, 3.0), (1456, 3.0), (199, 4.0), (1891, 2.0), (945, 3.0), (88, 4.0), (799, 3.0), (617, 3.0), (1450, 4.0), (213, 5.0), (669, 5.0), (1922, 4.0), (92, 3.0), (1518, 3.0), (1529, 3.0), (829, 2.0), (1468, 3.0), (132, 1.0), (955, 3.0), (1249, 3.0), (882, 3.0), (167, 4.0), (315, 3.0), (41, 4.0), (381, 4.0), (210, 4.0), (2451, 3.0), (135, 3.0), (1511, 3.0), (15, 4.0), (711, 3.0), (215, 2.0), (1475, 4.0), (216, 3.0), (728, 4.0), (1154, 3.0), (67, 4.0), (434, 5.0), (1228, 3.0), (48, 5.0), (546, 4.0), (1009, 2.0), (170, 3.0)]
['1248', '2991', '1252', '589', '6', '1267', '1276', '1292', '905', '910', '1446', '913', '3068', '1610', '1617', '942', '3083', '2289', '955', '3095', '3417', '3418', '3435', '296', '3654', '2858', '457', '3683', '1304', '1177', '1179', '1188', '3101', '2300', '2186', '1387', '858', '3307', '110', '3341', '3504', '3362', '3366', '2571', '164', '2726', '1783', '318', '2599', '1961', '34', '1036', '3720', '3735', '2935', '2791', '2000', '1201', '2944', '2947', '2948', '2949', '1213', '1221', '1222', '2028', '1233', '707', '1244']
1248
3.0
In [131]:
Out[131]:
2.0
0
In [132]:
Out[132]:
<matplotlib.colorbar.Colorbar at 0x7fc983f4f110>