Path: blob/main/L4assets/DSandMLOpsAssets/HandsOn/Notebooks/DS comparing clusters.ipynb
1928 views
Compare SPSS and AgglomerativeClustering accident clusters
CPDaaS: Make sure to first insert a "project token"
Click on the three vertical dots icon in the uper right of the screen, then click on Insert project token
Once inserted, execute the cell.
A project token is only available if you followed the prerequesite instructions to create on in your project.
Read the cluster files
The columns are renamed so they match for the three sets
Display the cluster centers on a map
Comparison conclusion
Creating a list of 5 clusters with SPSS modeler was much easier than through a notebook. The SPSS clustering was done both on the limited dataset used by sklearn and using the entire dataset. Both cluster sets creation completed in a few seconds.
Both SPSS cluster sets end up relatively in the same positions. This seems to indicate that the choice for limiting the number of input records was valid but using the complete input set should be more precise.
The resulting cluster sets are relatively similar between sklearn and SPSS. It would take a large effort to really figure out which one is better. In this lab, we use the clusters from the complete set of accidents.
Author
Jacques Roy is a member of the IBM Enablement for Data and AI
Copyright © 2023. This notebook and its source code are released under the terms of the MIT License.