eAxis Analytics: K-Means Clustering Demo

Developed Using Flask and Apache Running on Docker

 
This demo performs K-Means Clustering (unsupervised classification). The algorithm is coded using scikit-learn in Python running on Flask and Apache in a Docker container. To run the demo, choose the number of clusters (default is 2), select a dataset, and click the Classify button. The output is a zip file that contains an Excel spreadsheet with three sheets: one that lists each row of text in the dataset along with its cluster number, one with the top keywords for each cluster and one that summarizes the results including the total count for each cluster as well as a simple bar chart of this data.
 
Two options are available for selecting a dataset. You can upload your own dataset, or use the default dataset of 250 Amazon reviews provided below. If you upload a dataset, it must contain two columns of comma separated data with the following headers: id and text. The id column consists of numbers that increment beginning at 1, and the text column consists of sentences (text strings) enclosed in parentheses. An example of the data file format is as follows:
 
 
id,text
1,"This movie was fantastic."
2,"Hated it!"
3,"The best movie I ever saw."
4,"Good acting and directing but the plot was confusing."
 

Perform K-Means Clustering:

 

Process Default Dataset:

Enter Number of Clusters:  (min=2, max=10 and default=2)
 
 
 
If you are interested in learning more, Email Us or use the form on the Contact Us page.
 
 

 
© Copyright 2019 by Michael F. Suesserman, PhD
All Rights Reserved