Primitive Array Clustering

Hadoop implementation of Canopy Clustering using Levenshtein distance algorithm and other non-mathematical distance measures (such as Jaccard coefficient). Difference with Mahout Needless to say clustering algorithms (such as K-Means) use a mathematical approach in order to compute Clusters' centers. Each time a new point is added to a cluster, Mahout framework recomputes cluster's center as … Continue reading Primitive Array Clustering