Balanced clustering

Balanced clustering is a special case of clustering where, in the strictest sense, cluster sizes are constrained to n k {\displaystyle \lfloor {n \over k}\rfloor } or n k {\displaystyle \lceil {n \over k}\rceil } , where n {\displaystyle n} is the number of points and k {\displaystyle k} is the number of clusters.[1] A typical algorithm is balanced k-means, which minimizes mean square error (MSE). Another type of balanced clustering called balance-driven clustering has a two-objective cost function that minimizes both the imbalance and the MSE. Typical cost functions are ratio cut[2] and Ncut.[3] Balanced clustering can be used for example in scenarios where freight has to be delivered to n {\displaystyle n} locations with k {\displaystyle k} cars. It is then preferred that each car delivers to an equal number of locations.

Software

There exists implementations for balanced k-means[4] and Ncut[5]

References

  1. ^ M. I. Malinen and P. Fränti (August 2014). "Balanced K-Means for Clustering". Structural, Syntactic, and Statistical Pattern Recognition. Lecture Notes in Computer Science. Vol. 8621. pp. 32–41. doi:10.1007/978-3-662-44415-3_4. ISBN 978-3-662-44414-6.
  2. ^ L. Hagen and A. B. Kahng (1992). "New spectral methods for ratio cut partitioning and clustering". IEEE Transactions on Computer-Aided Design. 11 (9): 1074–1085. doi:10.1109/43.159993.
  3. ^ J. Shi and J. Malik (2000). "Normalized cuts and image segmentation". IEEE Transactions on Pattern Analysis and Machine Intelligence. 22 (8): 888–905. doi:10.1109/34.868688.
  4. ^ M. I. Malinen and P. Fränti. "Balanced k-Means implementation". University of Eastern Finland.
  5. ^ T. Cour, S. Yu and J. Shi. "Ncut implementation". University of Pennsylvania.

Levin, M. Sh. (2017). "On Balanced Clustering (Indices, Models, Examples)". Journal of Communications Technology and Electronics. 62 (12): 1506–1515. doi:10.1134/S1064226917120105. S2CID 255277095.