Algorithm: K-Means

阅读量：373 次

发布时间：2019-03-04

本文共 3098 字，大约阅读时间需要 10 分钟。

K-Means????

K-Means ????

K-Means???????????????????????????????????????????????????????????????????????????????????????????

K-Means ?????

K-Means????????????????????

???????????????????

???????????????????????????????????????

????????????????????????????

?????????????????????????????????????????????

K-Means ?????

K-Means???????????????

??????????????????????????????

?????????????????????????????

???????CRM???????????????????????

K-Means ?????

Python ??????

# ??????from copy import deepcopyimport numpy as npimport pandas as pdimport matplotlib.pyplot as plt# ????k = 3# ?????????X = np.random.random((200, 2)) * 10# ??????????C_x = np.random.choice(range(0, int(np.max(X[:, 0])), size=k, replace=False))C_y = np.random.choice(range(0, int(np.max(X[:, 1])), size=k, replace=False))C = np.array(list(zip(C_x, C_y)), dtype=np.float32)# ????????print("????????", C)# ????????????plt.scatter(X[:, 0], X[:, 1], c='#050505', s=7)plt.scatter(C[:, 0], C[:, 1], marker='*', s=300, c='g')plt.show()

??????

# ?????C_old = np.zeros(C.shape)clusters = np.zeros(len(X))# ????????def dist(a, b, ax=1):    return np.linalg.norm(a - b, axis=ax)# ???????????error = dist(C, C_old, None)while error != 0:    # ???????    for i in range(len(X)):        distances = dist(X[i], C)        clusters[i] = np.argmin(distances)        # ????????    C_old = deepcopy(C)        # ??????    for i in range(k):        points = [X[j] for j in range(len(X)) if clusters[j] == i]        C[i] = np.mean(points, axis=0)        # ????    error = dist(C, C_old, None)    # ??????    colors = ['r', 'g', 'b', 'y', 'c', 'm']    fig, ax = plt.subplots()    for i in range(k):        points = np.array([X[j] for j in range(len(X)) if clusters[j] == i])        ax.scatter(points[:, 0], points[:, 1], s=7, c=colors[i])    ax.scatter(C[:, 0], C[:, 1], marker='*', s=200, c='#050505')    plt.show()

K-Means ?????

1. ????

K-Means??????????????????????????????????????????????

2. ?????

??K-Means???????????????????????????????????????????????????????????????????????????????

K-Means ????????

1. K-Means vs Hard Clustering

K-Means?????????????????????????????????????????Soft Clustering????GMM?????????????????????

2. K-Means ????

????????K-Means????????????????????????????