Python语言PPT14.3k均值和k中心点聚类 (2).pptx
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- Python语言PPT14.3k均值和k中心点聚类 2 Python 语言 PPT14 均值 中心点
- 资源描述:
-
1、K-means快速聚类出发点:希望克服层次聚类法在大样本时产生的出发点:希望克服层次聚类法在大样本时产生的困难,提高聚类效率困难,提高聚类效率做法:做法:通过用户通过用户事先指定聚类数目事先指定聚类数目的方式提高效率的方式提高效率因此,分层聚类可以对不同的聚类数而产生一因此,分层聚类可以对不同的聚类数而产生一系列的聚类解,而快速聚类只能产生单一的聚系列的聚类解,而快速聚类只能产生单一的聚类解类解Company Namewww.SucaiFK-means聚类是一种基于距离的聚类算法,采用距离作为相似是一种基于距离的聚类算法,采用距离作为相似性的评价指标。性的评价指标。ABCDEFIGJHK-me
2、ans聚类算法举例怎样把A,B.H,J 分成两类呢?A(1,4)B(2,4)CDEFIGJHRandomly choose A,B as the centre and K=2.ExampleAAdABdACdADdAFdAGdAHdAIdAJdAEdBAdBBdBCdBDdBFdBGdBHdBIdBJdBEd0111.412.243.614.475.394.245101.41122.833.614.473.614.24 So,we classify A,C as a cluster and B,E,D,F,G,H,I and J as another cluster.Step 1 and 2.
3、ABCDEFGHIJABdmeans distance ABACdBCdA(1,4)B(2,4)CDEFIGJHRandomly choose A,B as the centre and K=2.ExampleStep 3.),(jyixcenterji)5.4,1()254,211(,CA)875.2,75.3(,JIHGFEDB The new centers of the two clusters are(1,4.5)and(3.75,2.875)cluster 1cluster 2new centerABCDEFIGJH(1,4.5)(3.75,2.875)ExampleAdBdCdD
4、dFdGdHdIdJdEdAdBdCdDdFdGdHdIdJdEd0.51.120.51.121.83.914.725.594.615.322.972.083.482.753.580.911.532.411.892.25Step 2 again.,as the centre and K=2.So,we classify A,B,C,D,E as a cluster and F,G,H,I,J as another cluster.ABCDEFGHIJABCDEFIGJH(1,4.5)(3.75,2.875)ExampleStep 3 again.,as the centre and K=2.)
5、,(jyixcenterji)8.4,6.1(,EDCBAP)6.1,8.4(,JIHGFQ The new centers of the t w o c l u s t e r s a r e P(1.6,4.8)and Q(4.8,1.6)cluster 2cluster 1new centerABCDEFIGJHP(1.6,4.8)Q(4.8,1.6)ExamplePAdPBdPCdPDdPFdPGdPHdPIdPJdPEdQAdQBdQCdQDdQFdQGdQHdQIdQJdQEd10.890.630.451.263.694.405.224.495.104.493.695.104.45
6、.220.890.451.2610.63Step 2 again.So,we classify A,B,C,D,E as a cluster and F,G,H,I,J as another cluster.ABCDEFGHIJABCDEFIGJHP(1.6,4.8)Q(4.8,1.6)ExampleStep 3 again.),(jyixcenterji)8.4,6.1(,EDCBAM)6.1,8.4(,JIHGFN The new centers of the two clusters are equal to the original P(1.6,4.8)and Q(4.8,1.6)P,
7、Q as the centre and K=2.new centercluster 2cluster 1FinalABCDEFIGJHcluster 1cluster 2Clustering finished!Disadvantages o n e o f t h e m a i n disadvantages to k-means is the fact that you must specify the number of clusters(K)as an input to the algorithm.As designed,the algorithm is not capable of
8、determining the appropriate number of clusters and depends upon the user to identify this in advance.K=2K=3K-means聚类Example012345678910012345678910012345678910012345678910012345678910012345678910012345678910012345678910012345678910012345678910K=2Arbitrarily choose K object as initial cluster centerA
展开阅读全文
