Python聚类算法之基本K均值实例详解
本文实例讲述了Python聚类算法之基本K均值运算技巧。分享给大家供大家参考,具体如下:
基本K均值 :选择 K 个初始质心,其中 K 是用户指定的参数,即所期望的簇的个数。每次循环中,每个点被指派到最近的质心,指派到同一个质心的点集构成一个。然后,根据指派到簇的点,更新每个簇的质心。重复指派和更新操作,直到质心不发生明显的变化。
# scoding=utf-8 import pylab as pl points = [[int(eachpoint.split("#")[0]), int(eachpoint.split("#")[1])] for eachpoint in open("points","r")] # 指定三个初始质心 currentCenter1 = [20,190]; currentCenter2 = [120,90]; currentCenter3 = [170,140] pl.plot([currentCenter1[0]], [currentCenter1[1]],'ok') pl.plot([currentCenter2[0]], [currentCenter2[1]],'ok') pl.plot([currentCenter3[0]], [currentCenter3[1]],'ok') # 记录每次迭代后每个簇的质心的更新轨迹 center1 = [currentCenter1]; center2 = [currentCenter2]; center3 = [currentCenter3] # 三个簇 group1 = []; group2 = []; group3 = [] for runtime in range(50): group1 = []; group2 = []; group3 = [] for eachpoint in points: # 计算每个点到三个质心的距离 distance1 = pow(abs(eachpoint[0]-currentCenter1[0]),2) + pow(abs(eachpoint[1]-currentCenter1[1]),2) distance2 = pow(abs(eachpoint[0]-currentCenter2[0]),2) + pow(abs(eachpoint[1]-currentCenter2[1]),2) distance3 = pow(abs(eachpoint[0]-currentCenter3[0]),2) + pow(abs(eachpoint[1]-currentCenter3[1]),2) # 将该点指派到离它最近的质心所在的簇 mindis = min(distance1,distance2,distance3) if(mindis == distance1): group1.append(eachpoint) elif(mindis == distance2): group2.append(eachpoint) else: group3.append(eachpoint) # 指派完所有的点后,更新每个簇的质心 currentCenter1 = [sum([eachpoint[0] for eachpoint in group1])/len(group1),sum([eachpoint[1] for eachpoint in group1])/len(group1)] currentCenter2 = [sum([eachpoint[0] for eachpoint in group2])/len(group2),sum([eachpoint[1] for eachpoint in group2])/len(group2)] currentCenter3 = [sum([eachpoint[0] for eachpoint in group3])/len(group3),sum([eachpoint[1] for eachpoint in group3])/len(group3)] # 记录该次对质心的更新 center1.append(currentCenter1) center2.append(currentCenter2) center3.append(currentCenter3) # 打印所有的点,用颜色标识该点所属的簇 pl.plot([eachpoint[0] for eachpoint in group1], [eachpoint[1] for eachpoint in group1], 'or') pl.plot([eachpoint[0] for eachpoint in group2], [eachpoint[1] for eachpoint in group2], 'oy') pl.plot([eachpoint[0] for eachpoint in group3], [eachpoint[1] for eachpoint in group3], 'og') # 打印每个簇的质心的更新轨迹 for center in [center1,center2,center3]: pl.plot([eachcenter[0] for eachcenter in center], [eachcenter[1] for eachcenter in center],'k') pl.show()
运行效果截图如下:
希望本文所述对大家Python程序设计有所帮助。

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



There is no built-in sum function in C language, so it needs to be written by yourself. Sum can be achieved by traversing the array and accumulating elements: Loop version: Sum is calculated using for loop and array length. Pointer version: Use pointers to point to array elements, and efficient summing is achieved through self-increment pointers. Dynamically allocate array version: Dynamically allocate arrays and manage memory yourself, ensuring that allocated memory is freed to prevent memory leaks.

There is no absolute salary for Python and JavaScript developers, depending on skills and industry needs. 1. Python may be paid more in data science and machine learning. 2. JavaScript has great demand in front-end and full-stack development, and its salary is also considerable. 3. Influencing factors include experience, geographical location, company size and specific skills.

Although distinct and distinct are related to distinction, they are used differently: distinct (adjective) describes the uniqueness of things themselves and is used to emphasize differences between things; distinct (verb) represents the distinction behavior or ability, and is used to describe the discrimination process. In programming, distinct is often used to represent the uniqueness of elements in a collection, such as deduplication operations; distinct is reflected in the design of algorithms or functions, such as distinguishing odd and even numbers. When optimizing, the distinct operation should select the appropriate algorithm and data structure, while the distinct operation should optimize the distinction between logical efficiency and pay attention to writing clear and readable code.

!x Understanding !x is a logical non-operator in C language. It booleans the value of x, that is, true changes to false, false changes to true. But be aware that truth and falsehood in C are represented by numerical values rather than boolean types, non-zero is regarded as true, and only 0 is regarded as false. Therefore, !x deals with negative numbers the same as positive numbers and is considered true.

C language identifiers cannot contain spaces because they can cause confusion and difficulty in maintaining. The specific rules are as follows: they must start with letters or underscores. Can contain letters, numbers, or underscores. Cannot contain illegal characters (such as special symbols).

There is no built-in sum function in C for sum, but it can be implemented by: using a loop to accumulate elements one by one; using a pointer to access and accumulate elements one by one; for large data volumes, consider parallel calculations.

In C language, snake nomenclature is a coding style convention, which uses underscores to connect multiple words to form variable names or function names to enhance readability. Although it won't affect compilation and operation, lengthy naming, IDE support issues, and historical baggage need to be considered.

The H5 page needs to be maintained continuously, because of factors such as code vulnerabilities, browser compatibility, performance optimization, security updates and user experience improvements. Effective maintenance methods include establishing a complete testing system, using version control tools, regularly monitoring page performance, collecting user feedback and formulating maintenance plans.
