Home Backend Development Python Tutorial Python语言实现机器学习的K-近邻算法

Python语言实现机器学习的K-近邻算法

Jun 06, 2016 am 11:18 AM
python

写在前面

额、、、最近开始学习机器学习嘛,网上找到一本关于机器学习的书籍,名字叫做《机器学习实战》。很巧的是,这本书里的算法是用Python语言实现的,刚好之前我学过一些Python基础知识,所以这本书对于我来说,无疑是雪中送炭啊。接下来,我还是给大家讲讲实际的东西吧。

什么是K-近邻算法?

简单的说,K-近邻算法就是采用测量不同特征值之间的距离方法来进行分类。它的工作原理是:存在一个样本数据集合,也称作训练样本集,并且样本集中每个数据都存在标签,即我们知道样本集中每一数据与所属分类的对应关系,输入没有标签的新数据之后,将新数据的每个特征与样本集中数据对应的特征进行比较,然后算法提取出样本集中特征最相似数据的分类标签。一般来说,我们只选择样本数据集中前k个最相似的数据,这就是K-近邻算法名称的由来。

提问:亲,你造K-近邻算法是属于监督学习还是无监督学习呢?

使用Python导入数据

从K-近邻算法的工作原理中我们可以看出,要想实施这个算法来进行数据分类,我们手头上得需要样本数据,没有样本数据怎么建立分类函数呢。所以,我们第一步就是导入样本数据集合。

建立名为kNN.py的模块,写入代码:

 from numpy import *
 import operator
 
 def createDataSet():
   group = array([[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])
   labels = ['A','A','B','B']
   return group, labels

Copy after login

代码中,我们需要导入Python的两个模块:科学计算包NumPy和运算符模块。NumPy函数库是Python开发环境的一个独立模块,大多数Python版本里没有默认安装NumPy函数库,因此这里我们需要单独安装这个模块。

下载地址:http://sourceforge.net/projects/numpy/files/

有很多的版本,这里我选择的是numpy-1.7.0-win32-superpack-python2.7.exe。

实现K-近邻算法

K-近邻算法的具体思想如下:

(1)计算已知类别数据集中的点与当前点之间的距离

(2)按照距离递增次序排序

(3)选取与当前点距离最小的k个点

(4)确定前k个点所在类别的出现频率

(5)返回前k个点中出现频率最高的类别作为当前点的预测分类

Python语言实现K-近邻算法的代码如下:

 # coding : utf-8
 from numpy import *
 import operator 
 import kNN
 group, labels = kNN.createDataSet()
 def classify(inX, dataSet, labels, k):
   dataSetSize = dataSet.shape[0] 
   diffMat = tile(inX, (dataSetSize,1)) - dataSet
   sqDiffMat = diffMat**2
   sqDistances = sqDiffMat.sum(axis=1)
   distances = sqDistances**0.5
   sortedDistances = distances.argsort()
   classCount = {}
   for i in range(k):
     numOflabel = labels[sortedDistances[i]]
     classCount[numOflabel] = classCount.get(numOflabel,0) + 1
   sortedClassCount = sorted(classCount.iteritems(), key=operator.itemgetter(1),reverse=True)
   return sortedClassCount[0][0]
 my = classify([0,0], group, labels, 3)
 print my

Copy after login

运算结果如下:

 输出结果是B:说明我们新的数据([0,0])是属于B类。

代码详解

相信有很多朋友们对上面这个代码有很多不理解的地方,接下来,我重点讲解几个此函数的关键点,以方便读者们和我自己回顾一下这个算法代码。

classify函数的参数:

inX:用于分类的输入向量
dataSet:训练样本集合
labels:标签向量
k:K-近邻算法中的k
shape:是array的属性,描述一个多维数组的维度

tile(inX, (dataSetSize,1)):把inX二维数组化,dataSetSize表示生成数组后的行数,1表示列的倍数。整个这一行代码表示前一个二维数组矩阵的每一个元素减去后一个数组对应的元素值,这样就实现了矩阵之间的减法,简单方便得不让你佩服不行!

axis=1:参数等于1的时候,表示矩阵中行之间的数的求和,等于0的时候表示列之间数的求和。

argsort():对一个数组进行非降序排序

classCount.get(numOflabel,0) + 1:这一行代码不得不说的确很精美啊。get():该方法是访问字典项的方法,即访问下标键为numOflabel的项,如果没有这一项,那么初始值为0。然后把这一项的值加1。所以Python中实现这样的操作就只需要一行代码,实在是很简洁高效。

后话

K-近邻算法(KNN)原理以及代码实现差不多就这样了,接下来的任务就是更加熟悉它,争取达到裸敲的地步。

以上所述上就是本文的全部内容了,希望大家能够喜欢。

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Is the conversion speed fast when converting XML to PDF on mobile phone? Is the conversion speed fast when converting XML to PDF on mobile phone? Apr 02, 2025 pm 10:09 PM

The speed of mobile XML to PDF depends on the following factors: the complexity of XML structure. Mobile hardware configuration conversion method (library, algorithm) code quality optimization methods (select efficient libraries, optimize algorithms, cache data, and utilize multi-threading). Overall, there is no absolute answer and it needs to be optimized according to the specific situation.

How to convert XML files to PDF on your phone? How to convert XML files to PDF on your phone? Apr 02, 2025 pm 10:12 PM

It is impossible to complete XML to PDF conversion directly on your phone with a single application. It is necessary to use cloud services, which can be achieved through two steps: 1. Convert XML to PDF in the cloud, 2. Access or download the converted PDF file on the mobile phone.

What is the function of C language sum? What is the function of C language sum? Apr 03, 2025 pm 02:21 PM

There is no built-in sum function in C language, so it needs to be written by yourself. Sum can be achieved by traversing the array and accumulating elements: Loop version: Sum is calculated using for loop and array length. Pointer version: Use pointers to point to array elements, and efficient summing is achieved through self-increment pointers. Dynamically allocate array version: Dynamically allocate arrays and manage memory yourself, ensuring that allocated memory is freed to prevent memory leaks.

Is there any mobile app that can convert XML into PDF? Is there any mobile app that can convert XML into PDF? Apr 02, 2025 pm 08:54 PM

An application that converts XML directly to PDF cannot be found because they are two fundamentally different formats. XML is used to store data, while PDF is used to display documents. To complete the transformation, you can use programming languages ​​and libraries such as Python and ReportLab to parse XML data and generate PDF documents.

How to convert xml into pictures How to convert xml into pictures Apr 03, 2025 am 07:39 AM

XML can be converted to images by using an XSLT converter or image library. XSLT Converter: Use an XSLT processor and stylesheet to convert XML to images. Image Library: Use libraries such as PIL or ImageMagick to create images from XML data, such as drawing shapes and text.

Recommended XML formatting tool Recommended XML formatting tool Apr 02, 2025 pm 09:03 PM

XML formatting tools can type code according to rules to improve readability and understanding. When selecting a tool, pay attention to customization capabilities, handling of special circumstances, performance and ease of use. Commonly used tool types include online tools, IDE plug-ins, and command-line tools.

What is the process of converting XML into images? What is the process of converting XML into images? Apr 02, 2025 pm 08:24 PM

To convert XML images, you need to determine the XML data structure first, then select a suitable graphical library (such as Python's matplotlib) and method, select a visualization strategy based on the data structure, consider the data volume and image format, perform batch processing or use efficient libraries, and finally save it as PNG, JPEG, or SVG according to the needs.

Is there a mobile app that can convert XML into PDF? Is there a mobile app that can convert XML into PDF? Apr 02, 2025 pm 09:45 PM

There is no APP that can convert all XML files into PDFs because the XML structure is flexible and diverse. The core of XML to PDF is to convert the data structure into a page layout, which requires parsing XML and generating PDF. Common methods include parsing XML using Python libraries such as ElementTree and generating PDFs using ReportLab library. For complex XML, it may be necessary to use XSLT transformation structures. When optimizing performance, consider using multithreaded or multiprocesses and select the appropriate library.

See all articles