What is the decision tree algorithm?
English name: Decision Tree
The decision tree is a typical classification method. The data is first processed, the inductive algorithm is used to generate readable rules and decision trees, and then the decision is used to analyze the new data. . Essentially a decision tree is the process of classifying data through a series of rules.
Decision tree is a supervised learning method, mainly used for classification and regression. The goal of the algorithm is to create a model that predicts the target variable by inferring data features and learning decision rules.
The decision tree is similar to the if-else structure. The result is that you have to generate a tree that can continuously judge and select from the root of the tree to the leaf nodes. But the if-else judgment conditions here are not manually set, but automatically generated by the computer based on the algorithm we provide.
Decision tree elements
Decision points
are several possibilities The choice of plan is the best plan chosen in the end. If the decision is a multi-level decision, there can be multiple decision points in the middle of the decision tree, and the decision point at the root of the decision tree is the final decision plan.
State node
represents the economic effect (expected value) of the alternative. By comparing the economic effect of each state node, according to a certain Decision criteria can be used to select the best solution. The branches derived from the state nodes are called probability branches. The number of probability branches represents the number of possible natural states that may occur. The probability of the occurrence of the state must be noted on each branch.
Result Node
Mark the profit and loss value of each plan under various natural states on the right end of the result node
Advantages and disadvantages of decision tree group
Advantages of decision tree
Easy to understand, clear principles, decision tree can be visualized
The reasoning process is easy to understand, and the decision-making reasoning process can be expressed in the if-else form
The reasoning process completely depends on the value characteristics of the attribute variables
Can automatically ignore attribute variables that do not contribute to the target variable, and also provide a reference for judging the importance of attribute variables and reducing the number of variables
Disadvantages of decision trees
It is possible to establish overly complex rules, that is, overfitting.
Decision trees are sometimes unstable, because small changes in the data may generate completely different decision trees.
Learning the optimal decision tree is an NP-complete problem. Therefore, actual decision tree learning algorithms are based on heuristic algorithms, such as greedy algorithms that achieve local optimal values at each node. Such an algorithm cannot guarantee to return a globally optimal decision tree. This problem can be alleviated by training multiple decision trees by randomly selecting features and samples.
Some problems are very difficult to learn because decision trees are difficult to express. Such as: XOR problem, parity check or multiplexer problem
If some factors dominate, the decision tree is biased. Therefore, it is recommended to balance the influencing factors of the data before fitting the decision tree.
Common algorithms for decision trees
There are many algorithms for decision trees, including CART, ID3, C4.5, C5.0, etc. Among them, ID3, C4.5, C5.0 is based on information entropy, while CART uses an index similar to entropy as a classification decision. After the decision tree is formed, it must be pruned.
Entropy: The degree of disorder of the system
ID3 algorithm
The ID3 algorithm is a classification decision tree algorithm. He finally classified the data into the form of a decision tree through a series of rules, and the basis of classification was entropy.
The ID3 algorithm is a classic decision tree learning algorithm proposed by Quinlan. The basic idea of the ID3 algorithm is to use information entropy as a measure for attribute selection of decision tree nodes. Each time, the attribute with the most information is prioritized, that is, the attribute that can minimize the entropy value to construct an entropy value. The fastest descending decision tree has an entropy value of 0 to the leaf node. At this time, the instances in the instance set corresponding to each leaf node belong to the same class.
Use the ID3 algorithm to realize early warning analysis of customer churn and find out the characteristics of customer churn to help telecommunications companies improve customer relationships in a targeted manner and avoid customer churn
Use the decision tree method to conduct Data mining generally has the following steps: data preprocessing, decision tree mining operations, pattern evaluation and application.
C4.5 algorithm
C4.5 is a further extension of ID3, which removes the limitations of features by discretizing continuous attributes. C4.5 converts the training tree into a series of if-then grammar rules. The accuracy of these rules can be determined to determine which ones should be adopted. If accuracy can be improved by removing a rule, pruning should be implemented.
The core algorithm of C4.5 and ID3 is the same, but the method used is different. C4.5 uses the information gain rate as the basis for division, which overcomes the problem of using information in the ID3 algorithm. Gain partitioning causes attribute selection to favor attributes with more values.
C5.0 algorithm
C5.0 uses smaller memory than C4.5, establishes smaller decision rules, and is more accurate.
CART algorithm
Classification and Regression Tree (CART - Classification And Regression Tree)) is a very interesting and very effective non-parametric classification and regression method. It achieves prediction purposes by constructing a binary tree. The classification and regression tree CART model was first proposed by Breiman et al. and has been commonly used in the field of statistics and data mining technology. It constructs prediction criteria in a completely different way from traditional statistics. It is given in the form of a binary tree, which is easy to understand, use and interpret. The prediction tree constructed by the CART model is in many cases more accurate than the algebraic prediction criteria constructed by commonly used statistical methods, and the more complex the data and the more variables there are, the more significant the superiority of the algorithm becomes. The key to the model is the construction of prediction criteria, accurately. Definition: Classification and regression first use known multivariate data to construct prediction criteria, and then predict one variable based on the values of other variables. In classification, people often first make various measurements on an object, and then use certain classification criteria to determine which category the object belongs to. For example, given the identification characteristics of a certain fossil, predict which family, which genus, or even which species the fossil belongs to. Another example is to predict whether there are minerals in the area based on the geological and geophysical information of a certain area. Regression is different from classification in that it is used to predict a certain value of an object rather than classifying the object. For example, given the characteristics of mineral resources in a certain area, predict the amount of resources in the area.
CART is very similar to C4.5, but it supports numerical target variables (regression) and does not generate decision rules. CART uses features and thresholds to obtain maximum information gain at each node to build a decision tree.
scikit-learn uses the CART algorithm
Sample code:
#! /usr/bin/env python#-*- coding:utf-8 -*-from sklearn import treeimport numpy as np# scikit-learn使用的决策树算法是CARTX = [[0,0],[1,1]] Y = ["A","B"] clf = tree.DecisionTreeClassifier() clf = clf.fit(X,Y) data1 = np.array([2.,2.]).reshape(1,-1)print clf.predict(data1) # 预测类别 print clf.predict_proba(data1) # 预测属于各个类的概率
Okay, that’s it, I hope it’s helpful You have help.
The github address of this article:
20170619_Decision Tree Algorithm.md
Welcome to add
The above is the detailed content of What is the decision tree algorithm?. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



Written above & the author’s personal understanding: At present, in the entire autonomous driving system, the perception module plays a vital role. The autonomous vehicle driving on the road can only obtain accurate perception results through the perception module. The downstream regulation and control module in the autonomous driving system makes timely and correct judgments and behavioral decisions. Currently, cars with autonomous driving functions are usually equipped with a variety of data information sensors including surround-view camera sensors, lidar sensors, and millimeter-wave radar sensors to collect information in different modalities to achieve accurate perception tasks. The BEV perception algorithm based on pure vision is favored by the industry because of its low hardware cost and easy deployment, and its output results can be easily applied to various downstream tasks.

Common challenges faced by machine learning algorithms in C++ include memory management, multi-threading, performance optimization, and maintainability. Solutions include using smart pointers, modern threading libraries, SIMD instructions and third-party libraries, as well as following coding style guidelines and using automation tools. Practical cases show how to use the Eigen library to implement linear regression algorithms, effectively manage memory and use high-performance matrix operations.

The bottom layer of the C++sort function uses merge sort, its complexity is O(nlogn), and provides different sorting algorithm choices, including quick sort, heap sort and stable sort.

The convergence of artificial intelligence (AI) and law enforcement opens up new possibilities for crime prevention and detection. The predictive capabilities of artificial intelligence are widely used in systems such as CrimeGPT (Crime Prediction Technology) to predict criminal activities. This article explores the potential of artificial intelligence in crime prediction, its current applications, the challenges it faces, and the possible ethical implications of the technology. Artificial Intelligence and Crime Prediction: The Basics CrimeGPT uses machine learning algorithms to analyze large data sets, identifying patterns that can predict where and when crimes are likely to occur. These data sets include historical crime statistics, demographic information, economic indicators, weather patterns, and more. By identifying trends that human analysts might miss, artificial intelligence can empower law enforcement agencies

01 Outlook Summary Currently, it is difficult to achieve an appropriate balance between detection efficiency and detection results. We have developed an enhanced YOLOv5 algorithm for target detection in high-resolution optical remote sensing images, using multi-layer feature pyramids, multi-detection head strategies and hybrid attention modules to improve the effect of the target detection network in optical remote sensing images. According to the SIMD data set, the mAP of the new algorithm is 2.2% better than YOLOv5 and 8.48% better than YOLOX, achieving a better balance between detection results and speed. 02 Background & Motivation With the rapid development of remote sensing technology, high-resolution optical remote sensing images have been used to describe many objects on the earth’s surface, including aircraft, cars, buildings, etc. Object detection in the interpretation of remote sensing images

1. The historical development of multi-modal large models. The photo above is the first artificial intelligence workshop held at Dartmouth College in the United States in 1956. This conference is also considered to have kicked off the development of artificial intelligence. Participants Mainly the pioneers of symbolic logic (except for the neurobiologist Peter Milner in the middle of the front row). However, this symbolic logic theory could not be realized for a long time, and even ushered in the first AI winter in the 1980s and 1990s. It was not until the recent implementation of large language models that we discovered that neural networks really carry this logical thinking. The work of neurobiologist Peter Milner inspired the subsequent development of artificial neural networks, and it was for this reason that he was invited to participate in this project.

1. Background of the Construction of 58 Portraits Platform First of all, I would like to share with you the background of the construction of the 58 Portrait Platform. 1. The traditional thinking of the traditional profiling platform is no longer enough. Building a user profiling platform relies on data warehouse modeling capabilities to integrate data from multiple business lines to build accurate user portraits; it also requires data mining to understand user behavior, interests and needs, and provide algorithms. side capabilities; finally, it also needs to have data platform capabilities to efficiently store, query and share user profile data and provide profile services. The main difference between a self-built business profiling platform and a middle-office profiling platform is that the self-built profiling platform serves a single business line and can be customized on demand; the mid-office platform serves multiple business lines, has complex modeling, and provides more general capabilities. 2.58 User portraits of the background of Zhongtai portrait construction

Written above & The author’s personal understanding is that in the autonomous driving system, the perception task is a crucial component of the entire autonomous driving system. The main goal of the perception task is to enable autonomous vehicles to understand and perceive surrounding environmental elements, such as vehicles driving on the road, pedestrians on the roadside, obstacles encountered during driving, traffic signs on the road, etc., thereby helping downstream modules Make correct and reasonable decisions and actions. A vehicle with self-driving capabilities is usually equipped with different types of information collection sensors, such as surround-view camera sensors, lidar sensors, millimeter-wave radar sensors, etc., to ensure that the self-driving vehicle can accurately perceive and understand surrounding environment elements. , enabling autonomous vehicles to make correct decisions during autonomous driving. Head
