The recent breakthroughs in large language models (LLMs) have ignited significant interest in the field of artificial intelligence (AI). This surge in popularity has led many to pursue careers in this rapidly expanding sector. However, a crucial foundational element often overlooked is the artificial neuron, the bedrock of artificial neural networks. A thorough understanding of the artificial neuron is essential for grasping the intricacies of these networks. This tutorial will explain the functionality of an artificial neuron, also known as logistic regression. Despite its simplicity, the artificial neuron proves remarkably effective in solving various classification problems, including spam detection, diabetes prediction, and credit risk assessment.
To fully appreciate this technique, understanding the classification of machine learning models is crucial. Machine learning, a subset of AI, focuses on developing systems capable of automated learning and improvement from data. Machine learning models are broadly categorized into supervised, unsupervised, and reinforcement learning models.
Supervised models learn from labeled examples. In contrast, unsupervised techniques identify patterns in data without prior knowledge of those patterns. Reinforcement learning models learn through trial and error, receiving feedback in the form of rewards.
Logistic regression, as an implementation of the artificial neuron, falls under the category of supervised learning. Supervised models are further divided into classification and regression systems.
Classification models aim to identify the correct class for a given input. For example, a system might analyze a person's financial data to determine loan eligibility. Another example involves classifying animals based on their characteristics (mammal, reptile, bird, etc.).
Regression models, on the other hand, predict a numerical value based on input data. Predicting inflation rates using financial data is a common application in finance.
Despite its name, logistic regression is a classification technique. Classification can be binary (two classes, e.g., yes/no) or multiclass (multiple classes, e.g., parts of speech).
To differentiate logistic regression from linear regression, let's consider a visual representation using two inputs (for simplicity). In linear regression, the goal is to fit a line to a set of points, capturing the overall trend.
Classification of machine learning models.
This line is then used to predict one axis value based on the other (a plane in 3D space, a hyperplane in higher dimensions).
Logistic regression, however, aims to produce a binary decision (yes/no, etc.). A straight line is insufficient for this purpose. Consider determining loan eligibility based on salary. Fitting a line to this data is problematic.
Example of linear regression.
An "S"-shaped curve, however, provides a more effective solution. Points closer to the upper part of the curve indicate "yes," while those closer to the lower part indicate "no." Introducing non-linearity transforms the line into this curve.
Illustration of the inadequacy of linear regression for classification.
The logistic function introduces this non-linearity. Its formula is:
This function exhibits several key properties:
Demonstration of the suitability of the “S”-shaped curve for classification.
Graphical representation of the logistic function.
The differentiability allows for calculating the slope at any point on the curve, crucial for adjusting the model during training.
Graphical representation of a tangent line to a point in the logistic function.
Let's illustrate logistic regression with a loan approval dataset. The dataset contains features like salary and loan amount, and a label indicating approval (1) or rejection (0). We'll use a portion for training and another for testing.
The Dataset.
The model calculates a weighted sum of inputs (salary and loan amount) plus a bias term (Z). Initial weights and bias are random and adjusted during training.
Example of calculating the value of Z.
The sigmoid function then transforms Z into a probability (0-1). Values ≥ 0.5 are classified as "yes," and < 0.5 as "no." Errors are calculated by comparing predictions to actual values.
This process is analogous to a biological neuron: inputs (dendrites), weighted connections, summation, threshold (sigmoid), and output (axon).
Graphical representation of the logistic regression calculation flow.
Neuron
Formally, given input vector x, weight vector w, and bias b:
Z = wTx b
The sigmoid function then produces the output.
Notational convention.
Vector multiplication.
Application of the sigmoid function to Z.
A Python implementation is shown below, illustrating the calculation and error computation. The training process (weight adjustment) will be covered in a subsequent tutorial.
<code class="language-python">from math import exp def sigmoide(x): return 1 / (1 + exp(-x)) # Input X[0] Wage, x[1] Loan X = [[3,10],[1.5,11.8],[5.5,20.0],[3.5,15.2],[3.1,14.5], [7.6,15.5],[1.5,3.5],[6.9,8.5],[8.6,2.0],[7.66,3.5]] Y = [0 , 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1, 1] m = len(X) w=[0.2,0.1] b=0.1 for j in range(m): z = X[j][0]*w[0]+X[j][1]*w[1]+b yhat = sigmoide(z) # Calculates error erro = yhat-Y[j] print(" Wage:{0:5.2f} Wage:{1:5.2f} Expected value:{2} ". format( X[j][0]*1000, X[j][1], Y[j])) print(" z:{0:2.3f} yhat:{1:2.3f} error:{2:2.3f}\n ".format( z, yhat, erro))</code>
Example of calculation in logistic regression.
Output issued by the program.
This concludes this tutorial. The training process will be explained in a future installment.
The above is the detailed content of Artificial Neurons: The Heart of AI. For more information, please follow other related articles on the PHP Chinese website!