如果您覺得這篇文章有價值,請告訴我,我會繼續努力!
最簡單但強大的概念之一是線性模型。
在機器學習中,我們的主要目標之一是根據資料進行預測。 線性模型就像機器學習的「Hello World」一樣——它很簡單,但卻構成了理解更複雜模型的基礎。
讓我們建立一個模型來預測房價。在此範例中,輸出是預期的“房價”,您的輸入將是“sqft”、“num_bedrooms”等...
def prediction(sqft, num_bedrooms, num_baths): weight_1, weight_2, weight_3 = .0, .0, .0 home_price = weight_1*sqft, weight_2*num_bedrooms, weight_3*num_baths return home_price
您會注意到每個輸入都有一個「權重」。這些權重創造了預測背後的魔力。這個例子很無聊,因為權重為零,所以它總是輸出零。
那麼讓我們來看看如何找到這些權重。
尋找權重的過程稱為「訓練」模型。
data = [ {"sqft": 1000, "bedrooms": 2, "baths": 1, "price": 200000}, {"sqft": 1500, "bedrooms": 3, "baths": 2, "price": 300000}, # ... more data points ... ]
home_price = prediction(1000, 2, 1) # our weights are currently zero, so this is zero actual_value = 200000 error = home_price - actual_value # 0 - 200000 we are way off. # let's square this value so we aren't dealing with negatives error = home_price**2
現在我們有一個方法可以知道一個數據點的偏差(誤差)有多大,我們可以計算所有數據點的平均誤差。這通常稱為均方誤差。
當然,我們可以選擇隨機數並在進行過程中不斷保存最佳值 - 但這是低效的。因此,讓我們探索一種不同的方法:梯度下降。
梯度下降是一種最佳化演算法,用於為我們的模型找到最佳權重。
梯度是一個向量,它告訴我們當我們對每個權重進行微小改變時誤差如何變化。
側邊欄直覺
想像站在丘陵地形上,您的目標是到達最低點(誤差最小)。梯度就像指南針,總是指向最陡的上升點。透過與梯度方向相反的方向,我們正在朝著最低點邁進。
工作原理如下:
我們如何計算每個錯誤的梯度?
計算梯度的一種方法是對權重進行小幅調整,看看這對我們的誤差有何影響,並看看我們應該從哪裡移動。
def calculate_gradient(weight, data, feature_index, step_size=1e-5): original_error = calculate_mean_squared_error(weight, data) # Slightly increase the weight weight[feature_index] += step_size new_error = calculate_mean_squared_error(weight, data) # Calculate the slope gradient = (new_error - original_error) / step_size # Reset the weight weight[feature_index] -= step_size return gradient
逐步分解
輸入參數:
計算原始誤差:
original_error = calculate_mean_squared_error(weight, data)
我們先用目前權重計算均方誤差。這給了我們我們的起點。
weight[feature_index] += step_size
我們將權重增加了一點點(step_size)。這使我們能夠看到權重的微小變化如何影響我們的誤差。
new_error = calculate_mean_squared_error(weight, data)
稍微增加權重後,我們再計算均方誤差。
gradient = (new_error - original_error) / step_size
這是關鍵的一步。我們要問:「當我們稍微增加重量時,誤差變化了多少?」
大小告訴我們誤差對權重的變化有多敏感。
weight[feature_index] -= step_size
我們將權重恢復到其原始值,因為我們正在測試更改它會發生什麼。
return gradient
我們傳回該權重的計算梯度。
This is called "numerical gradient calculation" or "finite difference method". We're approximating the gradient instead of calculating it analytically.
Now that we have our gradients, we can push our weights in the opposite direction of the gradient by subtracting the gradient.
weights[i] -= gradients[i]
If our gradient is too large, we could easily overshoot our minimum by updating our weight too much. To fix this, we can multiply the gradient by some small number:
learning_rate = 0.00001 weights[i] -= learning_rate*gradients[i]
And so here is how we do it for all of the weights:
def gradient_descent(data, learning_rate=0.00001, num_iterations=1000): weights = [0, 0, 0] # Start with zero weights for _ in range(num_iterations): gradients = [ calculate_gradient(weights, data, 0), # sqft calculate_gradient(weights, data, 1), # bedrooms calculate_gradient(weights, data, 2) # bathrooms ] # Update each weight for i in range(3): weights[i] -= learning_rate * gradients[i] if _ % 100 == 0: error = calculate_mean_squared_error(weights, data) print(f"Iteration {_}, Error: {error}, Weights: {weights}") return weights
Finally, we have our weights!
Once we have our trained weights, we can use them to interpret our model:
For example, if our trained weights are [100, 10000, 15000], it means:
Linear models, despite their simplicity, are powerful tools in machine learning. They provide a foundation for understanding more complex algorithms and offer interpretable insights into real-world problems.
以上是軟體工程師的機器學習的詳細內容。更多資訊請關注PHP中文網其他相關文章!