Home > Backend Development > Python Tutorial > How to Replace NaN Values in a Pandas DataFrame with Column Averages?

How to Replace NaN Values in a Pandas DataFrame with Column Averages?

Linda Hamilton
Release: 2024-10-30 07:01:28
Original
196 people have browsed it

How to Replace NaN Values in a Pandas DataFrame with Column Averages?

Pandas DataFrame: Replacing NaN Values with Column Averages

In pandas DataFrames, handling missing data is crucial for accurate analysis. When encountered with incomplete data, replacing NaN values with meaningful estimates becomes necessary. This article demonstrates how to replace NaN values with the average of their respective columns in a pandas DataFrame.

Problem

Consider a DataFrame with a mixture of real numbers and NaN values. The goal is to replace the NaN values with the average values of the columns in which they appear.

Solution

Unlike in NumPy arrays, filling NaN values in pandas DataFrames can be efficiently handled using the fillna method:

<code class="python">df.fillna(df.mean())</code>
Copy after login

This method fills NaN values with the mean of the corresponding column. For example:

<code class="python">df = pd.DataFrame({'A': [-0.166919, -0.297953, -0.120211, np.nan, np.nan, -0.788073, -0.916080, -0.887858, 1.948430, 0.019698],
                   'B': [0.979728, -0.912674, -0.540679, -2.027325, np.nan, np.nan, -0.612343, 1.033826, 1.025011, -0.795876],
                   'C': [-0.632955, -1.365463, -0.680481, 1.533582, 0.461821, np.nan, np.nan, np.nan, -2.982224, -0.046431]})

mean = df.mean()
print(df.fillna(mean))</code>
Copy after login

Output:

          A         B         C
0 -0.166919  0.979728 -0.632955
1 -0.297953 -0.912674 -1.365463
2 -0.120211 -0.540679 -0.680481
3 -0.151121 -2.027325  1.533582
4 -0.151121 -0.231291  0.461821
5 -0.788073 -0.231291 -0.530307
6 -0.916080 -0.612343 -0.530307
7 -0.887858  1.033826 -0.530307
8  1.948430  1.025011 -2.982224
9  0.019698 -0.795876 -0.046431
Copy after login

The NaN values have been replaced with the average values of their respective columns.

The above is the detailed content of How to Replace NaN Values in a Pandas DataFrame with Column Averages?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template