Home > Backend Development > Python Tutorial > How to Efficiently Convert Pandas DataFrames to NumPy Arrays?

How to Efficiently Convert Pandas DataFrames to NumPy Arrays?

Patricia Arquette
Release: 2024-12-20 06:15:10
Original
186 people have browsed it

How to Efficiently Convert Pandas DataFrames to NumPy Arrays?

Convert pandas dataframe to NumPy array


Why df.to_numpy() is the recommended method


Using df.to_numpy() is the recommended method because it provides a consistent and reliable way to obtain NumPy arrays from pandas objects. It is defined on Index, Series, and DataFrame objects, and by default, it returns a view of the underlying data, which means that any modifications made to the NumPy array will also be reflected in the pandas object. If a copy of the data is needed, the copy=True parameter can be used.


It's important to note that df.values will not be deprecated in the current version of pandas, but it is recommended to use df.to_numpy() for new code and to migrate towards the newer API as soon as possible.


To preserve the dtypes when converting a pandas dataframe to a NumPy array, the DataFrame.to_records() method can be used.


import pandas as pd<br>import numpy as np</p>
<p>index = [1, 2, 3, 4, 5, 6, 7]<br>a = [np.nan, np.nan, np.nan, 0.1, 0.1, 0.1, 0.1]<br>b = [0.2, np.nan, 0.2, 0.2, 0.2, np.nan, np.nan]<br>c = [np.nan, 0.5, 0.5, np.nan, 0.5, 0.5, np.nan]<br>df = pd.DataFrame({'A': a, 'B': b, 'C': c}, index=index)<br>df = df.rename_axis('ID')</p>
<h1>Convert the DataFrame to a NumPy array with preserved dtypes</h1>
<p>array = df.to_records()</p>
<h1>Print the NumPy array</h1>
<p>print(array)<br>

The output of the code is as follows:


<br>rec.array([('ID', 'index', 'A', 'B', 'C')]</p>
<div class="code" style="position:relative; padding:0px; margin:0px;"><pre class="brush:php;toolbar:false">           [1, 'a', nan, 0.2, nan],
           [2, 'b', nan, nan, 0.5],
           [3, 'c', nan, 0.2, 0.5],
           [4, 'd', 0.1, 0.2, nan],
           [5, 'e', 0.1, 0.2, 0.5],
           [6, 'f', 0.1, nan, 0.5],
           [7, 'g', 0.1, nan, nan]),
      dtype=[('ID', '<i8'), ('index', 'O'), ('A', '<f8'), ('B', '<f8'), ('C', '<f8')])
Copy after login


As you can see, the NumPy array preserves the dtypes of the columns in the DataFrame.

The above is the detailed content of How to Efficiently Convert Pandas DataFrames to NumPy Arrays?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Recommendations
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template