Using df.to_numpy() is the recommended method because it provides a consistent and reliable way to obtain NumPy arrays from pandas objects. It is defined on Index, Series, and DataFrame objects, and by default, it returns a view of the underlying data, which means that any modifications made to the NumPy array will also be reflected in the pandas object. If a copy of the data is needed, the copy=True parameter can be used.
It's important to note that df.values will not be deprecated in the current version of pandas, but it is recommended to use df.to_numpy() for new code and to migrate towards the newer API as soon as possible.
To preserve the dtypes when converting a pandas dataframe to a NumPy array, the DataFrame.to_records() method can be used.
import pandas as pd<br>import numpy as np</p> <p>index = [1, 2, 3, 4, 5, 6, 7]<br>a = [np.nan, np.nan, np.nan, 0.1, 0.1, 0.1, 0.1]<br>b = [0.2, np.nan, 0.2, 0.2, 0.2, np.nan, np.nan]<br>c = [np.nan, 0.5, 0.5, np.nan, 0.5, 0.5, np.nan]<br>df = pd.DataFrame({'A': a, 'B': b, 'C': c}, index=index)<br>df = df.rename_axis('ID')</p> <h1>Convert the DataFrame to a NumPy array with preserved dtypes</h1> <p>array = df.to_records()</p> <h1>Print the NumPy array</h1> <p>print(array)<br>
The output of the code is as follows:
<br>rec.array([('ID', 'index', 'A', 'B', 'C')]</p> <div class="code" style="position:relative; padding:0px; margin:0px;"><pre class="brush:php;toolbar:false"> [1, 'a', nan, 0.2, nan], [2, 'b', nan, nan, 0.5], [3, 'c', nan, 0.2, 0.5], [4, 'd', 0.1, 0.2, nan], [5, 'e', 0.1, 0.2, 0.5], [6, 'f', 0.1, nan, 0.5], [7, 'g', 0.1, nan, nan]), dtype=[('ID', '<i8'), ('index', 'O'), ('A', '<f8'), ('B', '<f8'), ('C', '<f8')])
As you can see, the NumPy array preserves the dtypes of the columns in the DataFrame.
The above is the detailed content of How to Efficiently Convert Pandas DataFrames to NumPy Arrays?. For more information, please follow other related articles on the PHP Chinese website!