Deleting DataFrame Rows Based on Column Value Efficiently
In Pandas, deleting rows based on a specific column value can be achieved in several ways. One of the most efficient approaches is to use logical indexing.
Consider the following DataFrame:
df = pd.DataFrame({ "daysago": [62, 83, 111, 139, 160, 204, 222, 245, 258, 275, 475, 504, 515, 542, 549, 556, 577, 589, 612, 632, 719, 733, 760, 790, 810, 934], "line_race": [11, 11, 9, 10, 10, 9, 8, 9, 11, 8, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], "rating": [56, 67, 66, 83, 88, 52, 66, 70, 68, 72, 65, 70, 64, 70, 70, -1, -1, -1, -1, -1, 69, -1, -1, -1, -1, -1], "rw": [1.000000, 1.000000, 1.000000, 0.880678, 0.793033, 0.636655, 0.581946, 0.518825, 0.486226, 0.446667, 0.164591, 0.142409, 0.134800, 0.117803, 0.113758, 0.109852, 0.098919, 0.093168, 0.083063, 0.075171, 0.048690, 0.045404, 0.039679, 0.034160, 0.030915, 0.016647], "wrating": [56.000000, 67.000000, 66.000000, 73.096278, 69.786942, 33.106077, 38.408408, 36.317752, 33.063381, 32.160051, 10.698423, 9.968634, 8.627219, 8.246238, 7.963072, -0.109852, -0.098919, -0.093168, -0.083063, -0.075171, 3.359623, -0.045404, -0.039679, -0.034160, -0.030915, -0.016647] })
To delete the rows where the line_race column is equal to 0, we can use the following line of code:
df = df[df["line_race"] != 0]
This expression creates a new DataFrame that includes only the rows where the line_race column does not have the value 0. By using logical indexing, we avoid creating a copy of the data, which can be a significant performance improvement when working with large datasets.
The above is the detailed content of How to Efficiently Delete DataFrame Rows Based on Column Value in Pandas?. For more information, please follow other related articles on the PHP Chinese website!