Let's assume we have a simple DataFrame like the following:
import pandas as pd from random import randint df = pd.DataFrame({'A': [randint(1, 9) for x in range(10)], 'B': [randint(1, 9)*10 for x in range(10)], 'C': [randint(1, 9)*100 for x in range(10)]})
Our goal is to select values from column 'A' that meet specific criteria for corresponding values in columns 'B' and 'C'.
To achieve this, we can utilize Boolean indexing. First, we create Boolean Series objects for each criterion:
df["B"] > 50 (df["B"] > 50) & (df["C"] != 900)
These Boolean Series represent the rows that satisfy the respective criteria. We can then use these Series as indices to select the desired values:
df["A"][df["B"] > 50] df["A"][(df["B"] > 50) & (df["C"] != 900)]
We can also employ the .loc attribute for more efficient indexing. .loc allows us to specify the rows and columns to retrieve using a single statement:
df.loc[(df["B"] > 50) & (df["C"] != 900), "A"]
Both methods effectively select values from the DataFrame based on complex criteria. The choice between using Boolean indexing or .loc depends on personal preference and code readability.
The above is the detailed content of How to Select DataFrame Values Based on Multiple Criteria in Pandas?. For more information, please follow other related articles on the PHP Chinese website!