Modifying Data in Pandas Based on Matching Values
When transitioning from Stata to Pandas for data manipulation, understanding the approach to change values based on matching conditions is essential. Consider the situation where we want to replace specific values in columns "FirstName" and "LastName" when the corresponding values in the "ID" column match a certain number.
In Stata, this task is straightforward using commands like "replace FirstName = 'Matt' if ID==103." To achieve a similar result in Pandas, we can utilize the loc or chained assignment methods.
loc Method:
The loc method uses logical indexing to evaluate and modify data based on specific conditions:
<code class="python">import pandas as pd df = pd.read_csv("test.csv") df.loc[df.ID == 103, 'FirstName'] = "Matt" df.loc[df.ID == 103, 'LastName'] = "Jones"</code>
Chained Assignment:
Chained assignment, while discouraged in newer Pandas versions, can also be used for this task:
<code class="python">import pandas as pd df = pd.read_csv("test.csv") df['FirstName'][df.ID == 103] = "Matt" df['LastName'][df.ID == 103] = "Jones"</code>
In both methods, the expression "df.ID == 103" creates a Boolean mask, where True indicates rows where ID equals 103. The subsequent assignments then modify the соответствующий values in the "FirstName" and "LastName" columns.
Note: For older Pandas versions, chained assignment is an acceptable approach. However, loc is the preferred method in more modern versions as it provides greater stability.
The above is the detailed content of How to Replace Values in Pandas DataFrame Columns Based on Matching Values in Another Column?. For more information, please follow other related articles on the PHP Chinese website!