Row Summation of Given Columns in Pandas DataFrame
In Python's Pandas library, we often encounter the need to calculate the sum of specific columns in a DataFrame. To effectively achieve this, we must consider the appropriate parameters and operations.
Let's consider the following DataFrame:
df = pd.DataFrame({'a': [1, 2, 3], 'b': [2, 3, 4], 'c': ['dd', 'ee', 'ff'], 'd': [5, 9, 1]})
Our objective is to add a column 'e' that represents the sum of columns 'a', 'b', and 'd'. While intuitively, one might approach this with something like:
df['e'] = df[['a', 'b', 'd']].map(sum)
this method fails to produce the desired output.
The correct approach involves utilizing the sum() function with the following parameters:
Applying this approach yields the following result:
df['e'] = df.sum(axis=1, numeric_only=True)
Output:
a b c d e 0 1 2 dd 5 8 1 2 3 ee 9 14 2 3 4 ff 1 8
Alternatively, if we desire to calculate the sum of only specific columns, we can create a list of those columns and eliminate the ones we don't need using the remove() method.
col_list = list(df) col_list.remove('d') df['e'] = df[col_list].sum(axis=1)
Output:
a b c d e 0 1 2 dd 5 3 1 2 3 ee 9 5 2 3 4 ff 1 7
By utilizing these operations, we can effectively sum rows for specified columns in a Pandas DataFrame, ensuring accurate and efficient data analysis.
The above is the detailed content of How do I calculate the row sum of specific columns in a Pandas DataFrame?. For more information, please follow other related articles on the PHP Chinese website!