Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to get the sum of a specific column of a dataframe in Pandas Python?
Sometimes, it may be required to get the sum of a specific column in a Pandas DataFrame. This is where the sum() function can be used to perform column-wise calculations.
The column whose sum needs to be computed can be accessed by column name or index. Let's explore different approaches to calculate the sum of a specific column.
Creating a Sample DataFrame
First, let's create a DataFrame with sample data ?
import pandas as pd
my_data = {
'Name': pd.Series(['Tom', 'Jane', 'Vin', 'Eve', 'Will']),
'Age': pd.Series([45, 67, 89, 12, 23]),
'Value': pd.Series([8.79, 23.24, 31.98, 78.56, 90.20])
}
my_df = pd.DataFrame(my_data)
print("The dataframe is:")
print(my_df)
The dataframe is: Name Age Value 0 Tom 45 8.79 1 Jane 67 23.24 2 Vin 89 31.98 3 Eve 12 78.56 4 Will 23 90.20
Method 1: Using Column Name
The most common way to get the sum of a specific column is by passing the column name ?
import pandas as pd
my_data = {
'Name': pd.Series(['Tom', 'Jane', 'Vin', 'Eve', 'Will']),
'Age': pd.Series([45, 67, 89, 12, 23]),
'Value': pd.Series([8.79, 23.24, 31.98, 78.56, 90.20])
}
my_df = pd.DataFrame(my_data)
print("The sum of 'Age' column is:")
print(my_df['Age'].sum())
print("The sum of 'Value' column is:")
print(my_df['Value'].sum())
The sum of 'Age' column is: 236 The sum of 'Value' column is: 232.77
Method 2: Using Column Index
You can also access columns by their index position ?
import pandas as pd
my_data = {
'Name': pd.Series(['Tom', 'Jane', 'Vin', 'Eve', 'Will']),
'Age': pd.Series([45, 67, 89, 12, 23]),
'Value': pd.Series([8.79, 23.24, 31.98, 78.56, 90.20])
}
my_df = pd.DataFrame(my_data)
# Age column is at index 1
print("Sum using column index 1 (Age):")
print(my_df.iloc[:, 1].sum())
# Value column is at index 2
print("Sum using column index 2 (Value):")
print(my_df.iloc[:, 2].sum())
Sum using column index 1 (Age): 236 Sum using column index 2 (Value): 232.77
Method 3: Multiple Column Sums
You can calculate sums for multiple columns at once ?
import pandas as pd
my_data = {
'Name': pd.Series(['Tom', 'Jane', 'Vin', 'Eve', 'Will']),
'Age': pd.Series([45, 67, 89, 12, 23]),
'Value': pd.Series([8.79, 23.24, 31.98, 78.56, 90.20])
}
my_df = pd.DataFrame(my_data)
print("Sum of numeric columns:")
print(my_df[['Age', 'Value']].sum())
Sum of numeric columns: Age 236.00 Value 232.77 dtype: float64
Conclusion
Use df['column_name'].sum() to get the sum of a specific column by name. For multiple columns, pass a list of column names to calculate sums efficiently. Column indexing with iloc provides an alternative when working with positional access.
