Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to Concatenate Column Values in a Pandas DataFrame?
Pandas is a powerful library for data manipulation and analysis in Python. Concatenating column values involves combining the values of two or more columns into a single column, which is useful for creating new variables, merging data from different sources, or formatting data for analysis.
There are several methods to concatenate column values in a Pandas DataFrame. In this tutorial, we'll explore two common approaches: using the str.cat() method and using string concatenation with operators.
Using the str.cat() Method
The str.cat() method is designed specifically for concatenating string values in pandas Series. It provides clean syntax and handles missing values gracefully ?
import pandas as pd
# Create a sample DataFrame
df = pd.DataFrame({
'Name': ['John', 'Jane', 'Bob'],
'Age': [25, 30, 35],
'Country': ['USA', 'Canada', 'Mexico']
})
print("Original DataFrame:")
print(df)
print()
# Concatenate columns using str.cat()
df['Name_Age_Country'] = df['Name'].str.cat([df['Age'].astype(str), df['Country']], sep='|')
print("DataFrame with concatenated column:")
print(df[['Name', 'Age', 'Country', 'Name_Age_Country']])
Original DataFrame: Name Age Country 0 John 25 USA 1 Jane 30 Canada 2 Bob 35 Mexico DataFrame with concatenated column: Name Age Country Name_Age_Country 0 John 25 USA John|25|USA 1 Jane 30 Canada Jane|30|Canada 2 Bob 35 Mexico Bob|35|Mexico
Using String Concatenation with apply()
You can also use the apply() method with a lambda function for more complex concatenation logic ?
import pandas as pd
# Create a sample DataFrame
df = pd.DataFrame({
'First_Name': ['John', 'Jane', 'Bob'],
'Last_Name': ['Doe', 'Smith', 'Johnson'],
'Age': [25, 30, 35]
})
print("Original DataFrame:")
print(df)
print()
# Concatenate with custom formatting using apply()
df['Full_Info'] = df.apply(lambda row: f"{row['First_Name']} {row['Last_Name']} (Age: {row['Age']})", axis=1)
print("DataFrame with formatted concatenation:")
print(df[['First_Name', 'Last_Name', 'Age', 'Full_Info']])
Original DataFrame: First_Name Last_Name Age 0 John Doe 25 1 Jane Smith 30 2 Bob Johnson 35 DataFrame with formatted concatenation: First_Name Last_Name Age Full_Info 0 John Doe 25 John Doe (Age: 25) 1 Jane Smith 30 Jane Smith (Age: 30) 2 Bob Johnson 35 Bob Johnson (Age: 35)
Using Simple String Concatenation
For basic concatenation, you can use the + operator directly ?
import pandas as pd
# Create a sample DataFrame
df = pd.DataFrame({
'City': ['New York', 'Los Angeles', 'Chicago'],
'State': ['NY', 'CA', 'IL'],
'Zip': ['10001', '90210', '60601']
})
print("Original DataFrame:")
print(df)
print()
# Simple concatenation with + operator
df['Address'] = df['City'] + ', ' + df['State'] + ' ' + df['Zip']
print("DataFrame with concatenated address:")
print(df[['City', 'State', 'Zip', 'Address']])
Original DataFrame:
City State Zip
0 New York NY 10001
1 Los Angeles CA 90210
2 Chicago IL 60601
DataFrame with concatenated address:
City State Zip Address
0 New York NY 10001 New York, NY 10001
1 Los Angeles CA 90210 Los Angeles, CA 90210
2 Chicago IL 60601 Chicago, IL 60601
Comparison of Methods
| Method | Best For | Handles NaN | Performance |
|---|---|---|---|
str.cat() |
Multiple columns with separator | Yes | Fast |
apply() with lambda |
Complex formatting logic | Depends on logic | Slower |
+ operator |
Simple string concatenation | No | Fast |
Conclusion
Use str.cat() for clean concatenation with separators and automatic NaN handling. Use apply() with lambda functions for complex formatting requirements. For simple string joining, the + operator provides the most straightforward approach.
