
- Python Pandas Tutorial
- Python Pandas - Home
- Python Pandas - Introduction
- Python Pandas - Environment Setup
- Python Pandas - Basics
- Python Pandas - Introduction to Data Structures
- Python Pandas - Index Objects
- Python Pandas - Panel
- Python Pandas - Basic Functionality
- Python Pandas - Indexing & Selecting Data
- Python Pandas - Series
- Python Pandas - Series
- Python Pandas - Slicing a Series Object
- Python Pandas - Attributes of a Series Object
- Python Pandas - Arithmetic Operations on Series Object
- Python Pandas - Converting Series to Other Objects
- Python Pandas - DataFrame
- Python Pandas - DataFrame
- Python Pandas - Accessing DataFrame
- Python Pandas - Slicing a DataFrame Object
- Python Pandas - Modifying DataFrame
- Python Pandas - Removing Rows from a DataFrame
- Python Pandas - Arithmetic Operations on DataFrame
- Python Pandas - IO Tools
- Python Pandas - IO Tools
- Python Pandas - Working with CSV Format
- Python Pandas - Reading & Writing JSON Files
- Python Pandas - Reading Data from an Excel File
- Python Pandas - Writing Data to Excel Files
- Python Pandas - Working with HTML Data
- Python Pandas - Clipboard
- Python Pandas - Working with HDF5 Format
- Python Pandas - Comparison with SQL
- Python Pandas - Data Handling
- Python Pandas - Sorting
- Python Pandas - Reindexing
- Python Pandas - Iteration
- Python Pandas - Concatenation
- Python Pandas - Statistical Functions
- Python Pandas - Descriptive Statistics
- Python Pandas - Working with Text Data
- Python Pandas - Function Application
- Python Pandas - Options & Customization
- Python Pandas - Window Functions
- Python Pandas - Aggregations
- Python Pandas - Merging/Joining
- Python Pandas - MultiIndex
- Python Pandas - Basics of MultiIndex
- Python Pandas - Indexing with MultiIndex
- Python Pandas - Advanced Reindexing with MultiIndex
- Python Pandas - Renaming MultiIndex Labels
- Python Pandas - Sorting a MultiIndex
- Python Pandas - Binary Operations
- Python Pandas - Binary Comparison Operations
- Python Pandas - Boolean Indexing
- Python Pandas - Boolean Masking
- Python Pandas - Data Reshaping & Pivoting
- Python Pandas - Pivoting
- Python Pandas - Stacking & Unstacking
- Python Pandas - Melting
- Python Pandas - Computing Dummy Variables
- Python Pandas - Categorical Data
- Python Pandas - Categorical Data
- Python Pandas - Ordering & Sorting Categorical Data
- Python Pandas - Comparing Categorical Data
- Python Pandas - Handling Missing Data
- Python Pandas - Missing Data
- Python Pandas - Filling Missing Data
- Python Pandas - Interpolation of Missing Values
- Python Pandas - Dropping Missing Data
- Python Pandas - Calculations with Missing Data
- Python Pandas - Handling Duplicates
- Python Pandas - Duplicated Data
- Python Pandas - Counting & Retrieving Unique Elements
- Python Pandas - Duplicated Labels
- Python Pandas - Grouping & Aggregation
- Python Pandas - GroupBy
- Python Pandas - Time-series Data
- Python Pandas - Date Functionality
- Python Pandas - Timedelta
- Python Pandas - Sparse Data Structures
- Python Pandas - Sparse Data
- Python Pandas - Visualization
- Python Pandas - Visualization
- Python Pandas - Additional Concepts
- Python Pandas - Caveats & Gotchas
- Python Pandas Useful Resources
- Python Pandas - Quick Guide
- Python Pandas - Cheatsheet
- Python Pandas - Useful Resources
- Python Pandas - Discussion
Python Pandas - Working with Text Data
Pandas provides powerful tools for working with text data using the .str accessor. This allows us to apply various string operations on Series and Index objects, which work efficiently on string manipulation within a Pandas DataFrame.
The .str accessor provides a variety of string methods that can perform operations like string transformation, concatenation, searching, and many other on string objects. Below, these methods are categorized based on their functionalities −
String Transformation
This category includes methods that transform the strings in some way, such as changing the case, formatting, or modifying specific characters.
Sr.No. | Methods & Description |
---|---|
1 |
Transforms the first character of each string in the Series or Index to uppercase and the rest to lowercase. |
2 |
Converts each string to lowercase in a more aggressive manner suitable for case-insensitive comparisons. |
3 |
Converts all characters in each string of the Series or Index to lowercase. |
4 |
Series.str.upper() Converts all characters in each string of the Series or Index to uppercase. |
5 |
Series.str.title() Converts each string to titlecase, where the first character of each word is capitalized. |
6 |
Series.str.swapcase() Swaps case converts uppercase characters to lowercase and vice versa. |
7 |
Series.str.replace() Replaces occurrences of a pattern or regular expression in each string with another string. |
String Trimming
This category includes methods to trim strings to a specific characters or specified prefix.
Sr.No. | Methods & Description |
---|---|
1 |
Series.str.lstrip() Removes leading characters (by default, whitespace) from each string. |
2 |
Series.str.strip() Removes leading and trailing characters (by default, whitespace) from each string. |
3 |
Series.str.rstrip() Removes trailing characters (by default, whitespace) from each string. |
4 |
Series.str.removeprefix(prefix) Removes the specified prefix from each string in the Series or Index, if it exists. |
5 |
Series.str.removesuffix(suffix) Removes the specified suffix from each string in the Series or Index, if it exists. |
String Concatenation and Joining Methods
These methods allow you to combine multiple strings into one or join elements within strings using specified separators.
Sr.No. | Methods & Description |
---|---|
1 |
Concatenates strings in the Series or Index with an optional separator. |
2 |
Joins the elements in lists contained in each string of the Series or Index using the specified separator. |
String Padding Methods
This category includes methods to pad strings to a specific length or align them within a specified width.
Sr.No. | Methods & Description |
---|---|
1 |
Centers each string in the Series or Index within a specified width, padding with a character. |
2 |
Series.str.pad() Pads each string in the Series or Index to a specified width, with an option to pad from the left, right, or both sides. |
3 |
Pads the right side of each string in the Series or Index with a specified character to reach the specified width. |
4 |
Series.str.rjust() Pads the left side of each string in the Series or Index with a specified character to reach the specified width. |
5 | Series.str.zfill() Pads each string in the Series or Index with zeros on the left, up to the specified width. |
String Searching Methods
These methods help you locate substrings, count occurrences, or check for patterns within the text.
Sr.No. | Methods & Description |
---|---|
1 |
Checks whether each string in the Series or Index contains a specified pattern. |
2 |
Counts occurrences of a pattern or regular expression in each string of the Series or Index. |
3 |
Finds the lowest index of a substring in each string of the Series or Index. |
4 |
Series.str.rfind() Finds the highest index of a substring in each string of the Series or Index. |
5 |
Similar to find(), but raises an exception if the substring is not found. |
6 |
Series.str.rindex() Similar to rfind(), but raises an exception if the substring is not found. |
7 |
Series.str.match() Checks for a match only at the beginning of each string. |
8 |
Checks for a match across the entire string. |
9 |
Extracts matched groups in each string using regular expressions. |
10 |
Extracts all matches in each string using regular expressions. |
String Splitting Methods
Splitting methods divide strings based on a delimiter or pattern, which is useful for parsing text data into separate components.
Sr.No. | Methods & Description |
---|---|
1 |
Series.str.split() Splits each string in the Series or Index by the specified delimiter or regular expression, and returns a list of strings. |
2 |
Series.str.rsplit() Splits each string in the Series or Index by the specified delimiter or regular expression, starting from the right side, and returns a list of strings. |
3 |
Series.str.partition() Splits each string at the first occurrence of the delimiter, and returns a tuple containing three elements: the part before the delimiter, the delimiter itself, and the part after the delimiter. |
4 |
Series.str.rpartition() Splits each string at the last occurrence of the delimiter, and returns a tuple containing three elements: the part before the delimiter, the delimiter itself, and the part after the delimiter. |
String Filtering Methods
These methods are useful for filtering out non-alphanumeric characters, controlling character sets, or cleaning text data.
Sr.No. | Methods & Description |
---|---|
1 |
Returns elements for which a provided function evaluates to true. |
2 |
Extracts element from each component at specified position. |
3 |
Series.str.get_dummies() Splits each string in the Series by the specified delimiter and returns a DataFrame of dummy/indicator variables. |
4 |
Series.str.isalpha() Checks whether each string consists only of alphabetic characters. |
5 |
Series.str.isdigit() Checks whether each string consists only of digits. |
6 |
Series.str.isnumeric()s Checks whether each string consists only of numeric characters. |
7 |
Series.str.isspace() Checks whether each string consists only of whitespace. |
8 |
Series.str.isupper() Checks whether all characters in each string are uppercase. |
9 |
Series.str.islower() Checks if all characters in each string are lowercase. |
10 |
Series.str.isalnum() Checks if all characters in each string are alphanumeric (letters and digits). |
11 | Series.str.istitle() Checks if each string in the Series or Index is in title case, where each word starts with a capital letter. |
12 |
Series.str.isdecimal() Checks if all characters in each string are decimal characters. |
13 |
Computes the length of each string in the Series or Index. |
14 |
Finds all occurrences of a pattern or regular expression in each string. |
Miscellaneous Methods
This category includes methods that perform a variety of other operations on strings, such as encoding, decoding, and checking for the presence of certain characters.
Sr.No. | Methods & Description |
---|---|
1 |
Encodes each string using the specified encoding. |
2 |
Decodes each string using the specified encoding. |
3 |
Expands tab characters ('\t') into spaces. |
4 |
Repeats each string in the Series or Index by the specified number of times. |
5 |
Series.str.slice_replace() Replaces a slice in each string with a passed replacement. |
6 |
Series.str.translate() Maps each character in the string through a translation table. |
7 |
Series.str.slice() Slices each string in the Series or Index by a passed argument. |
8 |
Series.str.startswith() Checks whether each string in the Series or Index starts with a specified pattern. |
9 |
Checks whether each string in the Series or Index ends with a specified pattern. |
10 |
Series.str.normalize() Normalizes the Unicode representation of each string in the Series or Index to the specified normalization form. |
11 |
Series.str.wrap() Wraps each string in the Series or Index to the specified line width, breaking lines as needed. |