Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to get values of all rows in a particular column in openpyxl in Python?
In this article, we will show you how to get all the row values of a particular column in an excel file using the openpyxl library in Python. This is useful when you need to extract specific column data from Excel files for analysis or processing.
Assume we have an Excel file named sampleTutorialsPoint.xlsx containing cricket player data. We will extract all row values from a specific column.
Sample Data: sampleTutorialsPoint.xlsx
| Player Name | Age | Type | Country | Team | Runs | Wickets |
|---|---|---|---|---|---|---|
| Virat Kohli | 33 | Batsman | India | Royal Challengers Bangalore | 6300 | 20 |
| Bhuvaneshwar Kumar | 34 | Batsman | India | Sun Risers Hyderabad | 333 | 140 |
| Mahendra Singh Dhoni | 39 | Batsman | India | Chennai Super Kings | 4500 | 0 |
| Rashid Khan | 28 | Bowler | Afghanistan | Gujarat Titans | 500 | 130 |
| Hardik Pandya | 29 | All rounder | India | Gujarat Titans | 2400 | 85 |
Installation
First, install the openpyxl library using pip ?
pip install openpyxl
Method 1: Using Column Letter
Extract all values from a column using its letter reference (A, B, C, etc.) ?
import openpyxl
# Load the Excel workbook
workbook = openpyxl.load_workbook("sampleTutorialsPoint.xlsx")
# Access the first worksheet
worksheet = workbook["Sheet1"]
# Get all values from column E (Team column)
teams = []
for cell in worksheet['E']:
teams.append(cell.value)
# Print all team values
for team in teams:
print(team)
Team Royal Challengers Bangalore Sun Risers Hyderabad Chennai Super Kings Gujarat Titans Gujarat Titans
Method 2: Using Column Index
Extract values using numeric column index (1-based indexing) ?
import openpyxl
# Load the Excel workbook
workbook = openpyxl.load_workbook("sampleTutorialsPoint.xlsx")
worksheet = workbook["Sheet1"]
# Get all values from column 2 (Age column)
ages = []
for row in worksheet.iter_rows(min_col=2, max_col=2):
for cell in row:
ages.append(cell.value)
# Print first 5 age values
for i, age in enumerate(ages[:5]):
print(f"Row {i+1}: {age}")
Row 1: Age Row 2: 33 Row 3: 34 Row 4: 39 Row 5: 28
Method 3: Excluding Header Row
Extract only data values, skipping the header row ?
import openpyxl
# Load the Excel workbook
workbook = openpyxl.load_workbook("sampleTutorialsPoint.xlsx")
worksheet = workbook["Sheet1"]
# Get player names (column A) excluding header
player_names = []
for row in range(2, worksheet.max_row + 1): # Start from row 2
cell_value = worksheet.cell(row=row, column=1).value
if cell_value: # Skip empty cells
player_names.append(cell_value)
# Print first 3 player names
for i, name in enumerate(player_names[:3]):
print(f"Player {i+1}: {name}")
Player 1: Virat Kohli Player 2: Bhuvaneshwar Kumar Player 3: Mahendra Singh Dhoni
Comparison
| Method | Syntax | Best For |
|---|---|---|
| Column Letter | worksheet['A'] |
Fixed column references |
| Column Index | iter_rows(min_col=1, max_col=1) |
Dynamic column selection |
| Cell Reference | worksheet.cell(row=1, column=1) |
Precise control over ranges |
Conclusion
Use worksheet['column_letter'] for simple column extraction. For more control, use iter_rows() or cell() methods to skip headers or handle specific ranges. The openpyxl library provides flexible ways to access Excel column data efficiently.
