Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to use Boto3 to to paginate through all databases present in AWS Glue
In this article, we will see how to paginate through all databases present in AWS Glue using the boto3 library in Python.
Problem Statement
Use boto3 library in Python to paginate through all databases from AWS Glue Data Catalog that is created in your account.
Pagination Parameters
The pagination function uses three important parameters:
max_items − denotes the total number of records to return. If the number of available records is greater than
max_items, then a NextToken will be provided in the response to resume pagination.page_size − denotes the size of each page.
starting_token − helps to paginate using
NextTokenfrom a previous response.
Algorithm
Follow these steps to paginate through AWS Glue databases:
Step 1: Import
boto3andbotocoreexceptions to handle exceptions.Step 2: Create an AWS session using
boto3library. Make sureregion_nameis mentioned in the default profile.Step 3: Create an AWS client for Glue service.
Step 4: Create a paginator object using
get_databasesoperation.Step 5: Call the
paginatefunction with pagination configuration.Step 6: Handle exceptions appropriately.
Example
Use the following code to paginate through all databases created in your AWS account −
import boto3
from botocore.exceptions import ClientError
def paginate_through_databases(max_items=None, page_size=None, starting_token=None):
session = boto3.session.Session()
glue_client = session.client('glue')
try:
paginator = glue_client.get_paginator('get_databases')
response = paginator.paginate(
PaginationConfig={
'MaxItems': max_items,
'PageSize': page_size,
'StartingToken': starting_token
}
)
return response
except ClientError as e:
raise Exception("boto3 client error in paginate_through_databases: " + str(e))
except Exception as e:
raise Exception("Unexpected error in paginate_through_databases: " + str(e))
# Example usage
paginator_response = paginate_through_databases(max_items=2, page_size=5)
# Iterate through paginated results
for page in paginator_response:
print("Page content:")
for database in page['DatabaseList']:
print(f"Database Name: {database['Name']}")
print(f"Create Time: {database['CreateTime']}")
# Check if there's a next token
if 'NextToken' in page:
print(f"Next Token: {page['NextToken'][:50]}...")
Output
Page content: Database Name: aurora_glue_catalog Create Time: 2020-11-18 14:24:46+00:00 Database Name: custdb Create Time: 2020-08-31 20:30:09+00:00 Next Token: eyJsYXN0RXZhbHVhdGVkS2V5Ijp7IkhBU0hfS0VZIjp7InMi...
Key Points
The paginator returns an iterator that yields pages of results.
Each page contains a
DatabaseListwith database information.The
NextTokenis used to continue pagination from where the previous request left off.Proper exception handling ensures robust error management.
Conclusion
Using boto3's paginator for AWS Glue databases provides an efficient way to handle large datasets. The pagination parameters allow you to control the number of results and manage memory usage effectively.
