Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to use Boto3 to reset the bookmark of job in AWS account
In this article, we will see how to reset the bookmark of an AWS Glue job using the boto3 Python library. Job bookmarks help AWS Glue track the data that has already been processed during previous job runs.
What are Job Bookmarks?
AWS Glue job bookmarks prevent reprocessing of old data by keeping track of data that has already been processed. Resetting a bookmark allows the job to reprocess all data from the beginning.
Approach to Reset Job Bookmark
Step 1: Import boto3 and botocore exceptions to handle errors.
Step 2: Create an AWS session with proper region configuration.
Step 3: Create an AWS Glue client using the session.
Step 4: Use the
reset_job_bookmark()method with the job name.Step 5: Handle exceptions appropriately.
Example
The following code demonstrates how to reset the bookmark of an AWS Glue job ?
import boto3
from botocore.exceptions import ClientError
def reset_bookmark_of_a_job(job_name):
session = boto3.session.Session()
glue_client = session.client('glue')
try:
response = glue_client.reset_job_bookmark(JobName=job_name)
return response
except ClientError as e:
raise Exception("boto3 client error in reset_bookmark_of_a_job: " + str(e))
except Exception as e:
raise Exception("Unexpected error in reset_bookmark_of_a_job: " + str(e))
# Example usage (requires actual AWS Glue job)
# print(reset_bookmark_of_a_job("test_job"))
Expected Output
When the function executes successfully, it returns a dictionary containing job bookmark details ?
{
'JobBookmarkEntry': {
'JobName': 'test-job',
'Version': 3,
'Run': 3,
'Attempt': 0,
'JobBookmark': ''
},
'ResponseMetadata': {
'RequestId': '03d40d90-******************f',
'HTTPStatusCode': 200,
'HTTPHeaders': {
'date': 'Sat, 27 Mar 2021 10:14:58 GMT',
'content-type': 'application/x-amz-json-1.1',
'content-length': '104'
}
}
}
Key Points
Ensure your AWS credentials are properly configured
The job must exist in your AWS Glue Data Catalog
Resetting bookmarks will cause the next job run to reprocess all data
Use this feature carefully to avoid duplicate data processing
Conclusion
The reset_job_bookmark() method in boto3 provides a straightforward way to reset AWS Glue job bookmarks. This is useful when you need to reprocess historical data or troubleshoot job processing issues.
