How to use Boto3 to get the details of a job that is bookmarked in AWS Glue Data Catalog?

AWS Glue Data Catalog stores job bookmarks to track processed data and prevent reprocessing. You can use boto3 to retrieve bookmark details for any bookmarked job using the get_job_bookmark() method.

Prerequisites

Before retrieving job bookmark details, ensure:

  • The job exists and has been bookmarked in AWS Glue
  • You have proper AWS credentials configured
  • The job name is correct (case-sensitive)

Approach

Step 1 ? Import boto3 and botocore exceptions to handle errors.

Step 2 ? Define the bookmarked job name parameter (must be an existing bookmarked job).

Step 3 ? Create an AWS session with proper region configuration.

Step 4 ? Create a Glue client using the session.

Step 5 ? Use get_job_bookmark() with the JobName parameter.

Step 6 ? Handle exceptions for non-existent jobs and other errors.

Example

Here's how to retrieve details of a bookmarked job in AWS Glue Data Catalog ?

import boto3
from botocore.exceptions import ClientError

def get_bookmarked_job_details(bookmarked_job_name):
    session = boto3.session.Session()
    glue_client = session.client('glue')
    
    try:
        response = glue_client.get_job_bookmark(JobName=bookmarked_job_name)
        return response
    except ClientError as e:
        if e.response['Error']['Code'] == 'EntityNotFoundException':
            print(f"Job '{bookmarked_job_name}' not found or not bookmarked")
        else:
            raise Exception("boto3 client error: " + str(e))
    except Exception as e:
        raise Exception("Unexpected error: " + str(e))

# Retrieve bookmark details for 'book-job'
result = get_bookmarked_job_details("book-job")
if result:
    print(result)

Output

{
    'JobBookmarkEntry': {
        'JobName': 'book-job',
        'Version': 8,
        'Run': 2,
        'Attempt': 2,
        'PreviousRunId': 'jr_dee547c2f78422e34136aa12c85de010b823787833eee04fbf34bc9b8cb4f7b9',
        'RunId': 'jr_a035fe15daa31e9a751f02876c26e5d11a829f2689803a9e9643bd61f70273e4',
        'JobBookmark': '{"gdf":{"jsonClass":"HadoopDataSourceJobBookmarkState","timestamps":{"RUN":"1","HIGH_BAND":"900000","CURR_LATEST_PARTITION":"0"}}}'
    },
    'ResponseMetadata': {
        'RequestId': 'bacf1497-***************996f05b3c1',
        'HTTPStatusCode': 200,
        'HTTPHeaders': {...},
        'RetryAttempts': 0
    }
}

Key Response Fields

The response contains important bookmark information:

  • JobName ? Name of the bookmarked job
  • Version ? Job version number
  • Run ? Current run number
  • RunId ? Unique identifier for the current run
  • JobBookmark ? JSON string containing processing state details

Error Handling

Common exceptions include:

  • EntityNotFoundException ? Job doesn't exist or isn't bookmarked
  • AccessDeniedException ? Insufficient permissions
  • InvalidInputException ? Invalid job name format

Conclusion

Use get_job_bookmark() to retrieve AWS Glue job bookmark details. Always handle EntityNotFoundException for non-existent or unbookmarked jobs. The response provides valuable information about job execution state and processing history.

Updated on: 2026-03-25T18:18:52+05:30

569 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements