如何使用 Boto3 停止 AWS Glue Data Catalog 中的爬虫

在本文中，我们将看到用户如何停止 AWS Glue Data Catalog 中的一个爬虫。

更多Python相关文章，请阅读：Python 教程

示例

问题陈述： 使用 Python 中的 boto3 库来停止爬虫。

解决此问题的方法/算法

步骤1： 导入 boto3 和 botocore 异常处理。
步骤2： crawler_name 是此函数的参数。
步骤3： 使用 boto3 库创建 AWS 会话。确保 region_name 在默认配置文件中指定。如果未指定，则在创建会话时显式传递 region_name 。
步骤4： 为 glue 创建 AWS 客户端。
步骤5： 现在使用 stop_crawler 函数，并将参数 crawler_name 作为 Name 传递。
步骤6： 它返回响应元数据并停止正在运行的爬虫，否则抛出异常 – CrawlerNotRunningException 。
步骤7： 如果停止爬虫时出现问题，请处理一般异常。

示例代码

以下代码停止爬虫 −

import boto3
from botocore.exceptions import ClientError

def stop_a_crawler(crawler_name)
    session = boto3.session.Session()
    glue_client = session.client('glue')
    try:
        response = glue_client.stop_crawler(Name=crawler_name)
        return response
    except ClientError as e:
        raise Exception("boto3 client error in stop_a_crawler: " + e.__str__())
    except Exception as e:
        raise Exception("Unexpected error in stop_a_crawler: " + e.__str__())
print(stop_a_crawler("Data Dimension"))

输出

{'ResponseMetadata': {'RequestId': '73e50130-*****************8e', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Sun, 28 Mar 2021 07:26:55 GMT', 'content-type': 'application/x-amz-json-1.1', 'content-length': '2', 'connection': 'keep-alive', 'x-amzn-requestid': '73e50130-***************8e'}, 'RetryAttempts': 0}}