
Data Engineering with AWS
By :

In the hands-on portion for this chapter, we're going to configure an S3 bucket to automatically trigger a Lambda function whenever a new file is written to the bucket. In the Lambda function, we're going to make use of an open source Python library called AWS Data Wrangler, created by AWS Professional Services to simplify common ETL tasks when working in an AWS environment. We'll use the AWS Data Wrangler library to convert a CSV file into Parquet format, and then update the AWS Glue Data Catalog.
Lambda layers allow your Lambda function to bring in additional code, packaged as a .zip
file. In our use case, the Lambda layer is going to contain the AWS Data Wrangler Python library, which we can then attach to any Lambda function where we want to use the library.
To create a Lambda layer, do the following...