What is the main purpose of Bird’s real-time event webhooks?

To push event data directly to a sender’s endpoint in real time, enabling immediate automation without polling or rate-limited API calls.

Why are webhooks better than pulling data via API for large senders?

Because API pulls require time-window management, risk data gaps or duplicates, and may hit rate limits—webhooks eliminate all of that by pushing data continuously.

What AWS services are used in the recommended webhook ingestion pipeline?

Amazon S3 (storage), AWS Lambda (processing), an Application Load Balancer (HTTP listener), and optionally Route 53 (DNS).

How does Bird batch event data?

Bird sends multiple events together in a payload, each assigned a unique batch ID (x-messagesystems-batch-id) for tracking, retries, and deduplication.

What triggers the consumer Lambda function?

The ALB forwards incoming webhook POST requests directly to the Lambda, which extracts the payload and writes it to S3.

Why store the raw webhook batch in S3 before processing it?

To ensure fast ingestion (<10 seconds) so the connection doesn’t timeout, and to offload heavier processing to a separate async pipeline.

What does the second Lambda function do?

It is triggered by new S3 objects, validates the JSON, converts it to CSV, and writes the processed output to a separate S3 bucket.

Why use a separate S3 bucket for processed CSV files?

To avoid recursive triggers (writing a new processed file back into the same bucket would re-trigger the Lambda endlessly).

What permissions do the Lambda functions require?

The consumer Lambda needs S3 PutObject permissions; the processing Lambda needs GetObject for the source bucket and PutObject for the destination bucket.

Why is an AWS ALB recommended over API Gateway?

ALBs are cheaper, simpler, and ideal for straightforward HTTPS POST forwarding; API Gateway may alter payload format and increase complexity.

How do senders configure the webhook in Bird?

By providing the HTTPS endpoint (the DNS record for the ALB), selecting a subaccount, choosing events, and saving the webhook configuration.

What downstream options exist for storing or analyzing the processed data?

Load into databases (PostgreSQL, DynamoDB, RDS), feed into ETL systems, or query directly with tools like Athena.

Création et consommation de Webhooks pour les oiseaux

Q: What does the second Lambda function do?

It is triggered by new S3 objects, validates the JSON, converts it to CSV, and writes the processed output to a separate S3 bucket.

Q: Why use a separate S3 bucket for processed CSV files?

To avoid recursive triggers (writing a new processed file back into the same bucket would re-trigger the Lambda endlessly).

Q: What permissions do the Lambda functions require?

The consumer Lambda needs S3 PutObject permissions; the processing Lambda needs GetObject for the source bucket and PutObject for the destination bucket.

Q: Why is an AWS ALB recommended over API Gateway?

ALBs are cheaper, simpler, and ideal for straightforward HTTPS POST forwarding; API Gateway may alter payload format and increase complexity.

Oiseau

27 janv. 2022

1 min read

Création et consommation de Webhooks pour les oiseaux

Points Clés

- Bird’s real-time event webhooks let senders receive event data instantly—no polling, no cron jobs, and no rate-limit headaches.
- Webhooks eliminate the complexity of managing time windows, preventing missed events and handling duplicate records.
- Ideal for downstream automation: updating lists, starting journeys, enriching dashboards, or syncing internal systems.
- The tutorial walks senders through building a full AWS ingestion pipeline: S3 storage, Lambda processing, and an application load balancer.
- S3 serves as the central storage layer for webhook payloads, with each batch written as a flat JSON file.
- Lambda functions handle both ingestion (storing raw batches) and transformation (converting JSON to CSV).
- Bird batches events—each webhook batch includes a unique x-messagesystems-batch-id, allowing replay checks and deduplication.
- The consumer Lambda must remain efficient since Bird retries batches if the endpoint does not respond within ~10 seconds.
- AWS ALB is recommended (vs. API Gateway) for simplicity, cost-effectiveness, and direct handling of HTTP POST requests.
- DNS (Route 53 or external provider) is configured to map a friendly hostname to the ALB endpoint.
- After setup, Bird pushes event data directly and reliably to your AWS pipeline for asynchronous processing.
- The guide also covers best practices: least-privilege IAM permissions, temp storage constraints, avoiding recursive triggers, and organizing multi-lambda workflows.

Points forts des Q&A

What is the main purpose of Bird’s real-time event webhooks?
To push event data directly to a sender’s endpoint in real time, enabling immediate automation without polling or rate-limited API calls.
Why are webhooks better than pulling data via API for large senders?
Because API pulls require time-window management, risk data gaps or duplicates, and may hit rate limits—webhooks eliminate all of that by pushing data continuously.
What AWS services are used in the recommended webhook ingestion pipeline?
Amazon S3 (storage), AWS Lambda (processing), an Application Load Balancer (HTTP listener), and optionally Route 53 (DNS).
How does Bird batch event data?
Bird sends multiple events together in a payload, each assigned a unique batch ID (x-messagesystems-batch-id) for tracking, retries, and deduplication.
What triggers the consumer Lambda function?
The ALB forwards incoming webhook POST requests directly to the Lambda, which extracts the payload and writes it to S3.
Why store the raw webhook batch in S3 before processing it?
To ensure fast ingestion (<10 seconds) so the connection doesn’t timeout, and to offload heavier processing to a separate async pipeline.
What does the second Lambda function do?
It is triggered by new S3 objects, validates the JSON, converts it to CSV, and writes the processed output to a separate S3 bucket.
Why use a separate S3 bucket for processed CSV files?
To avoid recursive triggers (writing a new processed file back into the same bucket would re-trigger the Lambda endlessly).
What permissions do the Lambda functions require?
The consumer Lambda needs S3 PutObject permissions; the processing Lambda needs GetObject for the source bucket and PutObject for the destination bucket.
Why is an AWS ALB recommended over API Gateway?
ALBs are cheaper, simpler, and ideal for straightforward HTTPS POST forwarding; API Gateway may alter payload format and increase complexity.
How do senders configure the webhook in Bird?
By providing the HTTPS endpoint (the DNS record for the ALB), selecting a subaccount, choosing events, and saving the webhook configuration.
What downstream options exist for storing or analyzing the processed data?
Load into databases (PostgreSQL, DynamoDB, RDS), feed into ETL systems, or query directly with tools like Athena.

Le guide suivant est une simple aide pour permettre aux expéditeurs de se sentir à l'aise lors de la création d'un webhook d'événements Bird et de l'utilisation des données via l'infrastructure d'AWS.

Les webhooks d'événements en temps réel de Bird sont un outil incroyablement précieux pour les expéditeurs afin de faire pousser automatiquement les données vers leurs systèmes. Cela peut entraîner des automatisations en aval telles que la mise à jour des listes de diffusion, le déclenchement de parcours de courrier électronique automatisés ou la population de tableaux de bord internes. Bien que les mêmes données d'événement puissent être consultées via l'interface utilisateur de Bird en utilisant Event Search, ou programmatiquement en utilisant l'API Events API de Bird, les limites placées sur le nombre de dossiers renvoyés dans une seule requête ou les limites de débit placées sur le point d'accès API peuvent rendre ces deux méthodes restrictives pour les expéditeurs importants et sophistiqués.

Les webhooks d'événements en temps réel permettent à un expéditeur de configurer un point d'accès où Bird transmet les données, et les données peuvent être consommées sans avoir à programmer des tâches cron qui récupèrent les données. Il y a aussi des compromis logistiques lorsque vous récupérez les données au lieu de les faire pousser vers vous, comme l'identification de la période de temps et des paramètres à utiliser pour chaque requête API. Si les périodes de temps ne sont pas parfaitement alignées, vous risquez de manquer des données, et si les périodes de temps se chevauchent, vous devez gérer les enregistrements de données en double. Avec les webhooks en temps réel, les données d'événement sont simplement poussées vers votre point d'accès au fur et à mesure qu'elles deviennent disponibles dans Bird.

Bien que les avantages de recevoir des données d'événement en temps réel pour faciliter les processus d'automatisation en aval puissent être immédiatement compris par de nombreux expéditeurs, le processus réel d'implémentation et de consommation des webhooks peut être intimidant. Cela peut être particulièrement vrai si vous n'êtes pas familiarisé avec les composants techniques de la création d'un point d'accès et de la gestion des données de manière programmatique. Il existe des services disponibles qui consommeront les données du webhook Bird et les intégreront automatiquement dans votre base de données – un exemple serait StitchData, dont nous avons parlé dans notre blog par le passé. Cependant, si vous souhaitez plus de contrôle sur le processus, vous pouvez facilement construire les composants vous-même. Voici un guide simple pour aider les expéditeurs à se sentir à l'aise lors de la création d'un webhook d'événements Bird et la consommation des données en utilisant l'infrastructure au sein d'AWS.

Configuration de l'Endpoint de Cible Webhook

When a Bird event is created, we want that event data to be streamed in real-time to an endpoint in AWS so that we can consume and use that data programmatically. The data will be sent from Bird to a target endpoint, which will forward the payload to a lambda function that will process and store the data in an S3 bucket. A high-level diagram of the described data flow can be seen below:

Flowchart for a system with components, showing a process starting with 'MessageBird', continuing through an 'Application Load Balancer', leading to a 'Lambda Script (Consumer)', connecting to an 'S3' storage, and optionally processed by another 'Lambda Script (Process)'.

To implement this workflow, let’s actually build them in the reverse order beginning with creating an S3 bucket where we will store our event data and then work backwards – adding each component that feeds into what we’ve built.

Create an S3 Bucket to Store the Webhook Data

Before creating our load balancer to accept the data, or our lambda function to store the data, we need to first create our S3 bucket where the data will be stored. While S3 provides excellent storage for webhook data, organizations also using PostgreSQL databases for event processing should implement proper backup and restore procedures to protect their structured data alongside their S3 storage strategy. To do this, navigate to the S3 service within AWS and press “Create Bucket.” You will be prompted to assign a name to your bucket and set the region – make sure to use the same region as your ALB and lambda function. When your S3 bucket is created, it will be empty – if you would like to organize the data inside of a folder, you can either create the intended directory now, or the directory will be created when your lambda function stores the file. In this example, we named our S3 bucket “bird-webhooks” and created a folder named “B Event Data” to store our event data – you will see these names referenced in our lambda function below.

Create a Lambda Function to Consume the Data

The actual processing and storage of the data will be performed by a lambda function that is invoked by our application load balancer (ALB).

The first step is to create your lambda function by navigating to the Lambda service within AWS and clicking “Create Function.” You will be prompted to assign a name to your lambda function and select which programming language to write your function in. For this example, we use Python as the runtime language.

Now we need to develop our lambda function. For a moment, let’s assume that our application load balancer has been configured and is forwarding the webhook payload to our lambda function – the lambda will receive a payload including the full headers and body. The payload is passed into our lambda function using the object “event” as a dictionary. You can reference the headers and the body of the payload independently by accessing the “headers” and “body” objects within the payload. In this example we are simply going to read the “x-messagesystems-batch-id” header, where the batch ID is a unique value created by Bird for the webhook batch, and use it as the filename when storing the body as a flat-file in S3; however, you may want to add additional functionality such as authentication checks or error handling, as needed.

When storing the payload to a flat-file on S3, we will need to define the name of the S3 bucket, location, and filename of the file where the payload data will be stored. In our sample lambda function, we do this within the “store_batch” function. In this example, we are going to store the entire batch as a single file, which helps to ensure that the data is collected and stored before the HTTP connection between Bird and your endpoint times out. While you could adjust the connection timeout settings on your load balancer, there are no guarantees that the connection won’t timeout on the transmission side (in this case Bird) or that the connection won’t be terminated before your lambda function finishes executing. It is a best practice to keep your consumer function as efficient as possible and reserve data processing activities for downstream processes where possible – such as converting the batched JSON-formatted payload into a CSV file, or loading the event data into a database.

It is important to note that you may need to update the permissions for your lambda function. Your execution role will need PutObject and GetObject permissions for S3. It is a best practice to enforce the principle of least privilege, so I recommend setting these permissions for only the S3 bucket where the webhook payloads will be stored.

A sample of our consumer lambda function can be found here.

As a quick note on the batch ID: Bird will batch events into a single payload, where each batch may contain 1 to 350 or more event records. The batch will be given a unique batch ID, which can be used to view the batch status by leveraging the Event Webhooks API or within your Bird account by clicking on a webhook stream and selecting “Batch Status.” In the event that a webhook payload could not be delivered, such as during a connection timeout, Bird will automatically retry the batch using the same batch ID. This can happen when your lambda function is running close to the max round-trip time of 10 seconds and is a reason for optimizing the consumer function to reduce execution time.

To handle all data processing activities, I recommend creating a separate lambda function that executes whenever a new file is created on the S3 bucket – this way, the data processing is performed asynchronously to the transmission of the data, and there is no risk of losing data due to a terminated connection. I discuss the processing lambda function in a later section.

Create an Application Load Balancer

In order to receive a webhook payload, we need to provide an endpoint to send the payloads to. We do this by creating an application load balancer within AWS by navigating to EC2 > Load Balancers and clicking “Create Load Balancer.” You will be prompted to choose which type of load balancer you want to create – for this, we want to create an application load balancer. We need to use an application load balancer (ALB) to build our consumer because the event webhooks will be sent as an HTTP request, and ALBs are used for routing HTTP requests within AWS. We could implement an HTTP Gateway as an alternative; however, we use an ALB for this project because it is more lightweight and cost-effective than HTTP Gateway. It’s important to note that if you choose to use an HTTP Gateway, the event format may be different than with an ALB, and therefore your lambda function will need to handle the request object accordingly.

Once your ALB has been created, you will be prompted to assign a name to your ALB and configure the scheme and access/security settings – since we plan to receive event data from an external source (Bird), we will want our ALB to be internet-facing. Under “Listeners and routing,” the ALB should listen to HTTPS on port 443, and we want to create a Target group that points to our lambda function so that our ALB will forward inbound requests to the consume lambda function we created above. You also need to ensure that the security group has permission to accept traffic through port 443.

Create a DNS Record for the Load Balancer

To make it easier for us to use our ALB as an endpoint, we will create an A record in DNS that points to our ALB. For this, we can use the AWS Route 53 service (or your current DNS provider) and create an A record for the hostname that you want to use for your endpoint (e.g. spevents.<your_domain>). When working with DNS at scale on AWS, be aware that there are undocumented DNS limits that can affect high-volume applications, especially those handling large amounts of outbound traffic like email delivery systems. The A record should be configured to point to the ALB we created. If you are using Route 53 to manage the DNS records, you can reference the ALB instance directly by enabling “Alias” and selecting the ALB; otherwise, if you are using an external DNS provider, you should point the A record to the public IP address of the ALB instance.

I recommend using a tool such as Postman to test that everything has been configured correctly before enabling your Bird webhook. You can make a POST request to your endpoint and confirm that a response is received. If your POST request does not return a response, you may need to double-check that your ALB is listening to the correct port.

Create a Bird Webhook

Now we are ready to create the webhook in Bird and use the hostname defined by the A record above as our target endpoint. To create the webhook, navigate to the Webhooks section within your Bird account and click “Create Webhook.” You will be prompted to assign a name for your webhook and provide a target URL – the target should be the hostname of the A record you created previously. Note that the target URL may require “HTTPS://” be included in the URL.

Once complete, verify the correct subaccount and events are selected, and press “Create Webhook” to save your configuration. The event data for all selected event types will now stream to our target URL and be consumed by our ALB for downstream processing.

Processing Webhook Event Data

Depending on the intended purpose for storing the Bird event data, your requirements may be satisfied by simply storing the JSON payload as a flat-file. You may also have a downstream ETL process already established that is capable of consuming and loading data in a JSON format. In both of these cases, you may be able to use the flat-file created by our processing lambda that we created above as-is.

Alternatively, you may need to transform the data – such as to convert from a JSON to a CSV format – or load the data directly into a database. In this example, we will create a simple lambda function that will convert the webhook data from the original JSON format into a CSV file that could be loaded into a database.

Create a Lambda to Process the Data

As with the lambda function to consume the webhook data, we need to create a new lambda function by navigating to the Lambda service within AWS and pressing “Create Function.” This new lambda function will get triggered when a new file is created on our S3 bucket – it will read the data and convert it into a new csv file.

The lambda function accepts the file information as an event. In the sample lambda function, you will see that we first have a series of validation checks to ensure that the data is complete and formatted as expected. Next, we convert the JSON payload into a CSV file by using the “csv” library and writing to a temp file. Lambda functions are only able to write local files to the “/tmp” directory, so we create a temporary csv file and name it with the convention <batch_id>.csv. The reason we use the batch_id here is simply to ensure that any parallel processes that are running as a result of receiving multiple webhook payloads will not interfere with each other, as each webhook batch will have a unique batch_id.

Once the data has been fully converted to CSV, we read the CSV data as a byte stream, delete the temp file, and save the CSV data as a new file on S3. It is important to note that a different S3 bucket is needed for the output, otherwise, we risk creating a recursive loop that can result in increased lambda usage and increased costs. We will need to identify in which S3 bucket and location we want our CSV file to be stored within our lambda function. Follow the same procedure as above to create a new S3 bucket to store our CSV file.

Note that the tmp directory is limited to 512 MB of space, so it is important that the temp file is deleted afterward to ensure sufficient space for future executions. The reason we use a temp file, as opposed to writing directly to S3, is to simplify the connection to S3 by having a single request.

Just as with the consume lambda function, you may need to update the permissions for your process lambda function. This lambda function requires the execution role to have GetObject permissions for the input S3 bucket, and both PutObject and GetObject for the output S3 bucket.

A sample of our processing lambda function can be found here.

Configure a Lambda to Execute When New Data is Stored on S3

Now that our lambda function to convert the file from JSON to CSV format has been created, we need to configure it to trigger when a new file is created on our S3 bucket. To do this, we need to add a trigger to our lambda function by opening our lambda function and clicking “Add Trigger” at the top of the page. Select “S3” and provide the name of the S3 bucket where the raw webhook payloads are stored. You also have the option to specify file prefix and/or suffix to filter on. Once the settings have been configured, you can add the trigger by clicking “Add” at the bottom of the page. Now your processing lambda function will execute whenever a new file is added to your S3 bucket.

Loading the Data into a Database

In this example, I will not cover loading the data into a database in detail, but if you have been following along with this example you have a couple of options:

Load the data directly into your database within your processing lambda function
Consume your CSV file using an established ETL process

Whether you’re using an AWS database service, such as RDS or DynamoDB, or you have your own PostgreSQL database (or similar), you can connect to your database service directly from your process lambda function. For instance, in the same way, that we called the S3 service using “boto3” in our lambda function, you could also use “boto3” to call RDS or DynamoDB. The AWS Athena service could also be used to read the data files directly from the flat-files, and access the data using a query language similar to SQL. I recommend referring to the respective documentation for the service you are using for more information on how best to accomplish this within your environment.

Similarly, there are many services available that can help consume CSV files and load the data into a database. You may already have an established ETL process that you can leverage.

We hope you found this guide to be helpful – happy sending!

Autres news

Lire la suite de cette catégorie

15 Meilleures Pratiques de Livraison d'Email pour Gmail

Annonces

SparkPost est désormais Bird Email

Commencer avec les notifications par e-mail

15 Meilleures Pratiques de Livraison d'Email pour Gmail

Annonces

SparkPost est désormais Bird Email

15 Meilleures Pratiques de Livraison d'Email pour Gmail

Annonces

SparkPost est désormais Bird Email

A person is standing at a desk while typing on a laptop.

La plateforme native AI complète qui évolue avec votre business.

Contactez les ventes

Commencez gratuitement

Contactez le Support

La plateforme native AI complète qui évolue avec votre business.

Contactez les ventes

Commencez gratuitement

Contactez le Support

La plateforme native AI complète qui évolue avec votre business.

Contactez les ventes

Commencez gratuitement

Contactez le Support

Création et consommation de Webhooks pour les oiseaux

Points Clés

Points forts des Q&A

Le guide suivant est une simple aide pour permettre aux expéditeurs de se sentir à l'aise lors de la création d'un webhook d'événements Bird et de l'utilisation des données via l'infrastructure d'AWS.

Configuration de l'Endpoint de Cible Webhook

Create an S3 Bucket to Store the Webhook Data

Create a Lambda Function to Consume the Data

Create an Application Load Balancer

Create a DNS Record for the Load Balancer

Create a Bird Webhook

Processing Webhook Event Data

Create a Lambda to Process the Data

Configure a Lambda to Execute When New Data is Stored on S3

Loading the Data into a Database

Autres news

Lire la suite de cette catégorie

15 Meilleures Pratiques de Livraison d'Email pour Gmail

SparkPost est désormais Bird Email

Commencer avec les notifications par e-mail

15 Meilleures Pratiques de Livraison d'Email pour Gmail

SparkPost est désormais Bird Email

15 Meilleures Pratiques de Livraison d'Email pour Gmail

SparkPost est désormais Bird Email

La plateforme native AI complète qui évolue avec votre business.

Newsletter

La plateforme native AI complète qui évolue avec votre business.

Product

Solutions

Resources

Company

À venir bientôt

Social

La plateforme native AI complète qui évolue avec votre business.

Product

Solutions

Resources

Company

À venir bientôt

Social