IT Science Case Study: Solving High-Volume Email Issues

How SendGrid updated its platform to become cloud-native on Amazon Web Services in an effort to pass along savings and efficiencies to its customers.

This is the latest article in a new feature series in eWEEK called IT Science, in which we look at what really happens at the intersection of new-gen IT and legacy systems.

Unless it’s brand new and right off various assembly lines, servers, storage and networking inside every IT system can be considered “legacy.” This is because the iteration of both hardware and software products is speeding up all the time. It’s not unusual for an app-maker, for example, to update and/or patch for security purposes an application a few times a month, or even a week. Some apps are updated daily! Hardware moves a little slower, but manufacturing cycles are also speeding up.

These articles will describe industry solutions only and won’t focus on any single product. The idea is to look at real-world examples of how new-gen IT products and services are making a difference in production each day. Most of them will be success stories, but there will also be others about projects that blew up. We’ll have IT integrators, system consultants, analysts and other experts helping us with these as needed.

Today’s Topic:  Engaging AWS to Handle High-Volume, Multi-Tenant Email

This article is about enterprise email platform provider SendGrid and how it updated its platform to become cloud-native on Amazon Web Services in an effort to pass along savings and efficiencies to its customers. This industry information was provided to eWEEK by J.R. Jasperson, Chief Architect at SendGrid.

Name the problem to be solved: Over the years, SendGrid discovered unique challenges associated with serving an incredibly high volume, multi-tenant SaaS email delivery platform; SendGrid has processed more than 1 trillion emails since inception. The complete re-architecture of its mail-processing pipeline to be cloud native lays the groundwork for continued future growth while taking advantage of the technology enablements derived from elastic capabilities and managed services of Amazon Web Services.

Describe the strategy that went into finding the solution: SendGrid wanted to pass along the benefits of AWS to its customers. SendGrid selected AWS to use Amazon’s immense global reach to create additional Points of Presence (PoPs) around the globe to quickly respond to API and SMTP requests, reducing latency and increasing throughput for its customers.

Additionally, running a high­-scale Cloud MTA requires near real-­time analysis of a massive flow of data behind the scenes—challenges that require stream processing and machine learning technologies. Thus, it was important to select a provider that enables this. With Amazon, these capabilities include Amazon Kinesis Streams, Amazon EMR and Amazon Machine Learning. It was also important that the cloud provider offer advanced analytics capabilities, and through Amazon, SendGrid gained access to Amazon Athena, Redshift and AWS Data Pipeline.

List the key components in the solution: Access to AWS Regions across the globe with built-in local redundancy via Availability Zones, autoscaling compute (EC2), Amazon Elastic MapReduce (EMR) and a complex, proprietary and horizontally scalable scheduling solution will certainly prove to be linchpins for this solution. 

Describe how the deployment went, perhaps how long it took, and if it came off as planned: For more than a year, SendGrid has been quietly working to re-architect its infrastructure to run a high-scale cloud multi-tenancy architecture on AWS. These changes are continuing to be rolled out throughout the remainder of 2017. Because this effort involves carefully re-architecting a system that continues to deliver a massive volume of email we are executing this deployment in a measured, isolated way--layering in the new architecture to ensure we maintain the high quality of service and deliverability that SendGrid customers expect. 

Describe the result, new efficiencies gained, and what was learned from the project: As SendGrid re-architected its platform for AWS, its email delivery platform was better able to:

  • dynamically route traffic based on flexible criteria to streamline mail flow through the most efficient path;
  • eradicate state from CPU-bound components to optimize for AWS’ auto-scaling capabilities;
  • distribute stateful concerns such as eventing, signal aggregation, processing and orchestration in preparation to extend atop the primitives exposed by AWS’ managed services; and
  • decouple the mail processing and transfer agents from the underpinning network components to ensure strong deliverability for customers.

Describe ROI, carbon footprint savings, and staff time savings, if any: Expected ROI benefits include:

  • using AWS’ global presence will allow SendGrid to reduce latency in many customer touchpoints; and
  • utilizing AWS’ managed services will allow SendGrid to accelerate the development of differentiating features to delight customers.

Other references:

Chris Preimesberger

Chris J. Preimesberger

Chris J. Preimesberger is Editor-in-Chief of eWEEK and responsible for all the publication's coverage. In his 15 years and more than 4,000 articles at eWEEK, he has distinguished himself in reporting...