International IOT Deployment Pipeline

"Robin" stations, or Robotic Induct stations, are critical to Amazon's PrimeAir one day delivery guarantee.

Each Robin station consists of a robotic arm and a gantry of sensors. 

After packages are filled with items by workers, they are sent down a crowded conveyor to a Robin station which autonomously identifies a single package to pick up and scan. 

Once the package ID and final destination have been determined the arm places the package onto a different, wheeled robot, which will take the package to the loading area for its eventual destination city. [1]

From One to Many: Challenges of Scaling

When I joined the Robin team in 2020, they had already built three initial prototypes.

But the job wasn't over.


We were already racing to meet ambitious program goals: We needed to build over 300 stations across multiple sites in the US and EU, and remotely keep them operational. If a station crashed, we risked failing the one day delivery promise: thousands of packages would miss their plane flight and cause a cascading delivery delay.


Our existing update process required 1+ engineers to copy each software image to each station:

A Single Pipeline Emerges

Using Typescript and Amazon's Cloud Development Kit, I created a software pipeline that would promote software whenever commits were pushed to Robin's main packages. The pipeline could trigger AWS lambda-based integration tests and make promotion dependent on time and test-based conditions.

This took advantage of the existing in-house code repository management services, including security linters and dependency resolution. Other teams in the Robin project forked this pipeline for use with their own packages, and production stage code could be easily shared and ingested by dependent pipelines.

Each code promotion also triggered a lambda task to log metrics in an AWS DynamoDB table with the version(s) of software pushed, the date, and metrics on any deployment and integration tests. This enabled us to track the flow of changes throughout our pipeline, which was key to debugging several major issues in production.

When is a Robot Not a Robot?

IoT is a concept that describes groups of "smart" objects with sensors and processing power, which can transfer data over a network without the assistance of humans. [2]

Using AWS IoT, I designated each station as a "Thing," or device. [3] This allowed us to promote code using the AWS IoT services, which has conditional software rollback and group deployment functionality.

I attached an IoT deployment step to each stage in the pipeline, employing AWS lambda to dynamically determine the Thing group and build artifact versions to use. If a deployment failed, AWS IoT automatically rolled back to the prior Robin software version and logged the error in our AWS CloudWatch logs.

I integrated this with a site map app, which allowed users to view at a glance which stations were nonfunctional and their error codes.


As a result, deployment time dropped from several hours per station to less than 5 minutes for 50+ stations!

Incorrect version-related errors virtually disappeared.

Additionally, onsite managers with no technical background could monitor and recover from failed deployments.

Public Project Artifacts

YouTube: Robin Technology Spotlight

YouTube: Robin in action

References

[1] Brown, Alan S. “Amazon's Robot Arms Break Ground in Safety and Technology.” Amazon Science, Amazon Science, 15 Nov. 2022, https://www.amazon.science/latest-news/amazon-robotics-see-robin-robot-arms-in-action

[2] Gillis, Alexander (2021). "What is internet of things (IoT)?". IOT Agenda. Retrieved 4 February 2023. 

[3] https://aws.amazon.com/iot/