Amazon Kinesis

Amazon Kinesis is a fully managed cloud platform on AWS designed to collect, process, and analyze real-time, streaming data.

Amazon Kinesis Operational Stages

The standard lifecycle of a real-time Kinesis data processing pipeline operates across four sequential stages:

  1. Data Ingestion: Gathers and imports live data streams from source devices (e.g., clickstreams, telemetry, or server logs) in diverse formats like JSON or raw binary.
  2. Sharding and Scaling: Groups and distributes incoming records into manageable storage divisions called shards to ensure horizontal scaling and parallel processing.
  3. Processing and Buffering: Segregates, aggregates, and transforms the streaming records to prep them for down-stream database indexing.
  4. Data Accessibility: Exposes processed stream records to analytical consumers using native APIs, serverless functions, or structured SQL engines.

Detailed Breakdown

The Amazon Kinesis platform comprises four specialized services, each addressing a distinct requirement within the real-time data streaming lifecycle:

Amazon Kinesis Data Streams (KDS): KDS is a highly scalable, real-time buffering service that ingests gigabytes of data per second from thousands of source applications.

Amazon Data Firehose (ADF): Formerly known as Kinesis Data Firehose, ADF is a fully managed, serverless delivery stream designed to load real-time streaming data directly into target storage vaults.

Amazon Managed Service for Apache Flink (AMF): Formerly known as Kinesis Data Analytics, this fully managed service enables developers to process, aggregate, and analyze streaming data continuously using standard SQL or Apache Flink.

Amazon Kinesis Video Streams (AKVS): A secure, fully managed ingestion platform built to stream live media, audio, and depth map data from connected devices into AWS.