Amazon MSK is a fully managed service that makes it easy for us to build and run applications that use Apache Kafka to process streaming data. Apache Kafka is an open-source platform for building real-time streaming data pipelines and applications. With Amazon MSK, we can use native Apache Kafka APIs to populate data lakes, stream changes to and from databases, and power machine learning and analytics applications.
Apache Kafka clusters are challenging to setup, scale, and manage in production. When we run Apache Kafka on our own, we need to provision servers, configure Apache Kafka manually, replace servers when they fail, orchestrate server patches and upgrades, architect the cluster for high availability, ensure data is durably stored and secured, setup monitoring and alarms, and carefully plan scaling events to support load changes. Amazon MSK makes it easy for us to build and run production applications on Apache Kafka without needing Apache Kafka infrastructure management expertise. That means we spend less time managing infrastructure and more time building applications.
With a few clicks in the Amazon MSK console we can create highly available Apache Kafka clusters with settings and configuration based on Apache Kafka’s deployment best practices. Amazon MSK automatically provisions and runs our Apache Kafka clusters. Amazon MSK continuously monitors cluster health and automatically replaces unhealthy nodes with no downtime to our application. In addition, Amazon MSK secures our Apache Kafka cluster by encrypting data at rest.
Amazon MSK runs and manages Apache Kafka for us. This makes it easy for us to migrate and run our existing Apache Kafka applications on AWS without changes to the application code. By using Amazon MSK, we maintain open source compatibility and can continue to use familiar custom and community-built tools such as MirrorMaker, Apache Flink and Prometheus.
Amazon MSK lets us focus on creating our streaming applications without having to worry about the operational overhead of managing our Apache Kafka environment. Amazon MSK manages the provisioning, configuration, and maintenance of Apache Kafka clusters and Apache ZooKeeper nodes for us. Amazon MSK also shows key Apache Kafka performance metrics in the AWS console.
Elastic stream processing
Apache Flink is a powerful, open-source stream processing framework for stateful computations of streaming data. We can run fully managed Apache Flink applications written in SQL, Java, or Scala that elastically scale to process data streams within Amazon MSK.
Amazon MSK creates an Apache Kafka cluster and offers multi-AZ replication within an AWS Region. Amazon MSK continuously monitors cluster health, and if a component fails, Amazon MSK will automatically replace it.
Amazon MSK provides multiple levels of security for our Apache Kafka clusters including VPC network isolation, AWS IAM for control-plane API authorization, encryption at rest, TLS encryption in-transit, TLS based certificate authentication, SASL/SCRAM authentication secured by AWS Secrets Manager, and supports Apache Kafka Access Control Lists (ACLs) for data-plane authorization.