- Analytics›
- Amazon MSK›
- FAQs
Amazon Managed Streaming for Apache Kafka FAQs
Page topics
GeneralGeneral
What is Amazon Managed Streaming for Apache Kafka (Amazon MSK)?
Amazon MSK is an AWS streaming data service that manages Apache Kafka infrastructure and operations, making it easy for developers and DevOps managers to run Apache Kafka applications and Kafka Connect connectors on AWS, without the need to become experts in operating Apache Kafka. Amazon MSK operates, maintains, and scales Apache Kafka clusters, provides enterprise-grade security features out of the box, and has built-in AWS integrations that accelerate development of streaming data applications.
To get started, you can migrate existing Apache Kafka workloads and Kafka Connect connectors into Amazon MSK or build new ones from scratch in only a few steps. There are no data transfer charges for in-cluster traffic used for replication and no commitments or upfront payments required. You only pay for the resources that you use.
What is Apache Kafka?
Apache Kafka is an open source, high-performance, fault-tolerant, and scalable platform for building real-time streaming data pipelines and applications. Apache Kafka is a streaming data store that decouples applications producing streaming data (producers) into its data store from applications consuming streaming data (consumers) from its data store. Organizations use Apache Kafka as a data source for applications that continually analyze and react to streaming data.
What is streaming data?
Streaming data is a continuous stream of small records or events (a record or event is typically a few kilobytes) generated by thousands of machines, devices, websites, and applications. Streaming data includes a wide variety of data, such as log files, generated by customers using your mobile or web applications, ecommerce purchases, in-game player activity, information from social networks, financial trading floors, geospatial services, security logs, metrics, and telemetry from connected devices or instrumentation in data centers. Streaming data services such as Amazon MSK and Amazon Kinesis Data Streams make it easy for you to continually collect, process, and deliver streaming data.
What is Kafka Connect?
Kafka Connect, an open source component of Apache Kafka, is a framework for connecting Apache Kafka with external systems, such as databases, key-value stores, search indexes, and file systems.
What are Apache Kafka’s primary capabilities?
The three key capabilities of Apache Kafka are as follows:
- Apache Kafka stores streaming data in a fault-tolerant way, providing a buffer between producers and consumers. It stores events as a continuous series of records and preserves the order in which the records were produced.
- Apache Kafka allows many data producers—such as websites, IoT devices, and Amazon Elastic Compute Cloud (Amazon EC2) instances—to continually publish streaming data and categorize it using Apache Kafka topics. Multiple data consumers (such as machine learning applications, AWS Lambda functions, and microservices) read from these topics at their own rate, similar to a message queue or enterprise messaging system.
- Data consumers can process data from Apache Kafka topics on a first-in-first-out basis, preserving the order data was produced.
What are the key concepts of Apache Kafka?
Apache Kafka stores records in topics. Data producers write records to topics and consumers read records from topics. Each record in Apache Kafka consists of a key, a value, a timestamp, and sometimes header metadata. Apache Kafka partitions topics and replicates these partitions across multiple nodes called brokers. Apache Kafka runs as a cluster on one or more brokers, and brokers are located in multiple AWS Availability Zones to create a highly available cluster. Apache Kafka relies on Apache ZooKeeper or Apache Kafka Raft (KRaft) to maintain cluster metadata.
When should I use Apache Kafka?
Apache Kafka supports real-time applications that transform, deliver, and react to streaming data, and can be used to build real-time streaming data pipelines that reliably send data between multiple systems or applications.
Does Amazon MSK support schema registration?
Yes, Apache Kafka clients can use the AWS Glue Schema Registry, a serverless feature of AWS Glue, at no additional charge. Visit the AWS Glue Schema Registry User Guide to get started and learn more. |
What does Amazon MSK do?
Amazon MSK makes it easy to get started and run of Apache Kafka on AWS with open high availability and security. Amazon MSK also offers integrations with AWS services without the operational overhead of running an Apache Kafka cluster. Amazon MSK allows you to use open source open versions of Apache Kafka while the service manages the setup, provisioning, AWS integrations, and ongoing maintenance of Apache Kafka clusters.
In only a few steps in the console, you can create an Amazon MSK cluster. From there, Amazon MSK replaces unhealthy brokers, automatically replicates data for high availability, manages metadata nodes, automatically deploys hardware patches as needed, manages the integrations with AWS services, makes important metrics visible through the console, and supports Apache Kafka version upgrades so you can take advantage of improvements to the open source version of Apache Kafka.
Resources
How do I create my first MSK cluster?
You can follow the Amazon MSK Getting Started Guide for setting up a cluster and producing and consuming from it. In summary, you can create your first cluster in a few steps in the AWS Management Console or by using the AWS SDKs. First, in the Amazon MSK console, select an AWS Region to create an MSK cluster in. Choose a name for your cluster, the virtual private cloud (VPC) you want to run the cluster with, and the subnets for each Availability Zone. If you are creating a provisioned cluster, you can also pick a broker instance type and the number of brokers per Availability Zone.
What resources are within a cluster?
Provisioned clusters contain broker instances and abstracted metadata nodes. Serverless clusters are a resource in and of themselves, which abstract away all underlying resources.
What are brokers?
In Apache Kafka, brokers are the individual servers that make up the Apache Kafka cluster. They are responsible for storing and replicating the data published to Kafka topics, managing the partitions within those topics, handling client requests (producing and consuming messages), and coordinating with each other to maintain the overall state of the Kafka deployment. Brokers are the core components that enable Kafka's distributed, scalable, and fault-tolerant architecture.
What broker instance sizes can I provision on an MSK cluster?
For provisioned clusters, you can choose EC2 T3.small instances or instances within the EC2 M7g and M5 instance families. For serverless clusters, brokers are completely abstracted. MSK also offers Standard and Express Broker types.
Do I need to provision and pay for broker boot volumes?
No, each broker you provision includes boot volume storage managed by the Amazon MSK service.
When I create an Apache Kafka cluster, do the underlying resources (such as Amazon EC2 instances) show up in my Amazon EC2 console?
Some resources, such as elastic network interfaces (ENIs), will show up in your Amazon EC2 account. Other Amazon MSK resources will not show up in your Amazon EC2 account because they are managed by the Amazon MSK service.
What do I need to provision within an MSK cluster?
For provisioned clusters, you need to provision broker instances with every cluster you create. On Standard brokers, you will provision storage and optionally enable provisioned storage throughput for storage volumes, which can be used to scale I/O without having to provision additional brokers. With Express brokers, you do not need to provision or manage storage. For all cluster types, you do not need to provision metadata nodes such as Apache ZooKeeper or KRaft nodes because these resources are included at no additional charge with each cluster you create. For serverless clusters, you just create a cluster as a resource.
How does data replication work in Amazon MSK?
Amazon MSK uses Apache Kafka’s leader-follower replication to replicate data between brokers. Amazon MSK makes it easy to deploy clusters with Multi-AZ replication. With Standard brokers, you have the option to use a custom replication strategy by topic. Express brokers guarantee higher availability by always replicating your data across three Availability Zones. Leader and follower brokers will be deployed and isolated using the broker type and replication strategy specified. For example, if you select Standard brokers with three Availability Zone broker replication strategy and with one broker per Availability Zone cluster, Amazon MSK will create a cluster of three brokers (one broker in three Availability Zones in a Region), and by default (unless you choose to override the topic replication factor), the topic replication factor will also be three. To learn more about what happens during client failover, see our client failover documentation.
Can I change the default broker configurations or upload a cluster configuration to Amazon MSK?
Yes, Amazon MSK allows you to create custom configurations and apply them to new and existing clusters. Express brokers protect more of the configurations from suboptimal values that may affect availability and durability. Express brokers also offer a simpler experience by abstracting configurations that have to do with storage since Amazon MSK fully manages the storage layer. For more information on custom configurations, see the configuration documentation.
How do I create topics?
Once your Apache Kafka cluster has been created, you can create topics using the Apache Kafka APIs. All topic and partition-level actions and configurations are performed using Apache Kafka APIs. The following command is an example of creating a topic using Apache Kafka APIs and the configuration details available for your cluster:
<path-to-your-kafka-installation>/bin/kafka-topics.sh --create — bootstrap-server <BootstrapBrokerString> --replication-factor 3 --partitions 1 -- topic TopicName
What are the deployment options of Amazon MSK?
Amazon MSK offers two deployment options for Apache Kafka clusters: Amazon MSK Provisioned and Amazon MSK Serverless. MSK Provisioned gives you varying levels of control over your cluster while removing most of the operational overhead that comes with managing Apache Kafka clusters. With MSK Provisioned, you scale your cluster in units of brokers. You can choose from various broker types, including Standard and Express brokers. In contrast, MSK Serverless is a cluster type that fully abstracts cluster scaling and management. With MSK Serverless, you can run your applications without having to provision, configure, or optimize clusters, and you pay for the data volume you stream and retain. Amazon MSK also offers multiple options to simplify connecting to your MSK clusters. These options include Amazon MSK Connect, Amazon MSK Replicator, and other native AWS integrations. See subsequent sections for more details.
Amazon MSK Provisioned
What is MSK Provisioned?
MSK Provisioned is an MSK cluster deployment option that allows you to manually configure and scale your Apache Kafka clusters. This provides you with varying levels of control over the infrastructure powering your Apache Kafka environment.
With MSK Provisioned, you can choose the instance types, storage volumes on Standard broker type, and number of broker nodes that make up your Kafka clusters. You can also scale your cluster by adding or removing brokers as your data processing needs evolve. This flexibility enables you to optimize the clusters for your specific workload requirements, whether that's maximizing throughput, retention capacity, or other performance characteristics.
In addition to the infrastructure configuration options, MSK Provisioned provides enterprise-grade security, monitoring, and operational benefits. This includes features such as Apache Kafka version upgrades, built-in security through encryption and access control, and integration with other AWS services such as Amazon CloudWatch for monitoring. MSK Provisioned offers two main broker types—Standard and Express.
Standard brokers give you the most flexibility to configure your clusters, while Express brokers offer more elasticity, throughput, resilience, and ease-of-use for running high performance streaming applications. See the sub-sections below for more details on each offering. The table below also highlights the key feature comparisons between Standard and Express brokers.
Feature | Standard | Express |
Storage Management | Customer managed (Feature include EBS storage, tiered storage, Provisioned storage throughput, Auto-scaling, Storage capacity alerts) | Fully MSK managed |
Supported instances | T3, M5, M7g | M7g |
Sizing and scaling considerations | Throughput, connections, partitions, storage | Throughput, connections, partitions |
Broker Scaling | Vertical and horizontal scaling | Vertical and horizontal scaling |
Kafka versions | See documentation | Starts at version 3.6 |
Apache Kafka Configuration | More configurable | Mostly MSK Managed for higher resilience |
Security | Encryption, Private/Public access, Authentication & Authorization - IAM, SASL/SCRAM, mTLS, plaintext, Kafka ACLs | Encryption, Private/Public access, Authentication & Authorization - IAM, SASL/SCRAM, mTLS, plaintext, Kafka ACLs |
Monitoring | CloudWatch, Open Monitoring | CloudWatch, Open Monitoring |
Does Amazon MSK support M7g clusters?
Yes, Amazon MSK supports AWS Graviton3-based M7g instances from .large through .16xlarge sizes to run all Apache Kafka workloads. Graviton instances come with the same availability and durability benefits of Amazon MSK, with up to 24% lower costs compared to corresponding M5 instances. Graviton instances provide up to 29% higher throughput per instance compared to Amazon MSK M5 instances, which allow customers to run MSK clusters with fewer brokers or smaller-sized instances.
Standard brokers
What are Standard brokers?
Standard brokers for MSK Provisioned offer the most flexibility to configure your cluster's performance. You can choose from a wide range of cluster configurations, to achieve the availability, durability, throughput, and latency characteristics required for your applications.You can also provision storage capacity and increase it as and when needed. Amazon MSK handles the hardware maintenance of Standard brokers and attached storage resources, automatically repairing hardware issues that may arise.
Express brokers
What are Express brokers?
Express brokers for MSK Provisioned make Apache Kafka simpler to manage, more cost-effective to run at scale, and more elastic with the low latency you expect. Brokers include pay-as-you-go storage that scales automatically and requires no sizing, provisioning, or proactive monitoring. Depending on the instance size selected, each broker node can provide up to 3x more throughput per broker, scale up to 20x faster, and recover 90% quicker compared to standard Apache Kafka brokers. Express brokers come pre-configured with Amazon MSK’s best practice defaults and enforce client throughput quotas to minimize resource contention between clients and Kafka’s background operations.
What are the key benefits of Express brokers?
- No storage management: Express brokers eliminate the need to provision or manage any storage resources. You get elastic, virtually unlimited, pay-as-you-go, and fully managed storage. For high throughput use cases, you do not need to reason about the interactions between compute instances and storage volumes and the associated throughput bottlenecks. These capabilities simplify cluster management and eliminate storage management operational overhead.
- Faster scaling: Express brokers allow you to scale your cluster and move partitions faster than on Standard brokers. This capability is crucial when you need to scale out your cluster to handle upcoming load spikes or scale in your cluster to reduce cost. See the sections on expanding your cluster, removing brokers, reassigning partitions, and setting up LinkedIn’s Cruise Control for rebalancing for more details on scaling your cluster.
- Higher throughput: Express brokers offer up to 3x more throughput per broker than Standard brokers. For example, you can safely write data at up to 500 MBps with each m7g.16xlarge sized Express broker compared to 153.8 MBps on the equivalent Standard broker (both numbers assume sufficient bandwidth allocation towards background operations, such as replication and rebalancing).
- Configured for high resilience: Express brokers automatically offer various best practices pre-configured to improve your cluster’s resilience. These include guardrails on critical Apache Kafka configurations, throughput quotas, and capacity reservation for background operations and unplanned repairs. These capabilities make it safer and easier to run large scale Apache Kafka applications. See the sections on Express broker configurations and Amazon MSK Express broker quota for more details.
- No Maintenance windows: There are no maintenance windows for Express brokers. Amazon MSK automatically updates your cluster hardware on an ongoing basis. See Amazon MSK Express brokers for more details.
How can I optimize my cost with Express brokers?
Express brokers provide more throughput per broker, so you can create clusters with fewer brokers for the same workload. Additionally, once your cluster is up and running, you can monitor the use of your cluster resources and right-size capacity faster than with Standard brokers. You can, therefore, provision resources that are fit for the capacity you need and scale faster to meet any changes in demand.
Which Apache Kafka APIs and tools can I use with Express brokers?
Clusters with Express brokers work with Apache Kafka APIs and tools that use the standard Apache Kafka client.
Which Kafka configurations do I need to customize for Express brokers?
Express brokers come preconfigured with Amazon MSK best practice defaults that optimize for availability and durability. You may customize some of these configurations to further fine-tune the performance of your clusters. Read more about Express broker configurations in the Amazon MSK Developer Guide.
Which encryption options are available with Express brokers?
Just as for Standard brokers, Amazon MSK integrates with AWS Key Management Service (AWS KMS) to offer transparent server-side encryption for the storage in Express brokers. When you create an MSK cluster with Express brokers, you can specify the AWS KMS key that you want Amazon MSK to use to encrypt your data at rest. If you don't specify a KMS key, Amazon MSK creates an AWS managed key for you and uses it on your behalf. Amazon MSK also uses TLS to encrypt data in transit for Express brokers, as it does for Standard brokers.
What are the Amazon MSK feature differences between Standard and Express brokers?
Most MSK Provisioned features and capabilities that work on Standard brokers also work with clusters that use Express brokers. Some differences include: storage management, instance type availability, and supported versions. See table comparing Standard and Express brokers under MSK Provisioned highlights some key similarities and differences.
Can I move my existing Kafka workload to Express brokers?
Yes, you can migrate the data in your Kafka cluster to a cluster comprising of Express brokers using MirrorMaker 2 or Amazon MSK Replicator, which copies both the data and the metadata of your cluster to a new cluster. You can learn more about using MirrorMaker 2 and MSK Replicator in the Amazon MSK Developer Guide.
How should I choose between Standard and Express MSK Provisioned broker types?
Express brokers increase your price performance, provide higher resiliency, and lower operational overhead, making it the ideal choice for all Apache Kafka workloads on MSK Provisioned. However, you can choose Standard broker types if you want to control more of your brokers’ configurations and settings. With Standard brokers, you can customize a wider set of Kafka configurations, including replication factor, size of log files, and leader election policies, which gives you more flexibility over your cluster settings.
Amazon MSK Serverless
What is MSK Serverless?
MSK Serverless is a cluster type for Amazon MSK that makes it easy for you to run Apache Kafka clusters without having to manage compute and storage capacity. With MSK Serverless, you can run your applications without having to provision, configure, or optimize clusters, and you pay for the data volume you stream and retain.
Does MSK Serverless automatically balance partitions within a cluster?
Yes, MSK Serverless fully manages partitions, including monitoring and moving them to even load across a cluster.
How much data throughput capacity does MSK Serverless support?
MSK Serverless provides up to 200 MBps of write capacity and 400 MBps of read capacity per cluster. Additionally, to ensure sufficient throughput availability for all partitions in a cluster, MSK Serverless allocates up to 5 MBps of instant write capacity and 10 MBps of instant read capacity per partition.
What security features does MSK Serverless offer?
MSK Serverless encrypts all traffic in transit and all data at rest using service-managed keys issued through AWS KMS. Clients connect to MSK Serverless over a private connection using AWS PrivateLink without exposing your traffic to the public internet. Additionally, MSK Serverless offers AWS Identity and Access Management (IAM) access control, which you can use to manage client authentication and client authorization to Apache Kafka resources such as topics.
How can producers and consumers access my MSK Serverless clusters?
When you create an MSK Serverless cluster, you provide subnets of one or more Amazon Virtual Private Clouds (Amazon VPCs) that host the clients of the cluster. Clients hosted in any of these Amazon VPCs can connect to the MSK Serverless cluster using its bootstrap broker string.
Which Regions is MSK Serverless available in?
For up-to-date Regional availability, refer to the Amazon MSK pricing page.
Which authentication types does MSK Serverless support?
MSK Serverless currently supports IAM (Identity Access Management) for client authentication and authorization. Your clients can assume an IAM role for authentication, and you can enforce access control using an associated IAM policy.
How do I process data in my MSK Serverless cluster?
You can use any Apache Kafka compatible tools to process data in your MSK Serverless cluster topics. MSK Serverless integrates with Amazon Managed Service for Apache Flink for stateful stream processing and AWS Lambda for event processing. You can also use Apache Kafka Connect sink connectors to send data to any desired destination.
How does MSK Serverless ensure high availability?
When you create a partition, MSK Serverless creates two replicas of it and places them in different Availability Zones. Additionally, MSK Serverless automatically detects and recovers failed backend resources to maintain high availability.
Migrating to Amazon MSK
Can I migrate data within my existing Apache Kafka cluster to Amazon MSK?
Yes, you can use third-party or open source tools such as MirrorMaker2, supported by Apache Kafka, to replicate data from clusters into an MSK cluster. Check out this Amazon MSK migration lab to help you plan for your migration.
Supported versions
Are Apache Kafka version upgrades supported?
Yes, Amazon MSK supports fully managed, in-place Apache Kafka version upgrades for provisioned clusters. To learn more about upgrading your Apache Kafka version and high-availability best practices, see the version upgrades documentation.
What versions of Apache Kafka are supported?
All Apache Kafka versions are supported until they reach their end of support date. For more details on the end of support policy and dates, see our version support documentation.
Networking
Does Amazon MSK run in an Amazon VPC?
Yes, Amazon MSK always runs within an Amazon VPC managed by the Amazon MSK service. Amazon MSK resources will be available to your own Amazon VPC, subnet, and security group that you select when the cluster is set up. IP addresses from your VPC are attached to your Amazon MSK resources through ENIs, and all network traffic stays within the AWS network and is not accessible to the internet by default.
How will the brokers in my Amazon MSK cluster be made accessible to clients within my VPC?
The brokers in your cluster will be made accessible to clients in your VPC through ENIs appearing in your account. The security groups on the ENIs will dictate the source and type of ingress and egress traffic allowed on your brokers.
Is it possible to connect to my cluster over the public internet?
Yes, Amazon MSK offers an option to securely connect to the brokers of Amazon MSK clusters running Apache Kafka 2.6.0 or later versions over the internet. By enabling public access, authorized clients external to a private Amazon VPC can stream encrypted data in and out of specific Amazon MSK clusters. You can enable public access for MSK clusters after a cluster has been created at no additional cost, but standard AWS data transfer costs for cluster ingress and egress apply. To learn more about turning on public access, see the public access documentation.
Is the connection between my clients and an Amazon MSK cluster private?
By default, the only way data can be produced and consumed from an Amazon MSK cluster is over a private connection between your clients in your VPC and the Amazon MSK cluster. However, if you turn on public access for your MSK cluster and connect to your MSK cluster using the public bootstrap broker string, the connection—though authenticated, authorized, and encrypted—is no longer considered private. We recommend that you configure the cluster's security groups to have inbound TCP rules that allow public access from your trusted IP address and make these rules as restrictive as possible if you turn on public access.
How do I connect to my Amazon MSK cluster from inside AWS network but outside the cluster’s Amazon VPC?
You can connect to your MSK cluster from any VPC or AWS account different than your MSK cluster’s Amazon VPC by turning on the multi-VPC private connectivity for MSK clusters running Apache Kafka 2.7.1 or later versions. You can only turn on private connectivity after cluster creation for any of the supported authentication schemes (IAM authentication, SASL/SCRAM, and mTLS authentication). You should configure your clients to connect privately to the cluster using Amazon MSK managed VPC connections, which uses PrivateLink technology to enable private connectivity. To learn more about setting up private connectivity, see the access from within AWS documentation.
Encryption
Can I encrypt data in my MSK cluster?
Yes, Amazon MSK uses Amazon Elastic Block Store (Amazon EBS) server-side encryption and AWS KMS keys to encrypt storage volumes.
Is data encrypted in transit between brokers within an MSK cluster?
Yes, by default, new clusters have in-transit encryption enabled through TLS for inter-broker communication. For provisioned clusters, you can opt out of using in-transit encryption when a cluster is created.
Is data encrypted in transit between my Apache Kafka clients and Amazon MSK?
Yes, by default, in-transit encryption is set to TLS only for clusters created from the AWS CLI or AWS Management Console. Additional configuration is required for clients to communicate with clusters using TLS encryption. For provisioned clusters, you can change the default encryption setting by selecting the TLS/plaintext or plaintext settings. Read more about Amazon MSK encryption.
Is data encrypted in transit as it moves between brokers and metadata nodes in an MSK cluster?
Yes, MSK clusters support TLS in-transit encryption between Kafka brokers and metadata nodes.
Access Management
How do I control cluster authentication and Apache Kafka API authorization?
For serverless clusters, you can use IAM access control for both authentication and authorization. For provisioned clusters, you have the following options:
- IAM access control for both AuthN/AuthZ (recommended)
- TLS certificate authentication for AuthN and access control lists for AuthZ
- SASL/SCRAM for AuthN and access control lists for AuthZ
Amazon MSK recommends using IAM access control. It is the easiest to use and, because it defaults to least privilege access, the most secure option.
How does authorization work in Amazon MSK?
If you are using IAM access control, Amazon MSK uses the policies you write and its own authorizer to authorize actions. If you are using TLS certificate authentication or SASL/SCRAM, Apache Kafka uses access control lists (ACLs) for authorization. To enable ACLs, you must enable client authentication using either TLS certificates or SASL/SCRAM.
How can I authenticate and authorize a client at the same time?
If you are using IAM access control, Amazon MSK will authenticate and authorize for you without any additional set up. If you are using TLS authentication, you can use the Dname of clients’ TLS certificates as the principal of the ACL to authorize client requests. If you are using SASL/SCRAM, you can use the username as the principal of the ACL to authorize client requests.
How do I control service API actions?
You can control service API actions using IAM.
Can I enable IAM access control for an existing cluster?
Yes, you can enable IAM access control for an existing cluster from the AWS Management Console or by using the UpdateSecurity API.
Can I use IAM access control outside of Amazon MSK?
No, IAM access control is only available for MSK clusters.
How do I provide cross-account access permissions to a Kafka client in an AWS account different from my Amazon MSK account to connect privately to my MSK cluster?
You can attach a cluster policy to your Amazon MSK cluster to provide your cross-account Kafka client permissions to set up private connectivity to your Amazon MSK cluster. When using IAM client authentication, you can also use the cluster policy to granularly define the Kafka data plane permissions for the connecting client. To learn more about cluster policies, see the cluster policy documentation.
Monitoring, metrics, logging, and tagging
How do I monitor the performance of my clusters or topics?
You can monitor the performance of your clusters using the Amazon MSK console, the Amazon CloudWatch console, or through JMX and host metrics using Open Monitoring with Prometheus, an open source monitoring solution.
What is the cost for the different CloudWatch monitoring levels?
The cost of monitoring your cluster using CloudWatch is dependent on the monitoring level and the size of your Apache Kafka cluster. CloudWatch charges per metric per month and includes an AWS Free Tier. See Amazon CloudWatch pricing for more information. For details on the number of metrics exposed for each monitoring level, see the Amazon MSK monitoring documentation.
What monitoring tools are compatible with Open Monitoring with Prometheus?
Tools that are designed to read from Prometheus exporters are compatible with Open Monitoring, such as Datadog, Lenses, New Relic, Sumo Logic, or a Prometheus server. For details on Open Monitoring, see the Amazon MSK Open Monitoring documentation.
How do I monitor the health and performance of clients?
You can use any client-side monitoring supported by the Apache Kafka version you are using.
Can I tag Amazon MSK resources?
Yes, you can tag Amazon MSK clusters from the AWS CLI or AWS Management Console.
How do I monitor consumer lag?
Topic-level consumer lag metrics are available as part of the default set of metrics that Amazon MSK publishes to CloudWatch for all clusters. No additional setup is required to get these metrics.
How much does it cost to publish the consumer lag metric to CloudWatch?
Topic-level metrics are included in the default set of Amazon MSK metrics, which are free of charge. Partition-level metrics are charged according to Amazon CloudWatch pricing.
How do I access Apache Kafka broker logs?
You can enable broker log delivery for provisioned clusters. You can deliver broker logs to Amazon CloudWatch Logs, Amazon Simple Storage Service (Amazon S3), and Amazon Data Firehose. Firehose supports Amazon OpenSearch Service among other destinations. To learn how to enable this feature, see the Amazon MSK logging documentation. To learn about pricing, refer to Amazon CloudWatch Logs and Amazon Data Firehose pricing pages.
What is the logging level for broker logs?
Amazon MSK provides INFO-level logs for all brokers within a provisioned cluster.
Can I log the use of Apache Kafka resource APIs, such as create topic?
Yes, if you use IAM access control, the use of Apache Kafka resource APIs is logged to AWS CloudTrail.
Metadata Management
What is Apache ZooKeeper?
From https://zookeeper.apache.org: “Apache ZooKeeper is a centralized service that allows you to maintain configuration information, name, provide distributed synchronization, and provide group services. All of these kinds of services are used in some form or another by distributed applications,” including Apache Kafka.
Does Amazon MSK use Apache ZooKeeper?
Yes, Amazon MSK uses Apache ZooKeeper for metadata management. Additionally, starting from Apache Kafka version 3.7, you can create clusters in either ZooKeeper mode or KRaft mode. A cluster created with KRaft mode uses KRaft controllers for metadata management instead of ZooKeeper nodes.
What is Apache KRaft?
Apache KRaft is the consensus protocol that shifts metadata management in Kafka clusters from external Apache ZooKeeper nodes to a group of controllers within Kafka. This change allows metadata to be stored and replicated as topics within Kafka brokers, resulting in faster propagation of metadata. To learn more, refer to our Apache KRaft documentation.
Are there any API changes required to use KRaft mode on Amazon MSK compared to ZooKeeper mode?
There are no API changes required to use KRaft mode on Amazon MSK. However, if your clients still use the --zookeeper connection string today, you should update your clients to use the --bootstrap-server connection string to connect to your cluster and perform admin actions. The --zookeeper flag is deprecated in Apache Kafka version 2.5 and is removed starting with Kafka 3.0. Therefore, we recommend you use recent Apache Kafka client versions and the --bootstrap-server connection string.
I have tools that connect to ZooKeeper; how will these work for KRaft clusters without ZooKeeper?
You should check that any tools you use are capable of using Kafka Admin APIs without ZooKeeper connections. See our updated documentation on using Cruise Control for KRaft mode clusters. Cruise Control has also published steps to follow to run Kafka without ZooKeeper connection.
Can I host more partitions per broker on KRaft-based clusters than ZooKeeper-based clusters?
The number of partitions per broker is the same on Kraft- and ZooKeeper-based clusters. However, KRaft allows you to host more partitions per cluster by provisioning more brokers in a cluster.
Integrations
What AWS services does Amazon MSK integrate with?
Amazon MSK integrates with the following AWS services:
- Amazon S3 using Firehose for delivering data to Amazon S3 from Amazon MSK in a no-code manner
- Amazon VPC for network isolation and security
- Amazon CloudWatch for metrics
- AWS KMS for storage volume encryption
- IAM for authentication and authorization of Apache Kafka and service APIs
- AWS Lambda for Amazon MSK event sourcing
- AWS IoT Core for IoT event sourcing
- AWS Glue Schema Registry for controlling the evolution of schemas used by Apache Kafka applications
- AWS CloudTrail for AWS API logs
- AWS Certificate Manager for private CAs used for client TLS authentication
- AWS CloudFormation for describing and provisioning Amazon MSK clusters using code
- Amazon Managed Service for Apache Flink for fully managed Apache Flink applications that process streaming data
- Amazon Managed Service for Apache Flink Studio for interactively streaming SQL on Apache Kafka
- AWS Secrets Manager for client credentials used for SASL/SCRAM authentication
Amazon MSK Serverless integrates with the following AWS services:
- Amazon S3 using Firehose for delivering data to Amazon S3 from MSK in a no-code manner
- Amazon VPC for network isolation and security
- Amazon CloudWatch for metrics
- IAM for authentication and authorization of Apache Kafka and service APIs
- AWS Glue Schema Registry for controlling the evolution of schemas used by Apache Kafka applications
- AWS CloudTrail for AWS API logs
- AWS PrivateLink for private connectivity
Replication
What is Amazon MSK Replicator?
Amazon MSK Replicator is a feature of Amazon MSK that helps customers reliably replicate data across MSK clusters in different AWS Regions (cross-Region replication) or within the same AWS Region (same-Region replication), without writing code or managing infrastructure. You can use cross-Region replication to build highly available and fault-tolerant multi-Region streaming applications for increased resiliency. You can also use cross-Region replication to provide lower latency access to consumers in different geographic regions. You can use same-Region replication to distribute data from one cluster to many clusters for sharing data with your partners and teams. You can also use same-Region replication to aggregate data from multiple clusters into one for analytics.
How do I use MSK Replicator?
To set up replication between a pair of source and target MSK clusters, you need to create a Replicator in the destination Region. To create a Replicator, you specify details that include the Amazon Resource Name (ARN) of the source and target MSK clusters and an IAM role that MSK Replicator can use to access the clusters. You will need to create the target MSK cluster if it does not already exist.
Which type of Kafka clusters are supported by MSK Replicator?
MSK Replicator supports replication across MSK clusters only. Both Provisioned and Serverless types of MSK clusters are supported. You can also use MSK Replicator to move from Provisioned to Serverless or the other way around with other Kafka clusters that are not supported.
Can I specify which topics I want to replicate?
Yes, you can specify which topics you want to replicate using allow and deny lists while creating the Replicator.
Does MSK Replicator replicate topic settings and consumer group offsets?
Yes, MSK Replicator automatically replicates the necessary Kafka metadata, such as topic configuration, ACLs, and consumer group offsets so that consuming applications can resume processing seamlessly after failover. You can choose to turn off one or more of these settings if you only want to replicate the data. You can also specify which consumer groups you want to replicate using allow or deny lists while creating the Replicator.
Do I need to scale the replication when my ingress throughput changes?
No, MSK Replicator automatically deploys, provisions, and scales the underlying replication infrastructure to support changes in your ingress throughput.
Can I replicate data across MSK clusters in different AWS accounts?
No, MSK Replicator only supports replication across MSK clusters in the same AWS account.
How can I monitor the replication?
You can use CloudWatch in the destination Region to view metrics for ReplicationLatency,
MessageLag, and ReplicatorThroughput at a topic and aggregate level for each Replicator at no additional charge. Metrics are visible under ReplicatorName in the AWS/Kafka namespace. You can also see the ReplicatorFailure, AuthError, and ThrottleTime metrics to check if your Replicator is running into any issues.
How can I use replication to increase the resiliency of my streaming application across Regions?
You can use MSK Replicator to set up active-active or active-passive cluster topologies to increase resiliency of your Kafka application across Regions. In an active-active setup, both MSK clusters are actively serving reads and writes. Comparatively, in an active-passive setup, only one MSK cluster at a time is actively serving streaming data while the other cluster is on standby.
Can I use MSK Replicator to replicate data from one cluster to multiple clusters or replicate data from many clusters to one?
Yes. By creating a different Replicator for each source and target cluster pair, you can replicate data from one cluster to multiple clusters or replicate data from many clusters to one.
How does MSK Replicator connect to the source and target MSK clusters?
MSK Replicator uses IAM access control to connect to your source and target clusters. You need to turn on your source and target MSK clusters for IAM access control to create a Replicator. You can continue to use other authentication methods including SASL/SCRAM and mTLS at the same time for your clients since Amazon MSK supports multiple authentication methods simultaneously.
How much replication latency should I expect with MSK Replicator?
MSK Replicator replicates data asynchronously. Replication latency varies based on many factors, including the network distance between the Regions of your MSK clusters, your source and target clusters’ throughput capacity, and the number of partitions on your source and target clusters.
Can I keep topic names the same with MSK Replicator?
No, MSK Replicator creates new topics in the target cluster with an auto-generated prefix added to the topic name. For instance, MSK Replicator will replicate data in the topic from the source cluster to a new topic in target cluster called <sourceKafkaClusterAlias>.topic. MSK Replicator does this to distinguish topics that contain data replicated from source cluster from other topics in the target cluster and to avoid data being circularly replicated between the clusters. You can find the prefix that will be added to the topic names in the target cluster under the sourceKafkaClusterAlias field using the DescribeReplicator API or the Replicator details page on the Amazon MSK console.
Can I replicate existing data on the source cluster?
Yes. By default, when you create a new Replicator, it starts replicating data from the tip of the stream (latest offset) on the source cluster. Alternatively, if you want to replicate existing data, you can configure a new Replicator to start replicating data from the earliest offset in the source cluster topic partitions.
Can replication result in throttling consumers on the source cluster?
Since MSK Replicator acts as a consumer for your source cluster, it is possible that replication causes other consumers to be throttled on your source cluster. This depends on how much read capacity you have on your source cluster and throughput of the data you are replicating. We recommend that you provision identical capacity for your source and target clusters and account for the replication throughput while calculating how much capacity you need. You can also set Kafka quotas for the Replicator on your source and target clusters to control how much capacity the Replicator can use.
Can I compress data before writing to the target cluster?
Yes, you can specify your choice of compression codec while creating the Replicator amongst None, GZIP, Snappy, LZ4, and ZSTD.
Can I compress data before writing to the target cluster?
Yes, you can specify your choice of compression codec while creating the Replicator amongst None, GZIP, Snappy, LZ4, and ZSTD.
Scaling
How can I scale up storage in my cluster?
You can scale up storage in your provisioned cluster running on Standard brokers using the AWS Management Console or AWS CLI. You can also create an auto scaling storage policy using the AWS Management Console or by creating an AWS Application Auto Scaling policy using the AWS CLI or APIs. Tiered storage on Standard brokers allows you to virtually store unlimited data on your cluster without having to add brokers for storage. With Express brokers, you do not need to provision or manage storage, and you also have access to virtually unlimited storage. In serverless clusters, storage is scaled seamlessly based on your usage.
How does tiered storage work?
Apache Kafka stores data in files called log segments. As each segment is complete, based on the size configured at cluster or topic level, it is copied to the low-cost storage tier. Data is held in performance-optimized storage for a specified retention time or size, and then it’s deleted. There is a separate time and size limit setting for the low-cost storage, which is longer than the primary storage tier. If clients request data from segments stored in the low-cost tier, the broker will read the data from it and serve the data in the same way as if it’s being served from the primary storage.
Can I scale the number of brokers in an existing cluster?
Yes, you can choose to increase or decrease the number of brokers for provisioned MSK clusters.
Can I scale the broker size in an existing cluster?
Yes, you can choose to scale to a smaller or larger broker type on your provisioned MSK clusters.
How do I balance partitions across brokers?
You can use Cruise Control for automatically rebalancing partitions to manage I/O heat. See the Cruise Control documentation for more information. Alternatively, you can use the Kafka Admin API kafka-reassign-partitions.sh to reassign partitions across brokers. In serverless clusters, Amazon MSK automatically balances partitions.
Pricing and availability
How does Amazon MSK pricing work?
Pricing depends on the resources you create. You can learn more by visiting Amazon MSK pricing.
Do I pay for data transfer as a result of data replication?
No, in-cluster data transfer is included with the service at no additional charge.
In what Regions is Amazon MSK available?
For information about the Regions in which Amazon MSK is available, visit the AWS Regions table.
How does data transfer pricing work?
With provisioned clusters, you will pay standard AWS data transfer charges for data transferred in and out of an MSK cluster. You will not be charged for data transfer within the cluster in a Region, including data transfer between brokers and data transfer between brokers and metadata management nodes.
With serverless clusters, you will pay standard AWS data transfer charges for data transferred to or from another Region and for data transferred out to the public internet.
Does Amazon MSK offer Reserved Instance pricing?
No, Not at this time.
Compliance
What compliance programs are in scope for Amazon MSK?
Amazon MSK is compliant with or eligible for the following programs:
- HIPAA eligible
- PCI
- ISO
- SOC 1, 2, and 3
For a complete list of AWS services and compliance programs, see AWS Services in Scope by Compliance Program.
Service Level Agreement
What does the Amazon MSK SLA guarantee?
Our Amazon MSK SLA guarantees a Monthly Uptime Percentage of at least 99.9% for Amazon MSK (including MSK Serverless and MSK Connect).
How do I know if I qualify for an SLA Service Credit?
If Multi-AZ deployments on Amazon MSK have a Monthly Uptime Percentage of less than 99.9% during any monthly billing cycle, you are eligible for an SLA credit for Amazon MSK under the Amazon MSK SLA.
For full details on all of the terms and conditions of the SLA as well as details on how to submit a claim, see the Amazon MSK SLA page.