Security¶
In production environments security, such as in authentication, authorization and transport encryption, is often desired. The Fast Data CSD supports all Kafka authentication scenarios as well as transport encryption. Authorization isn’t fully tested yet.
The CSD integrates deeply with Cloudera Manager and takes advantage of the built-in facilities to configure Kerberos and SSL with the less steps possible.
Important
In order to utilize security features succesfully we strongly advise cluster administrators to familiriaze with this document. We do automate most of the configuration but an administrator needs to know of certain workarounds and pitfalls in order to get the most out of the CSD.
Contents
Security Types and Status¶
Authentication¶
Kerberos is the main authentication mechanism supported by both Kafka Brokers and Zookeeper [1]. Kafka Brokers also permit authentication via SSL client certificates.
Kafka REST Proxy and Schema Registry support authentication only via SSL client certificates. Kafka Connect does not support client authentication yet.
It is important to understand that as these components interact which other, they also are clients themselves. Whilst their server part may support authentication, their client part doesn’t always follow [2]. In later sections every component’s setup will be explained in detail.
[1] | As Kafka moves away from Zookeeper for storing its consumer offsets, it becomes much less taxing to Zookeeper service. Thus we chose to utilize Cloudera’s Zookeeper for Kafka’s coordination and service discovery tasks. An important difference when it comes to security, is that Cloudera’s Zookeeper as of CDH 5.9 does not support SSL certificates for authentication but only Kerberos. SSL authentication was only added in the current, under-development, iteration of the software (Zookeeper 3.5) which is still in alpha state and as only logical, should be avoided for use in production. |
[2] | We will see later on that Kafka REST Proxy doesn’t support authentication neither transport encryption when interacting with the brokers. |
Authorization¶
Authorization is handled by ACLs in ZooKeeper and Kafka Topics, enforced by the brokers. Although you will usually deal with ACLs in the brokers, Kafka topics’ settings are stored inside ZooKeeper, thus knowledge of and occasionally intervention in ZooKeeper ACLs is a requisite. Authorization requires authentication to the brokers and zookeeper.
The rest of the stack has limited support for ACLs as client to the ZooKeepers and the Brokers.
Although we include options for ACLs and perform some ACL management, we do not support ACLs at the moment due to their complex nature. Once we test them adequately we will provide a guide on how to safely use them and administer your services. Probably some manual intervention —documented here— will be needed to run an ACL setup.
Transport Security¶
Transport security is handled by TLS/SSL listeners on the Brokers, the Schema Registry and the Kafka REST Proxy. Kafka Connect does not offer SSL for its server part but can use SSL to connect to the brokers and the Schema Registry.
As of Confluent Platform 3.1.2 some workarounds and plaintext listeners are still needed, especially for the Schema Registry and Kafka Brokers (if Kafka REST is used). They are documented in the schema registry and broker sections below.
Authentication Support Matrix¶
These are the possible authentication scenarios Confluent Platform supports. All can be setup and managed within the Fast Data CSD. Please note that TLS provides always transport encryption and optionally authentication via client SSL certificates.
Client ▼ | Server ► Zookeeper Broker Schema Registry Kafka Connect REST Proxy Broker NONE,SASL NONE,SASL,TLS not applicable not applicable not applicable Schema Registry NONE,SASL NONE,SASL,TLS NONE [3] not applicable not applicable Connect NONE,SASL NONE,SASL,TLS NONE,TLS not applicable [4] not applicable REST Proxy NONE,SASL NONE [5] NONE,TLS not applicable not applicable Clients NONE,SASL NONE,SASL,TLS NONE,TLS NONE NONE,TLS
[3] | A shortcoming of the current Schema Registry, it can not exchange data between its instances via TLS. |
[4] | Kafka Connect workers on distributed mode do not exchange data, only configuration via zookeeper and kafka, so they don’t speak directly to each other. |
[5] | Kafka REST still uses the old consumer. This means that its consumer part does not support any type of security. Its producer part and its logic part support all types of security. |
Kerberos (system-wide)¶
kerberos (SASL/GSSAPI) can be used to authenticate to the brokers and zookeeper;
it is well supported in the Fast Data CSD. You can enable it through the service
wide setting kerberos.auth.enable
.
This setting by itself instructs Cloudera Manager to issue keytabs to the Fast Data service roles and load them. Keytabs are files that contain Kerberos principals and encrypted keys. They are used by client or server applications to authenticate or provide authentication via a kerberos authentication service.
It also makes the Brokers require SASL authentication. As by default none Fast Data role is configured to authenticate to the Brokers —not even the brokers themselves—, just enabling Kerberos will make your service fail to start. In the next sections we will show how to setup role specific authentication. Whilst this behaviour may seem counterintuitive at first, it was designed to ease the Fast Data CSD usage and facilitate setup scenarios where not all components of the stack use authentication. Only in the current iteration of Confluent Platform autentication is supported by all roles, but there are still some valid reasons to permit unauthenticated access to trusted clients.
Last, the credentials provided, can be used to authenticate to zookeeper, which, as you may notice in the figure above, is the default behaviour for the CSD once Kerberos is enabled.
Brokers’ Security¶
The brokers support both SASL (Kerberos) and TLS/SSL client certificates for authentication, as well as TLS/SSL for transport encryption. As of Confluent Platform 3.1.2 all components of the stack except Kafka REST Proxy can interact with secured brokers. A workaround provided, is to add unsecured listeners that you should only make available to the REST Proxy service via network configuration.
Authentication and Transport Security¶
Kerberos Authentication¶
Once you enabled Kerberos (system-wide) for your Fast Data service, your brokers are automatically set to require authentication via SASL. In Kafka terminology this translates to either a SASL_PLAINTEXT or a SASL_SSL listener [6], set automatically by the CSD.
Every other component, even the brokers themselves, are set to use PLAINTEXT
for communicating with the brokers. There are two possible approaches from here,
you can either enable a plaintext listener in your brokers that the rest of the
components will pick automatically, or set them up to use kerberos
authentication (except Kafka REST that does not support authentication to the
brokers). The latter is shown in Inter Broker Communication,
Schema Registry Client, Kafka REST Proxy Client,
Kafka Connect Client subsections. The former is set via the service wide
setting force.plaintext.listener
. You can read more about it at
Force Plaintext Listener. If you want to test your setup, see
Verify Kerberos.
[6] | Please refer to Kafka Listeners for more information about the listeners in Kafka. All our roles support listeners’ override via the safety valve to complement advanced deployment scenarios. |
TLS/SSL for Transport Encryption¶
As TLS/SSL for transport encryption is a requisite for SSL client
authentication, we will present it first. To enable TLS for the brokers, you
should use the broker specific settings ssl_enabled
,
ssl.keystore.location
, ssl.keystore.password
and
ssl.key.password
. These settings are the same for each Cloudera Managed,
JVM-based role that can act as a TLS server and probably you are already
familiar with.
Once TLS/SSL is setup, the brokers will use the TLS/SSL port instead of the default TCP Port and, depending on whether Kerberos is enabled, may require authentication via GSSAPI/Kerberos.
As with Kerberos, you have to set your roles to use the appropriate protocol to
connect to the brokers —SSL
or SASL_SSL
in this case. Also you have to
set them with a truststore that contains your Certificate Authorities’
certificates in order to accept the certificate provided by the brokers. Please
see Inter Broker Communication, Schema Registry Client,
Kafka REST Proxy Client, and Kafka Connect Client.
Important
TLS/SSL can have a substantial impact in the performance of your Kafka cluster, maybe up to 30% or even more. You may want to see Force Plaintext Listener if there are roles or clients in your setup that can skip transport encryption.
SSL Client Authentication¶
Once you enabled TLS/SSL for transport encryption, you can enable client
authentication as well. The necessary configuration options are
ssl.client.auth
which sets the brokers to require authentication via SSL and
ssl.truststore.location
, ssl.truststore.password
which are used to
provide to the brokers a list of Certificate Authorities that are trusted to
sign client certificates.
You have to set up the rest of the roles to use client certificates. Please refer to Schema Registry Client, Kafka REST Proxy Client, Kafka Connect Client subsections for more information. If you want to test your setup, see Verify Brokers SSL Authentication.
Force Plaintext Listener¶
The Fast Data CSD offers the option for an extra plaintext listener on the brokers; that is a listener with no security whatsoever. Such a setup may be justified when the unsecured listener is kept secured within your cluster via a firewall or other network configuration, so that only the other Fast Data roles or trusted clients have access to it.
Important
If you enable the force plaintext listener port without having set SASL or SSl for your brokers, your brokers will fail to start due to being set with two plaintext listeners.
It is mandatory if you want to use Kafka REST Proxy with secured brokers. It is especially useful when you enable transport encryption (TLS), which can have a significant impact on performance. The default listener still works on its default port and can be used by clients —or certain roles— to access the cluster in a secure way.
Other uses include when some trusted clients —i.e your applications— do not support the authentication mechanism you have enabled in your brokers or it is difficult to implement support for the authentication in your CI/CD system.
Inter Broker Communication¶
You can set the protocol the brokers will use to communicate with each other via
security.inter.broker.protocol
.
The brokers communicate via the same listeners that are available to clients. Due to replication, they may exchange large quantities of data at high throughput. It is often advised to avoid TLS/SSL for inter-broker communication due to the performance impact. If you have followed the Force Plaintext Listener section, you can safely choose PLAINTEXT in the setting above and the brokers will use the correct port.
Should you choose to utilize SSL between the brokers, you should also set a SSL truststore file, so the brokers can accept each other’s certificate.
If you want to test your setup, see Verify Kerberos or Verify Brokers SSL Authentication.
Schema Registry Client¶
For Schema Registry you should choose the appropriate protocol for connecting to
the brokers via the kafkastore.security.protocol
setting. GSSAPI/Kerberos
authentication does not require any other setting than this.
In order to be able to authenticate via client certificates, you have to enable SSL for the Schema Registry and provide it with a keystore that contains the certificate to use for authentication. Also in any SSL scenario (SSL, SASL_SSL), you should set up a truststore, so the Schema Registry can verify the brokers’ certificates.
Important
A, maybe uncommon, SSL setup is to add your organization’s CA certificates directly to your servers’ default truststore. Usually in this case you can omit setting explicitly the SSL truststore for each role. Schema Registry is an exception and you explicitly need to set the truststore and enable SSL in order for it to load it. Elsewise, it will fail to start.
Kafka REST Proxy Client¶
Kafka REST Proxy still uses the old consumer, so it doesn’t support neither authentication nor transport security to the brokers. Please see Force Plaintext Listener about enabling a plaintext listener to use with REST Proxy. We follow the development of Confluent Platform closely and release an update when this issue is addressed.
Important
You must always set Kafka REST Proxy’s security.protocol
to PLAINTEXT. The rest of the options are unsupported for now
unless you only plan to produce messages with it.
Kafka Connect Client¶
For Kafka Connect workers you should choose the appropriate protocol for
connecting to the brokers via the security.protocol
setting. GSSAPI/Kerberos
authentication does not require any other setting than this.
In order to be able to authenticate via client certificates, you have to enable SSL for Kafka Connect workers and provide them with a keystore that contains the certificate to use for authentication. Also in any SSL scenario (SSL, SASL_SSL), you should set up a truststore, so the workers can verify the brokers’ certificates.
Schema Registry¶
Schema Registry provides support for TLS/SSL to its server component, either just for server validation and transport encryption, or additionally as an authentication mechanism via client certificates.
TLS/SSL for Transport Encryption¶
To setup Schema Registry to serve its endpoint via TLS/SSL, enable SSL
and provide a SSL keystore. Then set the ssl_listener
option.
Secondary HTTP Listener¶
An important issue even with the latest iteration of the Schema Registry, is
that it does not support https for communication between its instances
[GHI_1]. To work around this shortcoming, by default the Fast Data CSD enables
a second, plaintext listener at port 18081
. You may change this behaviour
using the ssl_add_http_listener
and ssl_secondary_http_listener_port
. As
you probably will run multiple Schema Registry instances, you will need this
option enabled, so please make sure the plain http port is secured via a
firewall or other network configuration and accessible only by trusted servers,
especially the ones that run Schema Registry instances.
[GHI_1] | Github Issue (confluentinc/schema-registry/issues/386) |
Client Authentication¶
Once TLS is setup, client authentication only needs ssl.client.auth
option
and a SSL truststore in order to know which certificates the server can trust.
Client Authentication is yet unsupported by Kafka REST Proxy, Kafka Connect
and the Schema Registry itself (communication between Registry instances).
If you want to test your setup, see Verify HTTPS Endpoints Authentication.
Note
If you also use the Fast Data Tools CSD, please note that Schema Registry UI does not yet support authentication via client certificate to the Registry. You can use the secondary http listener for it instead or use the Schema Registry’s safety valve to override the listeners’ string and add a listener only for the UI.
Kafka REST Proxy Client¶
REST Proxy uses the Schema Registry. It is automatically setup with the prime listener of each Registry instance. If you enable SSL for the Schema Registry server, then REST Proxy will use the https port to communicate with it.
You need to set a SSL truststore for Kafka REST Proxy so that it can trust the certificate provided by the Registry.
Kafka REST Proxy does not support authentication to the registry. If you enable the Schema Registry’s secondary http listener, it will autodetect it and use that. Please see Secondary HTTP Listener for more information.
Kafka Connect Client¶
Kafka Connect workers use the Schema Registry. They are automatically setup with the prime listener of each Registry instance. If you enable SSL for the Schema Registry server part, then Connect workers will use the https port to communicate with it —ignoring the possible http secondary listener.
You must set a SSL truststore for Connect workers to be able to verify the certificate provided by the Schema Registry.
Kafka Connect does not support authentication to the Registry. If you enable the Schema Registry’s secondary http listener, it will autodetect it and use that. Please see Secondary HTTP Listener for more information.
Kafka REST Proxy¶
Kafka REST Proxy in its latest iteration supports TLS/SSL for both transport encryption/server validation and client authentication.
TLS/SSL Transport Encryption¶
To enable https for REST Proxy enable SSL and provide the role with a keystore
that contains the key-certificate pair to use. Then enable the ssl_listener
option.
Client Authentication¶
Once transport encryption is enabled, you can setup client authentication via
the ssl.client.auth
option and provide a SSL truststore with the
certificates of the CAs you trust to sign client certificates.
The Proxy isn’t used by other roles, so you don’t have to setup anything else within Fast Data. If you use the Fast Data Tools, please read the note below. If you want to test your setup, see Verify HTTPS Endpoints Authentication.
Note
If you also use the Fast Data Tools CSD, please note that Kafka Topics
UI does not yet support authentication via client certificate to
the REST Proxy. You can use Kafka REST Proxy’s safety valve to
add a second, http listener, overriding the auto-creation of the
listeners string. To keep your default https listener at port
8082
and add a second, http listener at port 31443
for
example, add to the Proxy’s safety valve:
listeners=https://0.0.0.0:8082,http://0.0.0.0:31443
You can use the http listener for Kafka Topics UI. Please make sure it isn’t accessible from untrusted resources.
Kafka Connect¶
Kafka Connect workers do not yet support security for their REST endpoint. We follow Kafka’s development closely and will adopt security once it is implemented in Kafka Connect.
Authorization¶
To be published.
Additional Topics¶
Kafka Listeners¶
Kafka Brokers use the listeners string to determine on which address and port the service should listen for client requests, as well as the protocol and possible authentication it will require. The protocol and the authentication are set by one string that may take one of four values:
PLAINTEXT
- No transport encryption, nor authentication is required.
SASL_PLAINTEXT
- No transport encryption. Authentication via SASL (only GSSAPI/Kerberos method supported) is required.
SASL_SSL
- Transport encryption via TLS/SSL. Authentication via SASL (only GSSAPI/Kerberos method supported) is required.
SSL
- Transport encryption via TLS/SSL. Authentication optionally set via client certificates.
A typical listener string to instruct the brokers to list to all addresses at port 9092 without auth or encryption, would look like:
listeners=PLAINTEXT://:9092
More than one listeners may be set, it the need arises:
listeners=SASL_PLAINTEXT://:9092,PLAINTEXT://10.10.0.1:19092,SSL://:9093
In the CSD we set the listeners automatically, given your choices for
kerberos.auth.enable
, ssl_enabled
and force.plaintext.listener
.
Should you need a more advanced scenario, you may set the listeners’ string in
the brokers’ safety valve, which overrides the automatic settings.
As of Confluent Platform 3.1.1 the listeners string is supported by all roles, albeit with different protocol support. Also the manual override via the safety valve is supported in all roles.
Verify Kerberos¶
To verify that Kerberos authorization works, from a node of your cluster you may obtain a ticket-granting ticket from Kerberos and run a short performance test.
First create a JAAS configuration file that contains your Kerberos principal:
KafkaClient {
com.sun.security.auth.module.Krb5LoginModule required
useTicketCache=true
serviceName="kafka"
principal="user@DOMAIN";
};
Client {
com.sun.security.auth.module.Krb5LoginModule required
useTicketCache=true
principal="user@DOMAIN";
};
Then proceed to obtain your TGT via kinit [7], create a test topic [8] and run the performance test:
$ kinit user@DOMAIN
$ kafka-topics \
--zookeeper zk:2181/fastdata \
--create \
--topic test-krb \
--partitions 9 \
--replication-factor 3
$ KAFKA_OPTS=-Djava.security.auth.login.config=/path/to/jaas.conf kafka-producer-perf-test \
--topic test-krb \
--throughput 1000 \
--record-size 1000 \
--num-records 5000 \
--producer-props \
bootstrap.servers=broker1:9092,broker2:9092,broker3:9092 \
security.protocol=SASL_PLAINTEXT
$ kafka-topics --zookeeper zk:2181/fastdata --delete --topic test-krb
[7] | The procedure to obtain a Kerberos TGT is outside of the scope of this
document. Usually the kinit command is enough. Consult your
cluster administrator in case it doesn’t work. |
[8] | You may notice that instead of exporting the KAFKA_OPTS variable
we set it explicitly only for the kafka-producer-perf-test . Be
careful because if you set it for your shell, then kafka-topics
will use Kerberos to login to Zookeeper and will write SASL protected
topics that your brokers can’t access. |
Verify Brokers SSL Authentication¶
As with Kerberos, you can use a SSL keystore, a SSL truststore and run a small performance test to verify your brokers work as expected.
Once you obtain your keystore and truststore, create a topic and the test:
$ kafka-topics \
--zookeeper zk:2181/fastdata \
--create \
--topic test-tls \ --partitions 9 \
--replication-factor 3
$ kafka-producer-perf-test \
--topic test-tls \
--throughput 1000 \
--record-size 1000 \
--num-records 5000 \
--producer-props \
bootstrap.servers=broker1:9093,broker2:9093,broker3:9093 \
security.protocol=SSL \
ssl.keystore.location=/path/to/keystore.jks \
ssl.keystore.password=changeit \
ssl.key.password=changeit \
ssl.truststore.location=/path/to/truststore.jks \
ssl.truststore.password=changeit
$ kafka-topics --zookeeper zk:2181/fastdata --delete --topic test-tls
Verify HTTPS Endpoints Authentication¶
To verify your https listeners from Schema Registry or Kafka REST Proxy you will need a key-certificate pair in PEM format [9] and curl.
To verify Schema Registry once you obtain your PEM files, first verify that it doesn’t permit connections without autentication:
$ curl https://schema.registry.url:8081
Then use your PEM keys to verify that authentication works:
$ curl --cert certificate.pem --key key.pem https://schema.registry.url:8081
The same procedure applies to Kafka REST Proxy:
$ curl https://kafka.rest.url:8082
$ curl --cert certificate.pem --key key.pem https://kafka.rest.url:8082
[9] | It is out of the scope of this document to explain the procedure to create PEM credentials. Please consult with your cluster administrator. |