Here you can find information about installing multiple instances of Fast Data to the same cluster as well as how to uninstall and clean-up remaining files and zookeeper.
We took special care to enable you to run more than one Fast Data instances on your Cloudera cluster. Whilst it may seem counterintuitive at first there are legit reasons for such a setup. Examples would be security and resource separation. Whilst Kafka supports ACLs, for simple tasks such as separation between two teams, it may be much easier to setup and manage a second Kafka cluster. Resource separation could be using different nodes or different disks per cluster.
In order to add a second (or… nth) instance you have to be very careful of four configuration options, because they can damage your existing cluster if shared between different Kafka clusters. The last part (ports) won’t damage your cluster but will prevent the new instance from starting until corrected.
Kafka Brokers’ Znode. This indicates the znode under which the brokers will keep the cluster information and should be unique for each cluster. The default is
Logs ID. A unique string that is appended to the logs filenames.
Kafka Brokers’ Data Directories (log.dirs), the directories where brokers store their data. Change it to something unique amongst the instances.
Schema Registry’s Znode. As with the Broker’s Znode, change it to something different from the default:
Ports; you have to set different ports for the new instances. In Cloudera Manager visit the Fast Data service page, go to configuration, select ports and change at least the ones you use (depends on roles used). Remember to adjust the JMX ports as well.
If the Cloudera cluster doesn’t have internet access, an alternative installation method should be used. Such methods are better documented in Cloudera’s website.
For these methods, the cluster administrator to manually download our parcels and make them available to the cluster.
Our parcels can be found at our archive. The version downloaded and installed should match the version of the CSD. As an example, if the 3.2 CSD is used, the latest 3.2.x KAFKA_CONFLUENT parcel should be used. This holds true for our other CSDs (Tools and Monitoring) as well. Special care is needed for STREAM_REACTOR, where its version should match exactly KAFKA_CONFLUENT version (e.g for Kafka 3.2.2, a Stream Reactor build for 3.2.2 should be used).
Whether you want to remove completely Fast Data or re-install the service, some manual cleaning procedure may be needed. The CSD’s role instances store data onto the disk and/or Zookeeper. Unfortunately we cannot automate this for you.
If you only want to re-install the service, you must follow the Clean Zookeeper Znodes or adjust your Zookeeper Znodes with new names. Also it is advised to follow the Clean Data Directories paragraph to free space.
To uninstall the CSD, you should first stop any running Fast Data services and then delete them from the cluster.
Next visit the CM’s Parcels tab and deactivate, remove from hosts and uninstall our parcel.
Next stop the Cloudera Manager service, delete the CSD file from your Cloudera
CSD directory (usually
/opt/cloudera/csd/) and restart the Cloudera Manager
for the change to take effect.
In our experiments, in one occassion Cloudera Manager wasn’t able to start again after the CSD removal due to missing metric sources. The solution was to copy the CSD back, start CM with it, stop it and remove the CSD again.
The only service that stores data are the Kafka Brokers. Each broker has a
unique ID and stores its data inside
If you remove a Fast Data service, you may want to clean these directories to
free up space and make sure any future installation won’t find old data. Default
log.dirs location is at
Logs are also stored for your roles. These are stored inside
/var/log/fastdata and may be removed freely.
Kafka Brokers and Schema Registry use Zookeeper to store data.
The default Znode for brokers is
/fastdata and all topic data are stored
below this. To delete it you may use the Zookeeper client:
zookeeper-client -server zookeeper.server:2181 rmr /fastdata
For schema registry the default Znode is
/schema_registry and you may delete
it by running:
zookeeper-client -server zookeeper.server:2181 rmr /schema_registry
If you have enabled Kerberos authentication and the Zookeeper ACLs
zookeeper.set.acl option) Kafka’s Znode is secured and only writeable by
the kafka Kerberos principal, hence you can’t delete it.
The easiest way to mitigate this, is to restart the brokers with the
zookeeper.set.acl disabled. The CSD will then unsecure the znode and you
will be able to delete it with
In case you don’t have anymore access to the service, you should use the
kafka kerberos principle with the
zookeeper-client to be able to delete
it. Directions for this procedure are outside the purpose of this document.