Schema Registry
In the data pipeline, Schema is used to define the message metadata - structure and type of message exchanged between systems. A schema registry acts as a central repository of the message metadata allowing applications to discover and decipher the messages. The registry can also provide interfaces to serialize/deserialize messages.
Schema includes metadata such as
name - Unique name of the schema.
description - Description of the schema.
type - The type of schema. e.g, Avro, Json etc
compatibility - Compatibility between different versions.
Run Schema Registry
To launch the schema registry in background, use the command below. Change the "env type" parameter below to match your environment in case you want to run multiple instances on the same server.
To start the schema registry in the foreground, use the command
To stop the schema registry, use the command
Kafka Consumer Integration
To use the registry with Kafka consumer, set the config as follows:
A console consumer is provided to extract the AVRO messages in a topic. To start the consumer with the de-serializer, use:
Users can utilize the bundled schema or use the registry from Confluent or Hortonworks. In which case, modify the schemaregistry URL to point to the appropriate registry.
You can also plugin custom serializers and deserializers, if you wish to make use of SerDE provided by Confluent. In which case, copy the appropriate client jars in the classpath.
Last updated