Installation

HDFS Pipeline for Event Publication Messages

Extract the product tar file on a Linux machine

cd /opt/invariant 
tar -xvzf invariant-hdfs-ppl.tgz

Move the extract to correct env subfolder
mv invariant-hdfs-ppl -dev/invariant-hdfs-ppl

The folder will contain the libraries, scripts and configuration files to run the pipeline

+invariant-hdfs-ppl
    - bin
    - config
    - libs
    - logs
    - mapping
    - working

The bin folder contains the scripts to start the service and the config folders contains the yaml and properties files required to configure the pipeline. The mapping folder contains the generated XML files which include the config data for mapping source records to target as well as details about handling insert, update and delete events.

The mapping XML files are generated based on the tables defined in the confugration file

HDFS Pipeline for Change Data Capture (CDC) Messages

Extract the product tar file on a Linux machine

cd /opt/invariant 
tar -xvzf invariant-cdc-ppl.tgz

Move the extract to correct env subfolder
mv invariant-cdc-ppl -dev/invariant-cdc-ppl

The mapping folder contains the XML files which include the config data for mapping source records to target as well as details about handling insert, update and delete events.

+invariant-cdc-ppl
    - bin
    - config
    - libs
    - logs
    - mapping
    - working

DB Event Listener

The DB event listener is used to process messages from JMS message queue and write them out to Kafka topics

cd /opt/invariant 
tar -xvzf invariant-dbevent-mdp.tgz

Move the extract to correct env subfolder
mv invariant-dbevent-mdp -dev/invariant-dbevent=mdp

The bin folder contains the script to start the DB Event listener as a service. The config folders contains the configuration files to define the source queues and target topics.

To configure the pipeline refer to the HDFS adapter configuration section.

For Event pub data, you need the DB Event listener which reads data off MQ and writes it out to Kafka. The DB event listener is not needed for CDC data, as the CDC programs can write directly to Kafka.

Last updated