Generating Artifacts

Inventory manager generates various run-time artifacts that is used by data pipeline services and merge/recon processes.

 ./generateArtifacts.sh [ generatePPLXML         | 
                          generateDDL            | 
                          generateSqoopImport    |
                          generateAutoReconHQL   | 
                          generateStreamMergeHQL | 
                          generateAll 
                        ]

The options are

generatePPLXML - Streaming Pipeline mapping XML
generateDDL - DDL for stream, recon and current view tables
generateSqoopImport - Columns and predicate for sqoop command
generateAutoReconHQL - HQL to reconcile data in stream using recon tables
generateSTreamMergeHQL - HQL to merge stream data into current view
generateAll - Generate all the above artifacts

The user can also pass a flag "-p" to indicate if the table is streamed using event publication or change data capture (CDC). The default is CDC.

Another optional flag "-f" can be passed to indicate if the cached schema version should be updated. The valid values are Y or N and default is Y. Note - It is more efficient to use the cached values if there are no structural changes in the source tables.

Please note that from version 1.0.9 onwards, the generated DDL for CDC tables contains the audit timestamp column (inv_updtd_dtm) as the first field. The generated merge HQL also accounts for this and the columns are ordered accordingly. The EP tables may have this audit timestamp column at the end and the HQL is generated accordingly. In case of mismatch, the developers can adjust the generated HQL and check it into source control.

PreviousMetadata Changes and Versioning NextReconciliation, Merging Current View

Last updated 5 years ago