Running daily reconciliation and merge

Inventory manager can use daily Sqoop extracts to reconcile data against the stream tables. The data can then be merged into the final current view tables.

Use the recon_invmgr_recon script to extract, reconcile and merge the records into target schema

./run-invmgr-recon.sh [--all | --onlyMerge | --onlyRecon | --onlySqoop ]   <- Action Param          
                      [-e dev|test|prod]                                   <- Env Param             
                      [-s bpm|ods]                                         <- Source DB Param         
                      [-d yyyy-mm-dd]                                      <- Optional processing date
                      [-n num]                                             <- Optional Num days       
                      [-t \"table1 table2\"]                               <- Optional Table List

The action parameter is required and specifies the steps which will be executed

--onlyMerge - merge stream tables to current
--onlyRecon - reconcile stream and recon records
--onlySqoop - sqoop data into the recon schema
--all - run all steps in sequence (sqoop, recon and

The script uses configuration in the "recon.properties" file in the config folder. By default, the merge is run for the previous day. This can be overridden by specifying a specific date or number of days to go back and run the action.

The list of tables is defined in recon.properties, however this can be overridden on the command line and you can pass in a specific table or list of tables. The list should be within double quotes and separated by space. Alternatively you can specify the list in the configuration with source db and table_list suffix. for example - bpm_table_list=work history

# Recon script config

# queue name
queue_name=pipeline
tez_java_opts=-Xmx1640m
container_size=2048

# ods source db url
ods_source_url=jdbc:db2://localhost:51101/DBODSX11
ods_source_user_name=invapp
ods_source_user_alias=db2.invapp.password.alias
ods_source_user_jceks=//hdfs/user/invapp/db2.jceks

#bpm source db url
bpm_source_url=jdbc:db2://localhost:51102/DBBPMX11
bpm_source_user_name=invapp
bpm_source_user_alias=db2.invapp.password.alias
bpm_source_user_jceks=//hdfs/user/invapp/db2.jceks

# target store URL
target_url="jdbc:hive2://local_1:2181, local_2:2181/scratch;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"
target_sqoop_schema=recon
beeline_password_file=<location>

# schema
recon_schema=recon
stream_schema=stream
ods_schema=ods
bpm_schema=bpm

# list of tables to be processed
bpm_table_list=work history
ods_table_list=activity document

# target schema is table specific - since 1.0.9
default_target_schema=ods
activity_target_schema=dev
history_target_schema=dev
document_target_schema=dev
work_target_schema=dev

# user to be notified
[email protected]

The target schema is new in 1.0.9 and is required for merge to work

The target schema for each table should be defined in the recon.properties. The syntax is

{table_name}_target_schema={schema name}

For example if activity table is to be merged into activity table defined in dev schema, use

activity_target_schema=dev

PreviousReconciliation, Merging Current View NextData Inventory Reports

Last updated 4 years ago