Data Storage
The process event data is ingested and stored in near real time as the data is generated. The storage layer uses HDFS as the long term data store, which provides scalable and reliable data storage across distributed nodes. The distributed store allows Process Insight to scale horizontally and handle large number of events and queries distributed across the cluster. The data is also mapped and loaded to Postgres database based reporting marts for real time dashboards.
The process data is primarily stored in two tables -
Case Data - This table holds the summary data for the different types of cases which are typically modeled in the case management system to hold the state of the business process.
Task Data - The task table contains the data for the individual tasks within the lifecycle of a case.
Each table requires timestamps for key events - create date time, resolution date time etc. In addition, the tables contain the fields which can be used as filters to subset and slice the data.
Last updated