Replicating raw Transaction Data to cloud is most prevalent practice in organizations that embarked on Data Driven Organization journeys. The replicated data is transformed and made analytical user ready so that meaningful insights, Business Intelligence report or AI /ML model can be generated.
When a transaction ERP system generates Data because of event in business process , there is lot of business , business process , market , Rules , policies ,social , constraint, technological context that were prevalent or relevant for that generated Transactional Data.
Suppose you go to ATM and take out money 100 dollars a Transaction is conducted .
Bank has record of who, when, which payment process , which ATM, country, state , district , city , town , time debit card was active , debit card number etc . This extra information is also called metadata and gives context behind the transaction event of 100$. However ATM are not yet intelligent to capture on context why customer took 100$ was he under panic or he took for shooing or medical needs etc. if the customer swiped the card in hospital then of course hospital info will get captured as meta data.
Are these meta data stored in same database or other database ?
So much data captured for 100 dollars worth of transactions, how much of them reach cloud
Now when we replicate the Data to cloud only Who( person who with drawn money), and how much was with drawn (100 $) reach the cloud.
What about meta data only timestamp , account number , $ details and card details ? The demographic details missing in the data pushed by source team to cloud….
Imagine insights , Business Intelligence reports and AI systems created in cloud based on insufficient metadata replicated in cloud.
How to test metadata quality of replicated Data in cloud ?