Incremental computations, reproducible research, and operational data validation…all things made possible by PIT Engine
Have you ever asked yourself these questions?
- Why is my data missing or changed, what was it before, and when did it change?
- How do I know if the data revision I received is a correction or yet another error?
- Why is my production cycle so slow, and why is it impacting my model scoring process?
- How can I use the most recently available market data in my models?
- What happens when compliance regulators ask why my model made a trade and I can’t reproduce it?
- What happens when my reports differ from my clients’ reports and I can’t reproduce them?
We see five potential uses for PIT data.
- Incremental computations: detect changed data between subsequent executions of data-intensive processing runs; re-compute only those parts that were affected by the change
- Reproducible research: achieve consistent results when re-running models at any time with data that looks exactly the same as when the model was first run; perform controlled experiments on how data revisions affect output from your model; research how changes in model parameters affect computation results while keeping input data time-invariant
- Operational data validation: identify data quality issues caused by vendors; detect failures in data delivery processes by vendors or market data services (e.g. loss of coverage, sudden drop or increase in the number of changes delivered over a given period of time)
- Reproducible reporting: re-create reports for key stakeholders, customers, and compliance auditors, on demand at any point in the future, exactly as they looked when first generated
- Compliance and auditing: keep track of data revisions and corrections as they are delivered by the data vendor; know exactly which data points were used in your investment decisions