Measuring Digital Twin Performance and Maturity with the Confusion Matrix

By Ralph Rio


Digital twins have a high rate of adoption, with excellent reported benefits. They are typically managed as an engineering project with a scope and timetable for implementation. Unfortunately, little goes into tools for management of the twin after implementation, when the engineering team disbands. The good results are often lost as processes drift or change, people become fed up with false alerts, and the twin needs support.

Peter Drucker, management consultant and author, wrote, "You can't manage what you can't measure." This also applies to digital twins. Consider adopting the confusion matrix[1] for monitoring and managing the performance of your digital twins.

Measuring Digital Twin Performance

The current state of digital twins lacks a key performance indicator (KPI) to manage their effectiveness. A review of industry writings on digital twins found no KPI or other measurements of effectiveness of the twin itself. Each digital twin affects the KPIs for what it models, but these are secondary. This report addresses KPIs for the performance of the digital twin.

Confusion Matrix Applied to Digital Twins

A confusion matrix provides a specific table layout for visualization of the performance of a digital twin algorithm. This includes twins using first principle math, machine learning, or both. It measures alerts generated by the twin in a way that all stakeholders can easily interpret the twin’s truthfulness and respond appropriately.

The four-cell matrix contains true positives and negatives, and false positives and negatives. This matrix provides a dashboard for measuring false positives and false negatives (errors) by the digital twin. Performance is managed by monitoring the false readings and continuously improving the twin by reducing them.

Performance Digital Twin Confusion Matrix

The most common twin is a performance digital twin for predictive maintenance (PdM) in the operate and maintain portion of an asset’s lifecycle. PdM is used for the examples in this Insight. The concepts can be applied to other types of twins since they involve simulation for predictions that can be compared with actual conditions.

  • True Positive (TP) occurs when an alert generated by the model is confirmed by a maintenance planner or technician to be valid.
  • True Negative (TN) does not have alerts, since the PdM twin does not generate alerts when there is no indication of a problem.[2] 
  • False Positive (FP) applies to the alerts where a problem was not found.
  • False Negative (FN) records the failures without a corresponding alert.

Integration with EAM System for Data Gathering and Reporting

For end users with inhouse maintenance staff, these metrics can be obtained seamlessly. Automate the data collection by integrating the digital twin with the enterprise asset management (EAM) system where the technician’s work orders are processed and managed. First, alerts generated by the PdM twin are transferred to the EAM system. The workflow includes automatically creating a maintenance work order for the maintenance planner to review, triage, approve, and schedule.[3] Modern EAM systems have application programming interface (API) for this function.

A dashboard can be created to display the confusion matrix for each digital twin – preferably associated with the EAM system where asset management KPIs are monitored. An approach would be to add a checkbox field in the work order to obtain the needed data from the planner or technician. For those with just a few digital twins, this checkbox is likely best done by the planner.

As the quantity and maturity of the digital twins grows, this role may be transferred to technicians. With mobile devices, technicians can process work orders – including a checkbox for a false positive - while doing their work.

Confusion Matrix

Modern EAM systems have a means to add fields in work orders to collect the needed data from the technicians and/or planner. These newer systems also allow for the creation of custom reports or dashboards for tracking the “confusion matrix” for a digital twin.


ARC Advisory Group clients can view the complete report at  ARC Client Portal

If you would like to buy this report or obtain information about how to become a client, please  Contact Us

Keywords: Performance Digital Twin, KPI, Confusion Matrix, Predictive Maintenance, ARC Advisory Group.


[2] The true negative cell may need to remain blank. Or consider a modification to the confusion matrix by substituting the failure rate (FR) prior to the twin i.e., operating time divided by the previous mean-time-to-failure (MTBF). Then, comparing TP to the FR gives an indication of the deterioration of the asset i.e., TP greater than FR is a bad trend. Please contact me if you have comments or a recommendation (

[3] Intersection of IIoT and Asset Management, ARC Strategy Report, Ralph Rio, page 6, July 2016

Engage with ARC Advisory Group