Reducing Unplanned Downtime and Helping Future-proof Automation System Assets

By Craig Resnick

Category:
Industry Trends

Unplanned downtime is the number one issue for automation systems today. A significant percentage of today’s global installed base of automation systems are at least 20 years old and becoming increasingly difficult and costly to maintain properly.  The average impact of unplanned downtime in the process industries alone is $20 billion, or almost 5 percent of the annual production, making minimizing unplanned downtime attributable to automation one of the best ways for industrial organizations to improve their return on (automation) assets (ROA). Business Consequences of Unplanned Downtime Unplanned downtime results in major business consequences. For example, between 2 and 5 percent of all lost production in petrochemicals plants is attributable to unplanned downtime.  Reliability experts estimate that unplanned downtime costs 10 times as much as planned downtime for maintenance in the process industries.  Unplanned downtime also causes ripple effects throughout the organization, such as an estimated 5 to 10 percent increase in inventories and labor costs and delayed delivery of finished goods, all resulting in reduced profitability. Unplanned Downtime Plagues Productivity Some causes of unexpected stoppages in production include equipment failure, operator error, and nuisance trips. The direct impact of this unplanned downtime can result in equipment damage, lower key performance indicators (KPIs), environmental harm, and most importantly worker endangerment. Lower KPIs include reduced overall equipment effectiveness (OEE), decreased efficiency, and reduced profitability.  However manufacturers are often not aware of the magnitude of unplanned downtime in their own plants. Minimizing the Multiple Causes of Unplanned Downtime Minimizing the multiple causes of unplanned downtime requires a focus on both automation assets and human behavior. Automation-related influences include increasing the overall availability of system assets and implementing fault-tolerant systems. Human influences include training to eliminate human errors and improving coordination between processes at multiple layers.  However, one common denominator that will positively affect both automation assets and human behavior is the implementation of standards-based technologies, which can help simplify hardware, software, infrastructure, systems, and training.  Standards-based technologies provide flexibility, ease upgrades, and speed integration between third-party products.  Standard-based technologies can also make it easier for users to understand and monetize hardware lifecycle upgrades and how they impact ROA over time. Virtualization Addresses Unplanned Downtime and OEE One of the strategies technology users can deploy is to implement virtualization in “Layer 3” infrastructure, such as HMI/SCADA and historian software. This platform consolidation can reduce complexity and network infrastructure. It also simplifies system management and maintenance, and eases migration for Windows-type applications. Virtualization enables non-disruptive, scalable upgrades and application additions. Fault-Tolerant Solutions Support Virtualization Fault-tolerant solutions enhance virtualization. With virtualization, platform reliability is especially critical since any downtime impacts all applications running on that server or system. However, it is equally critical that owner-operators don’t swap their current complex systems for equally complex, high-availability virtualization solutions, so fault-tolerant solutions must be based on open standards. They must also be designed to eliminate any data losses. This information can be used to help eliminate unplanned downtime by predicting a problem based on changes in an asset’s parameters, such as asset temperature, vibration, etc., and making plans to have the asset repaired or replaced before the asset fails. Disaster Recovery Provides Additional Protection To supplement high-availability solutions for minimizing unplanned downtime, plants should implement disaster recovery solutions where geographic redundancy makes sense. This ensures that plant data is backed up in off-site servers in public or private clouds that would not be impacted by a physical disaster at the plant. Control room, MES, ERP, and asset management applications would all be appropriate for disaster recovery solutions, which should be used to complement – but not replace - fault tolerance to protect data that plants cannot afford to lose. Disaster recovery applications will help restore critical operations due to catastrophic failures, such as power outages, fires, natural disasters etc.  However, it generally will involve some potential loss of data and recovery times can range from many minutes to hours. OT/IT Convergence Addresses Unplanned Downtime OT/IT convergence is a hot topic these days. This has led to a rapid learning curve for both IT and OT groups.  IT folks often have to learn what terms such as “real time,” “non-stop,” and “deterministic” mean in the operations context, and OT folds are rapidly discovering the advantages of leveraging the latest IT-based approaches.  This convergence is helping plants address unplanned downtime, as 30 year old control systems (DCS and PLCs) need to be upgraded. Real-time or near real-time data are essential for any business to compete today and those data must be available 24/7/365.  This convergence trend increases the demand for tighter integration and more information and analytics. It also contributes to the adoption of cloud computing and Big Data applications, which in turn drive the need for high-availability systems to help eliminate unplanned downtime.

Engage with ARC Advisory Group