Take anyone’s prediction of the most important technology for this new decade and artificial intelligence or AI will be high on the list. However, until now most AI has been applied at predicting human behavior, interactions, and sensibilities. For example, your key board predicts what is the most likely word you will type, or your TV predicts what movies or shows you will enjoy watching next.
Nonetheless, a relatively white space in the world of AI is that of prediction of machine behavior and interactions. You see, while there are a lot of humans in this world, there are many more machines that serve them. The goal of every industrial OEM is to transform their products into seemingly sentient machines that can adapt themselves to the use-case, it’s environment, and to other machines that rely on their operation. This white space is the realm of the Digital Twin.
What’s the Digital Twin?
For clarity, Digital Twins are virtual models of key processes that define the performance of devices. This virtual model is fed with data from the device using an IoT platform, and the model can then be used to optimize the device’s performance, predict future device states, or provide business or usage recommendations that enhance the device’s performance and value. See Figure 1.
Digital Twin 1.0
The first wave of Digital Twins, let’s say Digital Twin 1.0, has focused on the interface between the device and its user. These do things like identifying who the user is, classifying the device use case, and adapting the device’s User Interface Systems to the user’s preferences. An example of this is the smart thermostat for air conditioning. Given manipulation of the thermostat by the user, the device is able to learn the patterns and preferences of the user, this allows a better control of when the air conditioner should turn on or off and what temperature settings are most appropriate. However, the air conditioner itself can remain a –pardon me— “stupid” device. In other words, it may lack knowledge of itself and its internal operations, and is unable to predict future states of its components and itself. This lack of self-awareness limits the types of operational performance improvements that could be considered. For example, maintenance recommendation, self-healing, operational life extension, diagnostics, and adaptive controls.
Self-awareness is what we’ll call Digital Twin 2.0. And, the business opportunities associated with Digital Twin 2.0 are transformative.
Digital Twin 2.0
Obviously, the realm of apparently sentient machines and optimized systems of machines would be dominated by Digital Twin 2.0. This type of Digital Twin needs to accurately and fast evaluate the state of key internal processes and components inside the device.
But how do we get there? What are the principal barriers to the deployment of next-gen Digital Twin 2.0 devices?
Through my work with Front End Analytics I have helped pioneer machine learning methods and techniques for Digital Twin 2.0.
A Question of Data
Since Digital Twins need to run fast and cannot be computationally intensive, data-driven AI is specially well-suited for model generation. It is a common theme among data scientists that given enough data, anything can be predicted. Thus, its no wonder that Digital Twin 1.0 type of applications have been first to market, since the data is generated at the interface of the device and its user, where loads of data is generated.
However, Digital Twin 2.0 systems require engineering rigor for determining which data will be used to train models, where is this data going to come from, and whether this data is extensive enough and significant enough to be a predictor of a machine state or process.
Illustrative Example – Motion of the Planets
To simplify these difficult questions and some of the fundamental ideas that we have pioneered at Front End Analytics, I will use a classic example of creating a Digital Twin that would enable us to predict the motion of planets.
First, a Purely Data Driven Approach:
If we know nothing and only have the data of where the planets are with respect to us on via some measurement device (a telescope?), we could start mapping the position of each planet. Each observation is one data point. As we gather more data the resulting map of the planets’ movements would start to look a lot like the mandala-like pictures in Figure 2. In fact, this was the precise method used by early astronomers. These early astronomers were acting as an AI system would today. Just observing data and trying to draw a pattern or function that enables prediction of future planetary states.
But to get the full mandala of planetary motion would take enormous amounts of data (observations). Which raises a fundamental question, how do we collect the data? In the product development process, data of product performance is generated from Computer Simulation, which typically takes the form of finite element modeling, or from testing. In both cases, the extent of the data is critical to making highly predictive data-driven machine learning algorithms. Take Figure 3, which is the motion of Venus. (1) Data could be narrowly collected, which does not provide a good picture of the entire operating space (as shown in the left side of the figure). Or (2) data could be scattered but insufficient to formulate a complete picture of the event being analyzed (as shown in the right side of the figure). Both of these conditions are typical in engineering.
Simulation data is often limited to particular operating events, because traditionally, simulation has been used to evaluate and compare design options. For this, the entire operational sweep is not necessary if comparisons between two different machines are being made. Also, testing tends to have the double role of validating product performance and simulation results. Thus, testing often follows the narrow space of simulation. In vehicle design, this could be a standard driving cycle, for example.
When operational widespread operational sweeps are made, these are often inconclusive, and confusing, because the fundamental relationships between the points may not be obvious or easy to interpret.
The Solution – Physics Informed Machine Learning (PIML™)
Just like the early astronomers, the purely data driven approaches, are hampered by not being informed that the earth is not at the center of the planets’ motion; instead the planet rotates around the sun. This simple fact makes development of a purely data driven approach prohibitively expensive.
What if instead we integrate into our machine learning techniques knowledge of Kepler’s laws of planetary motion (Figure 4). The number of data points required to make a highly predictive models would be reduced to two (the planet’s distance from the sun at two different periods of time). From this we can optimize machine learning equations associated with the planet’s mass and the orbits ellipse, which would provide the formula for the entire orbit.
Physical insight has other benefits. Note that the developed model can be applied to all planets. Thus, the digital twin can be extrapolated to a wide range of situations and can predict outside the range of data used to inform it. Finally, the incorporation of physics into the machine learning problem, informs the instrumentation and sampling design for the Digital Twin, so that it is more effective and less expensive. For example, we now know that the best instrumentation would be one that measures our distance from the sun as well as the distance of the planet from us, in order to infer the location of the planet with respect to the sun.
PIML™ Based Digital Twin Development at Front End Analytics
At Front End Analytics we have developed unique machine learning techniques that couple physics and data-driven techniques to produce highly predictive next-gen Digital Twins, that as with the illustration, drastically reduce the amount of data required to enable highly predictive models. These techniques have been applied successfully to predicting fatigue failure in gear systems, the thermomechanical fatigue failure point of engine manifolds, and to predict the performance of catalytic fuel processors.