Deep learning predicts path-dependent plasticity

Significance We show that material plasticity can be precisely and efficiently predicted by deep-learning methods. This approach is fundamentally different from the century-old theory of continuum plasticity because it is not iteratively tracing the yield surface, neither does it require the notion of effective strain or stress at the macroscopic level. Instead, we use representative computer simulations of materials, including microstructure and constituents, load them along different deformation paths, and then learn the reversible, irreversible, and history-dependent phenomena directly from data. We demonstrate that complex phenomena such as distortional hardening can be predicted within 0.5% error. The generality of the methodology and widespread importance of plasticity in designing structures and materials make it useful to a myriad of fields.


Design of experiments
For case 2 considered in the main text, we reconstruct the RVE corresponding to the i th DOE point (i.e., given [v i , r i , c i ]) as follows. First we randomly place n i = viL 2 100 * π * r 2 i fibers of radius r i in a square RVE of side length L = 200 µm. Then, we iteratively perturb the fiber locations until their spatial distribution satisfies c i . It is noted that some combinations of [v, r, c] might correspond to infeasible RVEs or our iterative perturbation might stop before c i is satisfied. The 3D input space of [v, r, c] along with the feasible DOE points are visualized in Fig. S1 where it can be observed that some regions of the [v, r, c] space do not correspond to realizable RVEs. Fig. S2 also shows four sample RVEs for easier interpretation of microstructural differences. Note that the triplets [v, r, c] cannot uniquely characterize a microstructure with randomly dispersed equally-sized fibers (1). Hence, we post-process the reconstructed RVEs to extract four more morphological features that quantify the spatial distribution of fibers. These features are the minimum, maximum, mean, and standard deviation of nearest neighbor distances across the fibers: nn = [nn 1 , nn 2 , nn 3 , nn 4 ]. That is, for the i th RVE, we calculate the nearest neighbor of all the fibers (center-to-center distance) and then calculate the above-mentioned statistics. These seven non-temporal features (along with the deformation path) are employed in our deep learning task as inputs. The deformation paths are sampled as described in the main text, such that the stress response is learned by training the RNN on a wide range of RVEs and deformation scenarios.
In summary, for case 2, Fig. S1 shows 5,000 sampling points obtained with Sobol sequence within the 3D space defined by the three microstructural descriptors of the RVEs: volume fraction v (in percentage), fiber radius r (in µm), and mean distance between fibers c (in µm). Fig. S2 shows four different samples of RVEs with different microstructure descriptors (four of the 5,000 points in Fig. S1).

RNN architecture analysis
The three RNN architectures introduced in Fig. 2 to combine temporal and non-temporal features are extensively tested and their training results are presented in Fig. S3 and Table S2 after 500 epochs of training on case 2 database. Fig. 2A considers an architecture where non-temporal features are merged with temporal RNN outputs through fully connected neural network (FCNN) layers that form a hybrid deep learning architecture. While this approach is plausible for applications with fixed output length at the final time-step (i.e. deformation increment), its structure does not provide a natural fit for constitutive law discovery of material systems as it restrains the temporal prediction of the model to a fixed length and offers limited correlation between temporal and non-temporal features. Therefore, the hybrid architecture, which combines temporal GRU outputs with non-temporal FCNN features ( Fig. 2A) cannot achieve accurate prediction on the training set and suffers from overfitting.
The second architecture shown in Fig. 2B has non-temporal features integrated into the RNN formulation as the initial value for hidden states. As the dimensionality of non-temporal inputs and hidden states are often different, a dense network can be used to perform this mapping. Although this approach has shown promising results for image processing tasks (2, 3), it is not the most effective architecture for constitute laws because all information in the hidden states are subject to change as they pass through GRU cells. That is, non-temporal inputs can get corrupted with other hidden features which makes it excessively difficult for GRU cells to access them at downstream time steps. Hence, this architecture with hidden state initialization performs moderately.
Finally, observing Fig. S3 and Table S2, we conclude that the proposed architecture with a secondary hidden state ( Fig. 2C) achieves significantly better accuracy consistently across different epochs and metrics.

RNN hyperparameter tests
The hyperparameters and configurations of the presented RNN models are studied and optimized in this work. This analysis includes but is not limited to activation functions, optimization algorithms, cost functions, dropout layers, normalization process, and addition of time-series dense layers. Fig. S4A depicts the results achieved by varying number of neurons in each GRU cells. It can be seen that 100 neurons cannot provide enough computational complexity to the model. While the model with 1000 neurons results in lower SAME on training set compared to the model with 500 neurons, the models perform closely on the test set. Considering that the model with 1000 neurons requires more computational resources and training time and overfits on the training set, RNNs with 500 neurons are used in this work. Similarly, Fig. S4B suggests that a model with 3 layers of stacked GRU layers achieves the best result when compared to models with 1 or 5 layers.

Performance of proposed RNN architecture
We analyze the performance of the model with different sizes of training set to study the required database for achieving certain error metrics, which is demonstrated in Fig. S5. As we increase the size of the training set, the model with 3 layers of 500 neurons performs better in both training set and test set; however, larger databases lead to an expected improvement of performance. Ultimately, the require size of database is dictated by the complexity of the behavior of the RVE and the required accuracy. In this work, we demonstrate that one can achieve predictive deep learning models for advanced plasticity behavior with databases that are computationally (or experimentally) built in a feasible time frame.
Note that once trained, our data-driven constitutive model performs far faster than the finite element method. As an example, the developed data-driven model predicts the behavior of one RVE in the second case study in 0.108 seconds on a Nvidia Titan black GPU while it takes 7.48 minutes on four cores of Intel Xeon CPU E5-2687 for the finite element method to complete the simulation. While the exact number highly depends on the hardware and simulated physics, it can be confidently stated that the data-driven approach offers orders of magnitude faster evaluation. This has important implications on multi-scale simulations where the constitutive laws at each point of the macro-scale material can be given by RNN models, instead of expensive RVE analyses. Furthermore, we note that the two approaches scale differently, given the type of hardware they require and application. For instance, calculating the response of 100 different RVE cases via the data-driven approach using the same hardware takes only 0.547 seconds, which is due to the batch processing capability of GPUs. Finite element methods, on the other hand, scale by distributing sub-domains over multiple CPUs to obtain performance gains through parallel computing. These gains often saturate due to the communication overhead between processing units.

Yield surface construction and microstructural influence
The yield surfaces presented in Fig. 5 are constructed by applying 40 linear strain paths, which are uniformly distributed in strain space starting from the initial strain condition. To construct the original yield surface (Fig. S6A), strain paths start from the unloaded condition and experience elastic and plastic deformation in different directions. We record the stress state in which each linear path exceeds a plastic energy threshold of 1 mJ, which constructs the yield surface. The plastic energy is defined as the integral of stress times plastic strain over volume and over the deformation path: τ 0 V σ · ε p dV dτ . The yield surface of an RVE after it undergoes a certain loading (Fig. S6B) is constructed by initially applying the main load (blue solid line in Fig. S6B) for all 40 linear strain paths and then loading the RVE in different directions until we detect the stress state where they reach the plastic energy threshold. Note that although all the applied loadings for yield surface constructions are linear and uniform in strain space, the stress responses are neither linear nor uniform which is due to the plasticity of the RVE.
In addition, we also illustrate the influence of the microstructure on the response of the material, as shown in Fig.  S7, by considering the same loading condition applied to two different RVEs (C and D in Fig. S2). This clarifies the non-trivial relationship of microstructure and plastic response.