## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# The geometric blueprint of perovskites

Edited by Roberto Car, Princeton University, Princeton, NJ, and approved April 13, 2018 (received for review November 3, 2017)

## Significance

Perovskites constitute one of the most versatile and chemically diverse families of crystals. In perovskites the same structural template supports a staggering variety of properties, from metallic, insulating, and semiconducting behavior to superconducting, ferroelectric, ferromagnetic, and multiferroic phases. Approximately 2,000 perovskites are currently known. In this work we revisit the century-old model proposed by Goldschmidt to predict the formability of perovskites in the key of modern inferential statistics and internet data mining. We demonstrate that the nonrattling rule postulated by Goldschmidt can predict the stability of perovskites with a success rate of 80%. Using this tool we predict the existence of 90,000 hitherto unknown perovskites.

## Abstract

Perovskite minerals form an essential component of the Earth’s mantle, and synthetic crystals are ubiquitous in electronics, photonics, and energy technology. The extraordinary chemical diversity of these crystals raises the question of how many and which perovskites are yet to be discovered. Here we show that the “no-rattling” principle postulated by Goldschmidt in 1926, describing the geometric conditions under which a perovskite can form, is much more effective than previously thought and allows us to predict perovskites with a fidelity of 80%. By supplementing this principle with inferential statistics and internet data mining we establish that currently known perovskites are only the tip of the iceberg, and we enumerate 90,000 hitherto-unknown compounds awaiting to be studied. Our results suggest that geometric blueprints may enable the systematic screening of millions of compounds and offer untapped opportunities in structure prediction and materials design.

Crystals of the perovskite family rank among the most common ternary and quaternary compounds and are central to many areas of current research (1). For example, silicate perovskites constitute the most abundant minerals on Earth (2), and synthetic oxide perovskites find applications as ferroelectrics (3), ferromagnets (4), multiferroics (5), high-temperature superconductors (6), magnetoresistive sensors (7), spin filters (8), superionic conductors (9), and catalysts (10). Halide perovskites are promising for high-efficiency solar cells, light-emitting diodes, and lasers (11⇓–13); their double perovskite counterparts are efficient scintillators for radiation detection (14). The unique versatility of the perovskite crystal structure stems from its unusual ability to accommodate a staggering variety of elemental combinations. This unparalleled diversity raises the questions of how many new perovskites are yet to be discovered and which ones will exhibit improved or novel functionalities. In an attempt to answer these questions, we here begin by mapping the entire compositional landscape of these crystals.

Fig. 1*A* shows the structure of a cubic ABX_{3} perovskite. In this structure the A and B elements are cations and X is an anion. B-site cations are sixfold coordinated by anions to form BX_{6} octahedra. The octahedra are arranged in a 3D corner-sharing network, and each cavity of this network is occupied by one A-site cation (15). All perovskites share the same network topology, but can differ in the degree of tilting and distortions of the octahedra (15⇓⇓⇓–19). The quaternary counterpart of the perovskite crystal is the double perovskite A_{2}BB′X_{6}. When the B and B′ cations alternate in a rock-salt sublattice, the crystal is called elpasolite (14) (*SI Appendix*, Fig. S1*A*). In the following we use the term “perovskite” to indicate both ternary and quaternary compounds.

## Results

How many perovskites do currently exist? If we search for the keyword “perovskite” in the inorganic crystal structure database (ICSD; www2.fiz-karlsruhe.de), we find 8,866 entries, but after removing duplicates the headcount decreases to 335 distinct ABX_{3} compounds. Similarly, the keyword “elpasolite” yields 224 distinct A_{2}BB′X_{6} compounds. By including also the extensive compilations of refs. 20⇓⇓⇓⇓⇓–26, we obtain a grand total of 1,622 distinct crystals that are reliably identified as perovskites (Dataset S1).

How many perovskites are left to discover? Direct inspection of the elemental composition in Dataset S1 indicates that, with the exception of hydrogen, boron, carbon, phosphorous, and some radioactive elements, these crystals can host every atom in the Periodic Table. Therefore an upper bound for the number of possible perovskites is given by all of the combinations of three cations and one anion. By considering only ions with known ionic radii (27) we count 3,658,527 hypothetical compounds. Our goal is to establish which of these compounds can form perovskite crystals. Ideally we should resort to ab initio computational screening (28), but these techniques are not yet scalable to millions of compounds. For example, by making the optimistic assumption that calculating a phase diagram required only 1 h of supercomputing time per compound, it would take 160 y to complete this task.

An empirical approach to investigate the formability of ABX_{3} perovskites was proposed by Goldschmidt almost a century ago (29). In this approach the perovskite structure is described as a collection of rigid spheres, with sizes given by the ionic radii

Goldschmidt’s principle was recently tested on larger datasets than those available in 1926. By analyzing a few hundred ternary oxides and halides, it was found that in a 2D t vs. μ map perovskites and nonperovskites tend to cluster in distinct regions (22, 23, 30). Based on this observation, much work has been dedicated to identifying a “stability range,” either by postulating boundaries for t and μ (22, 23, 31) or by using machine learning to draw t vs. μ curves enclosing the data points (30). The merit of these efforts is that they addressed the predictive power of the no-rattling principle in qualitative terms. However, these approaches suffer from relying too heavily on empirical ionic radii. It is well known that the definition of ionic radii is nonunique and that even within the same definition there are variations reflecting the coordination and local chemistry (1, 15, 27, 29, 32). As these uncertainties transfer to the octahedral and tolerance factors, the stability ranges proposed so far are descriptive rather than predictive. To overcome these limitations, instead of defining a map starting from empirical data, our strategy is to construct a stability range from first principles, by relying uniquely on Goldschmidt’s hypothesis. This choice allows us to also derive a structure map for quaternary compounds, a step that has thus far remained elusive.

We describe our strategy starting from ternary perovskites, and then we generalize our findings to quaternary perovskites. Fig. 1*B* shows that, for the A cation to fit in the cavity, the radii must satisfy the condition *SI Appendix*, section 3). Similarly, Fig. 1*C* shows that the octahedral coordination of the B cation by six X anions is not possible when *E*. When these conditions are not fulfilled, the lattice tends to distort toward a layered geometry, with edge-sharing or face-sharing octahedra or lower B-site coordination (15, 31). These two bounds are well known (15, 29) but are insufficient for quantitative structure prediction.

When *t* is smaller than 1, the corner-sharing octahedra exhibit an increased degree of tilting, and the A-site cation is displaced from the central position in the cuboctahedral cavity, as shown Fig. 1*D* and *SI Appendix*, Fig. S2 (17, 18). Previous work has shown that the displacement of the A-site cation can be determined by optimizing Coulomb interactions between the perovskite ions, taking into account the bond-valence configuration of each ion (19). Here we explore this scenario using a purely geometric approach, by identifying the ionic positions which achieve the tightest packing of ions in a tilted perovskite configuration.

Fig. 1*D* shows that the octahedra can tilt only until two anions belonging to adjacent octahedra come into contact. In this extremal configuration, to satisfy the no-rattling principle the A-site cation must exceed a critical size. Using the geometric construction described in *SI Appendix*, section 3 and shown in *SI Appendix*, Fig. S2, this condition translates into a lower limit for the tolerance factor, *SI Appendix*, Fig. S3). We refer to this condition as the “tilt” limit (Fig. 1*E*). When this criterion is not met, the perovskite network tends to collapse into structures with edge-sharing or face-sharing octahedra. Along the same lines we must consider the limit of two neighboring A-site cations coming into contact (Fig. 1*E* and *SI Appendix*, Fig. S4*A*) and of the A and B cations touching (*SI Appendix*, Fig. S4*B*). Besides these geometric constraints, we also should take into account that the rigid spheres of the model represent chemical elements, and therefore we have additional bounds on the size of the ionic radii: The largest tolerance factor corresponds to the combination of Cs and F, and the oxidation number of A cannot exceed that of B according to Pauling’s valency rule (32). By considering all of the *E*.

These considerations are readily generalized to double perovskites. In this case we have two different cations B and B′ (*SI Appendix*, Fig. S1*A*), and therefore we must consider two octahedral parameters: the average octahedral factor, *SI Appendix*, section 4 we derive the generalized tolerance factor, which takes the form *A*. If we slice this volume through the plane *D*. We emphasize that our present results derive exclusively from Goldschmidt’s principle (barring the chemical limits which are not essential) and make no reference to the definition and values of the ionic radii. The bounds of the perovskite regions are easy to evaluate for any structure and are described by six inequalities for *SI Appendix*, Table S1.

Can the inequalities in *SI Appendix*, Table S1 be used for structure prediction? To answer this question we analyzed a record of 2,291 ternary and quaternary compounds that we collected from the ICSD and from refs. 20⇓⇓⇓⇓⇓–26, as described in *SI Appendix*, section 2. Datasets S1–S3 include 1,622 perovskites (Dataset S1), 592 nonperovskites (Dataset S2), and 77 compounds which can crystallize either as a perovskite or as another structure (Dataset S3). Fig. 2*A* shows the distribution of all these compounds in the *B*–*E*, where we show slices of the perovskite volumes at fixed octahedral mismatch, and in *SI Appendix*, Fig. S5, where data for perovskites and nonperovskites are presented separately. In Fig. 2 *B*–*E* and *SI Appendix*, Fig. S5 we see that, as _{3}: While this compound is mostly known as a ferroelectric perovskite, it is also stable in a hexagonal structure under the same pressure and temperature conditions (33).

We now assess the predictive power of the model on quantitative grounds. The simplest way to proceed would be to classify compounds based on whether the corresponding *SI Appendix*, section 5. With this choice we define the “formability” as the fraction of the cuboid volume falling within the perovskite region, and we classify the compound as a perovskite if this fraction exceeds a critical value (*SI Appendix*, Fig. S6*A*). To quantify the accuracy of this classification procedure and the associated uncertainty, we determine the classification of large subsets of compounds, randomly selected from Datasets S1 and S2, and we repeat this operation several thousand times. By the central limit theorem, the average success rates tend to a normal distribution (*SI Appendix*, Fig. S7*B*); the center of this distribution gives the most probable success rate, and the standard deviation yields the statistical uncertainty (*SI Appendix*, Fig. S6*C*).

Our main result is that, for sample sizes of 100 compounds or more, the geometric model correctly classifies 79.7 ± 4.0% of all compounds with a 95% confidence level. This predictive power is unprecedented among structure prediction algorithms.

For completeness we also assess how our model compares with previous models. To this end, we calculate how many of the known compounds in Datasets S1 and S2 are correctly classified within the original model of Goldschmidt (which considers only the stretching and octahedral limits); within three other empirical models reported in refs. 22, 31, and 34; and within our present model (*SI Appendix*, Fig. S7). *SI Appendix*, Fig. S7*E* shows that the stretching and octahedral limits correctly categorize nearly all perovskites in Dataset S1, but fail to discriminate against more than half of nonperovskites in Dataset S2. The empirical regions in *SI Appendix*, Fig. S7 *B–D* clearly demonstrate that by setting a lower bound to the tolerance factor, the accuracy of the model improves significantly for both perovskites and nonperovskites. However, up to now, this bound has been described via empirical data fitting. Our model predicts the tilt limit from first principles, while retaining a very good accuracy in distinguishing perovskites from nonperovskites. By comparison, the perovskite regions reported in refs. 22, 31, and 34 can be understood as zeroth- and first-order approximations to our bounds, respectively.

By applying our classification algorithm to all possible 3,658,527 quaternary combinations, we generated a library of 94,232 hitherto-unknown perovskites and double perovskites that are expected to form with a probability of 80% (*SI Appendix*, Fig. S8). The complete library of predicted perovskites is provided in Datasets S4–S6. Our library of future perovskites dwarves the set of all perovskites currently known and is comparable in size to the ICSD database of all known inorganic crystals, which contains approximately 193,000 structures (www2.fiz-karlsruhe.de).

How many of our predicted perovskites are genuinely unknown compounds, i.e., have never been synthesized? To answer this question we performed a large-scale web data extraction operation by querying an internet search engine about each and every one of the nearly 100,000 compounds in Datasets S4–S6 (*SI Appendix*). This procedure revealed that the overwhelming majority of these compounds have never been reported or mentioned before (Dataset S4) and that fewer than 1% of the structures were already known, namely 786 of 94,232 compounds (Datasets S5 and S6).

The 786 previously known compounds reported in Datasets S5 and S6 were not included in our initial Datasets S1–S3 of known crystals. We use this additional set of known compounds to perform a second blind test of our predictions. According to our inferential analysis we expect 626 compounds of Datasets S5 and S6 (79.7% of 786) to be perovskites. By carrying out a manual literature search we confirmed that 555 crystals are indeed perovskites (Dataset S5). This result is remarkably consistent with our prediction. This blind test replaces a validation based on resource-intensive experimental synthesis of hundreds of new compounds with faster and inexpensive data analytics. The success of the blind test clearly demonstrates that, despite its simplicity, Goldschmidt’s principle has a considerable predictive power. Naturally, by combining our structure map with experiments and ab initio calculations on selected subfamilies, the predictive accuracy of the model is bound to improve even further.

What is the topography of our geometric structure map? Fig. 3*A* and *SI Appendix*, Fig. S9*B* show that the majority of predicted perovskites tend to cluster toward the region with the lowest octahedral, tolerance, and mismatch factors. This high density of compounds stems from the occurrence of a large number of lanthanide oxide and actinide oxide perovskites, which tend to have similar geometric descriptors due to the lanthanide contraction (35). We also note that the concentration of compounds near the bottom of the map shows that the geometric tilt limit derived in this work (Fig. 1*D*) is essential to accurately predict the formability of perovskites.

Fig. 3*B* shows the relative abundances of predicted perovskites. The majority of compounds are oxides (68%), followed by halides (16%), chalcogenides (12%), and nitrides (4%). Why is the perovskite landscape dominated by oxides, and nitrides are so rare instead? To answer this question we observe that the −2 oxidation state of O admits as many as 10 inequivalent charge-neutral combinations of the oxidation states of the cations (*SI Appendix*, Table S2). Furthermore, the oxygen anion has a small radius (1.3 Å), which is compatible with most transition metals, lanthanides, and actinides; and these elements form the most numerous groups in the Periodic Table. A similar argument could be made for chalcogens, which share the same oxidation state as oxygen. However, the chalcogen radii are too large (1.8–2.2 Å) to accommodate most transition metals and actinides, and hence chalcogenide perovskites constitute a much smaller family. Our finding is consistent with recent ab initio calculations (37). Halide perovskites are even less numerous than chalcogenides, mostly owing to their more restrictive −1 oxidation number: In fact, this oxidation state admits only +1 A-site cations and +1/+3 or +2/+2 B-site cations (*SI Appendix*, Table S2). Nitrides constitute an interesting exception to these trends. Indeed, while the ionic radius of N in the −3 oxidation state (1.5 Å) is very similar to that of O and its oxidation number admits as many as seven inequivalent combinations of oxidation states for the cations, in all such combinations at least one B-site cation must have the unusually high +5 oxidation state (*SI Appendix*, Table S2). As the ionic radii tend to decrease with the oxidation number (32), most B-site cations turn out to be too small to be coordinated by six nitrogen anions in an octahedral environment. As a result, if we exclude radioactive elements, we find fewer than 80 nitride perovskites across the entire Periodic Table. We note that, owing to the lower electronegativity of nitrogen, the ionic character of the chemical bonds in nitride perovskites is reduced, and our geometric model is reaching its limits of applicability (31). However, the scarcity of nitride perovskites predicted by our model is fully consistent with recent ab initio calculations (38) and with experimental observations (only one ternary nitride perovskite can be found in ICSD), indicating that the rigid sphere approximation can still provide meaningful predictions for nitrides. Among our predicted compounds we also identified many unexpected binary compounds of the type A_{2}X_{3}. One such example is iron oxide, Fe_{2}O_{3}. While this oxide is primarily known in the form of hematite (corundum structure), it was recently found that the crystal undergoes a phase transition to a perovskite at high pressure and temperature (39). The stabilization of Fe_{2}O_{3} as a perovskite under high pressure is in agreement with our ab initio calculations (discussion in *SI Appendix*, section 8 and *SI Appendix*, Fig. S10) and can be associated with the well-known phase transition of ilmenites (ternary ordered corundum) into perovskites, observed, for example, for FeTiO_{3} (34). This finding suggests that several other binary compounds may hide a perovskite phase in their phase diagram, an intriguing possibility that is open to investigation.

To demonstrate the applicability of our model, we take the example of ternary oxide perovskites. Fig. 4 shows a comparison between the combinatorial screening of ternary oxides performed using our model, ab initio calculations reported in ref. 36, and experimental data collected from Datasets S1, S3, and S5. The compositions classified as perovskites by our model include 92% of the experimentally observed oxide perovskites. In particular, when A is a lanthanide and B is a first-row transition metal, our model predicts that most compositions can form as a perovskite, in excellent agreement with DFT predictions and experiment. This can be explained by the similar ionic sizes of transition metals and rare earths, respectively. The same prediction is made for the case when both A and B are rare earths; however, fewer perovskites are found from DFT and experiment. The reason for this discrepancy is that the nonrattling principle effectively probes the dynamical stability of a given chemical composition in the perovskite structure. However, it does not contain information on its stability against decomposition (thermodynamic stability). Of the two criteria, the thermodynamic stability requirement is more stringent, and this explains why generally geometric blueprints tend to predict more perovskites than have actually been made or that are predicted from ab initio calculations. Therefore, further theoretical and experimental studies are required to ascertain whether these proposed compositions are also thermodynamically stable. Despite this limitation, Fig. 4 demonstrates that the Goldschmidt principle can be used as an efficient and reliable prescreening tool for the high-throughput combinatorial design of perovskites. In fact, in Fig. 4 we show that our model can reduce the number of calculations by more than 70% in the combinatorial screening of ternary oxides. Importantly, the Golschmidt no-rattling principle becomes increasingly useful in the context of screening all possible perovskites beyond oxides, reducing the total number of 3.6 million possible compositions by 97%, to fewer than 100,000 candidates. Therefore, by leveraging the complementary strengths of Goldschmidt’s empirical no-rattling principle and ab initio computational modeling it will be possible to explore the complete chemical landscape of all possible perovskites.

## Conclusion

We charted the complete landscape of all existing and future perovskites. By combining inferential statistics with large-scale web data extraction, we validated Goldschmidt’s no-rattling principle on quantitative grounds and developed a structure map to predict the stability of perovskites with a fidelity of 80%. Our model completes the general theory that Goldschmidt proposed almost a century ago and formalizes the nonrattling hypothesis into a mathematically rigorous set of criteria that can be used in the design and discovery of perovskites. As an outcome of our study, we were able to generate a library of almost 100,000 hitherto-unknown perovskites awaiting discovery (Dataset S4). By releasing this library in full, we hope that this work will stimulate much future experimental and computational research on these fascinating crystals. More generally, our findings suggest that geometric blueprints could serve as a powerful tool to help tackle the exponential complexity of combinatorial materials design.

## Materials and Methods

A full description of the methods, data provenance, and statistical analysis used in this paper can be found in *SI Appendix*.

## Acknowledgments

This work was supported by the Leverhulme Trust (Grant RL-2012-001), the Graphene Flagship (Horizon 2020 Grant 696656-GrapheneCorel), and the UK Engineering and Physical Sciences Research Council (Grant EP/M020517/1).

## Footnotes

↵

^{1}M.R.F. and F.G. contributed equally to this work.- ↵
^{2}To whom correspondence should be addressed. Email: feliciano.giustino{at}materials.ox.ac.uk.

Author contributions: M.R.F. and F.G. designed research, performed research, analyzed data, and wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1719179115/-/DCSupplemental.

Published under the PNAS license.

## References

- ↵
- Muller O,
- Roy R

- ↵
- ↵
- ↵
- ↵
- Mundy JA, et al.

- ↵
- Bednorz JG,
- Müller KA

_{c}superconductivity in the Ba-La-Cu-O system. Z für Phys B Condens Matter 64:189–193. - ↵
- ↵
- Park JH, et al.

- ↵
- Navrotsky A

- ↵
- Suntivich J,
- May KJ,
- Gasteiger HA,
- Goodenough JB,
- Shao-Horn Y

- ↵
- ↵
- Tan ZK, et al.

- ↵
- ↵
- Giustino F,
- Snaith HJ

- ↵
- Megaw H

- ↵
- ↵
- ↵
- Woodward PM

- ↵
- ↵
- ↵
- ↵
- Li C,
- Soh KCK,
- Wu P

_{3}perovskites. J Alloy Compd 372:40–48. - ↵
- ↵
- Vasala S,
- Karppinen M

_{2}B′B′O_{6}perovskites: A review. Prog Solid State Chem 43:1–36. - ↵
- Galasso FS

- ↵
- Flerov IN,
- Gorev MV

- ↵
- ↵
- ↵
- ↵
- ↵
- Giaquinta DM,
- zur Loye HC

_{3}phase diagram. Chem Mater 6:365–372. - ↵
- Pauling L

- ↵
- Akimoto J,
- Gotoh Y,
- Oosawa Y

_{3}. Acta Crystallogr C 50:160–161. - ↵
- ↵
- ↵
- Emery AA,
- Saal JE,
- Kirklin S,
- Hedge VI,
- Wolverton C

- ↵
- Körbel S,
- Marques MAL,
- Botti S

- ↵
- Sarmiento-Peréz R,
- Cerqueira TFT,
- Körbel S,
- Botti S,
- Marquez MAL

- ↵

## Citation Manager Formats

## Sign up for Article Alerts

## Article Classifications

- Physical Sciences
- Physics