# Quantifying the information impact of future searches for exoplanetary biosignatures

See allHide authors and affiliations

Edited by Neta A. Bahcall, Princeton University, Princeton, NJ, and approved July 17, 2020 (received for review April 20, 2020)

## Significance

The search for life on extrasolar worlds by way of spectroscopic biosignature detection is among the most compelling scientific endeavors of the next decades. This article explores the implications of either discovering or ruling out the presence of detectable biosignatures on planets within a few tens of light years from Earth, a distance within reach of future searches. Using a Bayesian methodology, we show that not detecting biosignatures in such sample volume would bring no added information about the galactic population of life-hosting exoplanets. Conversely, if life arose independently on other planets, even a single detection would imply exobiospheres to be more abundant than pulsars. Putative interstellar transfer of life through the panspermia mechanism may, however, significantly lower this estimate.

## Abstract

One of the major goals for astronomy in the next decades is the remote search for biosignatures (i.e., the spectroscopic evidence of biological activity) in exoplanets. Here we adopt a Bayesian statistical framework to discuss the implications of such future searches, both in the case when life is detected and when no definite evidence is found. We show that even a single detection of biosignatures in the vicinity of our stellar system, in a survey of similar size to what will be obtainable in the next 2 decades, would affect significantly our prior belief on the frequency of life in the universe, even starting from a neutral or pessimistic stance. In particular, after such discovery, an initially agnostic observer would be led to conclude that there are more than

Over the past 2 decades, astronomical observations have detected thousands of planets orbiting other stars in our galaxy, allowing us to draw robust statistical conclusions on the populations of such planets (1). Generally speaking, it is now believed that every star in our galaxy should have at least one planet (2) and that many such planets have physical features that may be conducive to the presence of life (3⇓–5).

With the focus of current research rapidly shifting from the detection of exoplanets to their characterization—and, in particular, to the study of their atmospheric composition—we are getting closer to the goal of looking for spectroscopic signatures of biological activity on other worlds (6⇓⇓–9). In the near term, the Transiting Exoplanet Survey Satellite (TESS) (10), CHaracterising ExoPlanet Satellite (CHEOPS) (11), and PLAnetary Transits and Oscillations of stars (PLATO) (12) space missions will refine the sample of potentially habitable nearby planets more suitable for follow-up observations. Over the next couple of decades, there will be realistic opportunities for attempting the detection of biosignatures on the most promising targets, both from the ground (e.g., with the European Extremely Large Telescope*) and with dedicated space observatories [such as James Webb Space Telescope [JWST] ref. 13 or Atmospheric Remote-sensing Exoplanet Large-survey (ARIEL) ref. 14]. On a longer time scale, envisioned missions such Habitable Exoplanet Observatory [Habitable Exoplanet observatory (HabEX^{†})], the Large UV/Optical/IR Surveyor^{‡}, and the Origins Space Telescope^{§} might attempt biosignature detection through the direct imaging of habitable rocky exoplanets.

Since technological limitations will initially restrict the search for biosignatures to the immediate vicinity of our stellar system (i.e., within a few tens of light years), a rigorous statistical treatment will be necessary in order to draw conclusions on the possible distribution of inhabited planets in the entire galaxy from a survey of limited spatial extent. This will be true both in the case of a positive detection of life on one or more exoplanets in a given volume and in the case where no evidence will be found.

Here we suggest an approach to this problem based on the adoption of a Bayesian perspective, showing how existing knowledge or credence on the presence of life beyond Earth will be updated as new evidence will be collected from future missions. A notable previous application of the Bayesian methodology in the context of life emerging in the universe was the attempt to quantify the rate of abiogenesis conditioned on a single datum, i.e., the early appearance of life on Earth, combined with the evidence that it took

Our study tackles the issue of how frequent life is in the universe from a different perspective. We bypass the question of the time scales involved in the abiogenesis, and we rather focus on the present abundance of inhabited planets in the galaxy. In particular, we are interested in assessing the impact of new data (those that could be possibly collected in the next 2 decades) in terms of information gain with respect to existing credence on the probability of life on other planets. We suggest a way to disentangle this unknown probability from others that can be in principle estimated independently, in particular, those pertaining to the probability that a specific survey can in fact observe habitable planets. A related relevant question addressed by our study is how hypotheses that assign a lower or higher credence to the presence of life outside Earth—i.e., a pessimistic, neutral, or optimistic attitude toward extraterrestrial life—are weighed and compared in light of new, sparse evidence. Finally, we consider how our results are altered when accounting for the possibility that the distribution of life is correlated over some characteristic distance, such as in panspermia scenarios.

## Methods

### Main Assumptions.

Our statistical model assumes that there are N potentially habitable planets in the Milky Way (i.e., rocky planets orbiting the habitable zone of their host star) and that a survey has looked for spectroscopic biosignatures within a radius R centered around Earth. Statistical estimates based on available data suggest that the percentage of Sun-like (GK-type) and M dwarf stars in our galaxy hosting rocky planets in the habitable zone is about 10 to 20% and 24%, respectively (3⇓–5), resulting in a number of potentially habitable planets of order

The probability p is a shorthand for the various factors that concur to make the presence of detectable biosignatures possible. In our Bayesian analysis we distinguish the factors ascribed to the selection effects of a specific survey from those that are truly inherent to the presence of biosignatures. To this end, we adopt a formalism similar to the one first suggested in ref. 18: this is akin to the Drake equation (19) used in the context of the search for extraterrestrial intelligence but adapted to the search for biosignatures. In our notation, this reduces to writing down the probability p as the product of independent probabilities:

The other probability factors in Eq. **3**,

Finally,

In the present work, we leave unaddressed the possibility of both false negatives (biosignatures that are present but go undetected) and false positives (gases of abiotic origin that are mistakenly interpreted as products of life): however, we note that in principle, both can be incorporated in our formalism through another probability factor, following, for example, the Bayesian framework outlined in refs. 20 and 21 (*SI Appendix*, section II). Our procedure could also be easily specialized for technosignatures, incorporating the appropriate probabilistic factors, along the lines of ref. 22.

In modeling **4** assumes that the density profile of habitable exoplanets is proportional to that of stars in the galaxy, other factors such as the metallicity gradient may affect the overall radial dependence of *SI Appendix*, section III).

Throughout this work, we take an observational radius of *SI Appendix*, section I).

The probability that a survey searching for biosignature within R finds remotely detectable biosignatures on exactly **1** as

### Bayesian Analysis.

By isolating the probability factor in p that pertains to astrophysical and observational constraints, **6** gives the likelihood of

We consider three different models of the prior defined in the interval

Finally, we consider here only two events resulting from the survey: nondetection, *SI Appendix*, section I).

## Results and Discussion

### Noninformative Prior.

We start by assuming no prior knowledge on even the scale of

Fig. 1*A* compares the impact of observing or not observing biosignatures within 100 ly. In the case of nondetection, the posterior PDF of *D*) is somewhat smaller than the corresponding prior CCDF, the main deviation being an upper cutoff for *SI Appendix*, Fig. S8). In particular, reducing

By contrast, the discovery of biosignatures on even a single planet within the entire survey volume (*A* and *D* (dotted lines) (*SI Appendix*, section I). We further note that although changing *SI Appendix*, Fig. S8).

To provide a more complete analysis of the noninformed case, we have considered also the log–log-uniform prior, which has been designed to reflect total ignorance about the number of conditions conducive to life (24). Although the log–log-uniform PDF slightly favors large values of *A* and *D* (*SI Appendix*, section IV).

### Informative Priors.

A log-uniform PDF is probably the best prior reflecting the lack of information on

Fig. 1 *B* and *C* show the posterior PDFs and CCDFs resulting from detection or nondetection starting from a pessimistic hypothesis about *C* and *F* the prior strongly constrains the posteriors resulting from the events of both detection and nondetection. In particular, the smallness of *F*), not justifying thus a substantial revision of the initial optimistic stance.

### Model Comparison.

By adopting impartial judgement about the probability of

Model comparison through the Bayes factor (Fig. 2) shows that if no detection is made, a pessimistic credence with regard to extraterrestrial life would strongly increase its likelihood with respect to an optimistic one, with a Bayes factor above 10, only if

### Impact of Panspermia Scenarios.

So far, we have assumed that any given planet has some probability of harboring life independently of whether or not other planets harbor life as well. However, in general, this may not be the case. For example, according to the hypothetical panspermia scenario, life might be transferred among planets, within the same stellar system, in stellar clusters, or over interstellar distances (25⇓⇓–28). If conditions favor the flourishing of a biosphere within a relatively short time scale after the transfer, this would result in an enhanced probability that a planet is inhabited if a nearby planet is inhabited as well (29). In this way, if panspermia can occur, the probability that two planets produce simultaneously biosignatures will depend on their relative distance and on a typical length scale that we denote ξ, defined by the capability of life of surviving transfer and establishing a biosphere.

We took this possibility into account by modeling the statistical correlation of biosignatures and rewriting

In principle, different models of correlation could be linked to specific panspermia mechanisms, and various scenarios might even be distinguished observationally from an independent abiogenesis (29). This could be an interesting subject for future studies. However, here we are only interested in how the presence of generic correlations would impact the statistical significance of biosignature detection. We adopt a simple model for the pair distribution function, by assuming that it depends on the relative distance **13** reduces to

As shown in Fig. 3*A*, the average number of biosignatures within R, *SI Appendix*, section IIIB). For much larger values of ξ, panspermia would distribute life homogeneously through the entire galaxy, and

Fig. 3*B* shows that if possible correlations in the biosignatures are taken into account, the probability that the number of life-bearing planets in the galaxy is larger than a given value (taken as *C*) of the optimistic scenario with respect to the noninformative one. Depending on ξ and χ, there may be no gain in knowledge when life is detected elsewhere, and for

## Data Availability.

All data are included in the manuscript and *SI Appendix*.

## Acknowledgments

A.B. acknowledges support by the Italian Space Agency (DC-VUM-2017-034, Grant 2019-3 U.O Life in Space) and by Grants FQXi-MGA-1801 and FQXi-MGB-1924 from the Foundational Questions Institute and Fetzer Franklin Fund, a donor-advised fund of Silicon Valley Community Foundation.

## Footnotes

- ↵
^{1}To whom correspondence may be addressed. Email: amedeo.balbi{at}roma2.infn.it or claudio.grimaldi{at}epfl.ch.

Author contributions: A.B. and C.G. designed research, performed research, and wrote the paper.

The authors declare no competing interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2007560117/-/DCSupplemental.

Published under the PNAS license.

## References

- ↵
- N. M. Batalha

- ↵
- ↵
- E. A. Petigura,
- A. W. Howard,
- G. W. Marcy

- ↵
- J. K. Zink,
- B. M. S. Hansen

- ↵
- C. D. Dressing,
- D. Charbonneau

- ↵
- S. Seager

- ↵
- J. L. Grenfell

- ↵
- Y. Fujii,
- D. Angerhausen,
- R. Deitrick

- ↵
- N. Madhusudhan

- ↵
- G. R. Ricker et al.

- ↵
- C. Broeg,
- W. Benz,
- A. Fortier

- ↵
- H. Rauer et al.

- ↵
- ↵
- G. Tinetti et al.

- ↵
- D. S. Spiegel,
- E. L. Turner

- ↵
- J. Chen,
- D. Kipping

- ↵
- J. Haqq-Misra,
- R. K. Kopparapu,
- E. T. Wolf

- ↵
- S. Seager

- ↵
- F. D. Drake

- ↵
- S. I. Walker et al.

- ↵
- D. C. Catling et al.

- ↵
- M. Lingam,
- A. Loeb

- ↵
- A. Misiriotis,
- E. Xilouris,
- J. Papamastorakis,
- P. Boumis,
- C. Goudis

- ↵
- B. C. Lacki

- ↵
- ↵
- ↵
- I. Ginsburg,
- M. Lingam,
- A. Loeb

- ↵
- H. W. Lin,
- A. Loeb

- ↵

## Citation Manager Formats

## Article Classifications

- Physical Sciences
- Astronomy