Skip to main content
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Accessibility Statement
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian
  • Log in
  • My Cart

Main menu

  • Home
  • Articles
    • Current
    • Special Feature Articles - Most Recent
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • List of Issues
  • Front Matter
  • News
    • For the Press
    • This Week In PNAS
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Editorial and Journal Policies
    • Submission Procedures
    • Fees and Licenses
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Accessibility Statement
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian

User menu

  • Log in
  • My Cart

Search

  • Advanced search
Home
Home

Advanced Search

  • Home
  • Articles
    • Current
    • Special Feature Articles - Most Recent
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • List of Issues
  • Front Matter
  • News
    • For the Press
    • This Week In PNAS
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Editorial and Journal Policies
    • Submission Procedures
    • Fees and Licenses

New Research In

Physical Sciences

Featured Portals

  • Physics
  • Chemistry
  • Sustainability Science

Articles by Topic

  • Applied Mathematics
  • Applied Physical Sciences
  • Astronomy
  • Computer Sciences
  • Earth, Atmospheric, and Planetary Sciences
  • Engineering
  • Environmental Sciences
  • Mathematics
  • Statistics

Social Sciences

Featured Portals

  • Anthropology
  • Sustainability Science

Articles by Topic

  • Economic Sciences
  • Environmental Sciences
  • Political Sciences
  • Psychological and Cognitive Sciences
  • Social Sciences

Biological Sciences

Featured Portals

  • Sustainability Science

Articles by Topic

  • Agricultural Sciences
  • Anthropology
  • Applied Biological Sciences
  • Biochemistry
  • Biophysics and Computational Biology
  • Cell Biology
  • Developmental Biology
  • Ecology
  • Environmental Sciences
  • Evolution
  • Genetics
  • Immunology and Inflammation
  • Medical Sciences
  • Microbiology
  • Neuroscience
  • Pharmacology
  • Physiology
  • Plant Biology
  • Population Biology
  • Psychological and Cognitive Sciences
  • Sustainability Science
  • Systems Biology
Research Article

Choosing experiments to accelerate collective discovery

Andrey Rzhetsky, Jacob G. Foster, Ian T. Foster, and View ORCID ProfileJames A. Evans
PNAS first published November 9, 2015; https://doi.org/10.1073/pnas.1509757112
Andrey Rzhetsky
aDepartments of Medicine and Human Genetics, University of Chicago, Chicago, IL 60637;
bComputation Institute, University of Chicago and Argonne National Laboratory, Chicago, IL 60637;
cInstitute of Genomic and Systems Biology, University of Chicago, Chicago, IL 60637;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: arzhetsky@uchicago.edu jevans@uchicago.edu
Jacob G. Foster
dDepartment of Sociology, University of California, Los Angeles, CA 90095;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ian T. Foster
bComputation Institute, University of Chicago and Argonne National Laboratory, Chicago, IL 60637;
eMathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60637;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
James A. Evans
bComputation Institute, University of Chicago and Argonne National Laboratory, Chicago, IL 60637;
fDepartment of Sociology, University of Chicago, Chicago, IL 60637
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for James A. Evans
  • For correspondence: arzhetsky@uchicago.edu jevans@uchicago.edu
  1. Edited by Yu Xie, University of Michigan, Ann Arbor, MI, and approved September 8, 2015 (received for review May 18, 2015)

  • Article
  • Figures & SI
  • Info & Metrics
  • PDF
Loading

Significance

Scientists perform a tiny subset of all possible experiments. What characterizes the experiments they choose? And what are the consequences of those choices for the pace of scientific discovery? We model scientific knowledge as a network and science as a sequence of experiments designed to gradually uncover it. By analyzing millions of biomedical articles published over 30 y, we find that biomedical scientists pursue conservative research strategies exploring the local neighborhood of central, important molecules. Although such strategies probably serve scientific careers, we show that they slow scientific advance, especially in mature fields, where more risk and less redundant experimentation would accelerate discovery of the network. We also consider institutional arrangements that could help science pursue these more efficient strategies.

Abstract

A scientist’s choice of research problem affects his or her personal career trajectory. Scientists’ combined choices affect the direction and efficiency of scientific discovery as a whole. In this paper, we infer preferences that shape problem selection from patterns of published findings and then quantify their efficiency. We represent research problems as links between scientific entities in a knowledge network. We then build a generative model of discovery informed by qualitative research on scientific problem selection. We map salient features from this literature to key network properties: an entity’s importance corresponds to its degree centrality, and a problem’s difficulty corresponds to the network distance it spans. Drawing on millions of papers and patents published over 30 years, we use this model to infer the typical research strategy used to explore chemical relationships in biomedicine. This strategy generates conservative research choices focused on building up knowledge around important molecules. These choices become more conservative over time. The observed strategy is efficient for initial exploration of the network and supports scientific careers that require steady output, but is inefficient for science as a whole. Through supercomputer experiments on a sample of the network, we study thousands of alternatives and identify strategies much more efficient at exploring mature knowledge networks. We find that increased risk-taking and the publication of experimental failures would substantially improve the speed of discovery. We consider institutional shifts in grant making, evaluation, and publication that would help realize these efficiencies.

  • complex networks
  • computational biology
  • science of science
  • innovation
  • sociology of science

Footnotes

  • ↵1To whom correspondence may be addressed. Email: arzhetsky{at}uchicago.edu or jevans{at}uchicago.edu.
  • Author contributions: A.R., J.G.F., and J.A.E. designed research; A.R., J.G.F., and J.A.E. analyzed data; and A.R., J.G.F., I.T.F., and J.A.E. wrote the paper.

  • The authors declare no conflict of interest.

  • This article is a PNAS Direct Submission.

  • *The notion that linking distant literatures is hard but potentially fruitful underwrites Swanson’s work on literature-based discovery (31).

  • †Scientists often study several entities in combination. This complicates the modeling, so we approximate the discovery process with dyadic strategies.

  • ‡Some values of αμ and αι describe a mechanism analogous to preferential attachment (21, 33), in which researchers choose concepts in proportion to the product of their degrees. Our model encodes many types of preferential attachment, e.g., versions that are superlinear in the degrees. We find that such preferential attachment strategies can be much more efficient for discovery.

  • §Observed behavior is generated by the interaction between preferences and the evolving set of opportunities. This makes interpretation subtle. For example, when considering chemicals in different connected components, a specific opportunity to combine them would be preferred (i.e., has a higher probability than an opportunity to connect similar chemicals at finite distance). Over time, however, more nodes enter the giant component. Hence, fewer opportunities exist to connect nodes in different components, leading to their small absolute number (Figs. S3B and S6).

  • ¶We assume that published research reflects the underlying distribution of research effort in a relatively undistorted way. Recent survey data on scientific choice are consistent with this assumption (41). Although we interpret Fig. 1 and Fig. S6D to imply that scientists pursue less risky projects over time, it is possible that scientists pursue such projects with the same intensity, but that fewer succeed and are published in later periods. We cannot tackle this issue directly, but consider how effort is screened by experimental failure, publication bias, etc., to produce the distribution of published choices. Our interpretation assumes that although a priori “risky” strategies (like combining two distant, low-degree chemicals) may fail more often than conservative alternatives, the risk is not so high that the published record no longer reflects the underlying distribution of effort. It also requires that risky strategies do not become much riskier over time. If the selection process has these plausible properties—i.e., it is well behaved and near stationary—then changes in the observed distribution and inferred parameters will track changes in the unobserved distribution of research effort and scientific choice.

  • This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1509757112/-/DCSupplemental.

Freely available online through the PNAS open access option.

Next
Back to top
Article Alerts
Email Article

Thank you for your interest in spreading the word on PNAS.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Choosing experiments to accelerate collective discovery
(Your Name) has sent you a message from PNAS
(Your Name) thought you would like to see the PNAS web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Citation Tools
Accelerating collective discovery
Andrey Rzhetsky, Jacob G. Foster, Ian T. Foster, James A. Evans
Proceedings of the National Academy of Sciences Nov 2015, 201509757; DOI: 10.1073/pnas.1509757112

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Request Permissions
Share
Accelerating collective discovery
Andrey Rzhetsky, Jacob G. Foster, Ian T. Foster, James A. Evans
Proceedings of the National Academy of Sciences Nov 2015, 201509757; DOI: 10.1073/pnas.1509757112
Digg logo Reddit logo Twitter logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Mendeley logo Mendeley
Proceedings of the National Academy of Sciences: 118 (8)
Current Issue

Submit

Sign up for Article Alerts

Jump to section

  • Article
  • Figures & SI
  • Info & Metrics
  • PDF

You May Also be Interested in

Surgeons hands during surgery
Inner Workings: Advances in infectious disease treatment promise to expand the pool of donor organs
Despite myriad challenges, clinicians see room for progress.
Image credit: Shutterstock/David Tadevosian.
Setting sun over a sun-baked dirt landscape
Core Concept: Popular integrated assessment climate policy models have key caveats
Better explicating the strengths and shortcomings of these models will help refine projections and improve transparency in the years ahead.
Image credit: Witsawat.S.
Double helix
Journal Club: Noncoding DNA shown to underlie function, cause limb malformations
Using CRISPR, researchers showed that a region some used to label “junk DNA” has a major role in a rare genetic disorder.
Image credit: Nathan Devery.
Steamboat Geyser eruption.
Eruption of Steamboat Geyser
Mara Reed and Michael Manga explore why Yellowstone's Steamboat Geyser resumed erupting in 2018.
Listen
Past PodcastsSubscribe
Multi-color molecular model
Enzymatic breakdown of PET plastic
A study demonstrates how two enzymes—MHETase and PETase—work synergistically to depolymerize the plastic pollutant PET.
Image credit: Aaron McGeehan (artist).

Similar Articles

Site Logo
Powered by HighWire
  • Submit Manuscript
  • Twitter
  • Facebook
  • RSS Feeds
  • Email Alerts

Articles

  • Current Issue
  • Special Feature Articles – Most Recent
  • List of Issues

PNAS Portals

  • Anthropology
  • Chemistry
  • Classics
  • Front Matter
  • Physics
  • Sustainability Science
  • Teaching Resources

Information

  • Authors
  • Editorial Board
  • Reviewers
  • Librarians
  • Press
  • Site Map
  • PNAS Updates

Feedback    Privacy/Legal

Copyright © 2021 National Academy of Sciences. Online ISSN 1091-6490