## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Defining genetic interaction

Mani |

## Supporting Information

#### Files in this Data Supplement:

SI Table 2

SI Table 3

SI Figure 4

SI Table 4

SI Table 5

SI Table 6

SI Table 7

SI Table 8

SI Figure 5

SI Figure 6

SI Figure 7

SI Figure 8

SI Figure 9

SI Figure 10

SI Figure 11

SI Figure 12

SI Text

SI Figure 3A

SI Figure 3B

**Fig. 3.** Coefficients of variation for Study J data. Plot *a* shows a plot of coefficient of variation versus the mean for each single and double mutant that was studied in Study J. Plot
*b* shows a region of plot *a* with a high density of points superimposed with a running median calculated with window size of 30.

**Fig. 4. **Comparing distributions of |ε_{Min}|, |ε_{Product}|, |ε_{Log}|, and |ε_{Additive}| using data from Study J. (*a*) Pairs from Study J for which the single and double mutants exhibited slower than wild-type growth. (*b*) The subset of pairs from *a* for which the phenotypes of both single mutants was also >0.9. (*c*) The subset of pairs from *a* for which the phenotypes of both single mutants was <0.9 but >0.75. (*d*) The subset of pairs for which phenotype of at least one of the single mutants was >0.75, and the phenotype of the other
was >0.9.

**Fig. 5. **Comparing distributions of ε_{Min}, ε_{Product}, ε_{Log}, and ε_{Additive}, using data from Study S in the absence of MMS. (*a*) All pairs from Study S (MMS-). (*b*) The subset of pairs from *a* for which the phenotypes of both single mutants was also >0.9. (*c*) The subset of pairs from *a* for which the phenotypes of both single mutants was <0.9.

**Fig. 6.** Comparing distributions of ε_{Min}, ε_{Product}, ε_{Log}, and ε_{Additive} using data from Study S in the presence of MMS. (*a*) All pairs from Study S. (*b*) The subset of pairs from *a* for which the phenotypes of both single mutants was <0.9 but >0.75. (*c*) The subset of pairs from *a* for which phenotype of at least one of the single mutants was <0.75, and the phenotype of the other was <0.9.

**Fig. 7. **Comparing distributions of |ε_{Min}|, |ε_{Product}|, |ε_{Log}|, and |ε_{Additive}| using data from Study S in the absence of MMS. (*a*) All pairs from Study S (MMS-). (*b*) The subset of pairs from *a* for which the phenotypes of both single mutants was also >0.9. (*c*) The subset of pairs from *a* for which the phenotypes of both single mutants was <0.9.

**Fig. 8. **Comparing distributions of |ε_{Min}|, |ε_{Product}|, |ε_{Log}|, and |ε_{Additive}| using data from Study S in the presence of MMS. (*a*) All pairs from Study S. (*b*) The subset of pairs from *a* for which the phenotypes of both single mutants was <0.9 but >0.75. (*c*) The subset of pairs from *a* for which phenotype of at least one of the single mutants was <0.75, and the phenotype of the other was <9.

**Fig. 9.** Differences in ε between Product, Min, and Additive definitions. The six plots show differences in ε between each pair of
definitions for various single-mutant fitness values *W _{x}* and

*W*: (

_{y}*a*) Min and Product definitions, (

*b*) Min and Log definitions, (

*c*) Min and Additive definitions, (

*d*) Product and Log definitions, (

*e*) Product and Additive definitions, and (

*f*) Log and Additive definitions. Dark plot regions indicate that the pair of definitions agree closely on the double-mutant fitness predicted for a non-interacting gene pair. These plots show that all four definitions produce identical results when at least one of the single mutants has wild-type fitness. They also show that Product and Log definitions yield practically equivalent results under all circumstances.

**Fig. 10. **Comparison of Product and Log definitions. This plot shows Fig. 7*d* with more resolution on the contours.

**Fig. 11. **Distributions of ε for interactions reported by Study T and Study P. Plots show how interactions identified in Study T (plot
*a*) and Study P (plots *b* and *c*) map on to the corresponding ε calculated for the Product definition in the Study S. Plot *b* shows only Study P_{severe} interactions. With this restriction, both Study T and Study P_{severe} (which use the Min definition) generally find more severe synthetic interactions that are also identified by the Product
definition. Plot *c* shows only Study P_{slight} interactions. Those interactions reported by Study P_{slight} (which used the Min definition) and not by Study S_{Product} are near ε = 0, suggesting that the differences between these studies arise from the definition of genetic interaction.

**Fig. 12. **Comparison of number of synthetic interactions reported in Study P and Study T. Results either include (*Upper*) or exclude (*Lower*) the SF and SF-slight interaction classes in Study P that were most affected by interaction definition choice, and consider
only gene pairs tested in both studies (see Table 1).

**SI Text**

**Computing distributions of ε**

For subsets of gene pairs, medians and median absolute deviations (defined as the median of the absolute deviation of the samples from sample median) were computed using the "median" and "mad" programs in MATLAB (MathWorks). For subsets of pairs, the standard error on the median was computed using 10,000 bootstrapped samples obtained using the "bootstrp" program in MATLAB. Distributions of ε were plotted by computing histograms with a bin width of 0.04.

**Measuring confidence intervals**

Confidence intervals reported for confirmation and rejection rates were computed using the Clopper and Pearson exact method as implemented by the "binofit" program in MATLAB.

**Defining shared function or functional links**

A set of additional studies used the methodology described in the main manuscript for defining functional links between genes (1-3).

**Deriving the Product definition**

The Product definition is obtained by applying a multiplicative neutrality function to the fitness measure *W* defined for a strain as , where *m* is the exponential growth rate of that strain and *m _{wt}* is the exponential growth rate of the wild-type strain (4). For a pair of genes (

*x*,

*y*), fitness values are , , and for the two single mutants and double mutant, respectively. Here,, , and are the corresponding exponential growth rates. If

*x*and

*y*are noninteracting genes, then the multiplicative neutrality function used in these studies predicts that , where . Thus the Product definition is .

**Deriving the Additive definition**

The Additive definition is obtained by applying a multiplicative neutrality function to the fitness measure *F*_{exp} defined for a strain as where *m* is the exponential growth rate of that strain (5, 6). For a pair of genes (*x*, *y*), fitness is , , and respectively, for the two single mutants and double mutant. Here , , and are the corresponding exponential growth rates. If *x* and *y* are noninteracting genes, then the multiplicative neutrality function used in these studies predicts that where is the fitness of the wild-type strain. In terms of exponential growth rate *m*, the multiplicative expectation above becomes: (6).

Now, applying the relative-growth-rate fitness definition, *W*, fitness values for strains *x*, *y*, and *xy* are , , and . The expectation may be expressed as:. This equation forms the basis for the Additive definition. The corresponding quantitative measure of genetic interaction
for this definition is ε_{Additive} =

**Deriving the Log definition**

The Log definition is obtained by applying a multiplicative neutrality function to the fitness measure (7). For a pair of genes (*x*, *y*), fitness measures are , , and respectively, for the two single mutants and double mutant. Here, *W _{x}, W_{x}* and

*W*are the relative-growth-rate measures of fitness corresponding to the mutant strains. If

_{xy}*x*and

*y*are non-interacting genes, then the multiplicative neutrality function predicts . Using the fact that , this reduces to , which in turn can be represented as . The corresponding measure of genetic interaction for the Log definition is then ε

_{Log}= .

**Genetic interaction definitions used previously but not considered here**

Some other fitness measures used in conjunction with the multiplicative neutrality function (8-11) have not been considered here, either due to our inability to replicate the fitness scoring steps carried out in the corresponding studies across the quantitative datasets we wanted to compare in our work).

**Bias shift for pairs involving extreme single-mutant fitness defects.**

As described in the main text, we observed a positive shift in the distribution of ε under each definition for gene pairs
involving an extremely deleterious mutation, relative to the distribution (for the same definition) for pairs involving less
severely deleterious mutations. We cannot rule out the possibility that definitions of genetic interaction truly have a shifted
bias in the context of more deleterious mutations, as has been recently suggested (12). Another explanation is the preferential
existence of compensatory mutations in extremely slow-growing mutant strains. This is supported by the fact that the positive
shift in ε bias we observed in Study J was more severe than that observed in Study S. Specifically, the more extreme positive
shift in εfor Study J (relative to Study S) under every genetic interaction definition may be explained by compensatory mutations
that have arisen more frequently in Study J strains. Strains carrying compensatory mutations (e.g., aneuploidy) may overtake
a population over time and affect measured growth rates, and this effect will be more pronounced for populations of doubly
mutant strains with more severe fitness defects. Study S was less prone to compensatory mutations occurring either before,
during, or after the meiosis combining the deletions. All of the parental strains with the nourseothricin resistance marker
gene (Nat^{r}) were freshly generated from wild type and would have had less opportunity for aneuploidy or other compensatory mutation
than strains from the preexisting library containing strains with the kanamycin resistance marker gene (Kan^{r}). Furthermore, growth rates for Kan^{r} and Nat^{r} strains corresponding to the same deleted gene were compared and where differences were observed, the Kan^{r} and/or Nat^{r} strain was rederived from wild type. If this did not correct the difference in growth rates, the corresponding gene was eliminated
from further analyses. Thus, Study S was less prone to compensatory mutations occurring before the meiotic combination of
two deletion alleles. Furthermore, Study S strains were generated in the absence of MMS. Because no strains in this Study
S had severe growth defects in the absence of MMS, there was less selection for compensatory mutations both before and after
the meiosis combining the two deletion alleles. In Study J, it was argued that compensatory mutations present before the meiosis
would not lead to any bias since the compensatory mutations would randomly segregate to wild-type, single- and double-mutant
strains. Accepting this argument, the positive shift in ε observed for extreme relative to moderate mutations could still
be due to compensatory mutations occurring after the meiosis. Alternatively, mutations that suppress the growth defect of
a mutation with which it was coselected may not impact growth similarly in the absence of that mutation. Thus, the fact that
the study with the greater positive bias in εwas also the study more prone to compensatory mutation is consistent with (but
by no means proves) the idea that compensatory mutations are causing the bias.

**Data**

**Difference between Product and Log definitions**

For a given pair of deleterious single mutations, the predictions for double mutant phenotype from the Product and Log definitions
presented herein are practically indistinguishable. In SI Fig. 3, we demonstrate this fact using a heatmap of the absolute
difference between the double mutant fitness predictions for the two definitions over all deleterious fitness values of *W _{x}* and

*W*. In SI Fig. 10, we show the same plot but with the scale modified. It is clear from this plot the numerical difference between the two definitions has a peak at

_{y}*W*=

_{x}*W*= 0.5. The maximum difference in ε between Product and Log definitions is 0.02155. Note that the difference between Log and Product definitions becomes significantly larger for advantageous mutations, so that Product and Log models are only practically equivalent for deleterious mutations.

_{y}**Overlap of Study J with Study T and Study P**

Among the 45 pairs tested by both Study T and Study J is the pair *NUP60*-*CTF18* that is labeled as synthetic by Study T. This pair has negative ε for all four definitions. Among the 56 pairs tested by
both Study J and Study P are two interactions labeled as synthetic by Study P: *VAC14*-*CCR4* (interaction type SL/SF) and *RAD52*-*RML2* (interaction type SF). The value of ε in Study J is negative for all four definitions for *VAC14*-*CCR4*, but for *RAD52*-*RML2* only the Min definition has a negative ε.

1. Lee I, Date SV, Adai AT, Marcotte EM (2004) A probabilistic functional network of yeast genes. *Science* 306:1555-1558.

2. Wong SL, Zhang LV, Roth FP (2005) Discovering functional relationships: biochemistry versus genetics. *Trends Genet* 21:424-427.

3. Rual JF, *et al.* (2005) Towards a proteome-scale map of the human protein-protein interaction network. *Nature* 437:1173-1178.

4. St. Onge RP, *et al.* (2007) Systematic pathway analysis using high-resolution fitness profiling of combinatorial gene deletions. *Nat Genet* 39:199-206.

5. Szafraniec K, Wloch DM, Sliwa P, Borts RH, Korona R (2003) Small fitness effects and weak genetic interactions between
deleterious mutations in heterozygous loci of the yeast Saccharomyces cerevisiae. *Genet Res* 82:19-31.

6. Jasnos L, Korona R (2007) Epistatic buffering of fitness loss in yeast double deletion strains. *Nat Genet* 39:550-554.

7. Sanjuan R, Elena SF (2006) Epistasis correlates to genomic complexity. *Proc Natl Acad Sci USA* 103:14402-14405.

8. Hartman JL, Tippery NP (2004) Systematic quantification of gene interactions by phenotypic array analysis. *Genome Biol* 5:R49.

9. Schuldiner M, *et al.* (2005) Exploration of the function and organization of the yeast early secretory pathway through an epistatic miniarray profile.
*Cell* 123:507-519.

10. Collins SR, Schuldiner M, Krogan NJ, Weissman JS (2006) A strategy for extracting and analyzing large-scale quantitative
epistatic interaction data. *Genome Biol* 7:R63.

11. Collins SR, *et al.* (2007) Functional dissection of protein complexes involved in yeast chromosome biology using a genetic interaction map. *Nature *446:806-810.

12. Beerenwinkel N, *et al*. (2007) Analysis of epistatic interactions and fitness landscapes using a new geometric approach. *BMC Evol Biol *7:60.