Chloroplast competition is controlled by lipid biosynthesis in evening primroses

In most eukaryotes, organellar genomes are transmitted preferentially by the mother, but molecular mechanisms and evolutionary forces underlying this fundamental biological principle are far from understood. It is believed that biparental inheritance promotes competition between the cytoplasmic organelles and allows the spread of so-called selfish cytoplasmic elements. Those can be, for example, fast replicating or aggressive chloroplasts (plastids) that are incompatible with the hybrid nuclear genome and therefore maladaptive. Here we show that the ability of plastids to compete against each other is a metabolic phenotype determined by extremely rapidly evolving genes in the plastid genome of the evening primrose Oenothera. Repeats in the regulatory region of accD (the plastid-encoded subunit of the acetyl-CoA carboxylase, which catalyzes the first and rate limiting step of lipid biosynthesis), as well as in ycf2 (a giant reading frame of still unknown function), are responsible for the differences in competitive behavior of plastid genotypes. Polymorphisms in these genes influence lipid synthesis and most likely profiles of the plastid envelope membrane. These in turn determine plastid division and/or turn-over rates and hence competitiveness. This work uncovers cytoplasmic drive loci controlling the outcome of biparental chloroplast transmission. Here, they define the mode of chloroplast inheritance, since plastid competitiveness can result in uniparental inheritance (through elimination of the “weak” plastid) or biparental inheritance (when two similarly “strong” plastids are transmitted). Significance statement Plastids and mitochondria are usually uniparentally inherited, typically maternally. When the DNA-containing organelles are transmitted to the progeny by both parents, evolutionary theory predicts that the maternal and paternal organelles will compete in the hybrid. As their genomes do not undergo sexual recombination, one organelle will “try” to outcompete the other, thus favoring the evolution and spread of aggressive cytoplasms. The investigations described here in the evening primrose, a model species for biparental plastid transmission, have discovered that chloroplast competition is a metabolic phenotype. It is conferred by rapidly evolving genes that are encoded on the chloroplast genome and control lipid biosynthesis. Due to their high mutation rate these loci can evolve and become fixed in a population very quickly. Author Contributions J.S. performed the main experimental work. P.G., A.F., J.K., D.W., M.A.S., H.G., T.O., T.P., B.B.S, and S.G. provided supportive data. A.F. and S.G. developed the correlation mapping approach, J.K. implemented PGLS. All authors analyzed and discussed the data. B.B.S. and S.G. designed the study. S.G. and J.S. wrote the manuscript. P.G., A.F., D.W., M.A.S., H.G., R.B. and B.B.S. participated in writing.

to be associated with inheritance strength (Figs. S1, S2, S6-S10; Supplementary Text). Similar to the natural variation observed in the wild types, mutations detected in the coding regions of these genes were in-frame indels (Figs. S1 and S2, Data S2). This analysis did not support an involvement oriB or ycf1 in the inheritance phenotype (Supplementary Text).
Interestingly, in the mutants, the strong sequence variation of the oriB region is not associated with inheritance strength. Furthermore, the second replication origin (oriA) was found to be nearly identical within all sequenced wild type or mutant plastomes. This finding argues against plastid DNA (ptDNA) replication per se being responsible for differences in chloroplast competitiveness. This conclusion is in line with previous analyses of the Oenothera replication origins, which had suggested that their variability does not correlate with the competitive strength of the plastids [17,18] (also see Supplementary Text). We further confirmed this by determining the relative ptDNA amounts of chloroplasts with different inheritance strengths in a constant nuclear background. No significant variation of ptDNA amounts was observed over a developmental time course in these lines, thus excluding ptDNA stability and/or turnover as a potential mechanism ( Fig. S11 and Supplementary Text). Moreover, no significant differences in nucleoid numbers per chloroplast or in nucleoid morphology was observed, as judged from DAPI staining (Figs. S12 and S13, Supplementary Text).
Next, we conducted a more detailed analysis of accD and ycf2. In a constant nuclear background, the weak wild type plastome IV appeared to be an accD overexpressor when compared to the strong wild type plastome I, as judged from northern blot analyses. However, this overexpression, could not be detected in the plastome I variants, which differ in their competitive ability. Similar results were obtained for ycf2. Here, a band of about 7 kb, reflecting the predicted size of the full-length transcript, is observable ( Fig. S14 and Supplementary Text). Interestingly, lower bands, probably reflecting transcript processing and/or degradation intermediates, differ between the strong and the weak wild types I and IV. They can be further correlated with competitive ability in the plastome I variants.
Since these analyses did not allow conclusions about the functionality of AccD or Ycf2 in our lines, we decided to determine the acetyl-CoA carboxylase (ACCase) activity in isolated chloroplasts. It appeared that mutations/polymorphisms in the reading frame of accD influence ACCase enzymatic activity. Surprisingly, mutations/polymorphisms in ycf2 also have an influence on ACCase activity, as revealed by lines that are not affected by mutations in accD (Fig. 3A). The molecular nature of this functional connection between Ycf2 and ACCase activity is currently unclear. A simple correlation of ACCase activity with the competitive ability of plastids is not present, but alterations in the earliest step of fatty acid biosynthesis can conceivably result in various changes in lipid metabolism (see also Supplementary Text). the accD gene, including promoter, 5'-UTR and coding region (CDS). The accD CDS, starting from +1, is highlighted in green. Regions marked by "interval 1" and "interval 2" display nearly absolute correlation to inheritance strength (see Supplementary Text and Data S1 for details). Note that these sequence intervals span the promotor region, the 5'-UTR and the 5'-end of accD. All three are considered to play a regulatory role [15,[26][27][28][29]. Lower panel: Amino acid sequence of the AccD Nterminus and correlation to inheritance strength. Most variation in the sequence is conferred by repeats encoding glutamic acid-rich domains (also see Supplementary Text). Fisher`s exact test (*** p < 0.0001, ** p < 0.001, * p < 0.01). (B) Crosses of I-johSt and variants with I-hookdV as male and, (D) with IV-atroSt as female parent. Box-plots represent the transmission frequencies of the paternal plastomes measured by MassARRAY®. To account for significant differences to I-johSt Kruskal-Wallis, one-way ANOVA on ranks was performed (* p < 0.05). ; accD OE = accD overexpressor; AccD N-ter = AccD N-terminus; Ycf2 site 2 and Ycf2 site 3; see Supplementary Text. Compared to I-johSt: -= not affected, + = mildly affected, ++ = intermediately affected, and +++ = strongly affected, n/d = not determined (cf. Figs. S1,S2, S8 -S10 and S14). Note the influence of mutations in ycf2 on AccD activity in a nonmutated accD background (yellow box), the strong correlation of mutations in the AccD N-terminus with ACCase activity (green box), and the correlation of mutations in site 3 of Ycf2 with inheritance strengths (blue box). Significance of difference compared to I-johSt was calculated using paired two-

Plant material
Throughout this work, the terms Oenothera or evening primrose refer to subsection Oenothera (genus Oenothera section Oenothera) [30]. Plant material used here is derived from the Oenothera germplasm collection harboured at the Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany [31]. Part of this collection is the so-called Renner Assortment, a collection of lines thoroughly characterized by the genetic school of Otto Renner [12,32]. Therefore, the original material of Franz Schötz in which he determined the distinct classes of chloroplast replication speeds [6,33] was available for correlation mapping. For all other genetic or physiological work presented here, the nuclear genetic background of O. elata ssp. hookeri strain johansen Standard [34] was used. The employed chloroplast (genomes) are either native or were introgressed into that race by Wilfried Stubbe or Stephan Greiner. The wild type chloroplast genomes (I-johSt, I-hookdV, II-suavG, and IV-atroSt) are compatible with and hence green when combined with the johansen Standard nucleus.
The chloroplast genome III-lamS confers a reversible bleaching, so-called virescent, phenotype in this genetic background [30,35] (Fig. S15). The white chloroplast mutants I-chi and IV-delta (Fig. S15). are part of the huge collection of spontaneous plastome mutants compiled by Wilfried Stubbe and coworkers [23,36,37]. Both mutants harbour a similar single locus mutation in the psaA gene (encoding a core subunit of photosystem I) and derive from the strong and weak wild type plastomes I-hookdV and IV-atroSt, respectively [23]. The plastome mutator line is a descendant of the original isolate E-15-7 of Melvin D. Epp. The nuclear pm allele was identified after an ethyl methanesulfonate mutagenesis [38] in johansen Standard. From descendants of this line the variant chloroplast genomes (V1a, V1b, V2a, etc., VC1, III-V1 and III-V2) with altered inheritance strengths were generated from the strong chloroplast genomes I-johSt and III-lamS, respectively [14] (see below). A summary of all strains and origins of the chloroplast genomes is given in Tables S4-S9.

Germination, plant cultivation and cross pollination
Fresh seeds were germinated on wet filter paper at 27°C at 100-150 µE m -2 s -1 . With this method essentially 100% germination was achieved after 1-3 day. If desired, seedlings were then cultivated to the appropriate developmental stage in a glasshouse. Crosses with flowering plants were performed as published earlier [23,31].

Plastome mutator mutagenesis
Chloroplast genome mutagenesis was conducted as previously described [16]. In brief, johansen Standard plants newly restored to homozygosity for the nuclear plastome mutator allele (pm/pm) were employed to mutagenize the chloroplast genome (I-johSt). When homozygous the plastome mutator causes a 200-1000x higher rate of chloroplast mutants compared to the spontaneous frequency. The underlying mutations mostly represent indels originated from replication slippage events [16,[38][39][40][41].
Homozygous pm plants were identified when new mutant chlorotic sectors were observed on them. On those plants, flowers on green shoots were backcrossed to the wild type PM allele as pollen donor. In the resulting pm/PM populations the chloroplast mutations were stabilized. This led (after repeated backcrosses with the PM allele and selection with appropriate markers against the paternal chloroplast) to homoplasmic green variants of the strong chloroplast genome I-johSt that differed by certain indels or combination of indels. The material was designated V1a, V1b, V2a, etc., where "V" stands for variant, the Arabic number for the number of the backcrossed plant in the experiment, and the small Latin letter for the shoot of a given plant. An additional line, named VC1, derived from a similar plastome mutator mutagenesis of I-johSt, but the mutagenesis was conducted over several generations. Due to this fact VC1, which is also a green variant, carries a much higher mutational load than do variants V1a, V1b, V2a, etc. (Table S8). The two variant chloroplast genomes III-V1 and III-V2 (Table S9) have a derivation similar to VC1. They are derived from the strong wild type chloroplast genome III-lamS, which displays a reversible bleaching (virescent phenotype) in the johansen Standard nuclear genetic background. To mutagenize this chloroplast genome, it was introgressed into the pm/pm background of johansen Standard by Wilfried Stubbe and selfed for a number of generations.
When stabilized with the PM allele, it still displayed a virescent phenotype that is comparable to the original wild type plastome III-lamS (Fig. S15).

Determination of plastid inheritance strength
In the evening primroses biparental transmission of plastids shows maternal dominance, i.e. F1 plants are either homoplasmic for the maternal chloroplast or heteroplasmic for the paternal and maternal chloroplasts. If in such crosses one of the chloroplasts is marked by a mutation, resulting in a white phenotype, the proportion of variegated (green/white; i.e. heteroplasmic) seedlings can be used to determine chloroplast inheritance strength (as percentage of biparental inheritance). Moreover, if in such crosses one of the crossing partners is kept constant, the inheritance strength of all tested chloroplasts in respect to the constant one is determined [6,7,14,33]. For example, in the I-chi crosses (where the strong white plastid is donated by the father; see below), when more variegated seedlings are found in the F1, it indicates that more paternal (white) chloroplasts were able to out-compete the dominating maternal green chloroplasts. Hence, in this crossing direction small biparental percentace values indicate strong (assertive) plastomes and high biparental values indicate weak variants. The situation is reversed in the reciprocal cross where the white chloroplast is donated by the mother, as is the case in the IV-delta crosses. Here, the weak white chloroplast is maternal and strong green variants contributed by the pollen give high fractions of variegated plants in the F1, whereas low percentages of biparental progeny result when weak green variants are carried by the pollen donor.

Crossing studies
All crossing studies between chloroplast genomes were performed in the constant nuclear background of the highly homozygous johansen Standard strain (see above). Germination efficiency in all populations was 100% (see above). Transmission efficiencies of the green plastome I variants (V1a, V1b, V2a, etc.) were determined using the white chloroplast I-chi (strong inheritance strength) and IV-delta (weak inheritance strength) as crossing partners, respectively. This allows the determination of the inheritance strength of a given green chloroplast relative to a white one based on counting the number of green, variegated (green/white), or white seedlings in the F1 [6,14,33] (also see above). In the I-chi crosses, the green plastome I variants, as well as the wild type chloroplast genomes I-johSt (strong inheritance strength; native in the genetic background of johansen Standard and the original wild type chloroplast genome used for mutagenesis), II-suavG (intermediate inheritance strength) and IV-atroSt (weak inheritance strength) were crossed as female parent to I-chi in three following seasons (2013, 2014, and 2015). In the IV-delta crosses green variants and I-johSt were crossed as male parent to IV-delta, again in three independent seasons, 2013, 2014 and 2015.
From each cross of each season randomized populations of 100-300 plants were grown twice independently, followed by rating of green, variegated and white seedlings/plantlets 14-21 days after sowing (DAS; I-chi crosses) or 7-14 DAS (IV-delta crosses). Based on this counting, percentage of variegated progeny was calculated for each individual cross. To determine statistically significant differences between the transmission efficiencies of the plastome I variants and I-johSt, counting results from all three seasons were summed for a particular cross and a Fisher's exact test was employed.
A very similar experiment was performed to determine the inheritance strength of the two green variants III-V1 and III-V2, which derive from the strong chloroplast genomes III-lamS. Here in two independent seasons (2015 and 2016) the wild type I-johSt was used as pollen donor to induce variegation between the maternal plastome III (giving rise to a virescent phenotype) and the green plastome I (native in the background of johansen Standard; see above).
To determine transmission efficiencies independent of white chloroplast mutants or other bleached material, the plastome I variants (including their wild type I-johSt) were crossed to the green wild type plastomes IV-atroSt (weak inheritance strength) as female parent and to I-hookdV (strong inheritance strength) as male one in two independent seasons (2013 and 2014). F1 progeny was harvested 6 DAS by pooling 60-80 randomized seedlings and the ratios of the plastome types in the pool were analysed via MassARRAY® (Agena Bioscience, Hamburg, Germany) as described below.
MassARRAY®: multiplexed genotyping analysis using iPlex Gold SNP genotyping to distinguish plastome I-johSt and I-hookdV/I-chi or I-johSt and IV-atroSt/IV-delta and subsequent quantification of their plastome ratios in appropriate F1s was carried out with the MassARRAY® system (Agena Bioscience, Hamburg, Germany). The system was used to analyse chloroplast transmission efficiencies in different crosses. For this, total DNA was prepared from 60-80 randomized pooled plantlets 6 DAS. Then, 10 SNPs distinguishing the plastomes I-johSt and I-hookdV/I-chi (I/I assay) and 15 SNPs between I-johSt and IV-atroSt/IV-delta (I/IV assay) were selected.
Two appropriate primers flanking the SNP and one unextended primer (UEP; binding an adjacent sequence to the SNP) were designed using MassARRAY® Assay Design v4.0 (Agena Bioscience, Hamburg, Germany). Primer sequences, SNPs and their position in I-johSt are listed in Table S10.
Plastome regions were amplified in a 5 µl PCR reaction containing PCR buffer (2 mM MgCl2, 500 µM dNTP mix, 1 U HotStartTaq; Agena Bioscience, Hamburg, Germany), 10 ng DNA and 10 (I/I assay) or 15 (I/IV assay) PCR primer pairs, respectively, at concentrations ranging from 0.5-2.0 µM. The reaction mix was incubated for 2 min at 95°C in 96 well plates, followed by 45 cycles of 30 sec at 95°C, 30 sec at 56°C and 60 sec at 72°C, and a final elongation for 5 min at 72°C. Excess nucleotides were removed by adding 0.5 U Shrimp alkaline phosphatase (SAP enzyme) and SAP buffer (Agena Bioscience, Hamburg, Germany), followed by an incubation for 40 min at 37°C and 5 min at 85°C. For the primer extension reaction the iPLEX reaction mixture (containing Buffer Plus, Thermo Sequenase and termination mix 96; Agena Bioscience, Hamburg, Germany) and, depending on the primer, the extension primers at a concentration of 7-28 µM were added. Sequence-specific hybridization and sequence-dependent termination were carried out for 30 sec at 94°C, followed by 40 cycles of 5 sec at 94°C plus five internal cycles of 5 sec at 52°C and 5 sec at 80°C, and finally 3 min at 72°C. After desalting with CLEAN resin (Agena Bioscience, Hamburg, Germany) the samples were spotted on 96pad silicon chips preloaded with proprietary matrix (SpectroCHIP; Agena Bioscience, Hamburg, Germany) by using the Nanodispenser RS1000 (Agena Bioscience, Hamburg, Germany). Subsequently, data were acquired with MALDI-TOF mass spectrometer MassARRAY® Analyzer 4 (Agena Bioscience, Hamburg, Germany) and analyzed with the supplied software. To identify significant differences in the frequencies of paternal ptDNA Kruskal-Wallis one-way analysis of variance (ANOVA) on ranks was performed.

k-means clustering to classify inheritance strength
For the wild type chloroplasts, inheritance strength was classified using the paternal transmission frequencies (percentage of variegated plants in F1, see above) of the chloroplasts "biennis white" and "blandina white" according to Schötz [6]. Both crossing series included the same 25 wild type chloroplasts, 14 of which had fully sequenced genomes and were employed for correlation mapping (Table S5 and below). For original data, see Schötz (1968) [6], summaries in Cleland (1972, p. 180) [12], or Table S11. Based on the two transmission frequencies the tested wild type plastomes were clustered using the k-means algorithm with Euclidean distance as distance dimension. The optimal number of centers was calculated with the pamk function of the fpc package, as implameted in R v.3.2.1 [42]. Strikingly, essentially the same three classes (strong, intermediate, and weak) were obtained that had been previously determined by Schötz [7,12]  For the variants, we used the transmission frequencies from I-chi and IV-delta crosses obtained from this work (Table S8 and Supplementary Text). Since the data-driven determination of the optimal number of clusters (k = 2; see above) does not reflect the biological situation, upon repeated k-means runs, we chose the number of centres with the best trade-off between lowest swapping rate of the samples between the clusters and the biological interpretability. This approach resulted in four classes (see Supplementary Text and Fig. S16 for details).

DNA isolation
Total DNA for Illumina sequencing was isolated described before [23]. For PCR analyses, quantitative real-time PCR and MassARRAY® experiments total DNA was isolated with the DNeasy Plant Mini Kit (Qiagen, Hilden, Germany) [43].

DNA sequencing
Illumina sequencing were performed at the Max Planck-Genome-Centre in Cologne (Germany), as described previously [23,44]. Exclusively "PCR-free" paired-end libraries (375 bp insert size) generated from total DNA isolations were employed. 100 bp Illumina paired-end reads were generated to determine the sequences of the green plastome I variants, whereas 150 bp Illumina paired-end reads were used to assemble the wild type plastomes. Sanger sequence were obtained from Eurofins MWG Operon (Ebersberg, Germany).

Plastome assembly, sequence annotation and repeat analysis
Complete chloroplast genomes were assembled with SeqMan NGen v12.1.0 or v13.0.0 (DNASTAR, Madison, WI, USA). Wild type ones were done de novo. The sequence of the green plastome I variants derive from reference-guided assemblies. For this, their original wild type chloroplast genome I-johSt of O. elata ssp. hookeri strain johansen Standard (AJ271079.4) was used as reference [23]. The notoriously repetitive regions in the Oenothera plastome, where individual repetitive elements can span more than an Illumina read length and/or the insert size of the employed paired-end libraries, namely upstream and/or within accD, ycf1 (tic214), ycf2, and the rrn16 -trnI-GAU spacer (oriB) were determined and/or confirmed by Sanger sequencing in all chloroplast genomes discussed in this work.
Finalized sequences were annotated by GeSeq v0.9 [45] and the wild type chloroplasts genomes were prepared for NCBI submission using GenBank 2 Sequin v1.3 [46]. Repeat structures were analysed based on an inspection by eye and/or employing the EMBOSS suite [47] as previously described [24].

Correlation mapping
For correlation mapping in the wild types, 14  In both sequence sets, divergence at a given alignment window was correlated to the experimentally determined inheritance strengths of a chloroplast genome. For the wild types, inheritance strength was measured using the paternal transmission frequencies (percentage of variegated plants in the F1) of the chloroplasts "biennis white" or "blandina white" according to Schötz [6], or k-means classes combining the two datasets by clustering (Table S11, Supplementary Text and see above). For the variants, we used the transmission frequencies from the I-chi and IV-delta crosses determined in this work, as well as the obtained k-means classes thereof (Table S8, Supplementary Text and above).
For correlation of these transmission frequencies to loci on the chloroplast genome, the redundant inverted repeat A (IRA) was removed from all sequences. Then, plastomes were aligned with ClustalW [48] and the alignments were curated manually (Data S2). Subsequently, using a script in R v3.2.1 [42] (Data S3), nucleotide changes (SNPs, insertion and deletions) relative to a chosen reference sequence plastome [I-hookdV (KT881170.1) for the wild type set and I-johSt (AJ271079. 4) for the variant set] were counted window-wise by two approaches: (i) segmenting the reference sequence in overlapping windows using a sliding window approach with a window size of 1 kb and a step size of 10 bp, yielding a matrix of 13,912 x 13 (wild type set) and 13,668 x 18 (variants), respectively, or (ii) defining regions of interest with correspondingly chosen window sizes. Then, Pearson's and Spearman's correlation coefficients were calculated between (i) the total count of nucleotide changes for every plastome in the aligned sequence window compared to the reference (total sequence divergence) and (ii) the determined inheritance strength of the plastomes (Fig. S17, Data S1). For the sliding window approach, p-values were adjusted for multiple testing using

Chloroplast isolation
For isolation of chloroplasts, Oenothera leaf tissue was harvested 7-8 weeks after sowing and processed as described previously [23]. However, minor modifications were applied to allow a rapid isolation of chloroplasts from six plant lines in parallel: 35 g of leaf material was homogenized in 500 ml BoutHomX buffer. The pellet from the first centrifugation step was re-suspended in 100 ml ChloroWash and, after one filtration, the volume was adjusted to 150 ml before the second filtration.
After the second centrifugation step re-suspended chloroplasts were loaded on two Percoll step gradients (each: 7 ml 85% Percoll, 14 ml 45% Percoll). Subsequent to gradient centrifugation the recovered chloroplasts were washed with 30 ml ChloroWash, followed by three additional washing steps with smaller volumes and a final re-suspension of the chloroplasts in 300-500 µl ChloroWash for the following ACCase activity measurements.

ACCase activity assay
ACCase activity was measured in isolated chloroplasts [51,52]. Suspensions of isolated chloroplasts were diluted to gain 400 µg chlorophyll/ml, determined as described above. To validate equilibration to chlorophyll, protein concentration using a Bradford assay (Quick StartTM Bradford 1x Dye Reagent; Bio-Rad, Hercules, CA, USA; with BSA solutions of known concentrations as standards) and chloroplast counts per ml suspension were determined for the same samples. For chloroplast counting, the suspension gained above was further diluted 1:10, with 15 µl subsequently loaded on a CELLOMETER™ Disposable Cell Counting Chamber (Electron Microscopy Sciences, Hatfield, PA, USA) and analysed under a Zeiss Axioskop 2 (Zeiss, Oberkochen, Germany). For each sample six "B squares" were counted and chloroplast concentration was calculated as chloroplasts/ml = 10 x average count per "B square" / 4 x 10-6. All three equilibration methods gave comparable results. After addition of 3 ml scintillation cocktail (Rotiszint® eco plus, Carl Roth, Karlsruhe, Germany), the acid stable radioactivity from incorporation of H 14 CO3 -( 14 C dpm) were detected by liquid scintillation counter (LS6500, Beckman Coulter, Brea, CA). ACCase activity is represented as the 14 C incorporation rate into acid stable fraction (dpm min -1 ) calculated by dividing the total fixed radioactivity by 20 min.
The rates in three replicated reactions were averaged and corresponding values from negative control samples were subtracted and normalized by the number of chloroplast to gain ACCase activity in individual samples. The average rates were calculated for each line. To combine all measurements, relative ACCase activities were calculated for each experiment as [fold I-johSt], and significant differences between each line and the wild type were identified using two-tailed paired t-test, followed by p-value adjustment using the Benjamini-Hochberg procedure.

Lipid extraction, mass spectrometry sample preparation and measurements
Metabolites were extracted according to published protocols [53] from 50 mg Oenothera seedlings harvested 6 DAS. In brief, frozen tissue was homogenized by a ball mixer mill and transferred to cooled 2.0 ml round bottom microcentrifuge tubes. Subsequently, each sample was re-suspended in 1.0 ml of a -20°C methanol:methyl-tert butyl-ether [1:3 (v/v)] mixture, containing 0.5 μg of 1,2diheptadecanoyl-sn-glycero-3-phosphocholine (Avanti Polar Lipids, Alabaster, AL, USA) as an internal standard. Samples were then immediately vortexed before incubation for 10 min at 4°C on an orbital shaker. This step was followed by ultra-sonication in an ice-cooled bath-type sonicator for an additional 10 min. To separate the organic from the aqueous phase, 650 μl of a H2O:methanol mix [3:1 (v/v)] was added to the homogenate, which was shortly vortexed before being centrifuged for 5 min at 14,000 g. Finally, 500 μl of the upper methyl-tert butyl-ether phase, containing the hydrophobic (lipid) compounds, was placed in a fresh 1.5 ml microcentifuge tube. This aliquot was either stored at -20°C for up to several weeks or immediately concentrated to complete dryness in a speed vacuum concentrator at room temperature. Prior to analysis the dried pellets were re-suspended in 400 μL acetonitrile:isopropanol [7:3 (v:v)], ultra-sonicated and centrifuged for 5 min at 14,000 g. The cleared supernatant was transferred to fresh glass vials and 2 μl of each sample injected onto a C8 reverse phase column (100 mm x 2.1 mm x 1.7 μm particles) using a Acquity UPLC system (Waters, Manchester, UK). In addition to the individual samples, we prepared pooled samples, in which 10 µl of each sample from the whole sample collection was mixed. These pooled samples were measured after every 20 th sample, to provide information on system performance including sensitivity, retention time consistency, sample reproducibility and compound stability.
The mobile phase for our chromatographic separation consisted of Buffer A (1% 1 M NH4acetate and 0.1% acetic acid in UPLC MS grade water), while Buffer B contained 1% 1 M NH4-acetate and 0.1% acetic acid in acetonitrile/isopropanol [7:3 (v:v)] (BioSolve, Valkenswaard, Netherlands). The flow rate of the UPLC system was set to 400 μl/min with a Buffer A/Buffer B gradient of 1 min isocratic flow at 45% Buffer A (55% Buffer B), 3 min linear gradient from 45% to 25% Buffer A (55% to 75% Buffer B), 8 min linear gradient from 25% to 11% Buffer A (75% to 89% Buffer B), and 3 min linear gradient from 11% to 1% Buffer A (89% to 99% Buffer B). After cleaning the column for 4.5 min at 1% Buffer A/99% Buffer (B) the solution was set back to 45% Buffer A/55% Buffer (B) and the column was re-equilibrated for 4.5 min, resulting in a final run time of 24 min per sample.
Mass spectra were acquired with an Orbitrap-type mass spectrometer (Exactive; Thermo-Fisher, Bremen, Germany) and recorded in the full scan mode, covering a mass range from 100-1,500 m/z. The resolution was set to 60,000 with 2 scans per second, restricting the maximum loading time to 100 ms. Samples were injected using the heated electrospray ionization source (HESI) at a capillary voltage of 3.5 kV in positive and negative ionization mode. A sheath gas flow value of 40 was used, with an auxiliary gas flow value at 20 and a capillary temperature of 200°C, while drying gas temperature in the heated electro spray source was 350°C. The skimmer voltage was set to 20 V with tube lens value at 140 V. The spectra were recorded from 0 to 20 min of the UPLC gradients.

Data processing and normalization of lipid data
Data analysis of the raw files (.raw) was performed using QI for metabolomics v2.3 (Nonlinear Dynamics, Newcastle upon Tyne, UK) according to the vendor description. Data were normalized to the internal standard (1,2-diheptadecanoyl-sn-glycero-3-phosphocholine) and the exact fresh weight of each sample. Lipid annotation was performed manually as described [54]. Statistical data analysis was performed using Excel 2013 (Microsoft, Redmond, WA, USA), R v3.2.1 [42] and SIMCA-P v13.0 (Umetrics, Umea, Sweden).

Predictability of inheritance strength based on lipid-level data as explanatory variables
Lipidomics data from Oenothera seedlings of the strain johansen Standard, harboring chloroplast genomes with different assertiveness rates, were analyzed jointly to test for predictability of inheritance strength based on lipid levels. For this, 33 probes representing 16 genotypes whose chloroplast genomes ranged from inheritance strength class 1 to 5 (see Supplementary Text) were measured in five replicates in three independent experimental series (Table S1; Supplementary Text).
In this dataset, a total of 184 different lipids/molecules could be annotated (Data S4; see above). Then, the data from each series were log-transformed and median-centered based on genotypes with inheritance strengths = 1, i.e. for every lipid/molecule, its median level across all inheritance strengths = 1 genotypes was determined and subtracted from all genotypes tested in the respective experimental series. Inheritance strength 1 was then selected to serve as a common reference across all three experimental series. Subsequently, the three experimental series were combined into a single set. Only those lipids/molecules were considered further, for which level-data were available across all three datasets, leaving 102 lipids/molecules for analysis (Data S4) LASSO regression model: Inheritance strength was predicted based on the median-centered lipid level data using LASSO, a regularized linear regression approach [55] as implemented in the "glmnet" R-software package (R v3.2.1) [42]. Glmnet was invoked with parameter α set to 1 to perform LASSO regression (Data S4). The penalty parameter λ was determined from the built-in crossvalidation applied to training set data (i.e. all but two randomly selected genotypes) and set to the obtained one-standard-error estimate deviation from the optimal (minimal error) value and assuming Gaussian response type. All other parameters were taken as their default values.

Determination of photosynthetic parameters
Gas exchange measurements were performed with a GFS-3000 open gas exchange system equipped with the LED array unit 3055-FL as actinic light source for simultaneous chlorophyll a fluorescence measurements (Heinz Walz GmbH, Effeltrich, Germany). Light response curves of CO2 assimilation were measured at 22°C cuvette temperature with 17,500 ppm humidity and a saturating CO2 concentration of 2,000 ppm, to fully repress photorespiration. Plants were dark-adapted for a minimum of 30 min. Then, the maximum quantum efficiency of photosystem II in the dark-adapted state (FV/FM) and leaf respiration were determined. Afterwards, the actinic light intensity was first set to the growth light intensity of 200 µE m -2 s -1 , followed by measurements at 500, 1,000, and finally 1,500 µE m -2 s -1 . At each light intensity, gas exchange was recorded until the steady state of transpiration and leaf assimilation was reached. Maximum leaf assimilation was corrected for the respiration measured in darkness. After the end of the gas exchange measurements, the chlorophyll content and chlorophyll a/b ratio of the measured leaf section were determined in 80% (v/v) acetone according to [56]. Leaf absorptance was calculated from leaf transmittance and reflectance spectra as 100% minus transmittance (%) minus reflectance (%). Spectra were measured between 400 and 700 nm wavelength using an integrating sphere attached to a photometer (V650, Jasco Inc., Groß-Umstadt, Germany). The spectral bandwidth was set to 1 nm, and the scanning speed was 200 nm min -1 .

Fluorescence microscopy and differential interference contrast to analyse ptDNA nucleoids, chloroplast volume and number per cell
We investigated leaf material from the central laminal region of the first true leaf 25 DAS. For this, four pieces of 5 mm 2 were excised from five individual plants per line and fixed. DAPI (4',6-diamidino-2-phenylindole) stains of ptDNA nucleoids and fluorescence microscopy was conducted as previously described [57,58] with minor modifications: In brief, excised leaf fragments were fixed with 3% glutaraldehyde in 50 mM phosphate buffer (pH 7.2), washed in 1x PBS buffer (phosphate-buffered saline, 137 mM NaCl, 2.7 mM KCl, 10 mM Na2HPO4, 1.8 mM KH2PO4, pH 7.2) and macerated in 1% DAPI as fluorochrome is considered to be sensitive enough to detect DNA of a single plastid genome copy [59]. The preparations were sealed with Fixogum rubber cement (Marabu, Tamm, Germany) and examined with a Nikon Eclipse Ni-U upright epifluorescence microscope equipped with a cooled monochrome camera (Nikon, Chiyoda, Japan) under a 100x UV objective. For each investigated cell, five to seven picture frames were digitally captured, each at a different focal plane. The frames were  (Table S12). Significance of difference between all lines was tested by one-way ANOVA. In addition, to test differences between the strong plastome I-johSt and weak plastome V3g, VC1, and IV-atroSt, respectively, two-tailed homoscedastic t-test was calculated followed by p-value adjustment according to Benjamini-Hochberg.
For differential interference contrast (DIC) microscopy, explants excised as described above were transferred to 10% formalin in phosphate buffer (Tissue-Prep Buffered 10% Formalin; Electron Microscopy Sciences, Hatfield, PA, USA). Samples were then evaporated for at least 1 h and incubated at 4°C overnight. After washing with sterile water, leaf fragments were incubated under rotation in 0.1 M EDTA for 2 h at room temperature, followed by incubation at 4°C overnight. Directly before analysis, samples were incubated for 3 h at 60°C while shaking (500 rpm). Leaf pieces were mounted in water on a slide and cells were released by softly tapping on the top of the cover slide. Analysis was performed on a motorized epifluorescence microscope Olympus BX61 under a 40x objective  [60]. For statistical analysis of each experiment, for comparison of all lines, one-way ANOVA was performed. In addition, to test differences between the weak plastome I variants or IV-atroSt and the strong wild type I-johSt using a two-tailed homoscedastic ttest followed by adjustment of p-values according to Benjamini-Hochberg was done. (Tables S13 and   S14).  Table S15. ptDNA copy numbers were quantified for the plastid genes rbcL, psbB and ndhI, which are topographically well separated on the plastid genome, and normalized to three nuclear loci M02, M19, and pgiC [43,61]. The nuclear loci are only present once in the nuclear genome of johansen Standard, as judged from coverage analysis of Illumina libraries. Additionally, three markers for mitochondrial DNA (mtM03, mtM04, mtM06) [62] were included in the calculation. Data were analysed with the LightCycler® 480 software v1.5.0 SP4 (Roche Diagnostics GmbH, Mannheim, Germany) employing the "Advanced Relative Quantification" method that incorporates primer efficiencies. Target/Reference values calculated by the software were used to determine the proportion of ptDNA per total DNA (including ptDNA, mtDNA and nuclear genome) by employing the approximate size of the nuclear (C1 about 1 GB) [44], mitochondrial (about 400 kb) [62,63] and chloroplast (about 160 kb) [24]  [bp]))/(value 5 DAS). Significance of the differences between I-johSt and IV-atroSt/plastome I variants for each developmental stage was calculated with a one-sample t-test followed by multiple testing pvalues correction according to Benjamini-Hochberg.

Detection of RNA via RNA gel blot analyses
Total RNA was isolated as described previously [23]

Generation of green variants with altered inheritance strength
For functional validation of loci predicted by correlation mapping in the wild types (see Main Text and below), mutagenesis of the strong chloroplast genome I-johSt was conducted using the Oenothera plastome mutator (see Materials and Methods for details). Inheritance strengths of the obtained green variants were determined in crosses to the white chloroplast mutants I-chi or IV-delta as pollen or seed parent, respectively (Materials and Methods for details, Fig. 2, Table S8, and Main Text).
The progeny of three seasons were analysed for the two crossing series. From these experiments it appeared that the lines VC1 and V3g (together with the weak wild type IV-atroSt) have very low assertiveness rates in the F1. As already judged by eye, they form a distinct class from all other variants in both crossing directions ( Fig. 2; Table S8). For the reciprocal cross, no significant differences to the strong wild type I-johSt were found for the variants V1c, V2f, and V3e. Interestingly, these variants had the same transmission efficiency as the wild type, although they underwent a mutagenesis approach and carry a mutational load. This makes them a particularly valuable material to identify plastome mutator-induced mutations that do not affect chloroplast inheritance. All other variant plastids showed a significantly decreased competitive ability from at least one parent when compared to the wild type chloroplast genome I-johSt. In general, the plastome I variants cannot be The results of the classical experimental set up in which bleached chloroplast mutants are used, could be confirmed in a MassARRAY® approach employed for the crosses of the plastome I variants with the strong wild type plastome I-hookdV as male parent or the weak wild type IV-atroSt as the female parent (see Materials and Methods for details). Due to the detection threshold of the method (5-10%) when I-hookdV was transmitted through the pollen, most variants showed the same or slightly decreased transmission efficiency as their wild type I-johSt, with the progeny having increased amounts of paternal plastid DNA (ptDNA). However, only for VC1 and V3g is the difference of the ratio of paternal and maternal cpDNA in the pool large enough to result in the detection of a significantly lower assertiveness rate (Fig. 2B). The lines behave similarly in the other crossing direction, where most variants seem to be of wild type competitive ability. Again VC1 and V3g can clearly be confirmed as weak lines, while V3c and V3f (which appear as weak to intermediate when contributed by the female), show a higher transmission efficiency than I-johSt. This is the same reciprocal difference that is observed also in the classical experimental set up (Fig. 2). Altogether, especially due to the detection limit, the classical approach using bleached chloroplast mutants gives more reliable results and allows a much finer discrimination of transmission efficiencies. Moreover, there is no qualitative difference in the assertiveness rates between the green wild types I-hook/IV-atroSt and their corresponding bleached mutants I-chi/IV-delta. This is in agreement with the classical literature [7] and investigated in more detail below.

The green plastome I variants do not display impaired growth, altered chloroplast morphology, or a photosynthetic phenotype
To rule out the possibility that the observed differences in chloroplast inheritance strength result from secondary effects in the green variants we performed several controls: First, we monitored growth behaviour of plants with the green chloroplasts of different inheritance strength in the common nuclear background of johansen Standard. Second, to access the physiological status of the material, we measured photosynthesis parameters. Third and last, we performed detailed microscopy to investigate chloroplast size, number per cell and morphology.

No growth, germination or macroscopic phenotypes are present in the plastome I variants
To ensure that the green variants are not impaired in development, cultures of johansen Standard plants harbouring various variant chloroplasts were compared side-by-side to plants with their strong wild type chloroplast genome I-johSt and the weak one IV-atroSt. It appeared that seeds from all plant lines germinated at 100% within 3 days after sowing (DAS). After transfer to soil, no differences in growth were observed during whole plant development under standard greenhouse conditions. Also no macroscopic phenotype such as altered leaf coloration was observed (Fig. S3).

Photosynthetic parameters are unaltered in the plastome I variants
To gain insights into the physiological status of our materials, we determined several photosynthetic parameters and plotted them against competitive ability (Fig. S4). From these analyses it became clear that differences in photosynthesis capability, if present at all, cannot be interpreted as a function of inheritance strength: We could not detect significant differences between plants nor dependencies of inheritance strengths on chlorophyll content per leaf area or for chlorophyll a/b ratio. The latter reflects the ratio of the photosynthetic reaction centres (exclusively binding chlorophyll a) to the antenna proteins (which bind both chlorophyll a and b). Also FV/FM, the maximum quantum efficiency of photosystem II (PSII) in the dark-adapted state, did not show any changes with inheritance strengths. All measured values were above 0.8 indicating that PSII was intact and that its antenna proteins were efficiently coupled to the reaction centre. There was a minor tendency towards a decrease of leaf respiration in darkness with higher assertiveness rates. However, neither for leaf assimilation rates measured at the growth light intensity of 200 µE m -2 s -1 , nor for assimilation capacity measured under light-saturated conditions, were changes dependent on competitive ability observed.
Similarly, for other photosynthetic parameters tested, including leaf absorptance, the chlorophyll a fluorescence parameters qN (non-photochemical quenching, a measure for the thermal dissipation of excess excitation energy in the antenna bed of PSII) and qL (a measure for the redox state of the PSII acceptor side), no clear differences dependent on competitive ability were found.

Chloroplast sizes, numbers or volumes per cell are unchanged in the plastome I variants
To test if differences in competitive ability are a side effect of a putative chloroplast division phenotype, our strong wild type I-johSt was compared to three lines with weak transmission efficiency For chloroplast number per cell, one-way analysis of variance (ANOVA) yielded p = 0.93 among all four lines (Fig. S5). Adjusted p-values obtained by t-test and multiple testing correction in the comparison of each single line with I-johSt again did not uncover significant differences (Table S13). Very similar results were obtained for the chloroplast volume, for which one-way ANOVA gave a value of 0.51 in the comparison of all four lines. Comparing I-johSt with the weak plastomes also did not uncover significant differences, as judged from multiple t-testing (Table S14)

Correlation mapping
As described above, the green variants do not display any phenotype other than an altered inheritance of the chloroplast in crosses. Together with the wild type chloroplasts of different inheritance strengths, this makes them a valuable material to pinpoint molecular loci for chloroplast transmission encoded on the plastome. In contrast to algae or fungi, however, organelle genomes of higher plants or animals are not amendable to linkage mapping [1]. Consequently, in these materials, identification of functional relevant loci can only be based on correlation of a polymorphism within a given sequence interval to a phenotype in a mapping panel. To our best knowledge, this has been done only manually so far [64,65], which somewhat limits these analyses to a manageable number of organelle sequences, as well as to simple phenotypes, such as the presence or absence of sterility [66]. We therefore developed a novel mapping approach that fills this methodological gap. Conceivably, this approach could be applied to map loci conferring cytoplasmic male sterility [66], mitochondrial diseases [67], cytonuclear incompatibility or to analyse adaptive cytoplasms [21,68,69].
The method is based on Spearman's rank and/or Pearson's correlation (Materials and Methods), with the latter capturing linear dependencies more directly. Since (i) presence or absence of linear dependencies in our data structure is a matter of speculation, and (ii) as a rank-based correlation metric, Spearman correlation yields more statistically robust results that are less influenced by outliers, we have used both approaches. For this, we calculated sequence divergence (total count of nucleotide changes, i.e. SNP, insertions and deletions) in respect to a reference sequence for every sequence in an alignment at a given alignment window. The value thus obtained is then correlated with a phenotype. In our case, this is a class of inheritance strength or a percentage value expressing transmission efficiency of a given chloroplast genome (see above and Materials and Methods for details). For example, if the reference sequence represents a strong chloroplast genome and, relative to it, certain weak plastomes contain polymorphisms in the same alignment window, this window is identified as highly relevant for inheritance strength ( Fig. 1A; Fig. S17). Subsequently, individual polymorphisms or regions within this window are analysed separately (Fig. 1B). Since full organelle genomes are analysed, more than one relevant site can be identified. However, as for any other association mapping approach, the presence of two or more genetically independent loci that confer the same phenotype can complicate the conclusion. Perfect correlation coefficients of 1 or -1 might not be achievable at a single site. On the other hand, in a non-recombining system, such as an organelle genome, even absolute correlation at a single site may be due to genetic hitchhiking via linkage disequilibrium, and not necessarily due to functional relevance. Hence, experimental verification of predicted loci by independent methods is necessary.

Division of wild type and mutant plastomes into classes of inheritance strength
The datasets that measure inheritance strength of wild type chloroplasts or of the green variants were either deduced from the literature or produced in this work. They represent percentage values of heteroplasmic seedlings in an F1 generation that reflect inheritance strength of a given chloroplast genome (Tables S8 and S11; see above). The numbers can be directly applied to Spearman's/Pearson's correlation. If datasets of more than one crossing series are to be combined, clustering of the crossing data into classes is necessary.
For the wild type plastomes, we used the original data of Franz Schötz, where two sets of crosses "biennis white" and "blandina white" are available [6,33] (Table S11; Materials and Methods); inheritance strength of 25 wild type chloroplasts was determined using these previously described tester lines. Clustering of the two datasets with the k-means algorithm using the optimal number of centres (k = 3) confirmed the original classifications suggested by Schötz, with the exception of the I-bauriSt and II-corSt plastomes ( Fig. S16A; for details see Materials and Methods). These plastomes were borderline genotypes in Schötz's classification system, and according to our data, they might be reassigned. Besides these minor discrepancies, clustering conclusively supports the presence of the three distinct classes of inheritances strengths (strong, medium and weak) in the wild type plastomes of Oenothera, as previously described.
The clustering of the green variants is less clear. When data from the I-chi and IV-delta crossing experiments are combined, the pamk function identified k = 2 as the optimal number of clusters, clearly separating the weak from the stronger materials (Fig. S16B). However, finer clustering of the stronger variants leads to ambiguous class membership. This is likely due to the higher variation in the IV-delta crosses compared to the I-chi crosses ( Fig. 2 and above). This seemingly weakens the combination of the two datasets.

Correlation mapping in the wild type plastomes
Pearson's correlation generally identified more windows than Spearman's, but both predict essentially the same regions relevant for inheritance strengths. Interestingly, there was no notable difference between the methods if either k-means classes or the "biennis/blandina white" crossing data were used for correlation (Fig. 1A, Fig. S6, and Data S1). Largely based on theoretical considerations (presence of three clearly ranked classes in the wild types and stronger experimental base if the "biennis white" and "blandina white" crossing experiments are combined; see above), we discuss here Spearman's rank correlation to k-means classes in more detail. According to the latter, sequence windows in the ycf1 and ycf2 genes (between alignment positions 99011-100000 and 134641-135640) show nearly absolute correlation to inheritance strengths (rho = -0.99, p < 0.0005; Data S1). In both genes, the correlation oscillates from rho = 0.86 to -0.99 (p < 0.0005), and the positive and negative correlation should be interpreted as equally important. Another nearly absolute correlation (rho = 0.98, p < 0.0005) was measured in alignment windows containing the promoter, 5'-UTR and 5'-end of accD (positions 63501-64760). Window further upstream containing the same features (positons 63391-64490) also correlates with rho = 0.96, p < 0.0005 (Fig. 1B). However, highly significant correlations of 0.96 were also found in intergenic regions of photosynthesis genes and/or tRNA genes, for example between ycf3 and psaA (encoding a photosystem I assembly factor and core subunit, respectively) [70]. In addition, significant correlations were measured from the spacers of the photosystem II and cytochrome b6f subunit genes psbE and petL, and in a sequence interval contacting trnR-UCU and trnG-UCC. In contrast, no significant correlation was observed for oriA. For oriB, three sequence windows (partially) containing the oriB correlate with 0.90, 0.88 (both p < 0.0005) and 0.81 (p< 0.005). If Pearson's correlation to k-means classes is applied to the wild type data, the described pattern can be reproduced but more windows with significant correlation are identified ( Fig. S6 and see above). The highest observed Pearson correlation in the wild type dataset is r = 0.96 (p < 0.0005) in a sequence window again containing the promoter, 5'-UTR and the 5'-end of accD ( Fig. 1B; Data S1).

Correlation mapping in the green variants
When correlation mapping results are compared between the wildtype and the green variants, the most striking difference in the variants is the loss of significance after p-value adjustment for Spearman's but not for Pearson's correlation. Here, windows with significant correlations were obtained (cf. Fig. S6 vs. 7; Data S1). This is probably because the rank-based Spearman correlation being less influenced by the VC1 and V3g data points. These two single genotypes, however, form the weak and, therefore, most predictive class, whereas the other variants do not differ noticeably from their wild type progenitor (cf. Fig. 2 and Fig. S16B; also discussed above). This leads to a relatively weak correlations to inheritance strengths which appears to be an under-estimation and a consequence of the multiple testing correction (> 13,000 tests). The weak correlations also contradict the observations, which clearly indicate that the plastomes of the green variants must contain mutated loci for inheritance strength. A similar argument applies to correlation of the k-means classes in the variants. As discussed above, definition of these classes is less clear than in the wild type, which weakens their predictive power. We therefore think that the Pearson correlation of the I-chi crosses (which yield a better resolution than the reciprocal IV-delta crosses; Fig. 2 and above) represents the best approach to identify the relevant loci that alter inheritance strength in this material. Notably, this approach yields the most significant correlations, but all approaches (including Spearman) identify the same regions in the plastome with the highest correlation values (Fig. S7; Data S1).
In the variants, Pearson's correlation of the I-chi crosses predicts a sequence window in the 5'-end of accD as significantly correlated to inheritance strengths (r = 0.78 and p < 0.005). The strongest correlation for this dataset is observed for the 5'-UTR of ycf2 (r = 0.91, p < 0.0005). In addition, a highly repetitive region in the coding region of the same gene also shows good correlation values (r = 0.71; p < 0.05; Fig. S7; Data S1). Two insertions/deletions (indels) in ycf1 are also found significant (r = 0.61; p < 0.05). They represent a single insertion and a deletion in the weak variant VC1, located relatively close to each other (see below). The functional relevance of the two mutations for inheritance strength can be questioned, however. Since the second weakest variant of the dataset displays a wild type ycf1 sequence (Fig. S10A), the above described mutations are likely a result of the higher mutation load present in VC1 (see Materials and Methods for details). For the same reason, a contribution to the phenotype by the oriB can be excluded (r = 0.41, p = 0.25) in the variants. Taken together, our results narrow down the regions identified in the wild types to the two genes accD and ycf2. All other mutated loci in the variants seem to be of minor importance.

Correlation analysis at selected loci
When correlation mapping is applied to selected loci within the above identified alignment windows, the general observation is that correlation values drop to some extent (cf. Data S1). This is probably best explained by looking at the highly correlating sequence intervals spanning the promoter, 5'-UTR and 5'-end of accD in the wild types (Fig. 1B). When analysed as functional units (promoter/5'-UTR region and protein N-terminus; Fig. S8), correlation of the individual segments (promoter/5'-UTR region: r = 0.80 or rho = 0.74; p < 0.005 for both; full N-terminus: r = 0.78 or rho = 0.60; p < 0.005 or p < 0.05), is much lower than for the original sequence intervals (r = 0.94 or rho = 0.96 and r = 0.96 or rho = 0.98 with p < 0.005 for all), which led to the identification of these regions. As discussed below, experimental evidence is available that promoter/5'-UTR and N-terminus interact to affect the inheritance phenotype, while the individual regions display weaker correlations.
In spite of these complications, to get an impression how well certain coding or promoter/5'-UTR regions, as well as segments of oriB correlate with inheritance strengths, we calculated correlation values for accD, ycf1, ycf2, and oriB for polymorphisms that are present in both wild type and the variants (Figs. S8-S10; Data S1). We also included two prominent sites in the ycf2 gene present in the wild type, for which we found no mutation in the variants. Please note that at three sites (AccD N-terminus, the AccD site 2 and the ycf2 promoter/5'-UTR region) in addition to Person's correlation the Spearman's correlation analysis yields significant correlations in the variants. This is in contrast to the whole plastome approach described above, where p-values corrections due to multiple testing were applied.
The best correlating region in both sequence sets (wild type and variants) is site 2 of the AccD N-terminus (Fig. S8B). Its prediction is extremely robust in that significant Pearson's and Spearman's correlation were obtained for all crossing series and k-means classes ( Fig. S8; Data S1). Less clear is the contribution of ycf2. In the wild types, the ycf2 site 1 and site 2, but not site 3 can be associated with inheritance strength, but in the green variants site 3, exerts the most influence on the competitive ability of chloroplasts ( Fig. S9B and below).
In summary, the refined analyses at selected loci clearly confirm the contribution of accD on inheritance strengths and might have even identified the most important region. It also shows that ycf2 may contribute to the phenotype. Without chloroplast transformation in the evening primrose, a technology currently not available, the influence of the individual sites remains speculative.

Repeat structure, sequence evolution and divergence of accD, ycf1, ycf2, and the oriB
The four genes or loci partially span rapidly evolving regions of the Oenothera plastome that are characterized by large repetitive regions. Those can be of up to 1 kb in size as it is the case for site 3 in the ycf2 gene. They are comprised mostly of tandem or direct repeats (and less pronounced of palindrome or inverted repeats) as described earlier [24]. Due to their repetitive nature these regions are very prone to replication slippage [23,71] and sequence divergence at these regions substantially contributes to the overall sequence variation of the Oenothera chloroplast DNA (Greiner et al. 2008. Fig. 3 therein) [24]. The presence of repeats also makes them a preferred target of the plastome mutator allele [16,39]. Sequence evolution is extremely fast at these repeats. In case of the repetitive regions of accD and ycf1 phenotypically neutral spontaneous mutations were isolated repeatedly at very similar sites [23]. Moreover, the oriB (which is essentially located in the rrn16 -trnI-GAU spacer) is used as hypervariable marker allele that allows discrimination among a huge variety of Oenothera strains [43].
The repeat structure of the oriB region was analysed earlier and is comprised of 7 direct repeat classes that can be divided into various subtypes [16,17] (and below). In the accD gene mostly tandem or direct repeats span the promoter/5'-UTR and N-terminal region (Fig. 1); all three are considered to contribute to the regulation of the gene [15,[26][27][28][29]. In fact, sequence variation induced by theses repeats is so high that upstream of the accD start codon a window of about 1.4 kb cannot be aligned between the weak plastome IV and the stronger plastomes I-III (Data S2). This is to some extent also observed for site 2 in the N-terminus of the wild type AccD. In plastome IV major portions of this site is missing and about half of the remaining sequence is polymorphic (Fig. S1A). In the ycf2 coding sequence, the most prominent repeats are present in site 2 and 3. At the first site the number of PEKRKEKK tandem repeats can be correlated with inheritance strengths in the wild types, but not in the variants. The situation is reversed for site 3, in which tandem repeats exist as two subtypes 5'-GAGGAAGtAGAAGGGACAGAA-3' and 5'-GAGGAAGgAGAAGGGACAGAA-3' associated with a GAT linker, and correlate with inheritance strengths in the variants, but not in the wild types (Fig. S2).

Variation at the chloroplast origins of DNA replication is not causative for chloroplast competition
As elaborated above, our correlation mapping already points to a connection of lipid biosynthesis and chloroplast competition. However, one might still argue that a priori differences in the origins of replication are the simplest mechanistic explanation for organelle competition. At least some evidence supporting this claim is available for yeast and Drosophila [9,72,73]. The location and repetitive nature of the oriB in evening primroses (see above) are reminiscent of the non-coding displacement loop (Dloop) of metazoan mitochondrial DNA (mtDNA). In many animal taxa the D-loop is the most variable sequence of mtDNA and is in the proximity of tRNA or rRNA genes [9,74,75].

Sequence variation in the oriB cannot explain differences in inheritance strength
Previous work in the evening primrose did not support an involvement of the origins of replication in chloroplast competition. First, the number of D-loop initiation sites (i.e. oris) does not differ between weak and strong plastomes and their locations in the chloroplast genome is identical [18]. Second, in a previous association mapping study that investigated the hypervariable repeat region of oriB, a short repeat series was identified as the sole determinant that could explain the difference between the strong and intermediate plastomes I, III and II on one side, and the weak plastome IV on the other side [17]. The sequence (5'-ACGACACGACGATTAGATTAGCTCATTGGTAGGACGACGATTAGCTCATTGGT AGGACGACG-3') is 62 bp in size and is capable of forming of weak hairpins. Our study, analysing a greater number of plastome sequences, confirms the absence of this sequence in the weak plastome IV. However, in none of the green plastome I variants with alerted inheritance strength is the sequence partially or fully deleted (Data S2). We therefore do not think that a genetic determinant within the oriB of Oenothera is able to explain the observed huge differences in competitive behaviour.
To substantiate this view, also on the level of DNA, we investigated the dynamics by which ptDNA increases during plant development in more detail. In addition, we analysed chloroplast nucleoid structure and number per cell.

Changes in plastid DNA amounts during development do not correlate with inheritance strength
To investigate if potential differences in ptDNA increase during development and/or changed ratios of plastid/nuclear DNA are able to explain chloroplast competition, we performed quantitative realtime PCR. In general, plastid DNA amounts are not static during ontogenies [58,76]. They increase as leaves grow, starting from 0.4% in meristematic tissue to more than 20% in mature leaves [57]. If differences were observed in DNA abundance during development in different Oenothera lines harbouring chloroplast with different inheritance strengths, it might hint towards DNA replication or replication speed as an underlying mechanism for plastid competition. To monitor this process we analysed total DNA of the johansen Standard strain equipped with the strong and the weak wild type chloroplast I-johSt and IV-atroSt, respectively. In addition, we included selected lines of our plastome Depending on the plastome target region, at 5 DAS a small increase of ptDNA amounts was observed in the lines V1c, V3e, V3c, VC1 and V3g. These differences, however, are not significant. At 21 DAS the weaker variants V3c, V3g and VC1 showed an increase in relative ptDNA amounts compared to wildtype I-johSt, but only for the target ndhI which was again not significant. In general, from 5 to 21 DAS only a minor or no increase of plastid DNA amount was observed for each particular line and each plastid target, while in the young rosette at 32 DAS the ptDNA amount doubled (Fig.   S11). These results echo previous work in Arabidopsis and sugar beet [57,76]. Since the same results were obtained for plant lines carrying strong and weak plastids, no developmental differences in ptDNA copy numbers correlates with differential transmission efficiencies.
In summary, all minor increases in ptDNA amount are not significant nor correlate with transmission efficiencies nor with the DNA variations described previously (Fig. S10B). Moreover, no differences can be detected in IV-atroSt compared to wildtype I-johSt, although plastome IV is the weakest of all genotypes tested. Therefore, the ptDNA amounts in vegetative tissues do not indicate different replication speeds, suggesting that replication per se is not the underlying mechanism for different transmission efficiencies.

Nucleoid number and structure is identical in lines with different inheritance strength
Under the premise that ptDNA amounts are constant, there is still the possibility that strong and faster replicating plastomes have altered numbers of nucleoids, which could impact their ability to divide.
To exclude this possibility we quantified nucleoids in the central laminal region of the first true leaf 25 DAS. After staining with DAPI, nucleoids were clearly visible as small dots with their fluorescence sharply delimiting them from the dark cellular background, even when forming tight associations like clumps or threads (Figs. S12 and S13). One strong (I-johSt) and three weak lines (V3g, VC1, and IV-atroSt) were investigated. The mean number of nucleoids per chloroplast ranges between 17.7 and 18.1 with no significance differences between the lines. One-way ANOVA gave p = 0.53; multiple ttests comparing I-johSt with each of the weaker plastomes did not point to significant differences as well ( Fig. S14 and Table S12). Moreover, we did not observe any difference in nucleoid morphology between the lines.

Expression and transcript maturation of accD and ycf2
Since the polymorphisms in oriB cannot explain differences in competitive ability, we investigated the accumulation of accD and ycf2 transcripts. were present. Moreover, the mature transcript clearly over-accumulated in this weak, but phylogenetically more distant plastome. This transcript over-accumulation appears to be a result of the high sequence variation observed in the accD promotor/5'-UTR that strongly correlates with inheritance strength (see Fig. 1, Fig. S8 and above). A similar analysis was conducted for ycf2, where again a probe specific for the C-terminal part of the gene detected the mature transcript at the expected size of about 9 kb. The very small differences in size between the lines perfectly mirrors the occurrence of in-frame deletions in the lines IV-atroSt, V3c, V3g, and VC1 (Data S2). In IV-atroSt as well as in the plastome I variants, no difference in accumulation of the mature transcript compared to I-johSt was found. However, transcript stability/processing seems to vary between the strong plastome I-johSt and the weak IV-atroSt. Interestingly, whereas the strong variants V1c, V3e and, to same, extent V2g showed exactly the same transcript pattern as the wildtype, the weak variants showed a pattern more similar to IV-atroSt. This indicates a correlation between transmission efficiency of mutations in site 3 of ycf2 (cf. Fig. S2), that might result from altered mRNA degradation and/or processing.

ACCase activity in lines harbouring chloroplasts of different inheritance strength
As the above described analysis indicates an involvement of accD and/or ycf2 in the inheritance phenotype, we decided to determine ACCase activity in our lines. From these measurements, it appeared that the strong variants (V1c, V3e, V2a, and V2g) display a similar or even lower ACCase activity than their wild type I-johSt. The same holds true for the strong to intermediate or intermediate genotypes (V3c and V3d). In the weak materials, however, a 2-3 fold increase of ACCase activity is observed for VC1 and IV-atroSt, although V3g shows wild type enzyme activity (Fig. 3A). Although there is no simple linear correlation between inheritance strengths and ACCase activity, the strong increase in VC1 and IV-atroSt is hard to ignore. In fact, both inheritance strength and ACCase activity seem to depend on the particular mutation pattern: (i) Mutations in ycf2 seem to influence ACCase activity, as judged from the variant V3e that is wild type for the accD segments but is mutated in ycf2 (Fig. 3A, yellow box). (ii) Larger mutations in the AccD N-terminus have higher ACCase activity, whereas the presence of a more diminutive AccD N-terminus correlates with lower activity (Fig. 3A, cf. blue boxes vs. the remaining pattern). (iii) There must be an influence of ycf2 on inheritance strength (cf. Fig. 3A, green boxes associated with the weaker materials). Hence, if ACCase and/or Ycf2 result in changes lipid levels, one would expect that lipid composition is predictive of inheritance strengths.

Predictability of inheritance strength based on lipid-levels
To test for predictability of inheritance strength from lipid level data, we analyzed 16 chloroplast genotypes of different inheritance strength in a LASSO regression model (Table S1; Materials and Methods). Since chloroplast inheritance strength is independent of photosynthetic competence (see below), we included pale lines. The aim was to enrich the lipid signal responsible for inheritance strength, i.e. to deplete for the structural lipids of the thylakoid membrane [77]. Namely, we used our bleached psaA mutants I-chi and IV-delta impaired in photosystem I assembly, as well as the pale green virescent genotypes III-lamS, III-V1 and III-V2 (Tables S1 and S7; Fig. S15). For such Methods perturbed thylakoid membrane formation was shown previously [78][79][80][81]. Moreover, as elaborated in the following chapters, we could confirm the independence of a pale phenotype to inheritance strength with these plastomes.

Chloroplast inheritance strength is independent from bleaching
Intuitively one might expect that bleached chloroplast mutants would be less successful in crosses than their corresponding green wild types. However, previous analyses in evening primroses showed that differences in chloroplast inheritance strength are largely independent of the chloroplast mutant used for the analyses [14,82]. At least for Oenothera, it is therefore generally accepted that mutations in a chloroplast genome that result in bleaching essentially do not affect chloroplast assertiveness rates [7] (also see Fig. 1B,D and above). Due to technical limitations, however, this hypothesis was never tested directly. Since closure of this gap is of general relevance for this work, and to provide further evidence that chloroplast inheritance strength is largely independent of the photosynthetic status of the chloroplast, we directly compared the wild type chloroplast I-hookdV and its bleached derivative I-chi, as well as IV-atroSt and the corresponding mutant IV-delta. For this, we investigated appropriate F1 populations crossed to the chloroplast genomes I-johSt, VC1, and V3g with the MassARRAY® system ( Fig. S18; see Materials and Methods for details on the material). As expected, transmission efficiencies were found of the same range for nearly all six pairs of crosses under investigation. Only in one cross with VC1 as a mother, the bleached mutant I-chi actually behaved even stronger than its corresponding green wild type.
Taken together, we could confirm that chloroplast assertiveness rates are independent of photosynthetic capability. Moreover, these results make it very unlikely that the differences in inheritance strengths observed for the mutated plastome I variants (all sharing a green phenotype), are due to a secondary effect.

The very weak variants III-V1 and III-V2
While the plastome I variants and their wild type I-johSt are native in and compatible with the nuclear background of the johansen Standard race, III-lamS and its plastome mutator variants III-V1 and III-V2 are foreign and incompatible in this background, meaning that tissues carrying them do not develop a normal green colour (Materials and Methods, Fig. S15, Table S9). The wild type III-lamS plastome appears to be strong, as judged from crosses to I-johSt as pollen donor, its derivative variants III-V1 and III-V2 are weak (cf. Fig. 2A vs. Fig. S19; cf. Table S8,S9 and S11) [14]. Although the fraction of plants showing biparental inheritance in the crosses of III-V1 and III-V2 to I-johSt as pollen donor are somewhat low (37.3% and 38.0%, respectively) for a combination of a weak and a strong plastome (Fig. 2, Table S8) [7], the striking difference of this cross to all other crosses described is that some seedlings inherit only paternal chloroplasts ( Fig. S19; Table S9). As mentioned previously, biparental inheritance in the evening primrose shows maternal dominance, in which progeny are either homoplasmic for the maternal chloroplast or heteroplasmic for the maternal and the paternal chloroplasts, but they are never homoplasmic for the paternal one. The appearance of homoplasmic offspring having the paternal chloroplast in the III-V1/III-V2 crosses to I-johSt is the only reported case in the evening primrose where an exception to maternal dominance occurs. This justifies the definition of a new inheritance class for these plastomes.

Classes of inheritance strength employed in the LASSO regression model
To predict chloroplast inheritance strength from lipid-level data, the genotypes of the plants needed to be ranked according to their inheritance strengths (Materials and Methods; Table S1). For the green variants (V1c, V2a, V2g, V3e, V3c, V3d, VC1, and V3g) and the wildtypes I-johSt and IV-atroSt were used the existing k-means classes 1 -4 already employed in our association mapping approach (see above). The remaining plastome (I-chi, IV-delta, I-hookdV, III-lamS, III-V1 and III-V2) were rendered consistent with this framework based on the classification of Schötz and our own data. This adds the plastome I-chi, its wild type I-hookdV and III-lamS to the strong class 1. The mutant IV-delta was placed into the weak class 4. As a result of the exceptions to maternal dominance, when III-V1 and III-V2 were seed parents, these plastome are placed in a new class 5 (very weak). Taken together, the 16 genotypes are classified into 5 classes of descending inheritance strengths (strong = 1, strong to intermediate = 2, intermediate = 3, weak = 4, and very weak = 5). Material in class 1 and class 4/5 are over-represented, since (as for the inclusion of the bleached material; see above) we expect to enhance the signal for predictive lipids (Table S1)

Predictability of inheritance strength based on lipid-level data as explanatory variables
The rationale of the predictive approach is as follows: To test for predictability of inheritance strength based on lipid-level data, a linear model (LASSO) was trained and its performance tested in a crossvalidation setting on two randomly selected genotypes with differing inheritance strength (see Materials and Methods). If the proposed regression model has predictive power, the actual inheritance strength-values associated with the two test genotypes should be positively correlated with their predicted ones. Note that for each genotype repeated measurements were available. Thus, regression was performed over more than two points and correlation coefficients could assume absolute values differing from 1. Testing was done in a cross-validation setting, i.e. the two test genotypes were not included in the model training. This procedure was repeated 100 times, with each run corresponding to two new randomly selected genotypes of differing inheritance strength and all others used for model training.
If, indeed, inheritance strengths can be predicted based on lipid levels, on average, a positive correlation (Pearson correlation coefficient, r) between actual and predicted inheritance strengthvalues of the two test set genotypes should be obtained. To test for this outcome, 100 Pearson correlation coefficients are classified as positive (success) or negative (failure). Then, they were compared to the null hypothesis of no predictive value, corresponding to a 50% chance of obtaining a positive correlation and significant deviations from this expected probability tested by performing a binomial test.

The lipid classes DGDG, PG, PC, and PE are enriched for predictive lipids
From the 100 cross-validation runs, using the combined dataset from three independent experimental series (Table S1), a median Pearson correlation coefficient (cvR) between actual and predicted inheritance strength values of cvRmedian = 0.7 was obtained (Fig. 3A) with 82 being positive, i.e. successful predictions. This corresponds to pbinomial = 2.17 x 10 -9 vs. the null hypothesis of 0.5 (no predictive value). Thus, lipid levels proved indeed predictive relative to inheritance strength.
Individual lipids were ranked with regard to their predictive value based on the coefficients by which they entered the regression model ( Fig. 3C; Table S2). Averaged over all 100 cross-validation runs, 20 lipids/molecules were identified as predictive as judged by their average absolute weight.
They were considered predictive if their absolute average weight was greater than one standard deviation (SD = 0.7) of the average weights of all 102 lipids/molecules. Among those, lipid/molecule classes DGDG, PG, PC, and PE were found enriched (odds ratio > 1), albeit statistical significance could not be established (Table S3)

A model for the predictability of inheritance strength based on lipid-levels
Taking into account the data of Fig. 3, we propose the following model to explain how certain changes in lipid abundance influence inheritance strengths: Increased activity of acetyl-CoA carboxylase in the chloroplast (Fig. 3A) increases fatty acid concentrations and subsequently fatty acid export to the endoplasmic reticulum (ER). The combination of increased fatty acid synthesis and export leads to an upregulation of the eukaryotic lipid biosynthesis pathway in the ER [83]. This is seen in increased amounts or shifted proportions of diverse lipid classes, including phospholipids or storage lipids. Since PC is the dominant lipid class of in the chloroplast outer envelope [19] those changes affect the structural and physiological properties of the envelope. It in turn impacts chloroplast division and/or stability processes, thus ultimately determining inheritance strengths (see Main Text). Conceivably, the Ycf2 protein, which is located in the envelope [84], participates directly in the transport of fatty acids or its function may be responsive to changes in the lipid composition of the envelope, thus influencing other transport processes and/or growth and division of the chloroplast. The observed shift in the proportions of storage lipids or other changes in the lipidome (cf. DGDGs or TAGs in Fig. 3C) might occur in response to altered fatty acid pools, although the storage lipids are likely not relevant for the variation in chloroplast inheritance strengths.  Variants. Sequence variation in both sequence sets is conferred by large tandem or direct repeats.
Note that sites 1 and 2, but not site 3 correlate with inheritance strengths in the wild types, whereas multiple deletions in site 3 are associated with the weaker inheritance phenotype of the variants.  Comparison between the lines. Note lack of statistically significant differences (Tables S13 and S14; Supplementary Text for details). Scale bar = 10 µm            x I-chi) was tested with Kruskal-Wallis one-way ANOVA on ranks (* p < 0.05).     5) According to Renner [98] this line was originally "received from Amsterdam" by N. v. Gescher in 1907. The material is quite likely identical to that collected by D. T. MacDouglas in 1902/1903 as described in [85]. Also see [100]. In Oenothera, five genetically distinguishable plastome types (I-V) can be recognized based on their compatibility with three nuclear genomes (A, (B) C) in either homozygous (AA, BB, CC) or stable heterozygous (AB, AC, BC) states. The basic plastome genotype is accompanied by a given inheritance strength (strong, intermediate and weak). Basic plastome and nuclear genome type are an important factor of species definition in Oenothera. For details see, e.g. [6,7,11,12,14,30,35].
3) Percentage of variegated seedlings, heteroplasmic due to the paternal transmission of the bleached chloroplast mutants "biennis white" or "blandina white" and the maternal transmission of the green wild chloroplast of the seed parent. Data accoding to Schötz [6]. For details see therein, reviews in [7,12] and Supplementary Text. 4) Class of inhertiance strength as determined by F. Schötz. For reviews see [7,10,12]. 5) For details on the definition of these classes see Supplementary Text. 6) The chloroplast genomes of the two suaveolens stains are identical.  [7,10,12]. 5) For details on the definition of these classes, see Supplementary Text. 6) For details on the donor strains, see Table S4.   Table S6 for details on the wild type plastomes.
2) See Table S7 for details on the chloroplast mutants.
3) All crosses were performed in the nuclear genetic background of O. elata ssp. hookeri strain johansen Standard. For details on the crosses see Materials and Methods, on the line see Table S4.   Table S6 for details on the wild type plastomes.
2) All crosses were performed in the constant nuclear genetic background of O. elata ssp. hookeri strain johansen Standard. For details on the crosses see Materials and Methods, on the line see Table S4.     Table S4.
2) For details on the wild type or variant plastomes see Table S8.
3) Means ± standard deviations are given. 4) p-values obtained with two-tailed homoscedastic t-test followed by multiple testing correction according to Benjamini-Hochberg.  Table S4.
2) For details on the wild type or variant plastomes see Table S8.
3) Means ± standard deviations are given. 4) p-values obtained with two-tailed homoscedastic t-test followed by multiple testing correction according to Benjamini-Hochberg.  Table S4.
2) For details on the wild type or variant plastomes see Table S8.
3) Means ± standard deviations are given. 4) p-values obtained with two-tailed homoscedastic t-test followed by multiple testing correction according to Benjamini-Hochberg. 5) Chloroplast volume index calculated according to [60].