Drosophila intestinal stem and progenitor cells are major sources and regulators of homeostatic niche signals

Significance Most epithelia are turned over throughout adult life as cells are lost from the surface and replaced by the proliferation of stem cells. Precise regulation of stem cells by signals from the local microenvironment or niche is important to maintain epithelial homeostasis. Here, using intestinal stem cells of the Drosophila midgut as a model system, we use transcriptome profiling to identify genes expressed specifically in stem and progenitor cells and not their differentiated daughters. We find that stem and progenitor cells express ligands of major developmental signaling pathways to both contribute to the niche and regulate the production of niche signals from other cell types.


Cloning and transgenic fly generation
Full length Sox21a was amplified from cDNA with phusion polymerase using attagcggccgctgatgacgagcatctcggcc forward and attactcgagtcaaatgatgtttggcggact reverse primers and cloned into pUAST-LT3-NDam and pUAST-attB with the NotI and XhoI restriction sites (NEB). The Dam construct was injected into yw;;attP2 (Bestgene) and overexpression construct into yv;; attP40 (lab stock).

Staining and imaging
Whole midguts were dissected in PBS on ice and fixed in 4% PFA in PBS at RT for 30 minutes. Fixed guts were washed 3x PBS, blocked in 1%BSA, 0.5%Triton, 5%NDS in PBS for 30 min at RT and incubated with primary antibodies overnight at 4°C in PBS 0.5% triton. Guts were then washed 3x10min PBS and incubated with appropriate secondary antibodies for 2 hours at RT, incubated with DAPI (1/2000 of 1mg/ml stock) in PBS 0.5% triton, washed 3x 10min PBS and mounted in vectashield. Primary antibodies used were: anti-GFP (Rb 1/2000 Molecular Probes A6455, Mse 1/500 Molecular Probes A11120), anti-GFP (Ckn, Abcam ab13970 1/1000), anti-βgal (Rb 1/10000, Cappell), anti-Pros (Mse 1.100 DSHB MR1A). Secondary antibodies were anti rabbit, mouse, rat or chicken raised in donkey, conjugated to alexa fluor 488 or 555 from Molecular Probes. Z stack images were acquired from the posterior midgut on a Zeiss LSM 780 or Leica SP5 Confocal microscope with a 40x oil immersion lens using identical acquisition conditions for all samples from a given replicate of a given experiment.

qRT-PCR
15 midguts per samples were dissected in PBS on ice and RNA was extracted using trizol reagent and purified with an RNA easy kit. cDNA was generated with iScript reverse transcriptase kit using 1ug of RNA as template. qRT-PCR was performed using SYBR Green on a Bio-Rad CFX96 system. Analysis was performed using Bio-Rad CFX software and the ΔΔCq method using GAPDH as a control. Primers used for qRT-PCR were:

Targeted DamID Experiments
esg-GAL4 ts and Myo1A-GAL4 ts flies were crossed to UAST-LT3-NDam and UAST-LT3-NDam-PolII flies at 18 degrees. Progeny were collected in 48 hour batches and aged for a further 3 days at 18 before transfer to 29 degrees to induce Dam protein expression for 24 hours. 40 midguts per condition were dissected in cold PBS and stored at -80C. Methylated fragments were isolated and next generation sequencing libraries were prepared as described previously (3,8) and below.

DamID Library Preparation, Sequencing and Data Analysis
Genomic DNA was isolated using a DNeasy Blood and Tissue kit (Qiagen) and digested overnight with DpnI. Adapters were then ligated using T4 DNA ligase (Roche) for 2 hours at 16 o C and ligated products were digested with DpnII. Ligated fragments were enriched by 17 cycles of PCR amplification using Advantage DNA polymerase (Clontech). DamID adaptors were then removed by digestion with AlwI. For PolII and Control experiments sample shearing (Covaris), Illumina library preparation (PrepX ILM Kit on an Apollo 324) and sequencing (Illumina HiSeq 2500, 50bp single reads, v3 chemistry) were performed at the Bauer Core Facility, FAS Division of Science, Harvard University. For Sox21a and associated control library preparation, samples were sonicated in a Bioruptor Plus (Diagenode) to reduce the average DNA fragment size to 300bp, and DamID adaptors were removed via overnight AlwI digestion. The resulting DNA was purified via magnetic bead clean-up using Seramag beads42 in 20% (w/v) PEG-8000, and 500ng of DNA was processed for Illumina Sequencing. 50bp single-end reads were obtained via a HiSeq 1500 (Illumina) as previously described (8,9). Libraries were multiplexed such as to yield at least 20 million mapped reads per sample. Sequencing files are available from GEO (GSE101814).
Analysis of DamID data was performed as previously described (8,9). FASTQ data from Cic DamID data was obtained from GSE74188 (10). Rscripts for the complete data analysis pipeline are available at https://github.com/AHBrand-Lab. NGS reads in FASTQ format were aligned and processed using damidseq_pipeline 1.4 with default parameters. The resulting gatc.gff ratio files were averaged for each dataset. RNA Pol II (RPII215) datasets were processed with the polii.gene.call script (available at https://github.com/AHBrand-Lab) to call genes with significantly enriched RNA Pol II occupancy (FDR < 0.01). For Sox21a and Cic binding data, peaks were called from averaged GFF ratio files using find_peaks with -gene_pad=0, with genes associated with peaks if a binding peak overlapped the gene body. Processed RNAseq data was obtained from GSE61361. All other analyses were performed using R (www.r-project.org).

Data Analysis and Figure Preparation
Gene ontology PANTHER Over-representation analysis was performed using GO Ontology database Release 2016-11-30 at geneontology.org (11,12). Ortholog identification was performed using DIOPT (13), with the highest scoring orthologs included in Figure 2A. Venn diagrams were initially plotted using Biovenn (14). Figures were assembled in Adobe Illustrator. Microscope images were processed in Fiji (15). Unprocessed Z stacks were max projected for quantification and figures. Background signal was subtracted using the remove outliers function and brightness adjusted (equally across comparable images) where necessary for figure panel clarity. Plots were generated using R studio, with the exception of 1D. Statistical analysis was done in Microsoft Excel unless otherwise stated.

Supplementary Dataset Legends
Dataset S1 -Full list of expressed genes at FDR <0.01 in each population Dataset S2 -Expression of positive control genes (values given only for those with significant expression).

Figure S1 (related to Fig 1) Comparison to Previous Genome-wide Datasets
A. Significantly enriched gene ontology terms in ISC/EB specific genes.
B. Significantly enriched gene ontology terms in EC specific genes.
C and D. Correlation between POLII DamIDseq and RNAseq data for ISC/EB and EC respectively. E and F. Correlation between POLII DamIDseq and RNAseq in ISC/EBs highlighting known ISC/EB specific genes (E) and transcription factors (F).
G. Comparison DamIDseq expression in ISC/EB to hits from a genome-wide screen of ISC/EB regulators.

-Expression of Stem/Progenitor Transcription Factors
A -C. Maximum projection Z stacks showing expression of conserved TFs in the midgut. Background signal was subtracted using the remove outliers function (Fiji) and brightness/contrast increased for clarity. DAPI blue, TF green as indicated, A. Dl-lacZ red, B. Su(H)-lacZ red, C. prospero red. Scale bars are 20μm. Quantification is shown in Figure 2F-H.
D. Zfh2-GAL4>UASEGFP shows expression (green) mostly in small ISC/EB cells that are negative for the EE marker prospero (red) and occasional EEs and ECs. Blue is DAPI. Scale bar is 20µm.
E. Zfh2 antibody (Rat 1/500, from Chris Doe) staining shows expression (red) in small cells and not large ECs. Scale bar is 20µm.

RNAi Target
Control Ilp6_2 E F