Template and primer requirements for DNA Pol θ-mediated end joining

Significance Using oligonucleotides comprised solely of purines or pyrimidines, we find that human Polθ is a bona fide template-dependent DNA polymerase and has no terminal transferase activity under physiological conditions. Polθ requires a minimum of 2 bp and optimally 4 bp between a template/primer pair for efficient and processive DNA synthesis. Polθ polymerase activity is inhibited by a 3′-overhang on the template strand. Consistent with in vivo observations of DNA deletion during TMEJ, 3′-exonucleases or 3′-flap endonucleases are likely recruited to remove such hindrances when microhomology is internal to the 3′ ends. In contrast to a previously report, we observed limited strand displacement activity by Polθ. Because Pol θ lacks a proofreading exonuclease activity and is error prone, minimal strand displacement can be only beneficial for TMEJ. DNA Pol θ-mediated end joining (TMEJ) is a microhomology-based pathway for repairing double-strand breaks in eukaryotes. TMEJ is also a pathway for nonspecific integration of foreign DNAs into host genomes. DNA Pol θ shares structural homology with the high-fidelity replicases, and its polymerase domain (Polθ) has been shown to extend ssDNA without an apparent template. Using oligonucleotides with distinct sequences, we find that with Mg2+ and physiological salt concentrations, human Polθ has no terminal transferase activity and requires a minimum of 2 bp and optimally 4 bp between a template/primer pair for DNA synthesis. Polθ can tolerate a mismatched base pair at the primer end but loses >90% activity when the mismatch is 2 bp upstream from the active site. Polθ is severely inhibited when the template strand has a 3′ overhang within 3–4 bp from the active site. In line with its TMEJ function, Polθ has limited strand-displacement activity, and the efficiency and extent of primer extension are similar with or without a downstream duplex.

DNA Pol θ-mediated end joining (TMEJ) is a microhomology-based pathway for repairing double-strand breaks in eukaryotes. TMEJ is also a pathway for nonspecific integration of foreign DNAs into host genomes. DNA Pol θ shares structural homology with the high-fidelity replicases, and its polymerase domain (Polθ) has been shown to extend ssDNA without an apparent template. Using oligonucleotides with distinct sequences, we find that with Mg 2+ and physiological salt concentrations, human Polθ has no terminal transferase activity and requires a minimum of 2 bp and optimally 4 bp between a template/primer pair for DNA synthesis. Polθ can tolerate a mismatched base pair at the primer end but loses >90% activity when the mismatch is 2 bp upstream from the active site. Polθ is severely inhibited when the template strand has a 3′ overhang within 3-4 bp from the active site. In line with its TMEJ function, Polθ has limited strand-displacement activity, and the efficiency and extent of primer extension are similar with or without a downstream duplex.
DSBs may form endogenously by necessity or exogenously due to ionization radiations, CRISPR-mediated gene targeting, or DNA damage that interferes with replication. In eukaryotes, DSBs are repaired by homologous recombination (HR), nonhomologous end joining (NHEJ), and Pol θ-mediated end joining (TMEJ) (1)(2)(3). HR occurs mainly in the S and G2 phases when sister chromatids are available for the repair. NHEJ takes place predominantly in G1 phases and depends on Ku protein and DNA PKcs for end sensing and protection. An alternative end-joining pathway for DSB repairs, which depends on limited sequence complementation between 3′ overhangs (also known as microhomology), was discovered two decades ago (4,5). Although the Pol θ homolog in Drosophila, Mus308, was found to play a role in repair of DNA breaks induced by a cross-linking agent as early as 1996 (6), it is only in recent years that Pol θ is shown to be required for the alternative end-joining pathway in Caenorhabditis elegans, plants, and mice and denoted as microhomology-mediated end joining (MMEJ) or alt-NHEJ (7)(8)(9). However, both NHEJ and Alt-NHEJ pathways depend on complementary bases between 3′ overhangs (10,11), and the unique player in the alt-NHEJ is Pol θ. We thus choose to use the name TMEJ as proposed by Tijsterman and coworkers (12). In addition to repairing DSBs and reducing the sensitivity to ionization radiation, Pol θ is involved in integration of exogenous DNAs to nonspecific targets and is implicated in CRISPR-dependent gene targeting (13-15).
Pol θ is a single-subunit A-family DNA polymerase and homologous to many replicative polymerases in bacteria, bacteriophage, and mitochondrial DNA polymerase γ (1,16). But it lacks the proofreading function, which is a characteristic of high fidelity and low error rate. The polymerase activity of Pol θ is essential in TMEJ, from annealing the 3′ overhangs to filling the gaps between broken DNA ends (10). In addition to the polymerase domain located at the C terminus, Pol θ contains an ATP-dependent helicase domain at the N terminus, which may form a homodimer (17). The Pol θ helicase domain has been shown to contribute to end joining in Drosophila and suggested to facilitate homology search during TMEJ (10,18). The crystal structure of Pol θ polymerase domain (Polθ as an abbreviation) complexed with DNA and an incoming dNTP (19) reveals that Polθ binds a DNA-duplex substrate as do many of its homologs involved in DNA replication or repair.
In contrast to the in vivo observation of microhomology dependence of TMEJ and the crystal structure of DNA-duplex binding by Polθ (10,19), in vitro characterizations of human Polθ have reported a template-independent terminal transferase (terminal deoxynucleotide transferase, TdT) activity (20) in addition to the template-dependent DNA synthesis (21,22). To detect the TdT activities, DNA synthesis assays were conducted with human Polθ expressed in Escherichia coli and at a 10:1 molar ratio of polymerase to ssDNA in a buffer without any salt at 42°C for 2 h (20). Whether and why Polθ extends ssDNA without a template remains an unsettling question, and how long the microhomology is required for Polθ to bridge DNA ends with 3′ overhangs awaits to be defined under physiological conditions.
To avoid intrastrand self-annealing ("snap-back") or random strand pairing, we used short and simple purine or pyrimidineonly DNA oligonucleotides to characterize the template and primer requirement for DNA synthesis by human Polθ expressed in the mammalian culture. Under physiological conditions, we find that Polθ has no terminal transferase activity and requires a minimum of two and optimally four Watson-Crick base pairs between a template and primer for efficient DNA synthesis. Furthermore, we find that Polθ has limited strand-displacement activity.

Significance
Using oligonucleotides comprised solely of purines or pyrimidines, we find that human Polθ is a bona fide templatedependent DNA polymerase and has no terminal transferase activity under physiological conditions. Polθ requires a minimum of 2 bp and optimally 4 bp between a template/primer pair for efficient and processive DNA synthesis. Polθ polymerase activity is inhibited by a 3′-overhang on the template strand. Consistent with in vivo observations of DNA deletion during TMEJ, 3′-exonucleases or 3′-flap endonucleases are likely recruited to remove such hindrances when microhomology is internal to the 3′ ends. In contrast to a previously report, we observed limited strand displacement activity by Polθ. Because Pol θ lacks a proofreading exonuclease activity and is error prone, minimal strand displacement can be only beneficial for TMEJ.

Results
Polθ Is Template-Dependent in the Presence of Mg 2+ . Using recombinant human Pol θ polymerase domain purified from HEK293 cells (amino acids 1,822-2,590, 86 kDa; Fig. 1A), referred to as "Polθ" in this paper, we confirmed that Polθ robustly extended ssDNAs of mixed purine and pyrimidine sequences (Fig. 1B). At 37°C in the presence of Mg 2+ , if the ssDNA (13 nt in length) contained an internal dA (T13-1 and T13-2) that can base pair with the dT at the 3′ end, Polθ extended the primer to up to 50 nt. But in the absence of a dA (T13-3) to pair with the 3′ dT, Polθ appeared to not extend it at all. With the mixed purine and pyrimidine sequences, these ssDNAs likely functioned as both template and primer and formed heterologous duplexes with at least one Watson-Crick base pair and some mismatches to allow template-dependent DNA extension. Because of the short length and unstable nature of these heteroduplexes, a pair of template and primer might dissociate after a few rounds of nucleotide incorporation, or upon reaching the end of a template, and form other heteroduplexes for additional strand extension, which resemble multiple cycles of PCR reactions.
In the presence of Mn 2+ , which is known for the reduced stringency of coordination geometry, Polθ had reduced dNTP selectivity and was able to extend a small fraction of T13-3 oligos without an apparent Watson-Crick base pair (Fig. 1B). With all three ssDNA variations, Polθ made more and longer products with Mn 2+ than with Mg 2+ (>80 nt, compared with ∼50 nt with Mg 2+ ). However, the nature of heteroduplexes that supported DNA synthesis by Polθ was unclear.
To eliminate self-annealed inter-and intrastrand heteroduplex formation, we designed ssDNAs containing either purines (A and G) or pyrimidines (T or C) only and found that Polθ failed to extend these ssDNA whether in the presence of Mg 2+ or Mn 2+ (Fig. 1C). When the ssDNAs composed of polypurines and polypyrimidine were mixed to form distinct 1, 2, or 3 bp at the 3′ ends between them, Polθ extended the primer that formed 3 bp with the template in the presence of Mg 2+ or Mn 2+ (Fig. 1D, lanes 2-5) and extended the primer with 2 bp in the presence of Mn 2+ (Fig. 1D, lanes 6-9) but failed to extend the primer with 1 bp (Fig. 1D, lanes 10-13). As observed with the mixed purine and pyrimidine ssDNA, with Mn 2+ Polθ became more promiscuous and made more and longer products than with Mg 2+ (SI Appendix, Fig. S1). With 3 bp between the primer and template, extension of the 32 P-labeled primer was much more efficient in the presence of all four dNTPs than with the correct one (dTTP) alone (Fig. 1D, lanes 3-4). We suspect that with all four dNTPs, Polθ alternately extended the 3′ ends of both strands (adding dA to the template or adding dT to the primer), thus increasing the effective length of duplex for efficient extension of the labeled primer. Polθ appeared to be much more active at 37°C than 25°C (Fig. 1D, lanes 2 and 3), probably because Polθ switched between the two DNA 3′ ends more frequently and effectively at 37°C than 25°C. Higher temperature also increased DNA melting and the "PCR" effect and thus lengthened products much beyond the original template length. With Mg 2+ at 25°C, the PCR effect was minimal, and Polθ stopped primer extension at the end of the designated template (Fig. 1D, lane 2). Therefore, we chose purine templates, pyrimidine primers, 5 mM Mg 2+ , and 25°C for characterizations of Polθ below.
After ascertaining the template dependence, we examined the nucleotide selection by Polθ. When a single kind of dNTP (G, A, T, or C) was present with a template/primer pair sharing 3 bp and with a 5′ overhang on each strand, Polθ incorporated the correct dNTP according to the template sequence (Fig. 1E). When the 5′ overhang contained six dAs, six to seven dTTPs were incorporated, and if the 5′ overhang contained six dCs, six to seven dGTPs were added to the primer strand. We suspect that when the 3′ overhangs of two ssDNAs formed terminal base pairs, both strands could be extended by Polθ. Pol θ dimerization via the N-terminal helicase domain as suggested [Protein Data Bank (PDB) ID code 5A9J, between A and C or B and D subunits, which includes both hydrophobic and hydrophilic residues] (17) may be advantageous to extend both 3′ ends of two annealed broken DNAs.  T3  GAGG  T3  TAGG  T3  TGGG   T13-3  T13-2  T13-1   T6TCC   A  AGG  A  GG A G 37°C, 10 min Polθ Requires 4 bp for Efficient and Processive Primer Extension. We next asked how many base pairs between a template and primer are needed for Polθ to efficiently synthesize DNA. In the crystal structure of the Polθ-DNA-dNTP ternary complex (PDB ID code 4X0Q) (19), Polθ interacts with 6 bp upstream from the nascent base pair (SI Appendix, Fig. S2). Therefore, we examined 1-6 bp between template and primer by pairing a 9-nt 32 P-5′labeled primer with templates of 7-12 nt in length, whose 3′ ends were lengthened 1 nt at a time ( Fig. 2A). All templates contained a 5′ overhang of six dAs. With 100 nM Polθ and DNA each, the DNA synthesis reactions took place with 100 μM dTTP in the standard buffer (20 mM Tris, pH 8.0, 100 mM NaCl, and 5 mM Mg 2+ ) at 25°C. Polθ extended the primer when the template/primer pair contained 2 bp but not 1 bp. However, with 2 bp, only 5% of the primers were extended after 2.5 min, and little full-length (+6) products were made after 10 min (Fig. 2 B and C). With 3-4 bp between the template and primer, 40-80% primers were extended in 2.5 min, and with the 5-bp substrate, the reaction was complete in 2.5 min. The processivity of Polθ was improved noticeably when the number of base pairs increased from 3 to 4, and the increase of full-length product (+6) reached a plateau after 4 bp (Fig. 2). As noted previously (21)(22)(23), Polθ has a tendency to extend primers beyond the end of template strand by 1 nt, which is termed terminal addition activity, and the +7 in addition to +6 products were observed. The +7 products were more abundant with the longer template/primer pairs (5 and 6 bp) (Fig. 2). The slight accumulations of extension intermediates (+3 to +5) with the 5 and 6 bp DNA substrate might be due to the increasing Gs in the template strand, which is known to promote DNA secondary structures. The above data suggest that Polθ requires 3 bp between a template and primer for efficient primer usage and 4 bp for optimal processivity.
The Overhangs of Primer and Template Influence Polθ Activity. After determining the base-pair requirement, we set out to test whether the primer length makes a difference. We maintained the terminal 3 bp and changed the 5′ overhangs on the primer strand from 0 to 6 nt (Fig. 3A). The amount of primer extended by Polθ increased steadily with lengthening of the primer 5′ overhang, and the complete product (+6 and +7) increased significantly when the overhang was increased from 1 to 2 or 3 nt (Fig. 3 B and C), which agrees well with the structural observation  that Polθ binds 6 nt of the primer strand (3 bp plus 3 nt here; SI Appendix, Fig. S2). In contrast, a 3′ overhang on the template strand beyond the 3 bp (Fig. 3D) is detrimental to DNA synthesis by Polθ. Polθ was most efficient when there was no 3′ overhang. Even with a 1-nt overhang, which is equivalent to one single mismatched base pair, the amount of primer extended and the full product (+6) formed were reduced by 1.6-and 2.4-fold, respectively, suggesting that a mismatch at the −4 position from the active site is poorly tolerated (Fig. 3 E and F). Lengthening of the 3′ overhangs on the template from 1 to 6 nt continued to diminish the Polθ activity. As we used dA to lengthen the template 3′ overhang, the additional three to six dAs could anneal with the primer extension products (with the addition of six to seven dTs), thus resulting in template/primer pairing without any 3′ overhang and traces of long products (Fig. 3E and SI Appendix, Fig. S3).
Effects of Mismatch on Primer Extension by Polθ. We next examined whether Polθ can tolerate a mismatched base pair in a template/ primer duplex. Using primers and templates of 9-10 nt in length and forming 3-4 bp, a mismatched base pair was placed at 4, 3, 2, or 1 bp upstream (−4, −3, −2 or −1) from the 3′ primer end in four designs (MM-4 to MM-1) (Fig. 4A). However, these oligos potentially formed four additional alternative heteroduplexes that resulted in full primer extensions of +2, +5, or +6 nucleotides (Fig. 4A). We relied on product lengths to discern which template/primer configuration was the substrate for DNA synthesis. Primer extension assays were carried out and compared with the normal template/primer pairs sharing 3 or 4 bp without any mismatch (Fig. 4 A and B).
Mismatch distal to the primer 3′ end (MM-4) was best tolerated, and primer usage and product formation suffered a threefold reduction compared with the normal 3-bp substrate (Fig. 4 B  and C). As the mismatched base pair moved closer to the primer 3′ end (MM-3), the reduction of primer extension was drastic. The complete product (+6) of MM-3 was reduced by >fivefold compared with MM-4, and the complete product of MM-2 was undetectable. Interestingly, the amount of +5 vs. +6 products was nearly equal for MM-3, indicating that by the alternative template/primer realigning (Alt3-1, +5 product), Polθ extended a mismatched primer end after three normal base pairs as efficiently as the template/primer pair containing a mismatch at the −3 position (MM-3). With MM-2, the alternative primer/ template annealing to form 5 bp and one mismatch at the primer 3′ end (Alt5-1, +2 product) appeared to be a preferred substrate by Polθ, and the predominant primer-extension products were +2 and some +3 (due to the terminal addition) instead of the +6 product from the internal mismatch configuration.
The design of the MM-1 substrate placed a mismatch at the primer 3′ end after four normal base pairs (Fig. 4A). In addition, the template/primer pair allowed a dC to slip out (Δ1) from the primer strand between the −1 and −2 or −2 and −3 positions or a mismatch at the −3 position (MM-3′) (Fig. 4A). Both the Δ1 and MM-3′ configurations would lead to +6 products, while MM-1 would produce +5 products. As expected, the efficiency of MM-1 primer usage was reduced compared with the normal 4 bp control (Fig. 4 B and C), and the products were a mixture of +5, +6, and +7, with the +5 predominant. This suggested that the MM-1 configuration with a terminal mismatch appended to 4 bp was preferred over an internal mismatch (MM-3′) or a slipout (Δ1) surrounded by 4 bp. Compared with the MM-3 design, the +6 and +7 products of MM-1 were more abundant (Fig. 4 B and C). We interpret the results as that Δ1 might be preferred by Polθ over MM-3′ configuration.
Polθ Has Limited Strand Displacement Activity. To assess the strand displacement activity of Polθ, we compared DNA synthesis on a template/primer pair with a single-stranded overhang or on a gapped substrate with single-and double-stranded DNA downstream to mimic annealing of 3′ overhangs between two broken DNA ends (Fig. 5A). Previously, it was reported that Polθ has robust strand displacement activity (21), and the strand displacement activity is increased by the 5′ phosphate on the displaced strand. We thus synthesized the downstream complementary strand with or without a 5′ phosphate. As a control, we compared the template of different overhang lengths, 6 or 23 nt (T-6s and T-23s), and found that lengthening the template greatly diminished Polθ activity (Fig. 5 B and C). The cause of inhibition is likely not due to the low processivity of Polθ because with T-23s the primer was extended by only 1-2 nt rather than by 6-7 nt as with T-6s. The severely reduced primer extension may be due to an entangled template strand because the Polθ activity was restored by the addition of a complementary strand and formation of a gapped substrate with a 17-bp duplex downstream (Fig. 6). However, the presence or absence of the 5′ phosphate made little difference. The template immediately after the 3′ primer end consisted of six dAs, which was followed by two more dAs base paired with the complementary strand. When dTTP alone was included in the synthesis assay, primer extension stopped after incorporation of six dTs, and only a small amount of terminal addition product (+7) was observed with the T-6s template. The primers were not extended further into the duplex with the gapped substrate (6s-17d or 6s-p17d) (Fig. 5 B and C). The results suggest that Polθ has little strand displacement activity. When all four dNTPs were added in the primer extension assay, both the terminal addition and strand displacement activities of Polθ were increased compared with dTTP only. Small amounts of full-length product (+23 nt) were observed for both simple and gapped substrate, while the majority of products were +8 or +9 nt. Thus, the apparent strand displacement activity was attenuated by the consecutive C/G base pairs. This is in agreement with the role of Polθ in TMEJ, during which strand displacement is unnecessary and potentially detrimental to DNA integrity by introducing unwanted mutations.

Discussion
Using the mammalian cell-expressed human DNA Polθ, we found that the polymerase activity is 10-100 times higher than previously reported using bacteria-expressed Polθ (20)(21)(22) (based on the one-tenth of molar ratio of Polθ to DNA substrate and 6-12-fold reduced reaction time in our assays). With such active DNA Polθ and distinctly base-paired templates and primers, the difference between preferred and nonpreferred DNA substrates becomes obvious. Using semiquantitative analyses, we find that DNA synthesis by Polθ is template-dependent and a minimum of 2 bp is necessary for primer extension by Polθ (Figs. 1D and 2 B and C). With 4 bp between a pair of template and primer, Polθ attains both efficient and processive DNA synthesis in the presence of physiologically abundant Mg 2+ (Fig. 2). Our findings are in perfect agreement with the in vivo results (10). The primer length required by Polθ for DNA synthesis, including unpaired and paired with a template, is 5-6 nt (Fig. 3), which agrees well with the protein-DNA interface demonstrated by the crystal structure (SI Appendix, Fig. S2).
Terminal transferase or template-independent activities of Polθ have been reported for DNA oligos of mixed purine and pyrimidine sequences, particularly in the presence of Mn 2+ (20). With these DNA sequences, it is difficult to define where heteroduplex may form and which portion of ssDNA is used as a template. As shown in Fig. 4, even with 9-10 nt ssDNA oligos, which were designed to have unique base pairs, the alternative ways that Polθ can anneal them for primer extension are numerous. If Mn 2+ instead of Mg 2+ is used to reduce polymerase fidelity or elevated temperature is used to allow PCRlike reactions, the varieties of primer extension products would be unlimited. In our hands, even with mixed purines and pyrimidines (Fig. 1B), one Watson-Crick base pair is essential for Polθ to extend the DNA in the presence of Mg 2+ , but in the presence of Mn 2+ , no base pair appeared to be required for DNA synthesis.   6. Diagram of the TMEJ process. Pol θ helps to anneal two broken DNA ends with 3′ overhangs by searching for microhomology of 3-4 bp. If microhomology is located internally in either overhang, a 3′-5′ exonuclease or flap endonuclease will be needed to trim off the overhangs for efficient extension by Pol θ. After removal of 3′ overhangs, Pol θ extends the microduplex by template-dependent DNA synthesis using both ends as primers either alternately (if the microduplex is short) or simultaneously (if the duplex is long enough to accommodate a Pol θ dimer).
The apparent terminal transferase activity observed on the PolyC sequence was detected at 42°C with Mn 2+ in a buffer containing little salt (20), but we found no polymerase activity on oligonucleotides consisting of purines or pyrimidines alone in the presence of Mg 2+ or Mn 2+ (Fig. 1C). The crystal structure of Polθ (19) and its homology to all A-family DNA polymerases (1) provide no evidence to support the template-independent terminal transferase activity (20). Even the bona fide TdT of the X-family depends on a non-base-paired template for primer extension (24,25). For Polθ, whatever heteroduplex it may assemble including those containing a terminal mismatched base pair (Fig. 4), a template base is required to select correct dNTPs for incorporation. These data confirm unambiguously that Polθ is a template-dependent DNA polymerase.
In the in vivo studies, deletions on DNA overhangs are detected when microhomology is internal (10). We have found that the reason for overhang deletion is because Polθ does not work well with a 3′ overhang on the template strand when microhomology is internal. The implication for TMEJ is that 3′ exonucleases or 3′ flap endonucleases must work with Pol θ to remove such hindrance as the in vivo analysis reveals that internal microhomology is efficiently utilized in TMEJ (10). Such 3′ trimming nucleases are in addition to the nucleases required in DNA-end resection to expose 3′ overhangs for initial homology search. After trimming, both 3′ ends can be extended by Pol θ, either alternately or simultaneously if the duplex is long enough to accommodate a Pol θ dimer (Fig. 6).
Despite the reported strand displacement activity and its dependence on the 5′ phosphate of the downstream duplex (21), with the highly active human Polθ we observed limited strand displacement and no impact of the presence or absence of a 5′ phosphate (Fig. 5). Extended DNA synthesis by error-prone Pol θ could only introduce mutations, and minimal strand displacement is thus beneficial for gapping-filling and end-joining function of TMEJ.

Materials and Methods
Details of the materials and methods are provided in SI Appendix, including protein and nucleic acid preparations (26,27) and assays for DNA synthesis.