Maize Genes Are Controlled by Far-Off Regulatory Regions
The regulation of gene expression is central to cellular differentiation, tissue development, and the response to environmental stimuli. Cis-regulatory elements (CREs)—stretches of DNA that bind transcription factors (TFs) which in turn modulate genic transcription—play crucial roles in this regulation. The TFs bound at CREs modulate transcription by interacting with genes' basal transcriptional machinery (Ma et al., 2018) or by altering genes' chromatin environments (Krogan et al., 2009). Because CREs function via sequence-specific TF binding, DNA sequence variation within CREs can perturb TF binding and, consequently, alter CRE behavior. Such sequence variation in CREs (and the resulting variation in gene expression) is a common source of phenotypic variation in crops (Haudry et al., 2013; Wallace et al., 2014; Rodríguez-Leal et al., 2017; Wang et al., 2017; Kremling et al., 2018). In fact, multiple crop domestication loci have been mapped to CREs rather than to the protein-coding sequences of their target genes (Salvi et al., 2007; Studer et al., 2011; Zheng et al., 2015; Huang et al., 2018); selection on gene expression variation seems to be a theme in crop domestication. The historical importance of CREs in crop domestication suggests that leveraging CREs will be an important component of future crop improvement. Therefore, understanding how CREs work mechanistically (e.g., identifying the specific TFs that bind to CREs and what effects they have on transcription) will be a boon for future genome engineering and crop improvement.
In small-genome plants such as Arabidopsis thaliana, CREs tend to cluster near the genes that they target (Yu et al., 2016). This does not hold true for large-genome plants, such as Zea mays (maize) and cereal crops, which contain large tracts of intergenic space separating genes. Genetic evidence suggests that the genome of maize abounds with thousands of loci located deep within the intergenic space (dozens of kilobases from any genes) that play important roles in regulating agronomic traits (Wallace et al., 2014). However, it has not been established if these loci are CREs, and if they are, what genes do they target, how do they target these genes over large genomic distances, and how do they modulate their target genes' transcription? The fact that these loci influence agronomic traits indicates that they are functionally important, and given sufficient understanding of how they work, they could be leveraged for future crop improvement. In this issue of Nature Plants, Ricci and colleagues (Ricci et al., 2019) demonstrate that many of these loci are indeed CREs and that they interact with genes over large genomic distances via the formation of chromatin loops. These findings provide the first demonstration of widespread long-range cis-regulation in a plant genome.
Ricci et al. first used ATAC-seq (Buenrostro et al., 2013) to generate a catalog of accessible chromatin regions (ACRs) across the maize genome. The reasoning for this was that TFs typically bind to DNA within accessible chromatin and thus CREs were expected to fall within ACRs. The authors identified more then 10,000 ACRs that were farther than 2 kilobases from any annotated genes (hereafter called 'distal ACRs'). This large number suggested a substantial potential for discovery. The distal ACRs displayed the telltale genetic signs that they contained CREs: (1) Distal ACRs displayed sequence constraint, suggesting evolutionary conservation. (2) Distal ACRs were enriched for TF binding (via DAP-seq and motif enrichment analysis). (3) SNPs that were genetically linked to agronomic phenotypic variation and gene expression variation were most often found within the distal ACRs. Collectively, these results suggested that the distal ACRs contained CREs, many of which contributed to agronomic phenotypes.
Next, the authors used ChIP-seq to characterize the histone covalent modifications that were immediately surrounding the distal ACRs. Based on several select histone modifications, the distal ACRs could be clustered into four distinct groups. One cluster was enriched primarily for acetylated histones. This cluster appeared most similar to the canonical enhancers that have been observed in mammalian genomes (Shlyueva et al., 2014). A second cluster was enriched for H3K27me3, the histone modification associated with transcriptionally silent chromatin. A third cluster contained a suite of histone modifications that appeared very similar to those of expressed genes. This class probably corresponded to unannotated transcription units. A fourth cluster lacked significant enrichment for any of the measured histone marks. The distinct clusters were noteworthy because the divergent histone marks suggested divergent regulatory functions of the distal ACRs. For example, the acetylated distal ACRs probably contained gene-activating CREs, while H3K27me3 distal ACRs probably contained gene-silencing CREs, and the transcribed group could simply have been unannotated genes.
Naturally, the next question was whether the distal ACRs actually interacted with genes in cis. Presumably, such interactions should take place via the formation of chromatin loops which bring distant loci into close proximity. The authors tested this by performing Hi-C (Rao et al., 2014) and HiChIP (Mumbach et al., 2016)—modern high-throughput variations on the classic chromatin conformation capture experiment. These experiments revealed that several thousand distal ACRs looped with genes that were dozens of kilobases away on the linear chromosomes. In support of the biological relevance of the identified chromatin loops, the distal CRE-gene loops could recapitulate genetic interactions identified by traditional mapping experiments. For example, the classical gene teosinte branched1 was found to loop witha region 65 kilobases away, which had previously been identified as a CRE targeting teosinte branched1 (Studer et al., 2011) (Figure 1). Furthermore, the stability of the chromatin loops (which can be thought of as the proportion of cells within a population that contain a given chromatin loop) appeared to have predictive power in determining their functional significance. Therefore, the stability of chromatin loops could be used to tease out the important functional loops from non-functional background loops, thus increasing the confidence of assigning distal CREs to their correct target genes.
Finally, the authors performed STARR-seq (Arnold et al., 2013), a high-throughput enhancer reporter assay, which served to provide quantitative measurements of transcriptional regulatory capacities for all of the ACRs in the maize genome. This assay demonstrated that, globally, ACRs contained DNA elements which were sufficient to activate transcription. Such results provided independent evidence that distal ACRs contained CREs.
The authors provided compelling evidence that long-range cis-regulation—previously believed to be a rarity in plants—is a common phenomenon in the maize genome. Importantly, these findings show that the regulatory space in maize is not restricted to close proximity to genes. This will be important for future genome engineering efforts, since searching for CREs only within the gene proximal space will be insufficient to fully recreate gene expression patterns. A companion study published in the same issue of Nature Plants (Lu et al., 2019) demonstrated the generalizability of these results to a wide variety of angiosperms.
Figure 1. A chromatin loop connects a distant cis-regulatory element to the teosinte branched1 gene. A CRE (Studer et al., 2011) was found to genetically control the teosinte branched1 gene. Here, we show that the genetically identified CRE (shaded gray region) contains accessible chromatin (indicated with the blue arrow) and that it forms a chromatin loop with teosinte branched1. The edges of the chromatin loops are represented by black and red boxes. The red boxes indicate the loop which connects the CRE to the gene. The 'loop strength' is a measure of chromatin loop stability. Here, the chromatin loop connecting the CRE to the gene shows significantly greater stability than the other local loops.
Arnold, C. D., Gerlach, D., Stelzer, C., Boryn, L. M., Rath, M., & Stark, A. (2013). Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science, 339(6123), 1074-1077.
Buenrostro, J. D., P. G. Giresi, L. C. Zaba, H. Y. Chang and W. J. Greenleaf (2013). "Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position." Nat Methods 10(12): 1213-1218.
Haudry, A., Platts, A. E., Vello, E., Hoen, D. R., Leclercq, M., Williamson, R. J., . . . Blanchette, M. (2013). An atlas of over 90,000 conserved noncoding sequences provides insight into crucifer regulatory regions. Nat Genet, 45(8), 891-898.
Huang, C., Sun, H., Xu, D., Chen, Q., Liang, Y., Wang, X., . . . Tian, F. (2018). ZmCCT9 enhances maize adaptation to higher latitudes. Proc Natl Acad Sci U S A, 115(2), E334-e341.
Kremling, K. A. G., Chen, S. Y., Su, M. H., Lepak, N. K., Romay, M. C., Swarts, K. L., . . . Buckler, E. S. (2018). Dysregulation of expression correlates with rare-allele burden and fitness loss in maize. Nature, 555(7697), 520-523.
Krogan, N. T., & Long, J. A. (2009). Why so repressed? Turning off transcription during plant growth and development. Curr Opin Plant Biol, 12(5), 628-636.
Lu, Z., A. P. Marand, W. A. Ricci, C. L. Ethridge, X. Zhang and R. J. Schmitz (2019). "The prevalence, evolution and chromatin signatures of plant regulatory elements." Nat Plants 5(12): 1250-1259.
Ma, Y., Gil, S., Grasser, K. D., & Mas, P. (2018). Targeted Recruitment of the Basal Transcriptional Machinery by LNK Clock Components Controls the Circadian Rhythms of Nascent RNAs in Arabidopsis. Plant Cell, 30(4), 907-924.
Mumbach, M. R., A. J. Rubin, R. A. Flynn, C. Dai, P. A. Khavari, W. J. Greenleaf and H. Y. Chang (2016). "HiChIP: efficient and sensitive analysis of protein-directed genome architecture." Nat Methods 13(11): 919-922.
Ricci, W. A., Z. Lu, L. Ji, A. P. Marand, C. L. Ethridge, N. G. Murphy, J. M. Noshay, M. Galli, M. K. Mejia-Guerra, M. Colome- Tatche, F. Johannes, M. J. Rowley, V. G. Corces, J. Zhai, M. J. Scanlon, E. S. Buckler, A. Gallavotti, N. M. Springer, R. J. Schmitz and X. Zhang (2019). "Widespread long-range cis-regulatory elements in the maize genome." Nat Plants 5(12): 1237-1249.
Rao, S. S., M. H. Huntley, N. C. Durand, E. K. Stamenova, I. D. Bochkov, J. T. Robinson, A. L. Sanborn, I. Machol, A. D. Omer, E.
S. Lander and E. L. Aiden (2014). "A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping." Cell 159(7): 1665-1680.
Rodriguez-Leal, D., Lemmon, Z. H., Man, J., Bartlett, M. E., & Lippman, Z. B. (2017). Engineering Quantitative Trait Variation for Crop Improvement by Genome Editing. Cell, 171(2), 470-480.e478.
Salvi, S., Sponza, G., Morgante, M., Tomes, D., Niu, X., Fengler, K. A., . . . Tuberosa, R. (2007). Conserved noncoding genomic sequences associated with a flowering-time quantitative trait locus in maize. Proc Natl Acad Sci U S A, 104(27), 11376-11381.
Shlyueva, D., Stampfel, G., & Stark, A. (2014). Transcriptional enhancers: from properties to genome-wide predictions. Nat Rev Genet, 15(4), 272-286.
Studer, A., Zhao, Q., Ross-Ibarra, J., & Doebley, J. (2011). Identification of a functional transposon insertion in the maize domestication gene tb1. Nat Genet, 43(11), 1160-1163.
Wallace, J. G., Bradbury, P. J., Zhang, N., Gibon, Y., Stitt, M., & Buckler, E. S. (2014). Association mapping across numerous traits reveals patterns of functional variation in maize. PLoS Genet, 10(12), e1004845.
Wang, X., Chen, Q., Wu, Y., Lemmon, Z. H., Xu, G., Huang, C., . . . Tian, F. (2018). Genome-wide Analysis of Transcriptional Variability in a Large Maize-Teosinte Population. Mol Plant, 11(3), 443-459.
Yu, C. P., Lin, J. J., & Li, W. H. (2016). Positional distribution of transcription factor binding sites in Arabidopsis thaliana. Sci Rep, 6, 25164.
Zheng, L., McMullen, M. D., Bauer, E., Schon, C. C., Gierl, A., & Frey, M. (2015). Prolonged expression of the BX1 signature enzyme is associated with a recombination hotspot in the benzoxazinoid gene cluster in Zea mays. J Exp Bot, 66(13), 3917-3930.