Highest Resolution Study of Genome Replication Surprises Scientists with Unexpected Timing in S-phase
by Stacey Ryder
University of Virginia's Anindya Dutta and UC Berkeley's Michael Botchan discuss the use of whole-genome tiling microarrays to study DNA replication at various times during S-phase
Scientists led by Dr. Anindya Dutta of University of Virginia (UVA) discovered that DNA replication initiates during all quarters of S phase in some chromosomal segments in cancer cells, providing critical information about how the cancer genome replicates and how it might be restored to healthy growth. The team made their findings through the most detailed whole-genome scan of origins of replication ever performed using high-resolution tiling microarrays that measured every 35th base of chromosomes 21 and 22.
Dutta recently published this work in the May 2005 issue of the Proceedings of the National Academy of Sciences, USA. "The pan S phase pattern was a complete surprise," said Dutta, professor of biochemistry and molecular genetics at UVA. "I wanted to ignore it, but it sort of jumped out at us. The result is hard to believe given the general consensus that everything replicates in a temporally specific manner."
Dr. Michael Botchan, a professor of biochemistry and molecular biology at the University of California at Berkeley, is using a similar approach to study DNA replication in Drosophila. Botchan is using a microarray tiled with the Drosophila genome to explore how the mechanisms and regulation of DNA replication are coupled to the cell cycle.
Botchan recently spoke with Dutta about the challenges they both face in exploring the genome for origins of replication using these new tiling microarrays. Topics covered:
- The discovery of pan-S phase replication in HeLa cells and ways to validate those tiling microarray results
- The role of chromatin structure and associated proteins regulating the timing of replication in early or late S-phase
- The methodologies used to validate and create standards for ChIP-on-chip data
Discovery and data validation of pan S-phase replication
Botchan: I'd like to know more about the surprising observation that you made with Tom Gingeras' group at Affymetrix, that in HeLa cells, at least 60 percent of chromosome 21 and 22 is replicated pan-S phase—it might replicate early, it might replicate late, it might replicate in the middle of the S phase. What are your thoughts about that 60 percent of the chromosome?
Dutta: That's a great question. The pan-S phase pattern was a surprise. It was hard for us to believe. So we have reanalyzed the data in a different way. Instead of looking at just the p value of replication in a given time interval, we have done TR50 calculations of the time at which a given segment is 50 percent replicated. Using that, we can decrease the pan-S phase area to 30 to 40 percent. So it's probably not 60 percent.
Botchan: I still have an open mind that origin selection may be random for some unknown fraction of the genome, depending on the cell type. We know that in early embryos that is the case, and even for the hard-wired phage, the T-phage, for example, you can delete an origin and you can still get by. So it may be that we don't understand why there is specific origin selection or when it occurs. The default may be that when there is no reason to replicate at a specific time, timing is essentially random.
Dutta: Right. Let me tell you about the experiment we have done. We saw 30 to 40 percent pan-S replication. If you look at the area that the ENCODE Project works on, which is a completely different segment, the number is 20 to 30 percent.
One thing we were concerned about was whether our results were due to the microarray we used. So, we confirmed our result by interphase FISH experiments, where you can count the number of dots produced by a given chromosomal segment. So you are sure that you're not seeing cross hybridization. And the pan-S phase pattern was not an artifact of the microarrays.
We were also worried about whether our results were due to the type of block we were imposing, because we did thymidineaphidicolin block and release. So we repeated the experiment by using a nocodazole block and release, and we still saw the pan-S phase pattern of replication in HeLa cells.
Our last concern is whether this is something unique to HeLa cells. These are cancer cells and maybe they have been around for such a long time that they are completely messed up. So we are in the process of repeating the experiment in different cell lines to see if this pan-S phase pattern holds up.
 Botchan: Even though HeLa cells are terribly aneuploid and there are lots of rearrangements, the domain of control may be reasonably local. That domain would divide a chromosome into a hundred different segments at least. And I don't think there are that many rearrangements in the HeLa cell nucleus. Of course, as the number goes down from 60 percent to 30 percent, it becomes more reasonable to think that it could be a result particular to that cancer cell.
Regulation of DNA replication timing
Botchan: In your paper you discuss that the early origins are in euchromatin and late origins are in eterochromatin. Where do you envision the rest of the origins of the 30 to 40 percent that replicate randomly? Are they sort of schizophrenic? Do they have a hard time deciding whether they're heterochromatin or euchromatin?
Dutta: When we did the interphase FISH studies, we could determine whether the pan-S phase replication was due to intercellular variation or interallelic variation.
It turned out to be due to interallelic variation. That makes us think that the pan-S phase areas are schizophrenic in the sense that the two alleles are in different chromatin environments. Some of them are in the open chromatin, so they replicate early and other alleles are in the heterochromatin, so they replicate late. When we look at the global replication signature, we see replication in multiple quarters of S-phase.
Botchan: Your observation that one allele has one phenotype and the other allele has another phenotype would help to explain the data, but another way to look at it is that the origins chosen for a particular time have particular types of proteins associated with them and those proteins specify whether they will replicate early or late. If an origin doesn't have those proteins, then the replication is random.
That would give you some kind of differentiation beyond euchromatin and heterochromatin, because there are data that show there are things that replicate very early in S phase and things that replicate a little bit later and things that replicate even later. That might require specialized proteins.
Botchan: What do you think is the reason for having early versus late replication?
 Dutta: One possibility is that early replication is a by-product of the fact that some segments of the chromosomes have very active transcription. So it's not that early replication is necessarily required, but the active transcription allows these parts of the chromosomes to be more open and attract replication factors earlier. So, the early replication is just a correlative feature rather than something required for the cell's survival.
Botchan: So you don't hold to the possibility that replication origins are at least sometimes important cis-acting genetic elements? What I'm thinking of is our own work with chorion amplification. And there, we know that replication can be uncoupled from transcription because the amplification occurs before the transcription.
Dutta: I don't believe that replication origins contain important cis-acting elements that interact with replication factors in a temporally specific manner regardless of where the origin is located in the chromosome. I'm slightly discouraged that the data is tracking almost entirely with chromatin structure, gene density, and RNA levels. If there is a cis-acting genetic element that dictates an origin, it's possible that it's pretty heterogeneous, like promoters are. When people first looked at promoters, they thought they were very defined-there was a TATA box and a CCAT box and all sorts of things like that. And gradually that definition is becoming weaker and weaker. Now it appears that different combinations of factors can bind to make a promoter.
Botchan: Yes. But the details are important in understanding the heterogeneity and selection mechanisms. I think it is important to keep the possibility open that there really is a mechanism that determines at least a few of the origins.
So going back to some old ideas, in yeast it certainly appears as if you can get by with very few origins. I believe that there was a suppressor of CDC7, an MCM mutant, that had almost no phenotype except that late origins were not fired and the cells became very sensitive to radiation. So maybe one reason for having late origins and early origins is to restart the genome.
In prokaryotes, you have primosomes and a very elaborate restart mechanism so that if a replication fork is stalled, it can get going again. Now maybe in eukaryotes, you have that kind of restart mechanism, but you also have late origins.
Dutta: Yes. That's a possibility. So, there would be very strong selection to specify at least a few origins that were discretely late. That way, when the early origins get stuck, and they inevitably do, you can finish off the chromosome. You probably wouldn't see the phenotype until after many divisions.
Botchan: The other possibility, of course, is the old idea that you replicate a region to clear all the proteins off the DNA so that you can reprogram and get new factors on. The assembly of those factors is coupled to the growing fork.
As you say in your paper, we now have hundreds of examples we can use to understand the rules that determine the specificity. And once we start getting these details, we'll be able to make more sense out of what the mechanisms are and why they are there in the first place.
Validating and creating standards for ChIP-on-chip data
Dutta: Do you worry about the fact that there is a lot of noise in the data? Why is it so jumpy and why do you need the smoothing-out algorithms to get a reasonable data set?
Botchan: It's an issue, but I don't worry about it too much, because I think in the end, the important issue is what we have learned from the pattern and to determine whether there is anything we can do to allow us to go further. So, for example, I look at the smoothed data in your paper and I see a few regions that look like well-defined early origins and well-defined late origins. Then we can use that to go further and define what's special about those DNA sequences. The data becomes useful if important conclusions can be followed up.
Dutta: I have some interesting results from the recent ENCODE work we've done. The ENCODE Project is looking at the same DNA segments, but they are using different techniques. We find that our early-replicating areas are co-segregating with areas rich in histone modifications. And the late-replicating areas are relatively depleted of them.
Botchan: Right, so you have a completely independent mechanism of experimentation that allows you to validate your conclusions.
Dutta: We are still looking at the temporal profile, but once you come to the narrow 10 to 30 kb area where we think the origins are, how much of that has been validated? For example, in Drosophila, have you gone back and looked at the 60 origins that David MacAlpine has mapped and seen whether they are origins by any other criteria?
Botchan: No. I think that's really where we are now. Even in very discrete amplification zones, if you look very carefully at where replication initiates and where the origins are, the way we used to do with viruses, it turns out that it's not a discrete point. So, I think that we are looking at something that we don't yet understand. Even the deposition of ORC (origin recognition complex) may not independently control where the first nucleotides are going to be laid down in a given cell.
Dutta: One thing I learned during these
experiments is that ChIP-on-chip studies, at least in the mammalian cells, give widely differing results from platform to platform. How do you think we can best validate data from ChIP-on-chip experiments?
Botchan: We're going back and very carefully taking the ChIP-on-chip data and validating it with quantitative PCR. We're trying to get just classes we want to study. So, the ChIP-on-chip data tells us that in repression, sometimes Myb is a silent partner and E2F2 is doing the targeting. We have other examples where E2F is the silent partner. E2F is not required, but Myb is. We couldn't have found those without the ChIP-on-chip data, but then we go back and validate it with quantitative PCR. Using MacAlpine's data, we're getting close to 100 percent validation.
Dutta: In mammalian cells we're not getting that kind of validation rate.
Botchan: A lot of it may depend on reagents and how the thing is done. I would guess that the more you're trying to do, the more errors are going to be made. Let me give you a simple example. If you try to do ChIP-chip with ORC, at some sites, it turns out that it's very hard to see ORC1, and at other places, you can see ORC1, but it's very hard to see ORC5.
We have the same thing with the Myb complex. When we were first getting ChIP-chip data with the Myb complex, where Myb is a silent partner, Myb itself doesn't give nearly as good a signal as
E2F2. But when we go to places where E2F2 is a silent partner, Myb gives us a much better signal than E2F2. So why is that?
Our working hypothesis is that it has to do with the conformation of the complex at these different sites. So at site 1, where Myb is a silent partner, Myb is not terribly exposed and E2F2 is, and it changes at the other site.
Dutta: I'd think that if it's just biological variation, then E2F in Berkeley and E2F at MIT would produce the same patterns.
Botchan: Yes. That depends on whether they're using the same kind of polyclonal antibodies with the same affinities and using the same cut-offs with the same p values.
Dutta: One of the ways that I'd like to follow up on the replication experiment is to do the ORC ChIP-on-chip, but I've been sort of put off by the fact that groups in the ENCODE project have compared data on the same region in mammalian cells and they're finding lots of differences between the data sets.
Botchan: Have people exchanged reagents?
Dutta: They are doing that now. Some
experiments have been done with the exchanged reagents, but it's still early to know whether the issues will be resolved that way.
|
FOR MORE INFORMATION
|
Contact
Information
Anindya
Dutta, M.D., Ph.D. Harry F. Byrd Professor Department of Biochemistry
& Molecular Genetics
University of Virginia Health System Jordan Hall Box 800733 Charlottesville,
VA 22908
ad8q@virginia.edu
Laboratory
website
Michael
Botchan, Ph.D. Professor
Department of Molecular & Cell Biology University of California, Berkeley
16 Barker Hall
Berkeley, CA 94720 mbotchan@berkeley.edu
Companies
Affymetrix
Inc.
Organizations
University
of Virginia |
University
of California, Berkeley
The
ENCODE Project Further
Reading
Jeon
Y, Bekiranov S, Karnani N, Kapranov P, Ghosh S, MacAlpine D, Lee C, Hwang
DS, Gingeras TR, Dutta A. Temporal profile of replication of human chromosomes.
Proc Natl Acad Sci USA 2005; 102(18):6419-24
Machida
YJ, Teer JK, Dutta A. Acute reduction of an origin recognition complex
(ORC) subunit in human cells reveals a requirement of ORC for Cdk2 activation.
J Biol Chem. 2005; 280(30):27624-30.
Bell
S P. and Dutta A. Initiation of DNA replication in eukaryotic cells. Annu
Rev Biochem. 2002; 71, 333-374. |
Bandura
JL, Beall EL, Bell M, Silver HR, Botchan MR, Calvi BR. humpty dumpty is
required for developmental DNA amplification and cell proliferation in Drosophila.
Curr Biol. 2005; 15(8):755-9.
Remus
D, Beall EL, Botchan MR. DNA topology, not DNA sequence, is a critical determinant
for Drosophila ORC-DNA binding. EMBO J. 2004; 23(4):897-907.
MacAlpine
DM, Bell SP. A genomic view of eukaryotic DNA replication. Chromosome
Res. 2005; 13(3): 309-26 People
David
MacAlpine |
|