NHGRI
NHGRI
DIR
Labs/Offices
Resources
Investigators
News
Education

 

Human Chromosome 7 Mapping and Sequencing

Genome Technology Branch

DIR Projects



 

Discussion


  STS Order

  Theoretical Considerations

  Comparison to Other Chromosome 7 Physical Maps

  Comparison to Physical Maps of Other Human Chromosomes

  Problems Encountered

  Utility of Map


STS Order

As part of the ongoing Human Genome Project, we have constructed a highly integrated and annotated physical map of human chromosome 7 that provides an average STS spacing of ~79 kb. YAC-based STS-content mapping was used to determine the local order of STSs within individual contigs, while integration with the genetic and RH maps provided a global scaffold for establishing the long-range order and orientation of contigs. While the integrated physical map likely provides a generally accurate order of the STSs, it is worth cautioning that occasional minor errors in the local STS order have inevitably occurred. In the process of assembling BAC contigs for some chromosomal regions, we have already encountered some minor local reordering of STSs, although in no instance has the position of an STS been found to differ substantially from that deduced by YAC-based. Thus, as with all DNA maps, the accuracy of our STS-based physical map decreases at higher levels of resolution.

The independent analysis of overlapping subsets of STSs by genetic-, RH-, and YAC-based mapping methods (see Figure 1) provides the ability to compare the STS order deduced by each approach. For example, the Genethon genetic and RH maps are concordant with the YAC-based physical map for 93% and 95% of the commonly mapped STSs, respectively. Admittedly, these numbers should be viewed cautiously, since these other maps were used to help guide the construction of the YAC contig map. In contrast, the physical mapping of chromosome 7 CHLC genetic markers (Buetow et al. 1994; Murray et al. 1994; Sheffield et al. 1995) was performed such that their positions on the genetic map were not examined until after assembly of the YAC contigs. Nonetheless, the physical and genetic orders deduced for the CHLC genetic markers are highly consistent . This is best illustrated with two CHLC chromosome 7 genetic maps: (1) the version 4.0 recombination-minimization extended Weber V6 screening set map, which consists almost exclusively of CHLC genetic markers; and (2) the version 4.0 recombination-minimization integrated map, which mainly consists of CHLC and Genethon genetic markers (see http://www.chlc.org:80/ChlcMaps.html). Of the 31 markers in the first map, 24 are also localized on the YAC-based physical map, with the identical order established for 23 (96%) of these markers. Of the 65 markers in the second map, 59 are also localized on the YAC-based physical map, with the identical order established for 57 (97%) of these markers.

Theoretical Considerations

A number of theoretical studies have attempted to predict the outcome of physical mapping projects employing strategies similar to our approach for mapping chromosome 7 (Arratia et al. 1991; Barillot et al. 1991; Green and Green 1991; Palazzolo et al. 1991; Zhang and Marr 1993; Nelson and Speed 1994; Port et al. 1995). Many of these included simulations that provided insight about the likely map characteristics based on various assumptions relating to the specific methods, reagents, and other factors. While it would be of interest to directly compare our chromosome 7 physical map with the predicted results derived from these simulations, such an analysis is actually quite difficult to perform. Specifically, the heterogeneous nature of many aspects of our project (e.g., YACs from various libraries with markedly different insert sizes and chimerism rates, a mixture of STSs obtained at random and from strategically selected YACs, the use of alternate maps to help guide contig assembly) was not properly accounted for in these studies, including the one performed at the initiation of our project (Green and Green 1991). It is worth noting that most of these studies predicted that our mapping effort should have yielded slightly better continuity with fewer total contigs. This difference almost certainly reflects specific features of the human genome that were not readily considered in the simulations, such as highly complex chromosomal structures as well as GC-rich regions and other DNA segments that are less readily recovered in YACs.

Comparison to Other Chromosome 7 Physical Maps

Several other large-scale efforts have aimed to construct YAC-based physical maps of chromosome 7. These have included both genome-wide and chromosome 7-specific mapping projects. In the latter cases, the mapping strategies have emphasized hybridization-based analyses with various probes rather than STS-content mapping (Scherer et al. 1993; Kunz et al. 1994; e.g., see http://www.genet.sickkids.on.ca/chromosome7). In addition, a separate YAC library constructed from a chromosome 7-containing human-rodent hybrid cell line was used (Scherer et al. 1992). Thus, our map has little in common with respect to either the markers or YACs, precluding rigorous map-to-map comparisons.

The CEPH-Genethon group utilized several experimental approaches for building a YAC-based physical map of the human genome (Bellanne-Chantelot et al. 1992; Cohen et al. 1993; Chumakov et al. 1995; see http://www.cephb.fr/bio/ceph-genethon-map.html). These included hybridization-based fingerprint analysis of a total genomic YAC library (Bellanne-Chantelot et al. 1992), cross-hybridization of Alu-PCR products from individual YACs to arrayed Alu-PCR products from the entire YAC library, assignment of YACs to individual chromosomes by hybridization with Alu-PCR probes generated from human-rodent hybrid cell lines, PCR-based assignment of Genethon genetic markers to YACs, and FISH-based assignment of YACs to individual chromosomes (Bray-Ward et al. 1996). Using a suite of computer programs (e.g., Quickmap), the resulting data can be analyzed to yield contig maps in the form of YAC paths that cover the majority of the genome, including most of chromosome 7. Importantly, the CEPH-Genethon physical map is essentially a clone-based map, as opposed to a landmark-based map (such as a YAC-based STS-content map). Direct comparison of the CEPH-Genethon map with our chromosome 7 map is difficult, since only a subset of the YACs and an even smaller proportion of the markers are in common. Nonetheless, there appears to be general consistency between the maps, especially with respect to the identification of CEPH YACs containing Genethon genetic markers. Of note, where it is possible to align the two maps, the CEPH-Genethon map does not seem to either provide clone coverage across any of our gaps or significantly extend any of our YAC contigs.

In contrast to these studies, the genome-wide physical mapping effort by the Whitehead/MIT Genome Center was more analogous to our project (Hudson et al. 1995). In this case, large numbers of STSs were developed and mapped to CEPH YACs and/or whole-genome RH cell lines. The local order of STSs was determined by YAC-based STS-content mapping, while the longer-range order and orientation of assembled contigs was established based on the RH mapping data and the positions of Genethon genetic markers. The major similarities between our chromosome 7 physical map and the Whitehead/MIT map are the general strategic approach (STS-content mapping of YACs coupled with genetic and RH mapping), the use of sequences from various sources for generating STSs, and the apparent coverage of >95% of chromosome 7 by the resulting map. However, a number of important differences are also evident. For example, the Whitehead/MIT map relies heavily on the STS order established by RH mapping; in fact, a large fraction of the STSs were never mapped to YACs. Furthermore, the YACs themselves were exclusively derived from the CEPH YAC library, with its associated 40-60% chimerism rate (Green et al. 1991b), whereas a large fraction of the clone coverage in our map is provided by hybrid cell line-derived YACs with a 10-15% chimerism rate (Green et al. 1995a). The latter feature also prompted us to develop large numbers of STSs from YAC insert ends, an activity not performed in constructing the Whitehead/MIT map. Finally, some of the gaps in the Whitehead/MIT map were filled on the basis of available CEPH-Genethon mapping data (e.g., YAC fingerprint analysis), whereas we only used STS-content data for constructing contigs. Direct map-to-map comparisons (based on Release 11, October, 1996; see http://www-genome.wi.mit.edu; L. Hui and L. Stein, personal communication) reveal several important numerical differences, including (Whitehead/MIT map vs. our map, respectively, in each case): (1) number of STSs mapped to either YACs or RH cell lines (1001 vs. 2150); (2) average spacing of mapped STSs (~170 kb vs. ~79 kb); (3) number of STSs mapped to YACs (659 vs. 2150); (4) number of STSs mapped to RH cell lines (696 vs. 259); (5) number of Genethon genetic markers mapped to YACs (208 vs. 260); and (6) number of YACs in contigs (968 vs. 3892).

Because of the similar strategies employed, we can more readily align the Whitehead/MIT chromosome 7 physical map with our map. In no instance does their map provide YAC connectivity between adjacent Genethon genetic markers that we were unable to establish. Thus, none of our remaining gaps appear to be filled in their map. Remarkably, there are only minor apparent discrepancies between the two maps, while there is generally good correlation with respect to the presence of notably large or small YAC contigs at specific locations along the chromosome.

Comparison to Physical Maps of Other Human Chromosomes

At a minimum, first-generation clone-based physical maps have been constructed for all human chromosomes. In most instances, the best available maps are those constructed by the genome-wide mapping efforts (Bellanne-Chantelot et al. 1992; Cohen et al. 1993; Chumakov et al. 1995; Hudson et al. 1995). However, in some cases, more focused efforts have produced more refined physical maps of individual chromosomes, including those for chromosomes 3 (Gemmill et al. 1995), 4 (see http://shgc.stanford.edu), 11 (Quackenbush et al. 1995; Qin et al. 1996), 12 (Krauter et al. 1995), 16 (Doggett et al. 1995), 19 (Ashworth et al. 1995), 21 (Chumakov et al. 1992a), 22 (Bell et al. 1995; Collins et al. 1995), X (Crollius et al. 1996; Nagaraja et al. 1997), and Y (Foote et al. 1992; Vollrath et al. 1992). However, there is significant differences among these maps, especially with respect to the reagents utilized (e.g., clones, markers), mapping strategy, and overall map resolution. The goal of 100-kb average STS spacing (Collins and Galas 1993) has thus far been achieved with the chromosome 7 map reported here and that constructed by a very similar strategy for the X chromosome (Nagaraja et al. 1997).

Problems Encountered

The experience gained by the rigorous implementation of the YAC-based STS-content mapping paradigm has provided important insight about the virtues and problems of this physical mapping approach. The virtues are generally apparent-- the strategy is quite effective for almost all segments of human DNA. However, specific problems are also encountered that hinder the mapping of specific regions; of course, such problems are neither unexpected nor unique to an STS-based approach.

The most difficult problems encountered relate to the presence of large non-unique blocks of DNA on human chromosomes. Unlike highly repetitive DNA sequences (e.g., Alu and L1 elements), these repeated segments are present only a few times in the genome. For our project, the use of a YAC collection highly enriched for chromosome 7 DNA helped to minimize problems caused by duplicated segments present on other chromosomes. However, several examples of duplicated blocks of DNA confined to chromosome 7 were encountered. In general, when the duplicated segment is small relative to the typical YAC size, the problem can often be overcome by simply discarding those STSs residing within the repeated DNA. However, when the duplicated segment is large, elimination of the non-unique STSs often precludes the establishment of contiguous clone coverage across the duplicated regions. In the latter case, one is thus presented with the dilemma of eliminating problematic STSs at the expense of continuity, a situation difficult to resolve.

A less severe problem is the failure to isolate certain genomic regions in YACs. For chromosome 7, we found this situation to be rare (1% of the STSs analyzed), similar to the experience for other human chromosomes (Chumakov et al. 1992a; Foote et al. 1992; Hudson et al. 1995; Qin et al. 1996). Thus, only a very small fraction of the human genome appears to be absent in carefully constructed YAC-based physical maps. Admittedly, there is evidence that some of the DNA that cannot be isolated in YACs is gene-rich and should thus not be simply ignored.

Utility of Map

The value of the physical map reported here should ultimately be assessed based on how it facilitates other experimental efforts. One set of studies that have already been greatly enhanced are those aiming to isolate chromosome 7 genes associated with genetic disease by a positional cloning strategy (Collins 1995b). A number of important advances have already been made as a direct result of our physical map (Lewanda et al. 1994; Curran et al. 1995; Johnson et al. 1995; Keen et al. 1995; Torigoe et al. 1995; Hoglund et al. 1996; Le Beau et al. 1996; McGuire et al. 1996; Reynolds et al. 1996; Howard et al. 1997). In studies such as these, the most common entry point into the physical map is through specific genetic markers, for example, those defining a critical region harboring a gene of interest. Thus, the presence of >430 genetic markers on our physical map should prove valuable for positional cloning efforts. Likewise, the established locations of >350 gene/EST-specific STSs as well as near complete clone coverage across the chromosome should facilitate virtually any positional cloning project involving chromosome 7.

Perhaps the most important utility of our physical map, however, will be to provide the necessary infrastructure for sequencing chromosome 7. The Human Genome Project is currently at a critical transition point, as the emphasis shifts from assembling a map of the human genome to elucidating its sequence (Olson 1995; Boguski et al. 1996). Ironically, despite the large investment in up-front physical mapping of the human genome and the relatively low per-base-pair cost of physical mapping compared to sequencing, the availability of high quality physical maps and supporting clone coverage of the mapped regions remains a rate-limiting step in the sequencing of the human genome. Our STS-based physical map provides an excellent starting point for deriving the necessary clones for large-scale sequencing. For example, similar to the strategy proposed by Hudson et al. (1995), we are actively using our STS-specific PCR assays to isolate corresponding BAC clones. While the average STS spacing across our map is ~79 kb, the precise distance between adjacent STSs is highly variable. As such, we are encountering many situations where multiple STSs are present in a single 100-200 kb BAC as well as other instances where the interval between adjacent STSs cannot be spanned by any one BAC. For the latter cases, achieving clone continuity will require the generation of new STSs from the BAC insert ends and subsequent walking, similar to the strategy we employed for constructing YAC contigs. Our experience to date, which includes the isolation of BACs that together account for >40 Mb of chromosome 7 (unpublished data), indicates that long-range BAC coverage can be achieved starting with the STS map reported here.

A distinctive feature of our map relevant to the sequencing of chromosome 7 is the frequent presence of YACs derived from a monochromosomal hybrid cell line. These YACs represent virtually all of the chromosome and are rarely chimeric. Such clones may be valuable for the systematic sequencing of the chromosome. For example, a sizable fraction (~20%) of the nematode genome has been difficult to isolate in large-insert bacterial clones (Berks and the C. elegans Genome Mapping and Sequencing Consortium 1995; Hodgkin et al. 1995; Waterston and Sulston 1995). For these regions, YACs (Coulson et al. 1988; Coulson et al. 1991) are proving to be critical sources of sequencing templates (Vaudin et al. 1995). Should an analogous situation be encountered with chromosome 7, then the availability of well-mapped, high-quality YACs should facilitate the complete sequencing of the chromosome. Towards that end, we have taken great care in the handling and growth of these clones, so as to minimize serial propagation and the opportunity for rearrangement of the cloned inserts. Of note, several overlapping pairs of hybrid cell line-derived YACs have been subcloned and analyzed at very high resolution, with no evidence of any clone-to-clone variation (Wong et al. 1997). Thus, at least for the cases examined to date, these YACs appear to represent a reliable source of cloned DNA that can be used, when needed, for sequencing segments of chromosome 7.

The sequences of various chromosome 7 BACs and YAC-derived cosmids (generated in collaboration with the Washington University Genome Sequencing Center and the University of Washington Genome Center, respectively; see Table 2) are now being established and deposited in GenBank at a steady pace. It is our expectation that chromosome 7 will be among the first few human chromosomes completely sequenced, perhaps around the turn of this century.

PREVIOUS

Human Chromosome 7 Physical Map

NEXT





| NHGRI | DIR | Staff | Labs/Offices | Resources |
| News | Education | Search |

bioinformatics@nhgri.nih.gov