When the whole-genome shotgun sequence of the Drosophila genome was assembled, it comprised 134 scaffolds made up of 1636 contigs.
How can sequence gaps be closed?
Table of contents
- 1. Introduction to Genetics51m
- 2. Mendel's Laws of Inheritance3h 37m
- 3. Extensions to Mendelian Inheritance2h 41m
- 4. Genetic Mapping and Linkage2h 28m
- 5. Genetics of Bacteria and Viruses1h 21m
- 6. Chromosomal Variation1h 48m
- 7. DNA and Chromosome Structure56m
- 8. DNA Replication1h 10m
- 9. Mitosis and Meiosis1h 34m
- 10. Transcription1h 0m
- 11. Translation58m
- 12. Gene Regulation in Prokaryotes1h 19m
- 13. Gene Regulation in Eukaryotes44m
- 14. Genetic Control of Development44m
- 15. Genomes and Genomics1h 50m
- 16. Transposable Elements47m
- 17. Mutation, Repair, and Recombination1h 6m
- 18. Molecular Genetic Tools19m
- 19. Cancer Genetics29m
- 20. Quantitative Genetics1h 26m
- 21. Population Genetics50m
- 22. Evolutionary Genetics29m
15. Genomes and Genomics
Sequencing the Genome
Problem 8
Textbook Question
You have just obtained 100 kb of genomic sequence from an as-yet-unsequenced mammalian genome. What are three methods you might use to identify potential genes in the 100 kb? What are the advantages and limitations of each method?

1
Examine the sequence for open reading frames (ORFs): ORFs are stretches of DNA that begin with a start codon (e.g., AUG) and end with a stop codon (e.g., UAA, UAG, UGA). Use computational tools to scan the sequence for ORFs that are long enough to potentially encode functional proteins. Advantage: This method is straightforward and computationally efficient. Limitation: It may miss genes with non-standard coding sequences or those interrupted by introns.
Search for conserved sequences using comparative genomics: Compare the 100 kb sequence to known gene sequences in other mammalian genomes using tools like BLAST. Conserved regions may indicate functional genes. Advantage: This method can identify genes based on evolutionary conservation. Limitation: It may not detect species-specific genes or genes with low conservation.
Analyze for regulatory elements and splice sites: Use software to identify promoter regions, enhancers, and splice junctions that are indicative of gene presence. Look for motifs such as TATA boxes or CpG islands near potential coding regions. Advantage: This method helps identify genes based on transcriptional and post-transcriptional regulatory features. Limitation: It requires prior knowledge of regulatory motifs and may not work well for genes with atypical regulatory structures.

This video solution was recommended by our tutors as helpful for the problem above
Video duration:
1mPlay a video:
Was this helpful?
Key Concepts
Here are the essential concepts you must grasp in order to answer the question correctly.
Gene Prediction Algorithms
Gene prediction algorithms are computational tools used to identify potential genes within a genomic sequence. These algorithms analyze sequence features such as open reading frames (ORFs), splice sites, and promoter regions to predict gene locations. While they can efficiently process large sequences, their accuracy can vary based on the quality of the input data and the specific algorithm used.
Recommended video:
Guided course
Mapping Genes
Comparative Genomics
Comparative genomics involves comparing the genomic sequence of the unsequenced mammalian genome with those of well-characterized genomes. By identifying conserved sequences across species, researchers can infer the presence of genes and their functions. This method is powerful for identifying evolutionary conserved genes but may miss species-specific genes that are not conserved.
Recommended video:
Guided course
Genomics Overview
Transcriptome Analysis
Transcriptome analysis, often performed through RNA sequencing, examines the complete set of RNA transcripts produced in a cell or tissue. By analyzing the transcriptome, researchers can identify expressed genes and their variants. While this method provides direct evidence of gene activity, it requires prior knowledge of the conditions under which the RNA was collected and may not capture all potential genes, especially those that are not actively expressed.
Recommended video:
Guided course
Chi Square Analysis
Related Videos
Related Practice
Textbook Question
354
views