12 Bio

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Class Notes

Class: XII Topic: HUMAN GENOME PROJECT

Subject: BIOLOGY

HUMAN GENOME PROJECT


The sequence of bases in DNA determines the genetic information of a given organism. In other words,
genetic make-up of an organism or an individual lies in the DNA sequences. If two individuals differ, then
their DNA sequences should also be different, at least at some places. These assumptions led to the quest of
finding out the complete DNA sequence of human genome. With the establishment of genetic engineering
techniques where it was possible to isolate and clone any piece of DNA and availability of simple and fast
techniques for determining DNA sequences, a very ambitious project of sequencing human genome was
launched in the year 1990.
Human Genome Project (HGP) was called a mega project. You can imagine the magnitude and the
requirements for the project if we simply define the aims of the project as follows: Human genome is said to
have approximately 3 x 109 bp, and if the cost of sequencing required is US $ 3 per bp (the estimated cost in
the beginning), the total estimated cost of the project would be approximately 9 billion US dollars. Further,
if the obtained sequences were to be stored in typed form in books, and if each page of the book contained
1000 letters and each book contained 1000 pages, then 3300 such books would be required to store the
information of DNA sequence from a single human cell. The enormous amount of data expected to be
generated also necessitated the use of high speed computational devices for data storage and retrieval, and
analysis. HGP was closely associated with the rapid development of a new area in biology called
Bioinformatics.
Goals of HGP Some of the important goals of HGP were as follows:
(i) Identify all the approximately 20,000-25,000 genes in human DNA;
(ii) Determine the sequences of the 3 billion chemical base pairs that make up human DNA;
(iii) Store this information in databases;
(iv) Improve tools for data analysis;
(v) Transfer related technologies to other sectors, such as industries;
(vi) Address the ethical, legal, and social issues (ELSI) that may arise from the project.

The Human Genome Project was a 13-year project coordinated by the U.S. Department of Energy and the
National Institute of Health. During the early years of the HGP, the Wellcome Trust (U.K.) became a major
partner; additional contributions came from Japan, France, Germany, China and others. The project was
completed in 2003. Knowledge about the effects of DNA variations among individuals can lead to
revolutionary new ways to diagnose, treat and someday prevent the thousands of disorders that affect
human beings. Besides providing clues to understanding human biology, learning about non-human
organisms DNA sequences can lead to an understanding of their natural capabilities that can be applied
toward solving challenges in health care, agriculture, energy production, environmental remediation. Many
non-human model organisms, such as bacteria, yeast, Caenorhabditis elegans (a free living non-pathogenic
nematode), Drosophila (the fruit fly), plants (rice and Arabidopsis), etc., have also been sequenced
Methodologies
The methods involved two major approaches. One approach focused on identifying all the genes that are
expressed as RNA (referred to as Expressed Sequence Tags (ESTs). The other took the blind approach of simply
sequencing the whole set of genome that contained all the coding and non-coding sequence, and later
assigning different regions in the sequence with functions (a term referred to as Sequence Annotation).
For sequencing, the total DNA from a cell is isolated and converted into random fragments of relatively
smaller sizes (recall DNA is a very long polymer, and there are technical limitations in sequencing very long
pieces of DNA) and cloned in suitable host using specialised vectors. The cloning resulted into amplification of
each piece of DNA fragment so that it subsequently could be sequenced with ease.

The commonly used hosts were bacteria and yeast, and the vectors were called as BAC (bacterial artificial
chromosomes), and YAC (yeast artificial chromosomes). The fragments were sequenced using automated
DNA sequencers that worked on the principle of a method developed by Frederick Sanger. (Remember,
Sanger is also credited for developing method for determination of amino acid sequences in proteins). These
sequences were then arranged based on some overlapping regions present in them. This required generation
of overlapping fragments for sequencing. Alignment of these sequences was humanly not possible. Therefore,
specialised computer based programs were developed

These sequences were subsequently annotated and were assigned to each chromosome. The sequence of
chromosome 1 was completed only in May 2006 (this was the last of the 24 human chromosomes – 22
autosomes and X and Y to be sequenced).

Another challenging task was assigning the genetic and physical maps on the genome. This was generated
using information on polymorphism of restriction endonuclease recognition sites, and some repetitive DNA
sequences known as microsatellites (one of the applications of polymorphism in repetitive DNA sequences
shall be explained in next section of DNA fingerprinting).

Salient Features of Human Genome Some of the salient observations drawn from human genome project are
as follows:
(i)The human genome contains 3164.7 million nucleotide bases.
(ii) The average gene consists of 3000 bases, but sizes vary greatly, with the largest known human gene
being dystrophin at 2.4 million bases.
(iii) The total number of genes is estimated at 30,000–much lower than previous estimates of 80,000 to
(iv)1,40,000 genes. Almost all (99.9 per cent) nucleotide bases are exactly the same in all people.
(v) The functions are unknown for over 50 per cent of the discovered genes.
Less than 2 per cent of the genome codes for proteins.
(vi) Repeated sequences make up very large portion of the human genome.
(vii)Repetitive sequences are stretches of DNA sequences that are repeated many times,

Sometimes hundred to thousand times. They are thought to have no direct coding functions, but they shed
light on chromosome structure, dynamics and evolution. Chromosome 1 has most genes (2968), and the Y has
the fewest (231). Scientists have identified about 1.4 million locations where singlebase DNA differences
(SNPs – single nucleotide polymorphism, pronounced as ‘snips’) occur in humans. This information promises
to revolutionise the processes of finding chromosomal locations for disease-associated sequences and tracing
human history.

Applications and Future Challenges Deriving meaningful knowledge from the DNA sequences will define
research through the coming decades leading to our understanding of biological systems. This enormous task
will require the expertise and creativity of tens of thousands of scientists from varied disciplines in both the
public and private sectors worldwide. One of the greatest impacts of having the HG sequence may well be
enabling a radically new approach to biological research. In the past, researchers studied one or a few genes
at a time. They can study all the genes in a genome, for example, all the transcripts in a particular tissue or
organ or tumor, or how tens of thousands of genes and proteins work together in interconnected networks
to orchestrate the chemistry of life.
Note: This document is prepared from home.
x

You might also like