Genomic Libraries 

Vectors 
Biology 
Vectors 
Genomic


The bacteria transformed with the plasmid vector ligation reaction, or the phage vector ligation reaction packaged in vitro represent
Each bacterial transformant (plasmid vectors) or in vitro packaged virus particle (phage vectors) represents a single vector molecule joined to a unique foreign DNA fragment. Since our objective in constructing such a library is to identify and isolate a specific foreign DNA fragment, it is important that all DNA sequences in the sample are represented by one or more individual recombinant(s).


Foreign Genomic DNA Fragment Preparation
We have already looked at the pattern of genomic fragments generated by restriction enzyme digestion of genomic DNA. Given the size limitations imposed by the various vector systems, In order to ensure that all genomic sequences are represented in our libraries, we must generate random genomic fragments and make a probability estimate of the probability with which our specific DNA fragment will be included in a statistically relevant number or recombinant molecules. To generate the random genomic DNA fragments and ligate them into our selected vector , we focus our attention on pairs of enzymes which generate compatable ends to facilitate ligation.


In order to generate a random population of genomic DNA fragments using an enzyme that cuts frequently, we use partial digest conditions where then enzyme only cuts the genome at a fractions of its recognition sites (limit enzyme concentration or time of digest or both).


Since the recognition sites occur frequently and we cut them randomly, the ends of the fragments we produce are a random selection of genomic sites.
Following partial digestion of the genomic DNA, we isolate an appropriate size class of genomic fragments  each end of each fragment represents one of many possible SauIIIa sites. These fragments are then ligated into a BamHI digested vector  either plasmid or lambda insertion or substitution vector. Each recombinant transformant or packaged phage resulting from this ligation represents an individual random genomic fragment. This produces a population of recombinant molecules that contain many overlapping genomic fragments  any given genomic sequence will be represented by several of these overlapping fragments.


Since this library contains a random population of genomic fragments we need to use a statistical approach to determining how many independant recombinant plasmids or phage are needed to ensure that all genomic sequences are represented in the library.
There are two statistical approaches used to estimate the number of recombinants required. The first is the 'lazy molecular biologist's rule of thumb': First, we calculate the number of individual recombinants that contain sufficient genomic DNA sequence to represent one complete genome  the genome equivalent.




for the human genome, using an average plasmid insert size of 6 kb, 1 genome equivalent = (3 x 10^{9} bp) / (6 x 10^{3} bp) = 5 x 10^{5} plasmids. for the human genome, using a phage substitution vector insert size of 20 kb


The genome equivalent assumes that the genomic inserts are nonoverlapping. In reality, these are random overlapping fragments that are follow a Gausian fashion across the genome. To approximate this distribution the 'rule of thumb' stipulates that 5 genome equivalents are necessary to ensure that 10 genome equivalents are necessary to ensure that


The second approach uses the simple statistical formula:


N = 






where p = probability (0.95 or 0.99) and f = fraction of the genome contained in a single average insert


The molecular biologist rule of thumb overestimates the number of recombinants needed to give complete representation of the genome  but then you can never really have TOO many recombinants can you?
For the human genome, a random shotgun plasmid library constructed using genomic inserts of 6 kb needs to contain at least


5 genomic equivalents = 5 ( 3 x 10^{9} bp ) / ( 6 x 10^{3} bp ) altenatively N = ln ( 1  p )/ ln ( 1  f ) = ln ( 1  0.95 ) / ( 1  ( 6 x 10^{3} / 3 x 10^{9} ) for 99% coverage 10 genomic equivalents = 10 ( 3 x 10^{9} bp ) / ( 6 x 10^{3} bp ) alternatively N = ln ( 1  0.99 ) / ln ( 1  0.000002)


a random shotgun lambda substitution library constructed using genomic inserts of 20 kb needs to contain at least


5 genomic equivalents = 5 ( 3 x 10^{9} bp ) / ( 2 x 10^{4} bp ) alternatively, N = ln ( 1  0.95) / ln ( 1  (2 x 10^{4} / 3 x 10^{9})) for 99% coverage 10 genome equivalents = 1.5 x 10^{6} clones N = ln ( 1  0.99) / ln (1  6.6 x 10^{6})


go to 
