Sequencing

From BioInfo

Jump to: navigation, search

Contents

[edit] Introduction

The Huntsman Cancer Institute purchased an Illumina GA II next generation sequencer in early 2008 and placed it under the management of the Microarray Core Facility. This instrument has recently been upgraded to a GAIIx and is able to fully support single end and paired end sequencing with read lengths of either 36 or 76 bases. Bioinformatic assistance with high throughput sequencing data analysis is available through the Bioinformtic Core Facility.

[edit] Sample Submission

  • Follow the recommendations below and when in doubt contact Brian Dalley for instructions on sample preparation methods.
  • Any sample exposed to phenol or other organic solvents should be run through a Qiagen cleanup column prior to submission to avoid contaminants that may inhibit the enzymes used in the Illumina library preparation protocols.
  • A completed Work Authorization Form (University of Utah) work authorization form or an active Purchase Order must be on file at the Microarray Core Facility prior to submitting your samples for sequencing services.
  • GNomEx, the Microarray Core Facility LIMS (GNomEx) should be used for documenting experimental details regarding your Illumina sequencing request. All Illumina Genome Analyzer Sequencing requests should be electornically submitted through GNomEx prior to sending samples to the Microarray Core Facility
  • Sequencing access is currently provided on a first come/first serve basis for all lab groups at the University of Utah.
  • Sequencing services for researchers outside of the University of Utah system is also available.

[edit] Sample Preparation and Data Processing

  • Samples submitted for Illumina Genome Analyzer Sequencing will be analyzed by appropriate quality control measures (NanoDrop, Qubit, Bioanalyzer, gel electrophoresis) to access the quantity and quality of the sample (1 day).
  • Samples that pass quality control steps will be used for generating an Illumina sequencing library using the apprpriate sample prep kit (1-2 days).
  • Investigators do have the option of performing the Illumina library preparation protocol within their own laboratory. A fraction of the library can then be submitted to the Microarray Core facility for sequencing by defining the type of flowcell single read/paired-end)to run the library on. See the Solexa catalog.
  • Following generation of the sequencing libraries, an aliquot will be run on an Agilent DNA 1000 Bioanalzyer chip to validate quality and size range of the library.
  • Sequencing libraries will then be run on the Illumina Genome Analyzer (3 days for single-end runs; 5 days for paired-end runs).
  • The Illumina Genome Analyzer analysis pipeline will be run on the image files to call bases, generate reads, and map the reads to a genome build of choice (2 days). See pipeline user guide.
  • The reads and mapped reads will then be made available to the investigator. After one month the raw image files will be deleted.
  • Additional bioinformatic analysis can be requested from the Bioinformatic Core. We are currently developing tools for the analysis of chIP-seq, digital gene expression, transcriptome, and resequencing applications. See USeq.


[edit] Sample Pricing

[edit] Genomic DNA Sequencing Recommendations

  • Genomic DNA should be provided as high molecular weight DNA. Preferred sample preparation methods include the Qiagen DNeasy kit and the Qiagen Genomic-Tip System.
  • DNA samples should be treated with RNase. This step should be included and performed during the Qiagen DNeasy or Qiagen Genomic-tip purification methods.
  • Do not overload Qiagen columns during DNA purification. Overloading columns can introduce impurities (guanidine, protein, carbohydrate) that can have inhibitory activity during the Illumina library preparation protocol.
  • Avoid organic extraction methods (such as phenol or Trizol) to purify genomic DNA. Organic carryover can inhibit the enzymatic reactions used in Illumina library preparation. If an organic extraction method is used, this should be followed by purification on a Qiagen spin column. To remove residual organic contamination in a DNA sample it is recommended to use the Qiagen DNeasy kit.
  • The recommended quantity of genomic DNA needed to create an Illumina DNA sequencing library is 1-5 μg of DNA. The genomic DNA sample (50 μl) should be provided to the Microarray Core Facility at a concentration between 20-100 μg/μl. It is preferred that a quantity on the high end of this concentration range is provided.
  • A single library preparation will yield sufficient samples such that numerous lanes of DNA sequence analysis can be performed.
  • Genomic DNA quality can be accessed by running approximately 50 ng of the sample on a 1% agarose gel stained with Ethidium Bromide. Intact genomic DNA should appear as a high molecular weight (>10,000 bp) band with no lower molecular weight smear. A small amount of low molecular weight smear may be acceptable; however, this should be limited. A significant amount of visible low molecular smearing may be detrimental to library generation.
  • The genomic DNA sample will be fragmented by nebulization during the library preparation procedure. This process will uniformly reduce the size of the genomic DNA to an optimal range of 200-800 bp.


[edit] Array Capture/DNA Sequencing Recommendations

  • Consult with the Bioinformatics Core Facility on the design of a custom microarray that can be used for the Array Capture experiment.
  • Microarray Slides for Array Capture experiments can be designed in the following formats (arrays per slide x features per array): 1 x 1M; 1 x 244K; 2 x 105K.
  • Genomic DNA should be provided as high molecular weight DNA. Preferred sample preparation methods include the Qiagen DNeasy kit and the Qiagen Genomic-Tip System.
  • Do not overload Qiagen columns during DNA purification. Overloading columns can introduce impurities (guanidine, protein, carbohydrate) that can have inhibitory activity during the Illumina library preparation protocol.
  • DNA samples should be treated with RNase. This step should be included and performed during the Qiagen DNeasy or Qiagen Genomic-tip purification methods.
  • Avoid organic extraction methods (such as phenol or Trizol) to purify genomic DNA. Organic carryover can inhibit the enzymatic reactions used in Illumina library preparation. If an organic extraction method is used, this should be followed by purification on a Qiagen spin column. To remove residual organic contamination in a DNA sample it is recommended to use the Qiagen DNeasy kit.
  • The recommended quantity of genomic DNA needed to create an Illumina DNA sequencing library for Array Capture experiments is 1-5 μg of DNA. The genomic DNA sample (50 μl) should be provided to the Microarray Core Facility at a concentration between 20-100 μg/μl. It is preferred that a quantity on the high end of this concentration range is provided.
  • A single library preparation will yield sufficient samples such that numerous lanes of DNA sequence analysis can be performed.
  • Genomic DNA quality can be accessed by running approximately 50 ng of the sample on a 1% agarose gel stained with Ethidium Bromide. Intact genomic DNA should appear as a high molecular weight (>10,000 bp) band with no lower molecular weight smear. A small amount of low molecular weight smear may be acceptable; however, this should be limited. A significant amount of visible low molecular smearing may be detrimental to library generation.
  • The genomic DNA sample will be fragmented by nebulization during the library preparation procedure. This process will uniformly reduce the size of the genomic DNA to an optimal range of 200-800 bp.


[edit] ChIP-Seq Recommendations

  • General overview see wikipedia
  • Use a robust chIP protocol/ kit such as those from the Graves Lab, Active Motif, Epigentek, CellSignaling, USB, Imgenex, ...
  • Genomic DNA should be fragmented to a size range of 200-600 bp prior to immunoprecipitation. Preferred methods of fragmentation include, using the Covaris Adaptive Focussed Acoustics technology, sonication via a Diagenode Bioruptor, or enzymatic digestion using NEB's dsDNA Fragmentase (http://www.neb.com/nebecomm/products/productM0348.asp).
  • The size distribution of a matched input DNA for each sample can be verified by running an aliquot of the sample on an Agilent High Sensitivity DNA Bioanalyzer chip (1 ng) or on a 2% agarose gel (50-100 ng). The DNA should show uniform size distribution in the 200-600 bp range. The presence of a substantial fraction of higher molecular weight DNA (>1 kb) indicates incomplete fragmentation and could result in higher risk of failure to generate a library due to an insufficient quantity of DNA in the size range that is selected for the library.
  • Improper fragmentation leads to a pcr bottle neck following library size selection, clumpy data, and thousands of false positives sites of enrichment. For this reason, we discourage the use of probe-based sonicators as these routinely introduce significant sample to sample variability.
  • If a substantial fraction of the sample is larger than this size, pcr artifacts can be introduced into the library due to insufficient template in the selected range.
  • Gloves and filter tips should be used during all stages of handling ChIP-DNA or Input DNA. The quantity of immunoprecipitated DNA used for ChIP-seq library preparation is very small. The failure to use gloves or filter tips can contaminate the sample with human DNA from skin or by DNA that is introduced through the aerosol of pipettes. These sources of DNA can introduce a significant contribution to the final amplified ChIP-DNA library.
  • The recommended input for an Illumina ChIP-seq library is 10 ng of ChIP DNA. It often may require multiple immunoprecipitations to achieve this quantity. Quantities of less than 5 ng of ChIP-DNA can introduce significant bias into the ChIP-seq library due to preferential amplification of specific library members during the library preparation process. This can result in ChIP-seq data that has a significant level of false positive background.
  • The ChIP DNA samples should be provided to the Microarray Core Facility in a volume of 30 μl. The Microarray Core Facility can assist concentrating samples that are provided in volumes up to 100 μl .
  • A single library preparation will yield sufficient samples such that numerous lanes of DNA sequence analysis can be performed.
  • If known positives and negatives are available, perform qPCR prior to ChIP-Seq sample submission to verify enrichment for the regions of interest.
  • Additional sample from each ChIP preparation should be saved for future qPCR validation of the enriched regions.
  • Avoid organic extraction methods (such as phenol or Trizol) to purify genomic DNA (input) and ChIP-DNA samples. Organic carryover can inhibit the enzymatic reactions used in Illumina library preparation. If an organic extraction method is used, this should be followed by purification on a Qiagen spin column. Preferred sample purification kits include the Qiagen PCR purification kit.
  • Do not use salmon sperm DNA, calf thymus DNA or other DNA based carriers as a blocking agent at any step during the immunoprecipitation process. Carrier DNA that is added to your sample can function as template during Illumina library preparation and will contribute to the sequence reads along with your sample.
  • Magnetic Beads (Dynabeads) do not readily absorb random DNA from the immunoprecipitation cocktail and therefore are preferred over Sepharose, Sephadex etc.
  • ChIP DNA sample concentration is very low and reliable readings cannot be obtained using a spectrophotometer. Accurate concentration readings can be obtained for these samples using a PicoGreen assay such as that performed on an Invitrogen Qubit.
  • For higher eukaryotes, 10 million mapped reads per sample are recommended for performing a whole genome analysis. For high quality ChIP samples, this quantity of reads can be acquired from two lanes of sequence analysis on a GAII.
  • It is recommended to initially fun one lane of sequence for each newly prepped sample to validate if the ChIP preparation shows enrichment and define the quality of the data. Additional lanes of sequence on the same library preparation can be requested for samples that pass this initial level of analysis.



[edit] mRNA Sequencing Recommendations

  • RNA should be provided as total RNA. Preferred sample preparation methods include the Qiagen RNeasy kit. If working with plant or fungi, it is recommended to use the Qiagen RNeasy Plant mini kit.
  • RNA samples should be treated with DNase. This step should be included and performed during the Qiagen RNeasy purification methods. DNase for this procedure should be acquired from Qiagen.
  • Do not overload Qiagen columns during DNA purification. Overloading columns can introduce impurities (guanidine, protein, carbohydrate) that can have inhibitory activity during the Illumina library preparation protocol.
  • Avoid organic extraction methods (such as phenol or Trizol) to purify total RNA. Organic carryover can inhibit the enzymatic reactions used in Illumina library preparation and can increase the risk of failure of library generation. If an organic extraction method is used, it is recommended to follow the initial organic extraction with cleanup on a Qiagen RNeasy spin column. Residual DNA can also be removed during this cleanup step by including DNase.
  • The quantity of total RNA recommended for the construction of an Illumina mRNA-seq Library is 1-10 μg. The sample should be provided to the Microarray Core Facility in a volume of 50 ul at a concentration between 20 ng/μl (total of 1 μg) and 200 ng/μl (total of 10 μg).
  • A single library preparation will yield sufficient samples such that numerous lanes of DNA sequence analysis can be performed.
  • Total RNA quality will be validated by running an aliquot of the sample on an Agilent Bioanalyzer RNA NanoChip. The client will be contacted following the Bioanalyzer run and prior to library preparation if there are any concerns about the quality of the RNA sample.
  • During the standard mRNA-sequencing library preparation procedure, poly A-enriched RNA will be selected by oligo-dT magnetic beads.
  • If the mRNA fraction of your sample has been enriched by other means (Cap-selection, Ribominus, etc) or if this sample is of prokaryote origin, please communicate to the Core Facility to bypass the poly-A selection step. Under this scenario, the mRNA-seq library protocol can be started with 100 ng of enriched mRNA. This sample should be provided at a concentration of 10 ng/ul in a total volume of 15 μl. The excess sample that is requested will be used to perform appropriate quality control steps.
  • Initially, a single sequencing lane will be run for each newly prepared sample library to validate the library and confirm that high quality data is realized. After the sequence data from this first lane has passed all QC metrics, additional lanes of sequence analysis can then be requested. Runs from different days/months are similar in data quality and can be added to the prior data.
  • A good rule of thumb in determining how many lanes to run is to aim for 100x fold coverage for human, 20x for worm, 10x for yeast. Each lane of sequencing generates ~5-10 million reads x 35bp = 140 million mapped bases for complex organisms. For simple organisms with little repetitive DNA, expect ~8-12 million reads x 35bp = 210 million mapped bp per lane.


[edit] Small RNA Sequencing Recommendations

  • RNA should be provided as total RNA. Preferred sample preparation methods include the Qiagen miRNeasy kit. When using Qiagen products, it is essential to use the miRNeasy purification line of products. The standard RNeasy purification kit will largely lose products less than 200 nucleotides in size.
  • RNA samples should be treated with DNase. This step should be included and performed during the Qiagen miRNeasy purification protocol. DNase for this procedure should be acquired from Qiagen.
  • Do not overload Qiagen columns during DNA purification. Overloading columns can introduce impurities (guanidine, protein, carbohydrate) that can have inhibitory activity during the Illumina library preparation protocol.
  • Avoid organic extraction methods (such as phenol or Trizol) to purify total RNA. Organic carryover can inhibit the enzymatic reactions used in Illumina library preparation and can increase the risk of failure of library generation. If an organic extraction method is used, it is recommended to follow the initial organic extraction with cleanup on a Qiagen miRNeasy spin column. Residual DNA can also be removed during this cleanup step by including DNase.
  • The quantity of total RNA recommended for the construction of an Illumina Small RNA Sequencing Library is 1-10 μg. The total RNA sample should be delivered to the Microarray Core Facility in a volume of 10 μl at a concentration between 200 ng/μl (total of 1 μg) and 1000 ng/μl. This volume sample will also allow us to perform appropriate quality control steps on the sample.
  • A single library preparation will yield sufficient samples such that numerous lanes of DNA sequence analysis can be performed.
  • Total RNA quality will be validated by running an aliquot of the sample on an Agilent Bioanalyzer RNA NanoChip. The client will be contacted following the Bioanalyzer run and prior to library preparation if there are any concerns about the quality of the RNA sample.
  • The RNA sample can be delivered to the Microarray Core Facility as total RNA. It is not necessary to separate or purify the small RNA fraction from the total RNA.


[edit] Analysis

  • The Illumina sequencing pipeline will be used to convert the images to reads and then map the reads to a reference genome. These ELAND mapped reads (xxx_sorted.txt) and unmapped reads (xxx_sequence.txt) will be made available to you through GNomEX. Additional files will transfered provided an external hard drive (0.5-1tb). All files, excluding image files, are tape archived.
  • The USeq project, developed by the core, is a good start for additional analysis.
Personal tools