Microarray Core Facility

PacBio RS II Sequencing

PacBio RS

Welcome to the PacBio RS II Sequencing Service at the University of Michigan's DNA Sequencing Core.

PacBio sequencing is available campus wide and to external clients [upcharge required for indirect cost recovery].

The PacBio RS II Sequencer is a Next Generation platform from Pacific Biosciences that records and identifies fluorescently tagged base additions to polymerase-bound DNA templates in real-time by laser excitation. The PacBio RS sequencer at the University of Michigan DNA Sequencing Core was installed in July of 2011 and began running investigators' samples in August 2011.  The system was upgraded in August 2013 to the RS II configuration for additional capabilities.

Advantages:

  • No Amplification - The DNA you send us will be made into libraries by attaching end adapters and then read by the sequencer as individual templates.
  • A choice of high accuracy Circular Consensus short reads (2 Kbp or less) for De Novo sequence and targeted variant resequencing or Long Reads (10 Kbp to 20 Kbp) for contig scaffolding and to help align Illumina HiSeq sequences (hybrid analysis.)
  • Cost effective for projects where 40,000 reads per sample is sufficient data.
  • Able to read in high GC areas and across mononucleotide or low complexity repeats.
  • Average read lengths of 3000-4000, or 5000 bp depending on the project mode. These are averages in which the longest reads can be near 15 to 20+ Kbp. Each read is a unique template, not an amplified or averaged cluster.
  • Methylation analysis available - we set the necessary run conditions and your bioinformaticist does the analysis.
  • Results are in FASTA, FASTQ, and HD5 output files. See the PacBio Data and Bioinformatics Resource Guide.  Any template short enough to be read two times (generally 2000 bp or less) will also be included in a file with the end adapters removed and the two reads combined for a higher accuracy circular consensus sequence (CCS).
  • Errors have no known bias and are considered to be random.  This greatly adds in the ability to accurately call SNP's and polymorphisms.

Disadvantages:

  • Less data generated per run than from an Illumina HiSeq or LifeTech Ion Torrent, however you also pay less. For smaller scale projects this can be an advantage.
  • Sensitive to damaged DNA, ethidium bromide exposure, RNA, and many untestable contaminants.  Genomic DNA preps have had occasional failures due to unknown contamination that inhibits polymerase activity (low level of reads).
  • There is soon to be a new chemistry, allowing even longer reads than current specs and some additional tools for cDNA analysis. 
  • The sequencing errors of the continuous long reads (CLR) are mostly insertions and/or deletions (indels) and can suffer from reference bias (during the alignment of reads).   

Typical Project Types:

Project Uses Min. Reads/ SMRT cell Avg. Readlength (chemistry) DNA Template Size
Short Template Size / High Accuracy Reads / CCS De Novo Seq or
Variant Analysis
40,000 3000-4000 (P4-C2) Uses 100bp to 2,000bp fragments
with multiple sequencing passes
around the same circular template
to get very high accuracy.
Long Mode/ Lower Accuracy Contig Scaffolding
and Illumina Hybrid
40,000  3000-4000 (P4-C2) Anything >2000bp up to 20kb.
12-15% random error rate, mostly indels.
Very Long Mode/ Lowest Accuracy Maximized length
for Scaffolding
40,000 5000 (P4-XL) DNA fragments > = 10kb.  Sacrifice
some accuracy (~2% more) for the longest possible reads.

Applications

  • De Novo Assembly
  • Targeted Sequencing
    • Compound Mutations and Haplotype Phasing - Multi-kilobase reads facilitate the study of linked mutations hundreds, even thousands, of bases apart.
    • Repeat Expansions - Long reads and low bias allow accurate sequencing across repeat expansions, even in low complexity regions.
    • Full-Length Transcripts and Splice Variants - Single-molecule resolution and long reads span entire cDNAs, allowing full characterization of splicing in the transcriptome.
    • Minor Variants and Quasispecies - Single molecule sequencing simplifies the analysis of mixed populations of sequences. Exquisitely sensitive and specific.
    • SNP Detection and Validation - Single molecule sequencing detects and validates SNPs with high accuracy by avoiding mapping errors and systematic errors.
  • Base Modification Detection.
    • Resolve strand-specific modifications - unamplified double-stranded input DNA enables detection of strand-specific modifications such as hemimethylation
    • Single-base resolution - visualize a wide variety of base modifications in high-resolution
    • Integrated sequence and chemical modification detection - epigenetic and other base modification data are gathered in the sequencing process, as an integral part of the sequencing workflow
    • Hypothesis-free detection - no prior knowledge of the modification is needed, allowing discovery of unknown or unexpected modifications
  • Adapter-free single pass sequencing - Inquire about options for this.
  • High accuracy circular consensus for microsatellite analysis.


Prices:

  • SpeedVac concentration.                                                            $Inquire
  • AMPure XP bead purification                                                       $Inquire
    (e.g. unpurified PCR product or incompatible buffer)
  • Shearing (250bp to 5Kbp) by Covaris S2 sonicator                              $15.00 per sample
  • Shearing (10-20Kbp) by Covaris G-Tube                                               $40.00 per sample
  • Basic library preparation                                                           $250.00 per sample
    - initial concentration check, library prep to template, analysis of template.
  • Sequencing, annealing, binding template on PacBio RS                       $190.00 per SMRT cell
    (P4 binding with C2 or XL reagents)

Submitting Samples for PacBio Sequencing

  1. Contact us (joewashb@umich.edu), open an account and have it enabled for PacBio sequencing. If you are a researcher in a PI's lab, please get a user ID within the PI's account and submit using your own user ID.

  2. Prepare the DNA. - Download PacBio's Full Guidelines on Sample Preparation.

    You must follow these guidelines.  The Pacific Biosciences template preparation process does not utilize amplification techniques. As a result, input DNA quality will be directly reflected in sequencing results. Any DNA damage present in the input material (abasic sites, nicks, interstrand crosslinks, modified bases, etc.) or DNA polymerase inhibitors will result in impaired performance in the system.

  3. Submit the appropriate volume and concentration for your project:

    Ask us if you are unsure but in most cases a single library preparation from one sample requires:

    Protocols by Fragment Size:

    Fragment Length Max Vol. for Shearing Min.Mass for Shearing Max Vol. DNA fragments
    (no shearing required)
     Min. Mass of DNA fragments
    (no shearing required)
    250 bp 120 ul 500 ng 23 ul 250 ng
    500 bp 120 ul 500 ng 23 ul 250 ng
    1000 bp 120 ul 1000 ng 23 ul 500 ng
    5 Kbp 200 ul 5 ug 39 ul 2.5 ug
    10 Kbp 150 ul 10 ug 39 ul 5 ug

    * If we have to speedvac your sample to a reasonable volume before starting the shearing or library preparation, you will be assessed an additional charge. Dilutions are not an issue but concentrating down large volumes is time consuming and delays the library production.

    * The number of SMRT Cells that can be run from one library is variable, but typically at least 16 are possible.


  4. Enter the samples online here. If you have discussed with us a project option that is not on the form, just mention it in the included text box.

  5. Drop off the sample at one of the two locations that accept PacBio samples: MSRBII or NCRC.

  6. Your results can be downloaded by FTP (for free) or can be provided on a hard drive (with an additional cost.)

  7. Contact Chris McHenry, 3-8896 or (cmchenry at umich.edu) if you have any questions.

How to Retrieve Your Results:

It's free and easiest to retrieve your results by FTP (instructions below) but we can provide the data on a hard drive if you want to purchase one. We can also put the files onto a hard drive that you provide.

Instructions to download your data by FTP:

If you do not have an FTP client please google the free program FileZilla and install it.
The information you will need to connect is:
hostname: ruddle.brcf.med.umich.edu
Login: your PI's login
Password: Your lab's password
Port: leave default (blank in FileZilla).

Your data will be available for two weeks after email notification before being automatically deleted.  Please download your data ASAP.

PacBio Sequencing in More Detail:

An Overview:

  1. The DNA Template library consists of double stranded fragments of DNA from about 250bp to 20Kbp in length with single stranded end adapters. Fragments with adapters are called SMRTbell templates. Primers are then annealed to the end adapters and bound with polymerases. The end adapters make the double stranded DNA fragments into circular templates on which the polymerase will continue to run until it becomes inactive or until the end of the observation period. A read containing multiple passes around the circular template can be condensed into a higher accuracy consensus sequence. Requirements for DNA to be submitted for library preparation are listed above.  Here is the latest RS II brochure from PacBio and sequencing guide.
    SMRTbell sequencing

  2. The SMRT Cell contains 150,000 wells called zero mode waveguides or ZMWs. A minimum of 40,000 ZMW's will contain readable templates and produce sequencing data.  Each ZMW is visualized by a laser beamlet to record additions of fluorophore-tagged bases to a DNA template by a polymerase attached to the bottom of the ZMW. Each SMRT Cell is loaded with one sample and up to 12 SMRT cells may be run within a 24 hour day.
    ZMW sequencing

  3. The PacBio RS II (Realtime Sequencer) has a five blade computing center to record the sequencing results and provide basecalls and primary analysis. Two 1-Watt lasers are each broken into 150,000 beamlets to read each of the wells on a SMRT Cell.
    RS clickable

  4. We use Pacific Biosciences' provided primary and secondary analysis software for quality control. Software is also available for your bioinformatics contact.
    ZMW sequencing

  5. Typical project examples are listed in the table above. Template protocols are based on fragment size in bins of 250 bp, 500 bp, 1 Kbp, 5 Kbp and 10+ Kbp. If your sample does not match one of the bins or has a range of sizes (e.g. some PCR products, cDNAs, or pooled fragments), ask us and we can probably find a way to accomodate it. An annealing/binding calculator is used to determine details for preparing completed SMRTbell templates for sequencing based on the bins so nonstandard sizes are less dependable but generally can be made to work.

(This webpage was last updated on 9/16/2013. Protocols and services are constantly being upgraded, ask us if something you want to request is not yet listed here.)