SPRITE Protocol DNA January 2018
SPRITE Protocol DNA January 2018
SPRITE Protocol DNA January 2018
Protocol
1
SPRITE Protocol
Table of Contents:
1 Materials
1.1 Solutions
1.2 Equipment
1.3 Additional Materials and Reagents
2 Cell Culture
3 Adaptor and Barcode Design
3.1 DNA Phosphate Modified (DPM) Adaptor
3.2 Odd and Even Tags
3.3 Terminal Tag
3.4 Final Library Amplification
3.5 DPM primers for QC of DPM ligation
3.6 Adaptor annealing program
4 Sample Preparation
4.1 Formaldehyde-DSG Crosslinking
4.2 Cell Lysis
4.3 DNA Fragmentation
5 SPRITE Pre Split-and-Pool and QC Pt. 1
5.1 NHS Coupling
5.2 Phosphorylation and End Repair
5.3 DPM Adaptor Ligation
5.4 QC: Check to Determine Ligation Efficiency of the DPM Adaptor
6 SPRITE and Library Preparation Pt. 2
6.1 SPRITE
6.2 Library Preparation Pt. 2
6.3 Estimating Sequencing Depth
7 Sequencing and Data Analysis
7.1 Tag identification
7.2 Alignment
7.3 Filtration
7.4 Subsequence post-processing
7.5 Quality Controls of Successful SPRITE Libraries
2
SPRITE Protocol
1 Materials
1.1 Solutions
DSG Crosslinking Cell Lysis Buffer C Coupling Buffer
Solution 10mM Tris pH 8 1X PBS
1X PBS 1.5mM EDTA 0.1% SDS
2mM DSG in DMSO 1.5mM EGTA
100mM NaCl RLT++ Buffer
Scraping Buffer 0.1% DOC 1X Buffer RLT supplied
1 x PBS pH 7.5 0.5% NLS by Qiagen
0.5% BSA 10mM Tris pH 7.5
Store at 4 °C 10x DNase Buffer 1mM EDTA
200mM Hepes pH 7.4 1mM EGTA
Cell Lysis Buffer A 1M NaCl 0.2% NLS
50mM Hepes pH 7.4 0.5% NP-40 0.1% Triton-X
1mM EDTA 5mM CaCl2 0.1% NP-40
1mM EGTA 25mM MnCl2
140mM NaCl M2 Wash Buffer
0.25% Triton-X 25x DNase Stop 20mM Tris pH 7.5
0.5% NP-40 Solution 50mM NaCl
10% Glycerol 250mM EDTA 0.2% Triton-X
125mM EGTA 0.2% NP-40
Cell Lysis Buffer B 0.2% DOC
10mM Tris pH 8 MyRNK Buffer
1.5mM EDTA 20mM Tris pH 7.5 PBLSD+ Wash Buffer
1.5mM EGTA 100mM NaCl 1X PBS
200mM NaCl 10mM EDTA 5mM EDTA
10mM EGTA 5mM EGTA
10x Annealing Buffer 0.5% Triton-X 5mM DTT (add fresh)
100mM Tris-HCl pH 7.5 0.2% SDS 0.2% Triton-X
2M LiCl 0.2% NP-40
2mM EDgTA 0.2% DOC
Note 1: RLT++ Buffer contains guanidine thiocyanate which when mixed with
bleach produces hydrogen cyanide gas and hydrogen chloride gas. Be careful to
ensure that all liquid RLT++ Buffer waste is disposed of in its own waste container.
Solids that have touched RLT++ Buffer such as tips and reservoirs should also be
discarded in a separate solid RLT++ Buffer container.
Note 2: DTT has a short half-life at pH 7.4 at 20C. It is important to keep PBLSD+
Buffer on ice during the procedure and frozen at -20C if not in use.
3
SPRITE Protocol
4
SPRITE Protocol
1.2 Equipment
Microcentrifuge
Plate Centrifuge
Sonication instrument and chiller
Gel Electrophoresis Equipment
Qubit Fluorometer
Eppendorf Thermomixer
Eppendorf SmartBlock 1.5mL thermoblock
Eppendorf SmartBloack PCR 96 thermoblock
Magnetic rack for 1.5mL tubes (e.g. Invitrogen DynaMag-2)
Magnetic rack for 15mL conical tubes
Magnetic rack for 96 well plate
PCR machine
Agilent Bioanalyzer
5
SPRITE Protocol
6
SPRITE Protocol
2 Cell Culture
7
SPRITE Protocol
The above figure demonstrates the adaptor and tag scheme that is central to the
SPRITE process. SPRITE uses a split-and-pool strategy to uniquely barcode all
molecules within a crosslinked complex by repeatedly splitting all complexes into a
96-well plate, ligating a specific tag sequence within each well, followed by pooling
of these complexes such that the final product contains a series of tags ligated to
each molecule, which we refer to as a barcode.
3.1 DNA Phosphate Modified (DPM) Adaptor
5'Phos AAACACCCAAGATCGGAAGAGCGTCGTGTA 3’ Spcr
||||||||||||||||||||
3' TTTTGTGGGTTCTAGCCTTCTGTACTGTTCAGT 5’Phos
The above dsDNA molecule is an example of one of the 96 DPM adaptors used
during our process. The 5’ end of the molecule has a modified phosphate group that
allows for the ligation between DPM and the target DNA molecules as well as the
subsequent tag. The highlighted regions on DPM have the following functions:
1. The yellow T overhang is a sticky-end that ligates to our target DNA
molecules, which are given a 5’ A overhang following end repair.
2. The pink region is the 9-nucleotide sequence unique to each of the 96 DPM
adaptors. These unique sequences help to identify post-sequencing DNA
molecules that are in a complex.
3. The green sequence is a sticky end that ligates to the first tag.
4. The grey sequence is complementary to the First Primer used for library
amplification. Part of the grey sequence makes up a 3’ spacer to prevent the
top strand of the Odd tag from ligating, and only the bottom
5’phosphorylated sticky end of the Odd tag will ligate to the green tag. Its
purpose is discussed in section 3.4.
8
SPRITE Protocol
9
SPRITE Protocol
2P_barcoded_85 (R primer)
5’ CAAGCAGAAGACGGCATACGAGATGCCTAGCCGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT 3’
Due to reverse complementarity of the sequences, only one primer amplifies the
tagged DNA in the first PCR cycle. This First Primer anneals to a sequence in the
DPM adaptor and extends, synthesizing two daughter strands with reverse
sequences. This first primer serves as the Read1 primer during Illumina
sequencing. To synthesize the complement, the Second Primer anneals to the
daughter strand extended from the First Primer in the second PCR cycle.
The 2P_barcoded primer contains an 8 nucleotide barcode within the primer. This
barcode is read from the illumina sequencer during the indexing priming step. This
barcode effectively serves as an additional round of tag addition during SPRITE.
Dilution of the sample into multiple wells is performed at the final step of SPRITE
prior to proteinase K elution from NHS beads. Each dilution of the sample prior to
proteinase K elution isolates a subset of the tagged complexes into different wells.
Each dilution of complexes are amplified with a different 2P_barcoded primer.
Both the First and Second primers are around 30 nucleotides each. Yet the
sequences they anneal to initially are ~20 nucleotides. For this reason, we set two
different annealing temperatures during the final library PCR. The first annealing
temperature is for the first four cycles until enough copies are made with fully
extended primer regions. After these four cycles, the annealing temperature is
raised for a remaining five cycles.
The 2P_universal primer and 2P_barcoded serve as the Read 1 and Read 2 primers
for illumina sequencing, respectively. Read 1 sequences the DNA molecule and the
DPM adaptor. Read 2 sequences the multiple tags, ie. unique barcode, ligated to the
DNA molecules.
3.5 DPM primers for Quality-Control of DPM ligation
These primers are used to ensure that the DPM adaptor has been successfully
ligated to DNA of the lysate. If no libraries are obtained at this step after 14-16
10
SPRITE Protocol
DPMQCprimerF 5’ TACACGACGCTCTTCCGATCT 3’
DPMQCprimerR 5’ TGACTTGTCATGTCTTCCGATCT 3’
The Forward and Reverse primers amplify the top strand and bottom strand of the
DPM adaptor, respectively (section 3.1).
3.5 Adaptor annealing program
The following adaptors are annealed to make the tags double-stranded adaptors for
dsDNA adaptor ligation:
1. DPM adaptors
2. Odd adaptors
3. Even adaptors
4. Terminal Tag adaptors
Mix the top and bottom strands of each adaptor into a PCR tube or 96-well plate
with 10x Annealing Buffer:
Reagents Volume
10x Annealing Buffer 10ul
Top Adaptor (200μM) 45ul
Bottom Adaptor (200μM) 45ul
Total 100ul
Incubate with the following conditions in a thermocycler for adapter annealing to
denature any secondary structure within the top and bottom strands of each
adaptor, then slowly cool to anneal each strand:
Temperature (°C) Time (min) Ramp (°C/s) Cycle
Denaturation 95 02:00
Annealing 85 00:10 -1 60
Hold 25 Infinite
11
SPRITE Protocol
4 Sample Preparation
Goal: Crosslink cells to fix in vivo RNA-DNA-Protein complexes with disuccinimidyl
glutarate (DSG) and formaldehyde crosslinkers. Lyse cells and fragment DNA to
appropriate sizes via sonication and DNase.
Optimization of lysis conditions (amount of sonication, amount/timing of DNase) is
a critical step in establishing the protocol for the first time. The length of sonication
might vary from 30sec to several minutes and DNase treatment might vary from 10
to 20 minutes, depending on cell number, ploidy, crosslinking strength, and the
desired DNA fragment size. To optimize DNase timing and conditions, remove 5 µL
lysate aliquots every 2-4 minutes, quench with EDTA and EGTA on ice, and assay
DNA sizes for each time point as described in the protocol. If an appropriate
combination of solubilization and DNA fragment sizes cannot be obtained by
varying the amount of sonication or DNase, then reducing the strength of the
crosslinking may be necessary.(1)
4.1 Formaldehyde-DSG Crosslinking
1. Grow adherent cells on 15-cm plates. Before crosslinking, count one plate.
This protocol details crosslinking multiple plates of cells in one suspension,
but it is important to maintain consistency in lysate batches. We typically
store cells in 10M pellets.
2. An hour before starting, warm TVP and wash solution at 37C. Chill one bottle
of PBS at 4C, keep one at room temperature.
3. Lift cells from plate and wash: Remove media from plates. Add 5mL TVP to
each 15cm plate and rock gently for 3-4 minutes. Afterwards, add 25mL
wash solution to each plate. Vigorously suspend cells in the wash solution
and transfer from plate to a 50mL conical tube. Rinse the plate with extra
wash solution and add to the 50mL conical. Pellet in a centrifuge for 3
minutes at 3300 X G at room temperature. Wash cells by resuspending in
4mL room temperature 1X PBS per 10M cells and transfer to a 15mL conical.
Pellet again.
4. Resuspend cells in DSG Crosslinking Solution, 4mL per 10M cells. Rock
gently at room temperature for 45 minutes. NOTE: Upon addition of
crosslinkers to the pellet it is critical to pipette up and down repeatedly to
break up any clumps of cells to avoid any cell-cell crosslinking.
5. Pellet cells for 4 minutes at 1000 X G at room temperature. Discard
supernatant.
12
SPRITE Protocol
6. Wash cells with 4mL 1x PBS per 10M cells. Pellet as before, discarding
supernatant.
7. Resuspend cell pellet in 3% formaldehyde in PBS. Rock gently at room
temperature for 10 minutes. NOTE: Upon addition of crosslinkers to the
pellet it is critical to pipette up and down repeatedly to break up any clumps
of cells to avoid any cell-cell crosslinking.
8. Add 200uL of 2.5M glycine stop solution per 1mL of cell suspension. Rock
gently at room temperature for 5 minutes.
9. Pellet cells at 4C for 4 minutes at 1000 X G at room temperature. Discard
formaldehyde supernatant in an appropriate waste container. From here,
keep cells at 4C.
10. Resuspend cell pellet in cold Scraping Buffer and gently rock for 1-2 minutes.
11. Pellet cells at 4C for 4minutes at 1000 X G. Discard supernatant in
formaldehyde waste container.
12. Resuspend cell pellet in cold Scraping Buffer again and gently rock for 1-2
minutes. Pellet as before and discard supernatant.
13. Resuspend pellet in 1mL of Scraping Buffer per 10M cells.
14. Aliquot 10M cells each into Microcentrifuge tubes and pellet at 4C for 5
minutes at 2000 X G. Remove supernatant.
15. Flash freeze in liquid nitrogen and store pellet at -80C.
(1) Engreitz, Jesse “RNA Antisense Purification (RAP): Experimental Protocols”
4.2 Cell Lysis
1. Chill Lysis Buffers A, B, and C on ice.
2. If using an electronic chiller for the sonication chamber, pre-chill to 4C.
3. Thaw 10M cell pellets on ice.
4. Add 1.4mL of Lysis Buffer A supplemented with 1x Proteinase Cocktail
Inhibitor (PIC) to each 10M cell pellet and resuspend.
5. Incubate mixtures on ice for 10 minutes.
13
SPRITE Protocol
14
SPRITE Protocol
4. Add 1uL of 25X DNase Stop Solution to each sample to terminate the
reaction.
5. Reverse the crosslinks in each sample.
Stock Solution Volume
Lysate 21uL
MyRNK Buffer 71uL
Proteinase K 8uL
Total 100uL
6. Incubate for at 65C for three hours at the minimum, optimally overnight.
7. Follow the protocol provided in the DNA Clean and Concentrator-5 Kit,
binding in 6 volumes of DNA Binding Buffer. Elute in 10uL of H20.
8. Run each DNase sample on a gel with a 100bp DNA ladder. An ideal
fragmentation sample will have most DNA around 200bp. Size should not
greatly exceed 1kb, or be overly fragmented where DNA is <100bp. Run
traces that have no long tail of DNA fragments on a DNA HS Bioanalyzer of
D1000 Tapestation. Example trace of DNA sizes obtained for SPRITE.
9. If none of these concentrations of TURBO DNase led to ideal fragmentation,
adjust concentrations and repeat the DNasing until optimal conditions are
found.
10. DNase the batch of crosslinked lysate at the identified optimal DNAase
concentration
Stock Solution Volume
10X DNase Buffer 110uL
Lysate 550uL
15
SPRITE Protocol
16
SPRITE Protocol
17
SPRITE Protocol
not to introduce water into the bottle, transfer 2mL of NHS beads into a clean
1.7mL tube. Place the tube on a magnetic rack to capture the beads.
2. Remove the DMAC and wash beads with 1mL ice-cold 1mM HCl.
3. Wash beads with 1mL ice-cold 1X PBS.
4. Add 1mL Coupling Buffer to the beads. Before mixing, add the appropriate
amount of lysate to the coupling buffer.
5. Incubate the lysate and beads overnight at 4C on a mixer.
6. Place beads on a magnet and remove a 500uL flowthrough aliquot to another
tube. This aliquot can be analyzed to determine how much lysate was
coupled.
7. Add 500uL 1M Tris pH 7.5 (3M ethanolamine pH 9.0 can also be used), to the
beads and incubate on a mixer at 4C for at least 45 minutes. This ensures that
all NHS beads will be quenched with protein from bound lysate or Tris, and
will not bind enzymes in the following steps.
8. Wash beads four times in cold RLT++ Buffer at 4C for 3-5 minutes each time.
9. Wash beads twice in PBLSD+ Wash Buffer at 50C for 4-5 minutes each time.
10. Wash beads once at room temperature in PBLSD+ buffer.
11. Wash beads three times with M2 Buffer.
12. Spin the beads down quickly in a microcentrifuge and place back on the
magnet to remove any remaining liquid.
5.2 Phosphorylation and End Repair
1. Phosphorylate the 5’ ends of the DNA molecules to allow subsequent barcode
ligation by adding the following mixture to the beads:
Stock Solution Volume
H20 167.5uL
T4 Polynucleotide Kinase Reaction 20uL
Buffer (10X)
T4 Polynucleotide Kinase 10uL
100mM ATP 2.5uL
Total 200uL
18
SPRITE Protocol
2. Incubate on a thermomixer for 60 minutes at 37C, 1200RPM.
3. Wash beads three times with M2 Buffer.
4. Spin the beads down quickly in a microcentrifuge and place back on the
magnet to remove any remaining liquid.
5. Blunt the 5’ and 3’ ends of the DNA molecules to prevent unwanted ligation
by adding the following mixture to the beads:
Stock Solution Volume
H20 212.5uL
End Repair Reaction Buffer (10X) 25uL
End Repair Enzyme Mix 12.5uL
Total 250uL
6. Incubate on a thermomixer for 60 minutes at 24C, 1200RPM.
7. Wash once with RLT++ Buffer.
8. Wash three times with M2 Buffer.
9. Spin the beads down quickly in a microcentrifuge and place back on the
magnet to remove any remaining liquid.
10. Add dATP to the 3’ ends of each DNA molecule to allow for ligation of the
DPM adaptor by adding the following mixture to the beads:
Stock Solution Volume
H20 215uL
dA-Tailing Reaction Buffer (10X) 25uL
Klenow Fragment (exo-) 10uL
Total 250uL
11. Incubate on a thermomixer for 60 minutes at 37C, 1200RPM. If ligating the
first adaptor barcode on the same day, set up the reaction during this
incubation.
12. Wash once with RLT++ Buffer.
13. Wash three times with M2 Buffer.
14. Spin the beads down quickly in a microcentrifuge and place back on the
magnet to remove any remaining liquid.
19
SPRITE Protocol
5.3 DPM Adaptor Ligation
Note: There are 96 adaptors that are designed to ligate onto the DNA molecules.
These DPM adaptors are kept in a 96-well stock plate at 45uM. The ligation reaction
between the adaptors and the DNA occurs in a 96-well plate. The following steps
that detail set up are designed for optimum efficiency during the process.
Note: All ligation steps include M2 buffer, which contains detergents, to prevent
beads from aggregation of multiple beads, from sticking to the plastic tips and tubes,
and for even distribution of the beads across a 96-well plate. We have verified that
these detergents do not significantly inhibit ligation efficiency.
1. Aliquot 200uL of 2x NEB Instant Sticky End Ligase Master Mix (NOTE: we
have found that other concentrations from 0.1x-1x final may be used but
might have differences in ligation efficiency) into each well of a 12-well strip
tube. Keep on ice until ready to use.
2. Centrifuge the DPM adaptor stock plate before removing the foil seal. Aliquot
2.4uL from the stock plate of DPM adaptors to a new low-bind 96-well plate.
Be careful to ensure that there is no mixing between wells at any point of the
process to avoid cross-contamination of barcodes. Use a new pipette tip for
each well. After transfer is complete, seal both plates with a new foil seal.
3. Create a diluted M2 Buffer by mixing 1100uL of M2 Buffer with 792uL of
H20.
4. Accounting for bead volume, add the M2+H20 mix to the beads to achieve a
final volume of 1700uL. Ensure that the beads are equally suspended in the
buffer.
5. Aliquot 140uL of the bead mix into each well of a 12-well strip tube.
6. Centrifuge the 96-well plate containing the aliquoted adaptors, and then
remove the foil seal.
7. Aliquot 17.6 uL of beads into each well of the 96-well plate that contains
2.4uL of the DPM adaptors. Be careful to ensure that there is no mixing
between wells at any point of the process. Use a new pipette tip for each well.
Also be careful to ensure that there are no beads remaining in the pipette tip.
8. Carefully add any remaining beads to individual wells on the plate in 1uL
aliquots.
20
SPRITE Protocol
9. Aliquot 20uL of Instant Sticky End Ligase Master Mix into each well, mixing
by pipetting up and down 10 times. Be careful to ensure that there is no
mixing between wells at any point of the process. Use a new pipette tip for
each well.
10. The final reaction components and volumes for each well should be as
follows:
Stock Solution Volume
Beads + M2 + H20 Mix 17.6uL
DPM Adaptor (45uM) 2.4uL
2X Instant Sticky End Ligation Master Mix 20uL
Total 40uL
11. Seal the plate with a foil seal and incubate on a thermomixer for 60 minutes
at 20C, shaking for 15 seconds at 1600RPM every minute to prevent beads
from settling to the bottom of the plate. NOTE: ligation time is critical for high
efficiency of ligation each round.
12. After incubation, centrifuge the plate before removing the foil seal.
13. Pour RLT++ Buffer into a sterile plastic reservoir, and transfer 100uL of
RLT++ into each well on the 96-well plate to stop the ligation reactions. It is
not necessary to use new tips for each well.
14. Pool all 96 stopped ligation reactions into a second sterile plastic reservoir.
15. Place a 15mL conical tube on an appropriately sized magnetic rack and
transfer the pool into the conical. Capture all beads on the magnet, disposing
all RLT++ in an appropriate waste receptacle.
16. Remove the 15mL conical containing the beads from the magnet and
resuspend beads in 1mL PBLSD+ Wash Buffer. Transfer the bead solution to
a microcentrifuge tube.
17. Wash three times with PBLSD+ Wash Buffer at 50C, 1200RPM for 3 minutes
each time.
18. Wash three times with M2 Buffer.
21
SPRITE Protocol
5.4 Quality Control (QC): Check to Determine Ligation Efficiency of the DPM
Adaptor
1. Resuspend the beads in MyRNK Buffer so that the final beads + buffer volume
is 1mL. Remove a 5% aliquot (50uL) into a separate microcentrifuge tube.
2. Place the remaining 95% of beads back on the magnetic rack, remove the
MyRNK Buffer, and store beads in 1mL of RLT++ Buffer. Keep beads at 4C
overnight.
3. Remove the DNA+DPM adaptor molecules from the beads and reverse the
crosslinks in the lysate. Proteinase K will degrade the protein covalently
linking the the proteins in the crosslinked lysate to the NHS group on the
beads, releasing the lysate from the beads. Proteinase K and heat will further
degrade any protein in the lysate and remove crosslinks, leaving nucleic acid
in the sample.
Stock Solution Volume
Sample on beads in MyRNK Buffer 50uL
MyRNK Buffer 42uL
Proteinase K 8uL
Total 100uL
4. Incubate at 65C overnight. (Time can be reduced to 1-4hrs if needed)
5. Place the microcentrifuge tube on a magnet and capture the beads. Remove
the flowthrough that contains the DNA ligated with DPM adaptor and place in
a clean microcentrifuge tube.
6. Pipette 25uL of H20 into the tube containing the beads. Vortex, and re-
capture the beads. Remove the 25uL of H20 that now contains any residual
nucleic acid and add to the new sample tube. Discard the beads.
7. Follow the protocol provided in the DNA Clean and Concentrator-5 Kit,
binding in 6 volumes of DNA Binding Buffer. Elute in 40uL of H20.
8. Amplify the DNA molecules that are ligated to the adaptors. The forward
primer should prime off the 5’ end of the DPM adaptor and the reverse
primer should prime off the 3’ end of the DPM adaptor. Before placing the
reaction in the thermocycler, split the sample into two tubes with 50uL in
each tube.
Stock Solution Volume
Sample (cleaned) 10uL
DPMQCForward Primer (100uM) 2uL
22
SPRITE Protocol
23
SPRITE Protocol
Example DPM library after 16 cycles on 5% of the ligated material on 2%
agarose gel.
24
SPRITE Protocol
25
SPRITE Protocol
2. Centrifuge the tag stock plate before removing the foil seal. Aliquot 2.4uL
from the stock plate of barcodes to a new low-bind 96-well plate. Be careful
to ensure that there is no mixing between wells at any point of the process.
Use a new pipette tip for each well. After transfer is complete, seal both
plates with a new foil seal.
3. Create a diluted M2 Buffer by mixing 1100uL of M2 Buffer with 792uL of
H20.
4. Accounting for bead volume, add the M2+H20 mix to the beads to achieve a
final volume of 1700uL. Ensure that the beads are equally suspended in the
buffer.
5. Aliquot 140uL of the bead mix into each well of a 12-well strip tube.
6. Centrifuge the 96-well plate containing the aliquoted barcodes, and then
remove the foil seal.
7. Aliquot 17.6 uL of beads into each well of the 96-well plate that contains
2.4uL of the tags. Be careful to ensure that there is no mixing between wells
at any point of the process. Use a new pipette tip for each well. Also be
careful to ensure that there are no beads remaining in the pipette tip.
8. Carefully add any remaining beads to individual wells on the plate in 1uL
aliquots.
9. Aliquot 20uL of Instant Sticky End Ligase Master Mix into each well, mixing
by pipetting up and down 10 times. Be careful to ensure that there is no
mixing between wells at any point of the process. Use a new pipette tip for
each well.
10. The final reaction components and volumes for each well should be as
follows:
Stock Solution Volume
Beads + M2 + H20 Mix 17.6uL
Tag (45uM) 2.4uL
2X Instant Sticky End Ligation Master 20uL
Mix
Total 40uL
11. Seal the plate with a foil seal and incubate on a thermomixer for 60 minutes
at 20C, shaking for 15 seconds at 1600RPM every 5 minutes.
26
SPRITE Protocol
12. After incubation, centrifuge the plate before removing the foil seal.
13. Pour RLT++ Buffer into a sterile plastic reservoir, and transfer 100uL of
RLT++ into each well on the 96-well plate to stop the ligation reactions. It is
not necessary to use new tips for each well.
14. Pool all 96 stopped ligation reactions into a second sterile plastic reservoir.
15. Place a 15mL conical tube on an appropriately sized magnetic rack and
transfer the pool into the conical. Capture all beads on the magnet, disposing
all RLT++ in an appropriate waste receptacle.
16. Remove the 15mL conical containing the beads from the magnet and
resuspend beads in 1mL PBLSD+ Wash Buffer. Transfer the bead solution to
a microcentrifuge tube.
17. Wash three times with PBLSD+ Wash Buffer at 50C, 1200RPM for 3 minutes
each time.
18. Wash three times with M2 Buffer.
19. Repeat the process starting at Step 1 for the remaining three or more SPRITE
rounds.
6.2 Library Preparation Pt.2
1. Resuspend the beads in MyRNK Buffer so that the final beads + buffer volume
is 1mL.
2. Remove five aliquots into clean microcentrifuge tubes: 0.5%, 1%, 2.5%, 5%,
and 7.5% (5uL, 10uL, 25uL, 50uL, and 75uL) and elute the barcoded DNA
from the beads.
Stock Solution Volume
Sample on beads in MyRNK Buffer 5 / 10 / 25 / 50 / 75uL
MyRNK Buffer 87 / 82 / 67 / 42 / 17uL
Proteinase K 8uL
Total 100uL
3. Incubate at 65C overnight.
27
SPRITE Protocol
4. Place the microcentrifuge tubes on a magnet and capture the beads. Remove
the flowthrough that contains the barcoded DNA and place in a clean
microcentrifuge tube.
5. Pipette 25uL of H20 into the tube containing the beads. Vortex, and re-
capture the beads. Remove the 25uL of H20 that now contains any residual
nucleic acid and add to the new sample tube. Discard the beads.
6. Follow the protocol provided in the DNA Clean and Concentrator-5 Kit,
binding in 6 volumes of DNA Binding Buffer. Elute in 40uL of H20.
7. Amplify the final barcoded DNA through PCR. Refer to section 3.4 for dtails
about the final library amplification step. Before placing the reaction in the
thermocycler, split the sample in in to two tubes with 50uL in each tube.
Stock Solution Volume
Sample (cleaned) 40uL
First Primer (100uM) 2uL
Second Primer (100uM) 2uL
H20 6uL
Q5 Hot Start Master Mix 50uL
Total 100uL
PCR Program:
1. Initial denaturation: 98C- 180 seconds
2. 4 cycles:
a. 98C-10 seconds
b. First Annealing Temperature- 30 seconds
c. 72C- 90 seconds
3. 5 cycles:
a. 98C-10 seconds
b. Second Annealing Temperature- 30 seconds
c. 72C- 90 seconds
4. Final extension: 72C- 180 seconds
5. Hold 4C
8. Clean the PCR reaction and size select for your target libraries. The total
length of our barcode on one amplified product is around 160 base pairs and
each target DNA molecules no less than 100 base pairs. Agencourt AMPure
XP beads are able to size select while cleaning the PCR reaction of unwanted
products.
a. Combine the two 50uL PCR reactions back into one tube.
b. Add 0.7X AMPure XP beads to the sample for a total volume of 170uL
and mix thoroughly.
28
SPRITE Protocol
29
SPRITE Protocol
30
SPRITE Protocol
7 Sequencing and Data Analysis
The Illumina, Inc. HiSeq v2500 platform was employed for next generation
sequencing of the generated libraries using a TruSeq Rapid SBS v1 Kit – HS (200
cycle) and TruSeq Rapid Paired End Cluster Kit – HS. All SPRITE data in this paper
was generated using Illumina paired-end sequencing. Reads must be long enough to
incorporate all tag information. Most read-pairs in this report were (115 bp, 100
bp).
All code for the SPRITE computational pipeline is found here:
https://github.com/GuttmanLab/sprite-pipeline/wiki
7.1 Tag identification
This step is performed using custom in-house software. The program takes as input
both FASTQ files, sorted by name so that the record with a particular line number in
the read 1 file corresponds with the record with the same line number in the read 2
file. The program also requires a text file containing the tag sequences with unique
identifiers and an identification tolerance -- the number of mismatches tolerated
between the tag and the read when search for the tag.
The program first loads the tags from the tag file and stores them in a hashtable
keyed by sequence. Storing these sequences in a hashtable allows rapid (O(1))
string matching. Additional tags are generated according to the given identification
tolerances, and these are also stored. For example, if the tag TTTT has an
identification tolerance of 1, the tag will be inserted into the table, keyed by all
sequences at most one Hamming distance away:
TTTT
ATTT
TATT
TTAT
TTTA
CTTT
TCTT
TTCT
TTTC
GTTT
TGTT
TTGT
TTTG
NTTT
TNTT
TTNT
TTTN
After storing the tags, the program iterates through the read-pairs by advancing
line-by-line through both FASTQ files simultaneously. For a given sequence, the
31
SPRITE Protocol
program queries the hash table for substrings that correspond to known tag
positions. (The exact details of this process depend on the barcoding scheme.) After
the identification process for a record is complete, the tags are appended to the
name of the record, and this modified record is output into new read 1 and read 2
FASTQ files.
7.2 Alignment
In our barcoding schemes, only one of the reads in a read pair contains an
appreciable amount of genomic sequence. These genomic-reads are aligned to the
appropriate reference with Bowtie2 under the default parameters — except for the
following.
Only one of the two FASTQ files is aligned. We do not run a paired-end alignment
despite having paired-end reads.
Before the genomic sequence on the read is an 11-mer DPM tag sequence. To
account for this, we run Bowtie2 with `--trim5 11`.
After the sequence, there are two possibilities. The read may extend into the tag
sequences on the other end of the fragment if the fragment is too short, or the read
may terminate before the tags if the fragment is long enough. To account for the
inclusion of tag sequences, we run Bowtie2 with `--local`. (This would also deal with
the DPM tag at the start of the sequence.
We align to both the reference chromosomes and unplaced scaffolds (typically end
in “random”).
We sort the resulting SAM file and convert it to a BAM file. The names of each SAM
record contain the identified tags, as these were present in the input FASTQ files.
7.3 Filtration
The BAM file is then passed through successive filtration steps:
i. Remove all alignments with a MAPQ score less than 30. This removes all
unmapped reads. Note that the MAPQ score depends on the aligner used; it is
not standardized. If a different aligner is used, this step will need to be
replaced with a different quality-filtration step.
ii. Remove all alignments that align to the reference with a Hamming score > 2.
We only tolerate two mismatches at most between the read and the
reference.
iii. Remove all alignments that overlap (in any amount) any region in the repeat-
mask BED file provided to us by B. Tabak. We used bedtools intersect with
the `-v` flag set.
32
SPRITE Protocol
iv. Remove all alignments that overlap (in any amount) any region in the mask
BED file generated by ComputeGenomeMask in the GATK package from the
Broad. This mask file was generated by shredding the reference into 35-mers
and BLATting them against the reference. Any non-unique location that a 35-
mer maps to is masked. The output of ComputeGenomeMask is not a BED file,
but a FASTA file where all masked bases are represented with 0s, and all
unmasked bases are represented with 1s. This mask file is converted to a
BED file with a custom Python script.
7.4 Subsequence post-processing
See the Guttman Lab Github page for the post-processing scripts to identify
interactions and QC the library is found here. https://github.com/GuttmanLab/barcoding-
post/wiki
7.5 Quality Controls of Successful SPRITE Libraries
We use the following metrics to evaluate whether SPRITE tagging was successful on
the post-sequencing library:
Calculate percentage of reads with all tags ligated (get_ligation_efficiency.py)
1. The percentage of reads with all tags ligated should be >90% each round. For
example, if r=5 rounds of SPRITE were performed, we can determine this by
calculating the fraction of reads f with all 5 barcodes and calculating the
ligation efficiency each round. Ligation efficiency per round = f 1/r. This is an
example of the distribution of reads with the number of barcodes identified.
2. Percentage of chromatin that is interacting with other chromatin: We have
found that over-fragmentation of the lysate via sonication for 15-20 minute
results in the majority of SPRITE molecules that are non-interacting.
Specifically, the majority of reads after heavy sonication do not share
barcodes with other molecules. Spinning the chromatin after 1-2 minute at a
high speed after sonication and only performing SPRITE on the supernatant
also results in a similar problem.
33
SPRITE Protocol
3. FastQC to QC quality of reads on sequencer. Due to a monotemplate that can
occur because all reads share a sticky end, we QC whether any monotemplate
issues happen to impact the quality of the reads when a common sequence is
reached on the machine. We have obtained high quality libraries on HiSeq
2500 and HiSeq 4000 and NextSeq loading at 8pM.
4. Calculate sufficient sampling of library to ensure all tagged molecules are
sequenced, which ensures that the majority of molecules in a complex are
identified on the sequencer. We use the program PreSeq from the Smith Lab
to ensure that most of the unique molecules (>70%) estimated by the
program in a library are sampled. We typically sample at a depth of 1.5-2x
reads per the total number of unique molecules estimated in a library. For
example, if we estimate there are 70M unique molecules in a library, we will
sample that library with 105-140M reads. Sequencing at higher depth results
in many duplicates without gaining many more molecules in each complex.
Highly under-sampling (<40%) however results in a lot of molecules that are
not interacting with any other molecule and need to be sequenced further.
34