1 s2.0 S2095927323004474 Main
1 s2.0 S2095927323004474 Main
1 s2.0 S2095927323004474 Main
Science Bulletin
journal homepage: www.elsevier.com/locate/scib
Short Communication
a r t i c l e i n f o a b s t r a c t
Article history:
Received 12 April 2023 Ó 2023 Science China Press. Published by Elsevier B.V. and Science China Press. This is an open access
Received in revised form 24 May 2023 article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
Accepted 12 July 2023
Available online 14 July 2023
DNA phosphorothioate (PT)-modification, with a non-bridging gen bonds and electrostatic interactions between SBD and PT-DNA
oxygen in the phosphodiester backbone substituted by sulfur, is also contribute to the binding. In sharp contrast to DNA methyla-
an epigenetic marker in prokaryotes and involved in the bacterial tion, PT-modification was only found in prokaryotes, suggesting
defense system, anti-oxidative stress, and gene regulation [1–4]. that PT-modification could be utilized as a specific marker for
PT-modification could be specifically recognized by sulfur binding enabling biotechnology development in vitro and in eukaryotic
domains (SBD) of PT-dependent restriction endonucleases (REases) cells. We proposed a new nucleic acid detection system utilizing
[5,6], making it a potential tool for enabling biotechnology devel- PT-modification as a specific marker and SBD as the targeting
opment [7]. However, the unclear recognition sequence-range of effector. However, SBDs are not only PT-dependent, but also
SBDs limits the application. Here, we report a technique named sequence-specific. Although 10 SBDs were characterized among
sulfur binding specificity-sequencing (SBS-seq) for high- more than 2000 SBD homologs in public databases so far, their
resolution characterization of the complete sequence-range of sequence-specificity was only tested within a handful of
SBDs. We then employed tandem PT-modifications to improve biologically-occurring sequence patterns such as GPSGCC, GPSAAC,
the binding affinity and extend the sequence-range. The strong GPSTTC, GPSATC, and CPSCA (PS denotes the PT link) [6,10]. The
affinity and broad range facilitate a new nucleic acid detection ambiguous sequence-range of SBDs limits the targeting scope, so
platform based on SBD and PT-DNA. Our work provides insights a method measuring the complete sequence-specificity of SBDs is
for research on modification-dependent REases and facilitates highly needed. In addition, expansion of the SBD sequence-range
biotechnology applications of PT-DNA. will surely facilitate the application of SBDs, like the PAM site
Epigenetic modifications play roles in the bacterial defense sys- extension of clustered regularly interspaced short palindromic
tem, gene regulation, and genome organization [8]. Modified DNA repeats-associated proteins 9 (CRISPR-Cas9) [11,12].
or histone could be specifically recognized by their binding pro- In this work, we developed SBS-seq and profiled the high-
teins, the epigenetic ‘‘readers”. The methyl CG di-nucleotides resolution complete sequence-specificity of three typical SBDs
(CpG) binding domain as well as the SET and RING-associated from different species, including SBDMmo from the gram-negative
(SRA) domain are representative examples of DNA methylation gut microbiota Morganella morganii, SBDSpr from the gram-
readers [9]. Recently, novel PT-DNA binding SBDs were discovered positive saprophytic soil strain Streptomyces pristinaespiralis, and
in PT-dependent REases [5,6]. A typical SBD is monomeric, com- the recently discovered SBDHga from the marine strain Hahella
posed of about 160 amino acids, and contains a hydrophobic sur- ganghwensis (unpublished data). SBS-seq involves four core steps
face cavity. SBD recognizes PT-DNA by hydrophobic interactions (Fig. S1 online): (1) the construction of an unbiased random
between the Rp form sulfur atom and the surface cavity, and hydro- PT-DNA library, with the randomness of the constructed PT-DNA
library (1-PT) being confirmed by deep-sequencing (sequence logo
⇑ Corresponding authors. of the random library shown in Fig. S2a online); (2) in vitro binding
E-mail addresses: lg1072@aliyun.com (G. Liu), lxzhang@ecust.edu.cn (L. Zhang). selection with SBD protein; (3) the verification of the bound sub-
https://doi.org/10.1016/j.scib.2023.07.012
2095-9273/Ó 2023 Science China Press. Published by Elsevier B.V. and Science China Press.
This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
Y. Shuai et al. Science Bulletin 68 (2023) 1752–1756
Fig. 1. SBD binding sequence-specificity identified by SBS-Seq on 1-PT (a), 3-PT (b), and 5-PT (c) substrates. For each of the substrates, the upper panel shows the binding
specificity sequence logo of SBDMmo, SBDSpr, and SBDHga, which exhibited base preferences for SBD at each position in the 10-bp random region. The middle panel is heat map
of SBS-seq enrichment values for SBDMmo, SBDSpr, and SBDHga showing the log2 of the enrichment value for each oligonucleotide at each position in the 10-bp random region.
The lower panel plots show the maximum and minimum R value of each oligonucleotide in 10-bp random region for SBDMmo, SBDSpr, and SBDHga. The x-axis shows the 10-bp
random region in 50 to 30 direction.
strate, with the binding substrates selected by SBD being verified and the mechanism was explained by molecular dynamics (MD)
by electrophoretic mobility shift assay (EMSA, Fig. S3 online); simulations. The complete sequence-specificity of three SBDs was
(4) deep-sequencing data analysis. We then validated the characterized at single-base resolution, and the results were
SBS-seq results by fluorescence polarization (FP) assays and EMSA, presented in Fig. 1a.
1753
Y. Shuai et al. Science Bulletin 68 (2023) 1752–1756
The sequence logos of three SBDs (Fig. 1a, upper panel) revealed strates were >100,000 nmol/L, showing no binding with unmodi-
that specificity was defined at a core region of 1 to +3 nt around fied substrates. The Kd values of SBDMmo on 1-PT specific and
the PT-modification (base 5 to 8 of the 10-bp random region), unspecific sequence substrates were 341.0 ± 47.8 nmol/L and
which was consistent with previous studies [5,6]. To illustrate 12,263.0 ± 4278.0 nmol/L, respectively, consistent with the SBS-
the enrichment after in vitro selection, the R value of each base seq results. For 3-PT and 5-PT substrates, the Kd values of specific
was calculated using the occupancy in deep-sequencing results sequence decreased by 2–4 folds, to 176.4 ± 24.5 nmol/L, and 92.8
of the samples after selection and the original library (Fig. 1a, mid- ± 13.2 nmol/L, respectively. In contrast, the Kd values of unspecific
dle panel, and Table S1 online) [13]. The maximum and minimum sequence sharply decreased by 37–139 folds for 3-PT and 5-PT
R values of each site were plotted in the lower panel of Fig. 1a, and substrates, to 327.5 ± 73.8 nmol/L, and 88.0 ± 20.2 nmol/L, respec-
an obvious increase/decrease in R value was found in the core tively (Figs. 2a, S9a,b and Table S3 online). This result revealed the
sequence, indicating that SBDs have sequence-specificity in this enhancement in binding affinity by tandem PT-modification, and
4-bp region. We then analyzed the detailed sequence-specificity the significant improvement of binding with the unspecific
of the 4-bp core sequence, with the R value of all 256 patterns sequence proved the sequence-range expansion effect of tandem
presented in Fig. S4a and Table S2 (online). In these results, SBDMmo PT-modification. The Kd of SBDSpr displayed similar phenomenon,
favors GPSTAG and GPSTAC, SBDSpr prefers GPSGCC, GPSACC, GPSGAC, with the Kd values on 1-PT specific and unspecific sequence to be
and GPSGCA, while SBDHga has the most relaxed sequence- 418.0 ± 181.0 nmol/L and 22,432.0 ± 5908.0 nmol/L, respectively.
specificity among the tested SBDs, recognizing almost all sequence And for tandem PT-modification, the Kd values decreased by 2–6
patterns. We also fixed the core sequences and checked the folds for specific and 118–261 folds for unspecific sequence
appearance of the six flanking bases, and no significant specificity (Figs. 2a, S9c,d and Table S3 online). SBDHga was also tested by FP
was found (Fig. S5 online). assay, since SBDHga has little sequence-specificity, we used both
To reveal the mechanism of sequence-specificity, we simulated the specific and unspecific sequences of SBDMmo and SBDSpr for
the interactions between SBDs and the DNA nitrogenous bases from the experiment. SBDHga has comparable affinity with all four
the crystal structure (PDB accession number 7CC9) and predicted sequences (Kd values ranging from 330.7 ± 23.7 nmol/L to 418.8 ±
structures. SBDMmo and SBDSpr have 1 and 4 sequence-specific 36.4 nmol/L), and the Kd values decreased by 2–7 folds with tan-
interactions with the nitrogenous bases of the specific-sequences, dem modification (Figs. 2a, S9e–h and Table S3 online). We also
respectively. In contrast, the bases of unspecific-sequences could tested the binding with EMSA. SBDMmo and SBDSpr both bound
not form interactions with the SBDs. So we speculate that SBD more substrates for the specific sequence than the unspecific
sequence-specificity is determined by the contacts between protein sequence, and tandem PT-modification significantly increased the
and DNA bases. The highly sequence-specific SBDSpr needs more binding efficiency (Figs. S10 and S11 online). Sequences with mod-
contacts with the DNA bases, so it recognizes only a few sequence erate binding affinity were also tested (Fig. S12 online). The EMSA
patterns, such as GPSGCC. However, the SBDHga residues are too results were all consistent with SBD-seq analysis.
far away from the nitrogenous bases (> 4.5 Å) to establish stable Molecular dynamics simulations and binding energy
interactions (Fig. S6 online), and it may have more sequence- calculations were performed to explain the mechanism of
nonspecific interactions with the DNA backbone, making the sequence-specificity and the effect of tandem PT-modification
SBDHga recognizing a broad range of sequences. (Figs. S13–S16 online). The relative binding free energy (RBFE) of
Intriguingly, we found that SBDs have stronger binding effi- SBDSpr binding with specific sequences (take SS-1-PT as an
ciency on DNA with tandem PT-modification. This phenomenon example, SS represents specific sequence, Table S4 online) and
reminded us that SBDs may also have a broader sequence-range unspecific sequences (take US-1-PT as an example, US represents
on tandemly PT-modified DNA. To test this hypothesis, we con- unspecific sequence, Table S4 online) confirmed the sequence-
structed random DNA libraries with three (3-PT) or five (5-PT) tan- specificity, and the independent gradient model (IGM) analysis
dem PT-modifications, and the randomness of both libraries was provided information for understanding the expansion of the
verified by deep-sequencing (Fig. S2b, c online). We then used sequence-range for tandemly PT-modified DNA.
SBDSpr to perform EMSA with the PT-modified (1-PT, 3-PT, and 5- The SBS-seq identified the high-resolution complete sequence-
PT) and unmodified (0-PT) libraries. SBDSpr shifted significantly specificity, and tandem PT-modification design further expanded
more substrates for the 3-PT and 5-PT libraries than the 1-PT the sequence-range and improved the binding affinity of SBDs.
library, and no shift was found for the 0-PT library, demonstrating Based on this, we facilitated the SBD based nucleic acid detection
that tandem PT-modifications enhance the binding affinities of system for the detection of ssDNA. In previous research [14],
DNAs with SBD (Fig. S7 online). SBDMmo was fused with split firefly luciferase reporters, respec-
We then used SBS-seq to investigate the complete sequence- tively. Two PT-modified DNA oligos were used as probes, and the
specificity of SBDs on 3-PT and 5-PT modified DNA substrates. two probes were designed to be complementary to adjacent
The results of 3-PT and 5-PT substrates were plotted in Fig. 1b sequences of the target ssDNA (no PT-modification is required).
and c, respectively. The maximum and minimum R values of In presence of target ssDNA, the probes annealed with the target
SBDMmo and SBDSpr core sequences decreased by 60% (Fig. 1b, ssDNA, forming dsDNA with two adjacent PT-modified sites; then
lower panel) and 70% (Fig. 1c, lower panel) with 3-PT and 5-PT sub- the SBDs bound the two PT-modified sites, helping complementa-
strates, showing much more relaxed sequence-specificity than the tion of the split firefly luciferase reporters for signal production
1-PT library. The detailed core sequence-specificity was analyzed (Fig. 2c).
in Figs. S4b, c (online). Violin plots were also generated from the To test the effect of tandem PT-modification, a 70-mer DNA
SBS-seq results (Fig. S8 online), providing a full vision of oligo (TMP-70, Table S4 online) identical to partial sequence of
sequence-range expansion by tandem PT-modification. the RNA virus SARS-CoV-2N gene was used as target 1 for nucleic
FP assays and EMSA were carried out to validate the sequence- acid detection, and the PT-sites (GPSAAC-GPSAAC) of target 1 were
specificity measured by SBS-seq. We chose APSAGT with an R value substituted with preferred PT-sites of SBDMmo to generate target
of 0.01 as the unspecific sequence and GPSTCG with an R value of 2 (GPSTAC-GPSTAG). Probes with 0-PT, 1-PT, 3-PT, and 5-PT modifi-
6.3 as the specific sequence of SBDMmo. We measured the dissoci- cations (Fig. 2d) were tested. For target 1, the signal of negative
ation constants (Kd) of three SBDs (SBDMmo, SBDSpr, and SBDHga) control (without target ssDNA) was about 200 relative lumines-
with PT-modified DNA substrates of specific and unspecific cence unit (RLU), and the signals of samples (with 500 nmol/L tar-
sequences. The Kd values of all SBDs on 0-PT (unmodified) sub- get ssDNA) for 0-PT, 1-PT, 3-PT, and 5-PT probes were about
1754
Y. Shuai et al. Science Bulletin 68 (2023) 1752–1756
Fig. 2. Binding affinity assay and nucleic acid detection system. (a) The Kd values of SBDMmo and SBDSpr on their specific (M-SS or S-SS) and unspecific (M-US or S-US)
sequence substrates with 1-PT, 3-PT, and 5-PT-modifications measured by FP assays. (b) The Kd values of SBDHga on all the four substrates with 1-PT, 3-PT, and 5-PT-
modifications. M-SS(50 -GATCGPSTCGAT-30 ), M-US(50 -TAGCAPSAGTGG-30 ), S-SS(50 -ATCTGPSGCCTA-30 ), S-US(50 -ATCCAPSGCTGC-30 ). SBD protein solutions were diluted serially
using 2-fold dilutions (5 lmol/L starting concentration, 16 dilutions) and mixed with a 5 nmol/L final concentration of PT-DNA. The detailed modified sequences are listed in
Table S4 (online). (c) Schematic diagram of the SBD based nucleic acid detection system. SBD is fused with split firefly luciferase (Fluc) reporter, and two PT-probes which are
complimentary to the target sequence are used for detection (N half of Firefly luciferase, NFluc; C half of Firefly luciferase, CFluc). (d) Design of probes with 0-PT, 1-PT, 3-PT,
and 5-PT modification. (e) Nucleic acid detection of target 1 (with core sequence of GPSAAC-GPSAAC) and target 2 (with core sequence of GPSTAC-GPSTAG) based on single and
tandem PT-modification probes. NC: negative control without target sample. Sample: with target sample (adding 1 lL of 500 nmol/L sample). SNR: signal to noise ratio. (f)
Sensitivity of the detection coupled with asymmetric PCR based on single and tandem PT-modification probes (1 lL of samples with serial concentration was added to the
reaction mixture). * P < 0.05, ** P < 0.01, *** P < 0.001, and **** P < 0.0001 from the two-tailed student’s t test, error bars indicate the standard deviation of three replicates. ns:
no significance.
200 RLU, 3000 RLU, 6000 RLU, and 10,000 RLU, respectively. For probes it was about 5 nmol/L, and the whole detection efficiency
target 2, the sample signals increased due to stronger binding on increased significantly (Fig. 2f). The preferred PT-sites provides
preferred sites, especially for 1-PT (Fig. 2e). The SNR was calculated higher signal and better SNR compared with biologically-
using negative control as background. For the 0-PT probes, no sig- occurring PT-sites, and the tandem PT-modification not only
nal increase was detected in presence of the sample. For the 1-PT expanded the target range, but also improved the sensitivity.
probes, the SNRs were about 10 (target 1) and 17 (target 2). For In conclusion, we characterized the specificity of three SBDs and
the 3-PT, the SNRs increased to more than 20; and for the 5-PT validated the results by EMSA and FP assays. In contrast with a
probes, the SNRs increased to more than 30 (Fig. 2e). We next handful of biologically-occurring PT-sites [2], SBDs display rela-
tested the sensitivity of the nucleic acid detection based on single tively lower sequence selectivity. One possible reason is that the
and tandem PT-modification probes. For 1-PT probes, the detection 600 kDa PT-modification complex is able to adopt complicated
threshold of sample was about 50 nmol/L, while for 3-PT and 5-PT machinery to react strictly at specified sequences, while the
1755
Y. Shuai et al. Science Bulletin 68 (2023) 1752–1756
20 kDa SBD has only limited interactions with the DNA bases [5] Liu G, Fu W, Zhang Z, et al. Structural basis for the recognition of sulfur in
phosphorothioated DNA. Nat Commun 2018;9:4689.
which is insufficient for high specificity. Moreover, the low
[6] Yu H, Li J, Liu G, et al. DNA backbone interactions impact the sequence
sequence-specificity of modification-dependent REases may pro- specificity of DNA sulfur-binding domains: revelations from structural
vide them broad spectrum in recognizing and restricting modified analyses. Nucleic Acids Res 2020;48:8755–66.
substrates. Similar phenomenon includes the methylases (GmATC [7] Yang W, Fomenkov A, Heiter D, et al. High-throughput sequencing of ecowi
restriction fragments maps the genome-wide landscape of phosphorothioate
for Dam, and CmCWGG for Dcm) and methylation-dependent modification at base resolution. PLoS Genet 2022;18:e1010389.
REases (RmCN40–3000RmC for EcoKMcrBC, SmCNGS for SauUSI, [8] Chen Y, Hong T, Wang S, et al. Epigenetic modification of nucleic acids: from
and mCNNR for MspJI) [15]. We also expanded the sequence- basic studies to medical applications. Chem Soc Rev 2017;46:2844–72.
[9] Nicholson TB, Veland N, Chen T. Chapter 3 - writers, readers, and erasers of
range, and improved the binding affinity by using tandem epigenetic marks. In: Gray SG, editor. Epigenetic cancer
PT-modifications in DNA. Compared with previous study [14], in therapy. Boston: Academic Press; 2015. p. 31–66.
which we used native PT-modification sites as baits for nucleic acid [10] Lutz T, Czapinska H, Fomenkov A, et al. Protein domain guided screen for
sequence specific and phosphorothioate-dependent restriction endonucleases.
detection, the complete sequence-specificity measured by SBS-seq Front Microbiol 2020;11:1960.
and confirmed by EMSA and FP assays provides precise instruc- [11] Hu JH, Miller SM, Geurts MH, et al. Evolved Cas9 variants with broad PAM
tions for target selection. Using 1-PT probes with specific- compatibility and high DNA specificity. Nature 2018;556:57–63.
[12] Walton RT, Christie KA, Whittaker MN, et al. Unconstrained genome targeting
sequence (such as GPSTAG and GPSTAC for SBDMmo) will enhance with near-PAMless engineered CRISPR-Cas9 variants. Science
the detection signal. In addition, tandem PT-modification not only 2020;368:290–6.
improves the detection sensitivity, but also extends the target [13] Bessen JL, Afeyan LK, Dancik V, et al. High-resolution specificity profiling and
off-target prediction for site-specific DNA recombinases. Nat Commun
range to all sequence patterns. Our work provides insights into
2019;10:1937.
related research on modification-dependent proteins, facilitates [14] Shuai Y, Ju Y, Li Y, et al. A rapid nucleic acid detection platform based on
the application of SBD-based detection methods, and paves the phosphorothioate-DNA and sulfur binding domain. Synth Syst Biotechnol
way for harnessing SBDs for other related biotechnologies. 2023;8:213–9.
[15] Loenen WA, Raleigh EA. The other face of restriction: modification-dependent
enzymes. Nucleic Acids Res 2014;42:56–69.
Conflict of interest
The authors declare that they have no conflict of interest. Yuting Shuai received her Ph.D. degree from East China
University of Science and Technology in 2023. She is
now working as a postdoctoral researcher at Shanghai
Acknowledgments Jiao Tong University. Her current research interest
focuses on the development of biotechnology methods
This work was supported by the National Key Research and based on DNA phosphorothioation, such as nucleic acid
detection and gene editing tools.
Development Program of China (2020YFA0907800,
2022YFC3400200, and 2022YFA0912200), the National Natural
Science Foundation of China (31900060), the Shanghai Pilot Pro-
gram for Basic Research-Shanghai Jiao Tong University
(21TQ1400204), and the Natural Science Foundation of Shanghai
(20ZR1414500).
Author contributions Guang Liu obtained his B.S. degree in Biotechnology and
Ph.D. degree in Microbiology from Shanghai Jiao Tong
University in 2007 and 2015, respectively. His current
Yuting Shuai, Anan Xu, Zhaoxi Han, Dini Ma, and Hairong Duan research interest focuses on nucleic acid modification,
performed the experiments. Jiayi Li and Yi-Lei Zhao performed the such as methylation and phosphorothioation on DNA
bioinformatics experiments. Xinye Wang, Lan Jiang, Jingyu Zhang, and RNA. He is also interested in nucleic acid detection
and gene editing technologies.
Gao-Yi Tan, and Xueting Liu processed the data and plotted the fig-
ures. Yaojun Tong, Shenlin Wang, Xinyi He, Zixin Deng, Guang Liu,
and Lixin Zhang designed the project and wrote the paper.
1756