Optimization of A Microarray For Fission Yeast
Optimization of A Microarray For Fission Yeast
Optimization of A Microarray For Fission Yeast
fission yeast
Dong-Uk Kim1, Minho Lee2, Sangjo Han3, Miyoung Nam4, Sol Lee4,
Jaewoong Lee4, Jihye Woo4, Dongsup Kim5, Kwang-Lae Hoe4*
Original article 1
Aging Research Center, Korea Research Institute of Bioscience & Biotechnology (KRIBB),
Daejeon 34141, Korea
2
Catholic Precision Medicine Research Center, College of Medicine, The Catholic University
eISSN 2234-0742 of Korea, Seoul 06591, Korea
3
Genomics Inform 2019;17(3):e28 Data Analytics CoE, SK Telecom, Seongnam 13595, Korea
4
https://doi.org/10.5808/GI.2019.17.3.e28 Department of New Drug Development, Chungnam National University, Daejeon 34134,
Korea
5
Department of Bio and Brain Engineering, Korea Advanced Institute of Science &
Technology (KAIST), Daejeon 34141, Korea
Received: June 3, 2019
Accepted: June 28, 2019
Bar-code (tag) microarrays of yeast gene-deletion collections facilitate the systematic
*Corresponding author: identification of genes required for growth in any condition of interest. Anti-sense strands
E-mail: kwanghoe@cnu.ac.kr of amplified bar-codes hybridize with ~10,000 (5,000 each for up-and down-tags) differ-
ent kinds of sense-strand probes on an array. In this study, we optimized the hybridization
processes of an array for fission yeast. Compared to the first version of the array (11 µm,
100K) consisting of three sectors with probe pairs (perfect match and mismatch), the sec-
ond version (11 µm, 48K) could represent ~10,000 up-/ down-tags in quadruplicate along
with 1,508 negative controls in quadruplicate and a single set of 1,000 unique negative
controls at random dispersed positions without mismatch pairs. For PCR, the optimal an-
nealing temperature (maximizing yield and minimizing extra bands) was 58°C for both
tags. Intriguingly, up-tags required 3× higher amounts of blocking oligonucleotides than
down-tags. A 1:1 mix ratio between up- and down-tags was satisfactory. A lower tem-
perature (25°C) was optimal for cultivation instead of a normal temperature (30°C) because
of extra temperature-sensitive mutants in a subset of the deletion library. Activation of
frozen pooled cells for >1 day showed better resolution of intensity than no activation. A
tag intensity analysis showed that tag(s) of 4,316 of the 4,526 strains tested were repre-
sented at least once; 3,706 strains were represented by both tags, 4,072 strains by up-tags
only, and 3,950 strains by down-tags only. The results indicate that this microarray will be
a powerful analytical platform for elucidating currently unknown gene functions.
Introduction
Molecular bar-code arrays facilitate the parallel analysis of thousands of biological samples
2019, Korea Genome Organization
through a microarray [1]. In particular, the unique 20-bp DNA bar-codes or tags in each
This is an open-access article distributed deletion strain enable the individual fitness of thousands of deletion mutants to be ana-
under the terms of the Creative Commons
Attribution license (http://creativecommons. lyzed from a single pooled culture. In principle, the change in the number of cells of inter-
org/licenses/by/4.0/), which permits est within the pooled library is visualized by the hybridization between fluorescence-la-
unrestricted use, distribution, and
reproduction in any medium, provided the beled PCR amplicons of unique molecular bar-codes and their cognate probes on the ar-
original work is properly cited. ray. This provides a powerful system for identifying the genes required for growth in any
condition of interest [2].
1/9
Kim DU et al. • DNA microarray for fission yeast
These arrays are well known for their use with the tagged strains Design of PCR primer pairs and gene-specific tags
of yeast gene-deletion collections. Deletion collections have been Notable components of bar-code regions are represented in the
constructed for budding yeast [3] and fission yeast [4,5]. Budding schematic drawing shown in Fig. 1A. For amplification of up- and
yeast is the pioneer model organism for gene-deletion collections, down-tags, pairs of each 20-mer primer, the universal primers U1/
followed by fission yeast. As the two types of yeasts are distant U2 and D1/D2, were theoretically designed (shown as two pairs of
within a phylogenetic tree [6], they play complementary roles in rectangles in the insets of Fig. 1A). Optimal PCR primer sets were
the systematic elucidation of gene function [7]. empirically selected by the criteria of maximizing the yield and
Among arrays for budding yeast [8,9], the first version of such an minimizing the extra bands (shown as two pairs of arrows, also re-
array (TAG3) was constructed with a 24-µm feature for each probe. fer to Fig. 3). The length of the four chosen primers was a 20/19-
In response to technological developments, the original TAG3 was mer and a 17/18-mer for U1/U2 and D1/D2, respectively. For flu-
improved to the TAG4 array. In particular, the feature size was re- orescence detection of hybridization, biotin was linked at both ends
duced to 8 µm with a capacity of ~100,000 (100K). Furthermore, of the anti-sense primers (shown as asterisks).
mismatch pairs and antisense-strand probes of each tag were re- The sense strands of the bar-codes were designed for tiling on the
moved, because they were proven to be uninformative. According array using the criterion of melting temperature (Tm, 60–65°C),
to an analysis of data reproducibility, at most triple 8-µm features GC content (30%–70%), and cross-hybridization with other bar-
were needed to equal the performance of a single 24-µm feature. To codes and genomic DNA regions (exact matches of no more than
test the ability of the TAG4 array to accurately measure differences 10 bp, corresponding to a blast score lower than 20), as shown in
in tag abundance, researchers conducted a signal ratio analysis and Fig. 1B. Finally, ~11,000 bar-code sequences were selected through
derived a correction function to adjust distorted intensity values the above criteria with an average Tm of 62°C (Fig. 1C).
due to the saturation effect.
Herein, we present detailed information on the optimization Hybridization: PCR and blocking oligonucleotides
process of fission yeast arrays by incorporating useful pieces of in- Hybridization of the array was performed as previously described
formation from earlier budding yeast arrays [8-10]. The optimiza- [4,12]. In brief, cells were collected during the pooled growth ex-
tion process of the array can reduce the inevitable defects caused by periments, and their genomic DNA was prepared from frozen cell
the innate hybridization bias. This study will provide a solid plat- stocks. For each PCR sample, 10–20 OD600 (2– 4 × 108 cells/mL)
form for fitness profiling using microarray technology. were used. For amplification and labelling of gene-specific tags,
PCR was performed with the indicated sets of universal PCR prim-
Methods ers (as shown by the pairs of head-to-head arrows in Fig. 1A) using
0.2 µg of genomic DNA as a template. PCR amplification was per-
Oligonucleotides, medium, and DNA samples formed through 30 cycles consisting of denaturation at 94°C, an-
All the synthetic DNA oligonucleotides were obtained from Bi- nealing at 55°C, and extension at 72°C for 30 s for each step in a to-
oneer (Daejeon, Korea). Yeast cells were cultivated in YES medium tal volume of 100 µL (2.5 mM MgCl2, 0.2 mM dNTP, and 1 µM
(0.5% yeast extract, 3% glucose, and appropriate amino acid sup- each primer mix). Hybridization was carried out using the Affyme-
plements) at 30°C unless otherwise stated, following the manufac- trix Fluidics Station 450 (Pasadena, CA, USA).
turer’s instructions [11]. Genomic DNA from the fission yeast gene As shown in Fig. 1D, only anti-sense strands of PCR products
deletion library was extracted using the Quick-DNA Fungal/Bacte- (shown as the filled rectangles with white dots) labeled with biotin
rial kit (catalog #D6005, Zymo Research Co., Irvine, CA, USA). (shown as the asterisks) were used for hybridization against sense-
strand probes (shown as the dotted rectangles) tiled on the chip. In
Gene deletion library of fission yeast addition, for each tag, four priming sequences were shielded by
The gene deletion library used in this study was constructed based blocking oligonucleotides (shown as rectangles in gray; refer to the
on the principle of homologous recombination, as previously re- previous report [4] for the sequence information), which prevent-
ported [4]. In brief, for each strain the open reading frame was re- ed melted strands from re-associating.
placed and tagged by homologous recombination with a deletion
cassette consisting of the KanMX module (the selectable resistance Design of the custom-made Affymetrix GeneChip
gene KanMX4 and a pair of unique 20-mer molecular bar-codes For the microarray experiments, two versions of Affymetrix
(up-tag and down-tag) on both sides flanking the KanMX4 gene) GeneChips with 11 µm features were custom-made by Affymetrix,
and its flanking homologous regions to the chromosome (RHG). “Affy-KRIBB SP1 (Part No. 520429)” and “Affy-KRIBB SP2 (Part
2/9 https://doi.org/10.5808/GI.2019.17.3.e28
A n=1 5’-RHG (1) UPTAG (1) KanMX 4 DNTAG (1) 3’- RHG (1) C
n=2 5’-RHG (2) UPTAG (2) KanMX 4 DNTAG (2) 3’- RHG (2)
.
.
n=4,256 5’-RHG (n) UPTAG (n) KanMX 4 DNTAG (n) 3’- RHG (n)
https://doi.org/10.5808/GI.2019.17.3.e28
(Designed, 20-mer) (Designed, 20-mer)
Genomics & Informatics 2019;17(3):e28
DN tag (n), 20-mer Fig. 1. Schematic overview of the tag array in fission yeast. (A)
Structure of deletion cassettes and detailed tag regions. The
Universal primer D1 Eco RV DraI Universal primer D2 deletion cassette containing the KanMX4 gene is flanked by a
(Designed, 20-mer) (Designed, 20-mer) pair of unique bar-codes (UPTAG and DNTAG, corresponding to
up-tags and down-tags, respectively) and regions of homology
to the gene of interest (RHG). The deletion cassette replaces the
B D open reading frame of interest by homologous recombination
at the RHG regions. To facilitate whole genome analysis, each
UP or DN tag deletion cassette is assigned a molecular bar-code, a pair of
Select a new possible bar-code
unique 20-mer sequences, which are referred to as the “UP
Biotin
PCR tag” and “DN tag” as shown in the insets with dotted lines. To
amplify each tag for hybridization, PCR was performed using
Sense strand the following universal primer pairs: U1/U2 for the “UP tag” and
60°C < Tm < 65°C
Antisense D1//D2 for the “DN tag.” For labeling, biotin was attached to the
Yes strand U2 and D1 primers (shown as asterisks). (B) Scheme of bar-code
Denaturation
Blocking of priming sequences
design. The 20-mer bar-codes used in the study were selected
using following criteria: (1) Tm of 60-65°C with a deviation of
30%<GC contents<70%
No ±2°C, (2) GC content between 30% and 70%, (3) less than 10-
bp cross-hybridization with other bar-codes, and (4) less than
Yes 10-bp cross-hybridization with genomic DNA of fission yeast.
The algorithm loop was repeated until to get 11,000 bar-codes.
No cross-hybridization (C) Tm distribution of the bar-codes. The bar-codes obtained
with another bar-codes No from the above algorithm showed a normal distribution in the
Hybridization
range of 57-67°C with a mean of 62°C. (D) Overall scheme of
Yes hybridization between tag amplicons and array probes. The pair
of unique bar-codes consisting of an up-tag and a down-tag
No cross-hybridization were amplified by PCR using a pair of flanking universal primers,
with genomic DNA No Sense
probe
one of which was labeled by biotin (shown with asterisks). The
Yes PCR product was hybridized with the custom-made array. Note
that anti-sense strands (filled rectangles with white dots) of tags
were hybridized to sense strands (dotted rectangles) of probes
Record bar-codes
on the array. To reduce unwanted hybridization, all regions of
Array chip the universal primers were shielded by blocking oligonucleotides
3/9
(shown as gray rectangles).
Kim DU et al. • DNA microarray for fission yeast
A B
Fig. 2. Design scheme of custom-made GeneChip and probes. For the microarray experiments, two versions of an Affymetrix GeneChip were
made by a custom order. (A) Schematic overview of KRIBB-SP1 (100K) and KRIBB-SP2 (48K). The KRIBB-SP1 array was made to represent
43,721 probe pairs (perfect match and mismatch) in three separate sectors along with another control probes, corresponding to ~10,000
tags in triplicate or quadruplicate. The KRIBB-SP2 array was made to represent 41,169 probes without mismatch pairs at randomly dispersed
positions along with other control probes, corresponding to ~10,000 tags in quadruplicate. For the reader’s convenience, array chips are
represented in blue due to fluorescence, and the areas of negative control probes are represented by solid black squares within chips due
to the absence of hybridization. (B) Strategy of probe design. Sequence information for each probe was represented in the sense strands by
seven-digit numbers. The first digit represents whether its sequence was confirmed by Sanger sequencing, and the second digit specifies the
type of deletion mutant. The third digit specifies the type of probe, including up-tag, down-tag, positive, or negative control. The remaining
four digits represent working codes of deletion mutants, which replace complicated systematic ID numbers of each strain with simple
numbers for the convenience of experiments.
No. 520506),” by following the guidelines of the GeneChip Custo- strands by seven-digit numbers (Fig. 2B).
mExpress Array Program.
Ideally, the first version (array format 100-3660) could represent Analysis of tag intensity and signal ratio
4,800 different probes by 11 probe pairs (perfect match and mis- The probe intensity was obtained using the GeneChip Scanner
match) with a maximum capacity of 100,000 (100K ≅ 4,800 × 3000 and GeneChip Operating Software (Expression-Affymetrix
× 22). As the required features for the ~10,000 tag probe pairs in MAS) with high-resolution upgrades. In brief, scanning of
question (5,000 up-tags + 5,000 down-tags) exceeded 220K GeneChips generated a variety of data in multiple file formats, in-
(10,000 × 22), the first version of the array was modified to repre- cluding .EXP, .DAT, .CEL, and .CDF. Among them, the.CEL files
sent 47,721 probe pairs in three separate sectors, consisting of harboring probe intensity data were then analyzed using the R
43,721 probe pairs (~10,000 tags in triplicate or quadruplicate, posi- package ‘affy’ [13]. The signal ratio (Fig. 4) and the distribution of
tive controls included) and a single set of 4,000 unique negative con- tag signals (Figs. 5 and 7 ) were plotted by kernel density plots, im-
trols, resulting in 91K features (43,721 × 2 + 4,000) as shown in the plemented in the R statistical software [14].
left diagram in Fig. 2A. The second version (array format 400) was
made to represent 48,201 features (48K), consisting of 41,169 Analysis of the relative growth rate
probes for 10,292 tags in quadruplicate (~5,000 up-tags + ~5,000 The relative growth rate (Fig. 6) was estimated from slopes of linear
down-tags) without mismatch pairs, 1,508 negative controls in qua- models with time measured in generations. The estimated data on
druplicate, and a single set of 1,000 unique negative controls the growth rates were added to the slope, which was set with a rela-
(~10,000 × 4 + 1,508 × 4 + 1,000) at randomly dispersed positions tive growth rate of 1.00 as the standard, as described previously [4].
without separate sectors, as shown in the right diagram in Fig. 2A.
For tiling sense-strand probes on arrays, specific files containing Results
the information about the total probes were generated in Excel
spreadsheets, following the guidelines suggested by Affymetrix. Se- Optimization of bar-code PCR: annealing temperature
quence information about each probe was represented in the sense As the first step toward optimizing the GeneChip hybridization,
4/9 https://doi.org/10.5808/GI.2019.17.3.e28
Genomics & Informatics 2019;17(3):e28
https://doi.org/10.5808/GI.2019.17.3.e28 5/9
Kim DU et al. • DNA microarray for fission yeast
0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10
Tag intensity (x1,000) Tag intensity (x1,000) Tag intensity (x1,000) Tag intensity (x1,000)
0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12
Tag intensity (x1,000) Tag intensity (x1,000) Tag intensity (x1,000) Tag intensity (x1,000)
Fig. 4. Optimization of signal ratios depending on the concentration of blocking oligonucleotides. The blocking oligonucleotides for
each gene are required at eight primer sites for a pair of the four universal primers (U1/U2 and D1/D2). The concentration of blocking
oligonucleotides (12.5 pm/µL each) was defined as 1×, corresponding to a 1:1 ratio of amplified tags. To determine the optimal
concentrations of blocking oligonucleotides, a set of samples were hybridized with the indicated concentrations of blocking oligonucleotides
from 0.5× to 3×, and subjected to analysis for the best resolution of the signal ratio. The dots in green, blue, and red represent each intensity
from a variety of tag mixes, where up-tags comprise 25%, 50%, and 75% of the total mix and vice versa for down-tags. Note that 3× of
up-tag and 1× of down-tag blocking oligonucleotides resulted in the best resolution of the signal ratio (shown as upper and lower filled
rectangles).
into vials, and stored in a deep-freezer until use [12]. For systematic nated due to the possibility of cross-hybridization, because they
screening of target genes affected by drugs, a single vial was cultivat- harbored a sequence similarity longer than 15 bp with each other
ed and treated with a drug. Cells were collected every five genera- by chance. This left 4,441 strains for further analysis. In addition,
tions until 20 generations, and their genomic DNA was extracted 125 strains showing noisy intensity for up-tags and/or down-tags
for further microarray analysis. It was determined whether the fro- were also eliminated, because they showed the intensity less than
zen pooled cells should be activated for more than 1 day in order to 4 × background signals. Finally, 4,316 strains were proven to be
obtain the best resolution of tag intensity. To do so, the resolution useful for the microarray, as they were represented at least once
of tag intensity was compared between activated and non-activated among the up-tags and/or down-tags. In particular, 3,706 strains
samples (Fig. 7). When the number of up-tags (shown as curved were represented by both tags, 4,072 strains by up-tags only, and
red lines), down-tags (shown as curved blue lines), and total up/ 3,950 strains by down-tags only.
down-tags (shown as curved gray lines) were plotted against tag in-
tensity using the same amounts of total intensity, the tag intensity of Discussion
activated cells showed a better, broad distribution from 5.0 to 5.8
(right panel) in comparison to that of the non-activated cells (from Molecular bar-code arrays enable the systematic identification of
3.7 to 4.0; left panel). genes required for growth in any condition of interest [2]. These
arrays are best known for their use with collections of yeast gene
Summary of available bar-codes deletions, each of which is tagged with a 20-mer identifying DNA
Initially, 4,526 strains in total were pooled and entered into the op- sequence known as a molecular bar-code. However, their extensive
timization process. However, the bar-codes of 85 strains were elimi- usage has been hampered by inevitable defects, such as intensity
6/9 https://doi.org/10.5808/GI.2019.17.3.e28
Genomics & Informatics 2019;17(3):e28
15 15 15 15
0 0 0 0
0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12
Tag intensity (x1,000) Tag intensity (x1,000) Tag intensity (x1,000) Tag intensity (x1,000)
15 15 15 15
0 0 0 0
0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12
Tag intensity (x1,000) Tag intensity (x1,000) Tag intensity (x1,000) Tag intensity (x1,000)
Fig. 5. Optimization of the tag mix between up- and down-tags. For the best resolution of tag intensity, up- and down-tags were mixed
at the indicated ratios with various concentrations of blocking oligonucleotides (shown as insets), and subjected to intensity analysis. A 1:1
mix between up- and down-tags resulted in the best resolution of tag intensity.
1.05 1.05
30ºC 25ºC
Relative growth rate
1.00 1.00
bias and bar-code mutations [15,16]. In a previous study [16], we a Sanger sequencing analysis of bar-codes. In the study, we opti-
reported that 16.6% of bar-codes contained mutations, as judged by mized the hybridization processes of the array in order to reduce
https://doi.org/10.5808/GI.2019.17.3.e28 7/9
Kim DU et al. • DNA microarray for fission yeast
No. of tags
10 10
5 5
0 0
3.6 3.8 4.0 4.2 4.4 5.0 5.5 6.0 6.5 7.0
Tag intensity (x1,000) Tag intensity (x1,000)
Fig. 7. Optimization of intensity resolution by activation of frozen pooled cells. The effects of activation of frozen pooled cells on the
resolution of tag intensity were identified in order to determine whether to activate the frozen pooled cells. The activation of frozen library
cells for more than 1 day resulted in better resolution of tag intensity for up-tags (shown as the curved red line), down-tags (shown as the
curved blue line), and the mean of both (shown as the curved gray line).
the inevitable defects originating from dozens of rounds of each hy- Overall, 95% of the tags (4,316/4,526) in the deletion library
bridization step. were represented at least once among the up-tags and/or down-
The universal primer pairs to amplify molecular tags were theo- tags. At first glance, a 95% success rate appears good, but there
retically designed as 20-mer oligonucleotides. However, during the might exist ~250 false-positive tags supposing a 5% detection error
optimization process, it was elucidated that their actual lengths among ~5,000 deletions in fission yeast. To circumvent this prob-
were a 20-mer/19-mer and a 17-mer/18-mer for U1/U2 and D1/ lem that arises from both yeast collections, the problematic bar-
D2, respectively. Furthermore, the annealing temperature was em- code probes were corrected in the second version of the array as re-
pirically obtained to obtain maximum yield with fewer noisy bands. ferred to earlier, as far as sequence data of the tags were available.
In response to technological advances in microarrays, the second However, the array technology still requires an upgrade to com-
version of the GeneChip array was improved to show better perfor- pletely eliminate potential defects, which promises to make the fis-
mance without a change in feature size (11 µm) despite a twofold sion yeast gene deletion library a reliable tool for understanding
reduction in feature capacity from 100K to 48K. For example, the molecular gene function in a systematic way. In this regard, a plat-
second version did not have separate sectors and mismatched form based on next-generation sequencing (NGS) analysis would
probes, with fewer negative and positive controls. In contrast, the be innovative and would avoid potential errors, because each bar-
second version of the array in budding yeast [8] reduced the feature code could be counted by direct sequencing irrespective of the in-
size from 24 µm to 8 µm, while retaining 100K features, but using a evitable defects caused by array hybridization. The results of this
similar strategy to ours for probe allocation. study serve as a basis for an innovative upgrade of the present mi-
Compared with the optimization process of budding yeast arrays croarray technology to a future NGS technology.
[8,10], a couple of optimization steps in fission yeast arrays are pe-
culiar. Regarding blocking oligonucleotides, an intriguing observa- ORCID
tion was made that the concentration required for up-tags was
three-fold higher than that required for down-tags. This unexpected Dong-Uk Kim: https://orcid.org/0000-0003-4088-0893
phenomenon would make sense under the assumption that the Minho Lee: https://orcid.org/0000-0002-0168-9546
PCR yield of up-tags would be higher than that of down-tags. Sangjo Han: https://orcid.org/0000-0002-7644-3671
When the DNA sequences of PCR primers were carefully checked, Miyoung Nam: https://orcid.org/0000-0002-7465-4029
the universal primer D2 for down-tag PCR contained a “G” stretch, Sol Lee: https://orcid.org/0000-0003-1743-2419
with six “G’s” straight in a row. This “G” stretch could result in a Jaewoong Lee: https://orcid.org/0000-0002-7833-9528
lower yield for down-tag PCR than for up-tag PCR. It was a mis- Jihye Woo: https://orcid.org/0000-0003-4019-5570
take that we did not carefully check for “G” stretches inside the Dongsup Kim: https://orcid.org/0000-0002-5916-6799
PCR primers. Next, a recessive ts-mutation unrelated to the gene Kwang-Lae Hoe: https://orcid.org/0000-0002-3943-4549
deletion was contained in a subset of our deletion collection, which
required an extra step to optimize the culture temperature.
8/9 https://doi.org/10.5808/GI.2019.17.3.e28
Genomics & Informatics 2019;17(3):e28
https://doi.org/10.5808/GI.2019.17.3.e28 9/9