Spectronaut™: User Manual
Spectronaut™: User Manual
Spectronaut™: User Manual
User Manual
1 General Information ............................................................................................... 9
Page 3 of 181
3.4.1 Setting up and Running an Analysis ...................................................... 39
Page 4 of 181
3.6.2 Normal Report ....................................................................................... 72
Page 5 of 181
3.10.6 Spectronaut Command Line Mode......................................................... 90
6 References .......................................................................................................... 98
Page 6 of 181
MS2 Intensity Correlation ............................................................................... 147
MS1 XIC Alignment and MS2 XIC Alignment Plots ......................................... 151
Heatmap......................................................................................................... 170
Page 7 of 181
Sample Correlation Plot .................................................................................. 172
Page 8 of 181
1 General Information
Spectronaut can analyze DIA data without the use of a retention time calibration kit. However,
the addition of the iRT Kit is highly recommended as it ensures calibration on difficult matrices
and allows for detailed quality control readouts.
Page 9 of 181
❖ Pulsar Database Search Engine
Improved performance for large protein databases (> 2 GB) with Pulsar
Improved identification and performance for unspecific searches
Page 10 of 181
1.3 Computer System Requirements
Spectronaut™ is only available for Windows operating systems. Command line operation is
also supported (see Section 3.10.6). The minimum and recommended system specifications
are described in Table 1.
64 GB or more
RAM 16 GB (16 GB of memory or more (1 precursor in 1 run
amounts to ~0.5 KB of RAM)
The memory growth for a given experiment can be estimated using the following equation:
0.6 ∗ 𝑛 ∗ 𝑟
𝑅𝐴𝑀𝐺𝐵 = 5.0 +
10242
where 𝑛 is the number of precursors in the library and 𝑟 is the number of runs in the
experiment. The baseline memory consumption (estimated at 5 GB in this equation) can vary
by vendor and gradient length. To get an estimate for the baseline, you can analyze a single,
representative raw file with the target library. Spectronaut has been successfully tested
running 1,000 2 h DIA raw files with a library of 200,000 precursors on a 128 GB RAM system.
Page 11 of 181
1.4 Post Installation Recommendations
1. Directories: Spectronaut™ will set all directories in the C:\ drive by default. However, it is
likely that the C:\ drive has a limited storage capacity. Thus, we strongly recommend
changing the Temporary Directory and the Local Search Archives directory to a local
destination with enough free memory. To do that, go to the Settings Perspective → Global
→ Directories (Figure 1).
Figure 1. Change the default location of Local Search Archives and Temporary Directories to local
destinations with enough storage capacity.
Page 12 of 181
1.5 Supported Mass Spectrometers
Spectronaut™ supports mass spectrometry DIA data from Thermo Scientific™, SCIEX,
Bruker, and Waters. The specific supported models are:
Page 13 of 181
1.6 Supported Data Acquisition Methods
Spectronaut™ analyzes a large variety of DIA data. Minimum requirements are a reversed
phase chromatography with either a linear or nonlinear gradient that spans at least 10-35%
acetonitrile. Methods acquiring MS1 and MS2 scans are recommended; methods acquiring
only MS2 scans are also supported. For more information on setting up a DIA acquisition
method, please refer to Box 1. The cycle time of the DIA method should be in the range of 2-3
seconds depending on the peak width of the chromatography used. MS1 as well as MS2 scan
ranges can be segmented. The MS2 scans should cover at least 500-900 m/z of precursor
range. More specifically Spectronaut supports HRM™ (Bruderer et al., 2015), WiSIM-DIA
(Kiyonami, 2014), AIF (Geiger et al., 2010), SWATH™ (Gillet et al., 2012), SWATH™ 2.0
(Lambert et al., 2013), SONAR™ (Moseley et al., 2018), BoxCar (Meier et al., 2018) DIA,
FAIMS Pro (Box 2; (Bekker-Jensen, Martinez-Val, et al., 2020)) dia-PASEF (Box 3), and
HDMSE (Distler et al., 2014). RTwinDIA (Li et al., 2019) is supported as well as the
staggered/shifted window method (Amodei et al., 2019) upon MS2 demultiplexing via the
HTRMS converter (co-installed with Spectronaut; see Section 4). Although fractionation in DIA
experiments is not recommended (Box 9), it is supported, including gas phase fractionation.
Multiplexed DIA (Egertson et al., 2013) and MSE (Silva et al., 2006) are not supported. In case
you experience technical problems with the software, or if you have feature suggestions,
please contact support@biognosys.com.
Although MS1 information in DIA is not strictly required, it is highly beneficial. MS1 information boosts the
sensitivity and is a very useful orthogonal information that significantly improves peak picking and scoring,
leading to a higher number of identifications (roughly 20-30%). It also hardly adds any time to the cycle
time.
In a DIA analysis, the MS1 XIC is reconstructed and evaluated by a set of different scores that focus mostly
on mass accuracy, isotopic pattern, XIC shape, and intensity. Together with the MS2 level scores, they
are weighted and incorporated into the final peak scoring. In a peptide-centric approach, the MS1
information will add to the scoring, but the scoring is not dependent on an MS1 signal being of good quality
or even present at all. For example, the dynamic range of certain biological samples can cause some
peptides to remain undetected in the MS1 scan. In such cases, if the MS2 signal is good enough,
Spectronaut can recover that peptide ion information.
Differently, a spectrum-centric approach strongly relies on the good quality of both MS1 and MS2 signals
for precursors identification. By default, the MS1 signal is not used for quantification, unless actively chosen
in the DIA analysis settings.
Page 14 of 181
If you are interested in further insight, please learn more in our MCP article on the optimization of
experimental parameters in DIA (Bruderer et al., 2017).
With the introduction of Spectronaut 14 we greatly increased the support for the FAIMS Pro™ device for
ion mobility (IM) filtering. DIA methods with FAIMS usually come in two main themes. Single CV methods
and multi CV methods. While single CV methods can be used with any already existing spectral library
(including those not recorded with a FAIMS device). DIA methods with multiple FAIMS windows require
matching IM (CV) annotation to be present in the spectral library to function.
To generate spectral libraries for FAIMS DIA we recommend the usage of Pulsar, our integrated search
engine. Pulsar supports the search of both FAIMS DIA and DDA methods for library generation.
Alternatively, one can also use Proteome Discoverer for generating spectral libraries from FAIMS DDA.
Single CV Multi CV
Standard Library ✓
FAIMS Library ✓ ✓
directDIA ✓ ✓
To analyze DIA data with multiple CV windows, one requires a library with CV annotation per precursor
(ideally measured on a DDA or DIA method with similar CV settings). The analysis of single CV DIA files
does not require any specialized libraries and can therefore be done with pre-existing libraries that were
generated on DDA or DIA data without the use of a FAIMS Pro™ device.
Alternatively, you can also analyze FAIMS DIA runs without the need for a spectral library using directDIA.
Page 15 of 181
Box 3. Ion Mobility dia-PASEF with Bruker timsTOF Pro™
Spectronaut supports dia-PASEF workflows. Spectronaut processes dia-PASEF data based on high-
precision ion mobility (IM) calibration workflow which is conceptually similar to retention time calibration
(Escher et al., 2012).
To analyze dia-PASEF data, it is recommended to use a library where the precursors are annotated with
their expected IM. While this is not mandatory, it will positively impact your analysis if your library consists
of IM information. To generate a library with IM annotation, you must use the Pulsar search engine as none
of the other external search engines like MaxQuant are supported. Pulsar can generate libraries from both
PASEF and dia-PASEF runs.
Spectronaut can also generate libraries with in-silico predicted IM. In the library generation settings, you
can enable the deep neural network to predict IM in libraries that do not have the IM dimension. This also
allows, for instance, to predict IM for libraries generated on other instruments than timsTOF Pro, such as
Orbitrap libraries
Pulsar ✓ ✓
MaxQuant
Page 16 of 181
2 Getting Started
Spectronaut™ software licenses can be requested on our webpage. We also provide free
licenses for a trial period upon request on our webpage. After requesting a license, you will
get an email with a link to the installer and an activation key for the software. If this is not
the case, please contact us at support@biognosys.com.
IMPORTANT NOTES:
1. Your license will start running from the moment we generate the activation key.
2. Activation keys are computer-bound. If you need to install Spectronaut on more than
one computer, please contact us at support@biognosys.com.
When you install and start Spectronaut for the first time, you will be asked to activate your
software by pasting your activation key into the Spectronaut activation dialogue. If your
computer can reach our servers, activation will be automatic. If your Spectronaut computer
cannot reach our servers (no internet connection, firewall, etc.), you can also activate your
software offline. The respective instructions will appear after a few seconds if online activation
was not successful: save the registration information file on your computer and send this file
to support@biognosys.com. In general, you will receive an activation file within one or two
working days. To activate Spectronaut using an activation file, click on the "Browse Activation
File…" button in the Spectronaut Activation dialogue.
In Section 3, Spectronaut™ Usage, we will guide you through the software perspective by
perspective. The examples shown for the classic DIA analysis (Section 3.4.1.1) are generated
with the demo data available for downloading here. Please note this demo data was
intentionally prepared to be as small as possible for demonstration purposes. Most DIA
experiments will require larger storage space and more resources to be analyzed.
Page 17 of 181
3 Spectronaut™ Usage
3.1.1 Layout
Spectronaut™ is structured in different levels (Figure 3). The highest level are the
perspectives. Within each perspective, you can often find several pages separated into tabs.
The layout of each page is normally structured into a left menu (tree) containing elements
(nodes) and a right panel containing information related to the selected nodes (plots, reports
and summaries). The Analysis perspective features Tree - and a Grid Views.
1. Spectronaut is full of informative tool-tips throughout the software (Figure 4). They will
appear as you hover over many of the elements.
2. There are many functionalities by right-clicking an element: experiment tab, plots, nodes,
etc.
Page 18 of 181
Figure 4. Spectronaut contains nonobvious tips and menus when you hover over some elements or right-click
on them.
Warnings are sometimes just informational, and do not require action. Errors during
spectral library generation and DIA analysis are shown in red. You can display the full
library and DIA analysis log in the respective perspective (Figure 5). The log provides
messages i.e., information on the steps executed in the pipeline. Warnings and errors are
displayed in separate tabs. The log can be directly saved as text file from the Library or
Analysis perspective or can be found in the About perspective under "Show error logs". If
an error occurs, please send the error log to support@biognosys.com.
Page 19 of 181
Figure 5. The analysis and library log shows messages, warnings, and errors in separate tabs. One or more
tabs can be selected and simultaneouly displayed.
4. Finally, we would also recommend watching Spectronaut video tutorials that you can find
here.They will guide you through basic steps of setting up your new DIA analysis with a desired
workflow.
Make sure you have everything you need ready before starting your analysis in Spectronaut™.
Two quantitative analyses are supported: classic DIA data extraction using a DDA spectral
library or direct search of the DIA data, directDIA™. Table 2 shows which resources are
required for each workflow.
Page 20 of 181
Table 2. Input resources for each DIA approach supported in Spectronaut.
The main tasks you can perform in the Library Perspective of Spectronaut™ are:
1. Generating a library with Pulsar, Biognosys' proprietary search engine (Section 3.3.1).
2. Generating a library using search results from external search engines.
3. Importing an external library.
Guidelines on how to generate the data for an optimal spectral library can be found in Box 4.
• Thermo Scientific™, with and without FAIMS (DDA, DIA, and PRM with MS1
information)
• SCIEX (DDA and DIA/SWATH™)
• Bruker, including TimsTOF Pro (DDA and DIA)
Page 21 of 181
• Waters (DDA and HDMSE)
Figure 6. Library Generation with Pulsar. In the Library Perspective, Spectral Library tab click "Generate
Library from Pulsar…". Follow the wizard to complete the process.
Every time a search is performed, Spectronaut will save the results (PSMs) for each run as a
Search Archive (Box 5). These search archives can be used to generate libraries without the
need to search the runs from scratch. Run files and Search Archives can be combined
conveniently to generate new libraries. Refer to Table 3 to see a summary of the resources
you will need in each of these cases.
To generate a library from Pulsar, go to the Library Perspective and click on "Generate Library
from Pulsar…" in the bottom left corner. A wizard will appear to help you set up the experiment
(Figure 6). A schematic view showing the wizard steps you will encounter, depending on your
input resources, is shown in Table 4. The sequential steps are described below:
Page 22 of 181
• Click on "Add Runs from File…" or "Add Runs from Folder…" and select the runs from
which you want to create the library. You can mix runs acquired in different modes. If
you want to generate a library from Search Archives only, skip this step.
2. Choose FASTA File(s) by clicking "Fasta File…". Protein databases can be assigned on
a run basis. Multiple protein databases can be selected. FASTA files can be added at this
step by clicking "Import…" in the bottom left corner. If you are generating a library from
Search Archives only, this page will not be shown.
3. Choose Pulsar Search Settings by clicking "Search settings…" (for detailed explanations
about each setting, see Appendix 2. Pulsar Search Settings (Section 7.2).
Choose either the default schema which can be modified on the fly or a previously saved
setting schema. Schemas can be assigned at either the experiment or run level. Only one
setting schema can be set per run. When nothing is selected for a run, default settings will
be applied. If you are generating a library from Search Archives only, this page will
not be shown.
Page 23 of 181
Box 4. Library Generation Guidelines
To generate a spectral library, typically DDA runs of your samples of interest are acquired, searched
against a sequence database, and the results condensed into a spectral library. To maximize the coverage,
we recommend measuring pools of representative samples that have been fractionated (e.g. using high
pH reverse phase fractionation). Technical LC-MS/MS replicates are still recommended due to the
semi-stochastic nature of DDA proteomics. The optimal number of pools, fractions and replicates depends
on the experimental setup and the complexity of the samples. However, overly large spectral libraries
where only a small percentage can be recovered from the data might negatively influence the sensitivity
of your analysis.
We strongly recommend generating the library in Spectronaut™. Although for most common samples the
iRT Kit is not strictly required, we do recommend spiking the iRT Kit into samples aimed at library
generation. Spectronaut™ will take care of calculating iRTs for all peptides identified even if the iRT Kit
was not used.
4. Next, you can add Search Archives (for more information see Box 5Box ) to your library.
Search Archives prevent you from having to research run files if you have already searched
them in the past.
5. Next, you can choose the FASTA files that will be used for protein inference. If you want to
keep the files that were selected for the search space (as set in a previous page for the
runs or the ones that were automatically saved in the archives), you can just skip this step.
6. In Spectronaut you can generate your library with Gene Ontology (GO) annotation
information. To select a GO annotation (*.goa) file at this point, you need to have the file
already imported in the Databases Perspective. Learn how to do so in Section 3.9.4.2.
Before Search Archives were introduced, already searched run files had to be searched again from scratch
to include them in a library with other runs and maintain control of the FDR. This resulted in a great amount
of time and computational resources which had to be reinvested.
With Search Archives, every time a library is generated using Pulsar, the result from this Pulsar search is
saved, and will appear in the Search Archive page of the Library Perspective. Search Archives contain the
information from a search before applying any FDR filter. This allows several Search Archives to be
combined together or with runs files to generate libraries with a proper, library-wide control of the FDR.
7. The next wizard page contains experiment-wide settings for library generation, such as
PSM, peptide, and protein FDR thresholds (for a detailed explanation about each setting,
see Appendix 4. Library Generation Settings - Section 7.4).
Page 24 of 181
8. The last page shows an overview of the whole experiment set-up. Clicking "Finish" will start
the experiment. Using the "View Live Log…" it is possible to follow the progress of the
experiment. As soon as the library is generated, it will appear in the library tree. Libraries
with FASTA files assigned are marked with a blue protein icon.
Table 4. Schematic view of the wizard steps during library generation depending on the input resources
Choose an experiment
required required required
name
required
required
Choose Search Settings (with not applicable
(with default)
default)
Specify Protein
optional optional optional
Inference FASTA files
required
required required
Choose library settings (with
(with default) (with default)
default)
Page 25 of 181
3.3.2 Library Generation from External Search Engines
We strongly recommend using Pulsar to generate your libraries. However, Spectronaut also
supports generating a library from external search engine results. To do so you will need:
Table 5 summarizes the type of files or folders needed for each search engine, and whether
some actions are required for correct integration of the post-translational modification (PTM)
annotations.
In addition to the specific result formats above, Spectronaut also supports results in
mzIdentML (.MZID) format (containing fragment ion information). Finally, any search results
can be reformatted into the Biognosys (BGS) Generic Format.
Regarding Ion Mobility data, Spectronaut supports PASEF and dia-PASEF™ spectral libraries
generated only in Pulsar whereas FAIMS spectral libraries generated by both Pulsar and
Proteome Discoverer™. Currently, no other search engines are supported in Spectronaut for
PASEF and dia-PASEF™ or FAIMS spectral library generation. Neither BGS generic format
can be used for uploading such library.
1. Go to the Library Perspective → Spectral Library and click on "Generate Spectral Library
from…" in the bottom left corner (Figure 7). Choose your search engine.
2. Navigate to the files or folders containing the search results (see Table 5). Spectronaut will
try to map the run files automatically (see Box 6 ). If it fails to do so, you will have to manually
link the files by clicking "Assign Shotgun Files…"
3. Choose your library settings in the Library Settings panel or run under default settings. (for
a detailed explanation of each setting, see Appendix 4. Library Generation Settings -
Section 7.4).
4. Choose a FASTA file in the FASTA File tab. If your FASTA file is not yet in the tree, you
can add it at this point by clicking "Import…" in the bottom left corner.
Page 26 of 181
5. Select your Gene Ontology annotation information in the Gene Annotation panel. You
should have your file previously loaded into the Databases Perspective (to learn how to do
this, go to Section 3.9.4). By clicking "Load", Spectronaut will perform the library generation.
Your new library will automatically appear in the Library Perspective upon completion.
Spectronaut will try to map the run files automatically by name matching. First, it will look in your Shotgun
Raw Repository (Settings Perspective → Global → Directories). If unsuccessful, it will look in the search
results location. If the automatic mapping fails, you will see a red cross
If this is the case, you will have to manually map the runs. Click on "Assign Shotgun Files…" to find the
missing runs. You can either navigate to a common directory or browse for your runs individually. After the
runs have been found, the red cross will change into a green tick mark
Table 5. Supported search engines and information required by Spectronaut when generating a library from
search results
Peptide modifications
Search engine Search result files
Default Custom
Imported (*.xml file)
from MaxQuant
evidence.txt or
MaxQuant Included installation folder
msms.txt
(\bin\conf\modificatio
ns.xml)
*.msf for PD 1.4
Proteome Discoverer Included with the search results
*.pdResult for PD > 2.0
Importation required:
MS Excel with the suffix "Unified Modification Catalog.xlsx" located
Protein Pilot
"_FDR" in \ProteinPilot\Help folder in Program
Files a
Download the latest
Unimod XML
Add manually
Mascot *.dat database from
(see 3.9.2.2)
www.unimod.org/do
wnloads.html a
a These defaults apply only to upgrades from old versions of Spectronaut. If your first version of Spectronaut
was either X, 13.0 or 14.0, no action is required concerning default modifications
Page 27 of 181
3.3.2.1 Spectral Library Generation from BGS Generic Format
Spectronaut supports generating libraries from the minimal BGS Generic Format. This allows
end-users to use their favorite search engine with the aid of a basic script which would convert
their search result into the BGS Generic Format. This is a tab separated, plain-text format with
defined header where each row represents a PSM. Table 6 shows the information required in
this file. The user selects the BGS Generic Format file as well as some corresponding LC-MS
raw files. Spectronaut tries to automatically map possible modifications to the internal
modification database. If unambiguous mapping is not possible then a UI form will prompt the
user to make the link.
In the past, the iRT value of a given peptide in a spectral library was summarized by taking a median
across all runs where the peptide was identified. However, when building a spectral library from
chromatographically heterogenous data, this can lead to a loss in iRT-precision. A good example of this
case would be a situation where you would generate a Hybrid Library using DDA files from public
repositories together with your quantitative DIA files.
To improve the targeted extraction of such data, we introduced the concept of Source-Specific iRT
Calibration. Spectronaut will generate libraries containing as many iRT values as different sources exist
in the dataset. When using this library on a quantitative dataset, Spectronaut will use the iRT calibration
from the best source available for each assay.
By using source-specific iRT calibration, you will be able to keep the iRT-precision of project specific data
while benefiting from the depth of a large resource dataset. This feature though, is not available for the
spectral library generation from BGS Generic Format.
Spectronaut will create different iRT sources in the libraries in the following cases:
Page 28 of 181
Figure 7. Library Generation from external search engines. In the Library Perspective, Spectral Library tab,
click "Generate Library from…" and choose your search engine. Load your search results and assign your
run files (Box 6). Follow the wizard to complete the process.
Page 29 of 181
Table 6. BGS Generic Format required information. The BGS Generic Format is a tab-separated, clear text
format with a header as specified below. PSM FDR, protein FDR and other filters must be already applied to
the PSMs beforehand.
Header Information
To import an external library into Spectronaut, click on "Import Spectral Library…" in the
bottom left corner of the Spectral Libraries tab in the Library Perspective (Figure 8).
Page 30 of 181
1. Importing a *.kit library (Biognosys' library format). In this case, no further action is required,
and the library will be loaded automatically into the Library tree.
2. Importing a compatible spreadsheet as a plain text, separated value format (*.txt, *.csv,
*.tsv, *.xls). Headers defining your columns are mandatory in these files.
The "Import Spectral Library…" dialogue (Figure 8) will try to auto-detect column names. If
there are new column names, Spectronaut will ask you whether or not you want to store
them as a recognized synonym for this column. This allows Spectronaut to automatically
select these columns the next time you load a spectral library with a similar format (you can
remove the user-defined column synonyms in the Databases Perspective → Table Import).
Figure 8. Importing an external library. The "Import Spectral Library…" dialog only applies to formats different
than *.kit. You can refine your library using the lower panel tabs.
The import function also allows you to refine your library. In the Library Settings panel, you
can choose several options to be applied to your library (for details, see
Appendix 4. Library Generation Settings, in Section 7.4). For example, you can perform
protein inference again. To do this, go to the FASTA File panel and choose your protein
database. You can also add Gene Ontology annotation information using the Gene
Annotations panel. Table 7 shows the recommended fields to achieve the best possible
results.
Page 31 of 181
3.3.3.1 Library Columns
A spectral library is similar to a typical MRM/SRM transition list. Refer to Table 7 to see what
information a library should contain.
Table 7. Library columns to achieve the best possible results with Spectronaut
IonMobility optional For the FAIMS raw files IonMobility refers to the
compensation voltage (CV) applied in the
acquisition method. For PASEF data, ion mobility
whether empirical or predicted, refers to the value
1/K0 expressed in Vs/cm2, and for HDMSE files is
drift time expressed in ms.
Page 32 of 181
Header Requirement Refers to
FragmentType Recommended The peptide fragment ion type. Usually, this is "y" or
Page 33 of 181
Header Requirement Refers to
ModifiedSequence optional In the event that your peptide is modified use this
column to specify the amino acid sequence
including modifications. The modified sequence
should be constant for one unique precursor. This
information is used to label your precursors in
Spectronaut and automatically generate a unique
ID if necessary. Spectronaut will try to parse and
map modifications from the provided sequences to
the internal modification database. This field does
not contain any label specific modifications (see
LabeledSequence).
To view an example of a library, see our online material for download or export a library from
the Library Perspective in *.xls format.
Page 34 of 181
3.3.3.2 Modification Parsing
Once the library is imported, Spectronaut will try to parse all values imported from the
"ModifiedSequence" and the "LabeledSequence" columns to assign modification
specifications to them. This allows Spectronaut to have greater control over decoy generation.
If possible, Spectronaut will automatically assign known modifications from its internal
database. If a certain modification is unknown, you will be prompted to assign the modification
specification from the database to the new keyword (Figure 9). The only parsing requirement
for external modification definitions is the modification tag which is specified within round or
square brackets. Spectronaut will not parse modifications specified as single letter special
amino acids (such as 'B' for carbamidomethyl cysteine or 'O' for oxidized methionine). You
can remove previously assigned parsing synonyms in the Modifications page of the
Databases Perspective (see Section 3.9.2).
Although Spectronaut allows merging libraries, we do not recommend this action, since this
can lead to uncontrolled inflation of the protein FDR. Instead, we strongly advise generating a
new library from a search that includes all relevant LC-MS/MS runs, so FDR remains
controlled. Alternatively, you can use Pulsar Search Archives (see Section 3.3.1 and Box 5).
This being said, two or more spectral libraries can be merged in the Library Perspective. To
do so, select the libraries you would like to merge while holding the Ctrl key and then, right-
click to open the context menu and select the "Merge" option. This will open a setup window
similar to when generating a library from a database search (see Section 0 and Figure 7).
Page 35 of 181
Figure 9. Modification assignment during import of an external library. Add the synonym from the database
by double-clicking of dragging it to the unassigned modifications.
Please note that if the libraries used for merging have different types of protein annotations,
protein counts in the merged library will be inflated (as the same protein could be counted
twice). This will not happen if the libraries were generated in Spectronaut performing protein
inference using the integrated IDPicker algorithm (Zhang et al., 2007).
Spectronaut provides several different plots with an overview of your library. You can access
these plots by clicking on the library node in the tree and selecting the relevant plot in the right
panel (Figure 10).
Page 36 of 181
Figure 10. Spectral library overview. Several plots can be selected from the drop-down menus. In this
example, the top plot shows the Library Summary, while the bottom plot shows the missed cleavage overview
There are two ways to generate a labeled (or spike-in) library. Either from an existing label-
free library or from scratch. If you already have a label-free library that you would like to label,
you can in silico generate a labeled library from a label-free library. By right-clicking on a library
in the Library Perspective, you can attach heavy labels to an existing library (Generate
Labeled Library). Doing so will open the Label Editor form where you can select which labels
should be applied to the existing library by double-clicking from the list on the right-hand side
(Figure 11). The selected library will be stripped of any pre-existing labels. The selected
workflow will be included in the library to define how these new peptides will be treated during
analysis.
If you want to generate a labeled library from scratch, specify the channels (up to three
channels allowed) and labels to be searched (Pulsar Search > Labeling > Labeling applied,
and tick the box). Note that also the light channel has to be specified by selecting a channel
and leaving the labels text box empty. Then under the Workflow node of the Library Generation
settings, select “In-Silico Generate Missing Channels” and choose the appropriate workflow
Page 37 of 181
(i.e., label, spike-in, inverted spike-in). This option will add the missing channels to yield a
homogeneous set of channels for all peptides.
For detailed information about the supported, labeled workflows, see Section 3.4.1.5.
Additionally, find a short video tutorial on labeled library generation here.
Figure 11. Right-clicking on a Library opens a context menu with several options, such as generating a
labeled library or enable library for QC. In the figure there is an example of applying a SILAC label to an
existing library. The two isotopic modifications Arg6 and Lys4 are selected by double-clicking to be applied
as label to all applicable peptides.
When right-clicking on a spectral library in the Library Perspective you have the option to
generate a new QC kit using this library (Enable QC, Figure 11). This will select 250 highly
abundant peptides from the spectral library which will be added as a QC kit to the quality
control perspective. The selection of peptides can also be altered manually within the dialogue.
These peptides can then be tracked for quality control purposes within the quality control
perspective whenever the corresponding library was used.
Page 38 of 181
3.4 Analysis Perspective
Spectronaut™ starts up in the Analysis Perspective. This perspective allows you to:
Setting up your DIA analysis is straightforward, thanks to the setup wizards in Spectronaut.
Before starting, see Table 2 to make sure you have everything you need. After completing the
wizard and clicking "Finish", Spectronaut will switch back to the Analysis Perspective and start
the analysis.
First, in every run, Spectronaut will perform a basic, linear iRT calibration using the iRT Kit
peptides; then, Precision iRT calibration will be applied using a stored set of endogenous iRT
peptides or using the novel Biognosys' Deep Learning Assisted iRT Calibration (as defined in
the settings). Browsing your data is possible a few seconds after the initial calibration process
is finished. Once the analysis has finished the number of unique precursors, peptides and
protein groups identified for the q-value cutoff defined in the settings (default 0.01, the
equivalent of an FDR cutoff of 1%) will be shown at the bottom right (Figure 21).
Page 39 of 181
Box 8. Spectronaut plots: how to get the most out of them
Spectronaut provides, across all its perspectives, a comprehensive number of plots of many types to show
you all the relevant details about your analysis, from MS data acquisition to post-analysis results. Most
plots in Spectronaut are interactive and customizable to some extend. For example: zoom in on a plot by
selecting the area you want to enlarge (find back to default scale by right-clicking on the plot). Drag or
navigate a plot horizontally by Ctrl +click and drag.
By right-clicking on a plot, you will find a context menu with an extensive list of functionalities (see figure
below):
In this example, you can show or hide the legend, save the data used for the plot, choose the unit you
want to show numbers for (protein, peptides), among many others.
Most examples shown in this Section are generated with the demo data available for
download. Please, use this data to reproduce these results. Alternatively, you can generate
your own DIA data to test Spectronaut. To check what resources you will need to perform a
DIA library-based analysis, refer to Table 2.
To start a library-based DIA analysis, go to the Analysis Perspective (add a new experiment
tab if needed) and click on "Set up a DIA Analysis from…" in the bottom left corner. This will
Page 40 of 181
let you navigate to your run files or folders. Once you have chosen your DIA data, a wizard
will start guiding you through the set-up (Figure 12):
Figure 12. Setting up a library-based DIA analysis. After selecting your run files, a wizard will guide you
through the process. You will be prompted to assign one or more libraries, the FASTA files, the GO annotation
file, and to select your analysis settings. Finally, you will see a summary of the analysis set-up (Figure 13).
2. Choose Spectral Library. Select the library from the Recently Used list, From File, or From
Library Perspective. If the library is chosen from file, this action is similar to importing an
external library, described in Section 3.3.3. Click "Load" to add the library to your analysis.
3. On the next page, you select your DIA Analysis Settings Schema. Use one of the schemas
available or modify one on the fly. These settings will define many important aspects of
the analysis, such as FDR thresholds, quantification preferences, how to filter your data,
among others. The BGS Factory Settings (default) schema is a good starting point for most
Page 41 of 181
projects. Find a detailed explanation of the analysis settings in Appendix 1. DIA Analysis
Settings (Section 7.1).
4. Choose a protein database (FASTA file) if you want Spectronaut to perform protein
inference. Spectronaut performs protein inference according to the IDPicker algorithm
(Zhang et al., 2007). Refer to Appendix 1. DIA Analysis Settings (Section 7.1) for more
details about this option.
5. Specify your experimental set-up (conditions, replicates) so Spectronaut can test for
differential abundance (paired and unpaired Student's t-test) and perform other
Post Analysis processing steps. See Section 3.4.1.4 for more information about how the
condition editor works.
6. Choose a Gene Ontology (GO) annotation file if you want Spectronaut to give you extra
biological insight into your experiment. This includes GO term enrichment and GO
clustering.
7. Before clicking "Finish", a summary of your analysis set-up will be shown (Figure 13).
Figure 13. Summary of the analysis set-up. Click "Finish" to proceed with the calculations.
Page 42 of 181
3.4.1.2 Performing a directDIA™ Analysis
Spectronaut enables directDIA™, Biognosys' library-free DIA workflow. This novel workflow
allows you to directly search DIA files using nothing but a FASTA file for identification.
DirectDIA is a 2-step process: 1) It creates a library by performing a spectrum-centric analysis
of your DIA runs based on the specified protein database which is conceptually similar to DIA
Umpire (Tsou et al., 2015), and 2) It automatically uses the library to perform a targeted
analysis of the same DIA runs. In the end, you get the results from the targeted analysis and
optionally, you can save the generated search archive by selecting the option in the Reporting
node of the Global Settings. Currently, directDIA is supported for Thermo Scientific™ Orbitrap,
SCIEX TripleTOF® data, Bruker timsTOF Pro, and Waters Synapt data. To know the
resources required to perform a directDIA analysis, refer to Table 2.
To start a new directDIA analysis, go to the Analysis Perspective (add a new experiment tab
if needed) and click "Set up a directDIA Analysis…" in the bottom-left corner. This will prompt
you to navigate to your run files. Once you have chosen your DIA data, a wizard will start
guiding you through the set-up (Figure 14):
Page 43 of 181
Figure 14. Starting a directDIA analysis. After selecting your run files, a wizard will guide you through the
process. You will be prompted to assign the FASTA files, the GO annotation file, and to select your analysis
settings. Finally, you will see a summary of the analysis set-up before clicking "Finish".
This feature allows for the direct comparison of different DIA methods within one directDIA
Experiment. By enabling the Method Evaluation parameter in the directDIA Worklow settings
(Figure 15), the user will perform a separate Pulsar DIA search per condition to better compare
different DIA methods within one experiment. The Method Evaluation worklow is not suited for
quantittave experiments.
Page 44 of 181
Figure 15. The Method Evaluation workflow allows for direct comparison of different DIA methods within one
directDIA experiment.
To let Spectronaut perform the differential abundance tests (paired or unpaired Student's t-
test) and other condition-wise metrics, you need to specify your experimental set-up during
the configuration of your analysis. Spectronaut will ask you to annotate your runs and specify
to which condition, biological replicate, and fraction (if applicable) they belong (see Box 9 to
learn more about fractionation in Spectronaut). Each condition in Spectronaut will get a color
assigned during the set-up which will be used for post-analysis plot labelling. The
Condition Set up panel contains several columns (Figure 16):
Page 45 of 181
Figure 16. Condition Setup panel during DIA Analysis set-up. You can manually adjust your conditions on
the panel or Import Condition Setup from a text file.
Unless actively changed to pairwise comparison (paired Student's t-test), or disabled in the
Analysis Settings, Spectronaut will perform an unpaired comparison (two-sample t-test)
between all conditions specified in the Condition Setup panel. The results are reported in the
Post Analysis Perspective.
There are several ways to introduce the annotation information into the Condition Setup panel:
1. If you maintain a file-name structure which is self-annotating, you can define a parsing rule
to automatically parse the conditions and replicates from it (Settings → Global → General
→ File Name Parsing Schema; see Section 3.10.5.1). A parsing rule is a set of instructions
that inform Spectronaut what type of information you want to extract from the file name and
how.
2. The Condition Setup table is editable: you can directly write in any of the fields to enter your
information (Figure 16). The table will recognize your changes and adapt them to the rest
of the fields automatically. Be aware that the condition editor is space and case sensitive.
3. Import your annotation from an external text file. The easiest way to do this is by exporting
the Conditions set-up Spectronaut suggests, modify it, save it as a text file and import it
back.
Page 46 of 181
Box 9. Sample Fractionation in Spectronaut
We do not recommend sample fractionation in DIA analyses. While for DDA, sample fractionation results
in significantly higher coverage, the gains in the case of DIA are less significant. In general, increasing the
coverage in a DIA analysis is achieved by optimizing the acquisition method and building a better spectral
library.
In addition, one of the main features of DIA datasets is the low CVs and the high reproducibility. The
process of sample fractionation introduces variability, sometimes notably high, which renders in a dataset
of lower quality.
Although not recommended, Spectronaut supports sample fractionation. If you have your samples
fractionated, you need to annotate this properly in the Condition Setup. This will allow Spectronaut to
perform fraction-wise normalization. Furthermore, libraries might have to be optimized in case of
fractionation. If a peptide is not expected to be in a fraction, ideally it should not be targeted in that fraction.
The reason for this is shown in the following table:
Labeled Workflows
Spectronaut not only performs label-free quantification. Labeled workflows are also supported,
and specific scoring methods are developed for each approach.
• Label-free: Default workflow for all channel experiments. Peak detection, scoring and
identification are applied as usual.
• Labeled: Peak detection and scoring will be applied to all channels. Quantification in
Post Analysis will be performed on the light to heavy ratio.
• Spike-in: Peak detection will be performed on only the reference (heavy) channel. Scoring
and identification will be performed on the target (light) channel. The heavy channel is
expected to be easily detectable and considered a peak-picking aid in this experiment.
Quantification in Post Analysis will be performed on the target to reference ratio.
Page 47 of 181
• Inverted Spike-in: Similar to spike-in, but the light channel is considered as the reference.
For more details about how to set these workflows, see Appendix 1. DIA Analysis Settings
(Section 7.1).
Please note that for labeled workflows post analysis with statistical testing is not currently
supported in Spectronaut. Both those steps could be done in downstream process after
exporting Spectronaut Analysis Report. All the labeled workflows reports contain separate
columns showing light, medium, and heavy channels as well as ratio of target to reference.
Spectronaut also supports the analysis of raw data acquired with Ion Mobility dimension. You
can find more information on supported ion mobility DIA acquisition methods in Box 2 and Box
3 (Section 1.6). Ion Mobility acquisition is compatible and could be used for implementation of
all supported workflows.
PTM Workflow
The differential abundance testing is available at the modification site level if the PTM analysis
is selected.
Firstly, Spectronaut performs quantitative site collapse of parent peptides carrying given
modification at a specific modification site (Bekker-Jensen, 2020). If the parent peptides carry
more modifications of the same type, a separate collapse could be performed according to the
modification multiplicity (see collapse of doubly and singly phosphorylated peptides on Figure
17). Singly phosphorylated parent peptides (multiplicity 1, M1) and doubly phosphorylated
(multiplicity 2, M2) will undergo site collapse separately. If the peptide carries three or more
modifications of the same type, it will be reported with multiplicity 3, M3.
Page 48 of 181
Figure 17. Example of quantitative site collapse of phosphorylated parent peptides, performed according to
their multiplicity.
A major challenge for library based (peptide-centric) data independent acquisition (DIA) is the correct
localization of post translational modifications (PTMs). Standard targeted DIA processing algorithms are
often not specific enough to differentiate between multiple versions of the same peptide differing only in
the PTM site localization. Spectronaut's PTM localization workflow for DIA allows to benefit from the
sensitivity, accuracy, and precision of a DIA targeted extraction with a high confidence site localization.
The PTM localization algorithm for peptide-centric analysis utilizes information not typically available to a
classic DDA. This includes a usually full isotopic pattern for all fragment ions and the possibility of
generating short elution chromatograms to correlate with the targeted peak shape. The later allows for
systematic removal of any interfering fragment ions that one could not account for in DDA. Combining
those two unique aspects with additional scores based on fragment mass accuracy and intensity shows
excellent performance.
The novel PTM localization algorithm can work on any variable modification (such as phosphorylation,
methylation, acetylation, sulfation, etc.) or combinations of different modifications and does not require
specially generated libraries.
The Analysis perspective features Grid and Tree Views (Figure 3). The Grid View is a protein
centric view of the analysis that allows easy visualization of differential protein expression
(Figure 18). Differential abundances across samples are displayed in colours: red for lower
and blue for higher abundances. You can also filter the list by identified PGs, complete profiles,
coefficient of variation, by the candidates according to the differential abundance testing. By
dragging a column and dropping it to the designated area above the grid, you group the list by
that function. For a selected PG you can also display protein coverage and quantity profile.
Page 49 of 181
Figure 18. The Grid View is a protein centric view of the analysis and shows differentially abundant proteins
across conditions – low abundant in red and high abundant in blue (if compared against the mean for that
protein group), protein coverage, and quantity profiles (log2 quantities).
The Tree View shows the data organized in an expandible tree (left panel) and corresponding
plots, reports, and summaries (right panels) (Figure 19). By default, the hierarchy of the tree
is:
Run
Precursor window
Elution group
Precursor
Fragment ions
The runs are by default filtered by identification, which means that only what has passed all
the identification thresholds specified in the settings is shown (see DIA identification settings
in Appendix 7.1). These include: precursor posterior error probability (PEP) cutoff, precursor
Q-value cutoff, and protein Q-value cutoff at both experiment and run level. You can also see
what has not been identified (“Not Identified”), or remove all the identification filters and
visualize all the data (“Not Filtered”).
You can change the tree structure by right-clicking on the experiment tab → Group by, then
select one of the options (more about these functions in Appendix 6. Experiment Tab Options,
Page 50 of 181
Section 7.6). The most common actions are also accessible through intuitive icons that are
available under the experiment tab (Figure 19).
Figure 19. The analysis Tree View. The data tree is displayed on the left side. Plots and summaries are on
the right side. The drop-down menu in the bottom-left corner allows to runs filtering. A summary of the number
of identifications is shown in green at the bottom.
In the data tree, you can right-click on any of the elements and execute element-specific
actions such as accepting or rejecting a precursor ion or refine the fragment ions selection
(Figure 25).
The right side of the Tree View is divided into upper and lower panel to display two different
plots at the same time. By selecting the same plot in both panels, you will get it in large view.
The plots change based on the selected element in the data tree (e.g. run-level, protein-level,
precursor ion level). To know which plots are available at each level, see Appendix 5. Analysis
Perspective Plots (Section 7.5).
To visualize more than two plots simultaneously, you can use the floating plotting windows
(Figure 21). You can open up to three floating widows per experiment.
Page 51 of 181
Figure 20. Filtering the tree in the Analysis Perspective. Check the box for a filter and give the corresponding
value. An example of filtering for a peptide sequence is shown on this figure.
Figure 21. Visualization of several plots simultaneously on one or many monitors with the floating plotting
windows.
Page 52 of 181
You can also visualize several perspectives simultaneously by detaching them (Figure 22).
This allows, for instance, to keep the Analysis and Post-Analysis perspectives open on
separate monitors. To detach a perspective, select it and press the F12 key on the keyboard.
Spectronaut provides a comprehensive set of plots and reports for the review of the analysis
at different levels: run, precursor and fragment ion.
• Run level plots: information about the calibration status, DIA method used, TICC, run meta
information and cross run performance
• Precursor and fragment level plots: XIC chromatograms, score-centric plots and cross-run
profile visualizations. The latter ones are only available in multi-run experiments and
disabled for experiments containing only one run or peptides that are only targeted in one
run.
Please, visit the Appendix 5. Analysis Perspective Plots (Section 7.5). to find an example and
description of each plot.
To learn some tips about how to use the plots in Spectronaut, see Box 8.
Page 53 of 181
3.4.2.2 Tree Filtering
One can apply one or several filters to the data tree. These filters only influence what is shown
in the Analysis Perspective but not, for instance, the Post Analysis Perspective. Select a filter
from the drop-down menu (Figure 20) and set the filter criteria. The filter is now checked within
the drop-down menu. To combine filters, select a different filter and define the value that
should be applied. A precursor must apply to all selected filters in order to be shown in the
review tree. By default, the Identification filters apply (see Section 3.4.2).
Note: sometimes is not obvious that a filter is applied. Make sure you check the filter
list before reviewing your analysis further.
You can specify a custom criterion with the "User Group" filter. This value can be set during
the library import by selecting a specific column as "User Group".
Right-clicking on the experiment tab in the Analysis Perspective opens a context menu with
many functionalities that can be applied to the experiment (Figure 19). The most common
actions are also accessible through intuitive icons. To see the full details for these options,
refer to Appendix 6. Experiment Tab Options (Section 7.6). Some of the most relevant ones
are:
1. Save and Save as: Spectronaut will not save the analysis automatically. To save an
analysis you will have to do it manually. You can save your analysis with or without ion
traces (XICs) (Figure 23):
a. With ion traces (FULL): the file generated (*.sne file) will be larger, but Spectronaut
will not require the run files to be available when you load your saved analysis again.
b. Without ion traces (XICs): The *.sne file will be smaller, and you can map the run files
after loading it.
Figure 23. Saving your *.sne file with or without ion traces (XICs).
Page 54 of 181
2. Group by: change the structure of the data tree. The main level will still be "Run". See
details in Appendix 6. Experiment Tab Options (Section 7.6).
3. Settings: this option allows you to review and change many of the analysis settings without
having to run the analysis again. For instance, you can change the FDR cutoffs,
quantification settings, FASTA file for protein inference, conditions set-up, and many more
(for details see Appendix 6. Experiment Tab Options, Section 7.6).
Spectronaut is implemented with a highly reliable peak picking algorithm and with scores and
confidence thresholds, adapted to high throughput workflows with several 1000s of runs
analyzed, several 1000s of protein identified and 100000s of precursors targeted. Reviewing
the results of your analysis is in general not expected or needed.
However, Spectronaut provides the possibility of exploring your results and fine tuning several
aspects manually, if you would like to do so (see Box 11 on how to optimize the manual
reviewing process). For this purpose, the Analysis Perspective provides a number of aspects
you can interact with and modify. The most relevant are:
• Refine elution group integration boundaries. Select an elution group in the data tree
and set the right-side plots to, for example, MS2 XIC. You will see the ion chromatogram
corresponding to the selected elution group within two green lines (Figure 24). You can
manually slide these lines on both sides of the integrated peak, to set different boundaries.
A new q-value should be calculated. A hand icon will appear next to the elution group in
the data tree, denoting it was manually modified.
Page 55 of 181
Figure 24. Reviewing Spectronaut peak picking. Integration boundaries can be modified by dragging them.
The precursor will be marked as manually modified.
• Manually select a different peak in the XIC. Select an elution group in the data tree and
set the right-side plots to, for example, MS2. Hover over the peak you want to assign and
click when the cursor changes into a hand. The integration will be transferred, and a new
q-value should be calculated. Similar to the action above, the precursor will be marked as
manually modified in the data tree.
• Manually accept or reject an elution group: If you right-click on an elution group, you
can manually accept or reject it. The icon next to the precursor will change to denote it has
been manually modified (Figure 25).
• Manually define an interfering fragment ion: in the data tree, when you expand the
precursor, you can see the fragments present in the library for that specific precursor. The
ones used for quantification will have a blue icon, while the ones detected as interferences
have a grey icon (Figure 26). You can define interferences manually by right-clicking on the
fragment ion and unchecking the "Used for Quantification" option.
Page 56 of 181
Figure 25. Reviewing Spectronaut peak picking. Precursors can be manually accepted or rejected by right-
clicking and choosing the option. The precursor will be marked as manually accepted or rejected.
Box 11. Tips to optimize manual reviewing of your data (UI responsiveness)
If you need to manually review and actively navigate through your analysis in the Analysis Perspective,
you might find some processes to be a bit slow and the software not as responsive as hoped. There are
several things you can check in order to make the process as fast as possible:
1. Have your run files locally. Having your run files on a network drive is not recommended and can
significantly slow down the computational processes.
2. Convert the run files to HTRMS files (see Section 4) before running your analysis.
3. If you saved your *.sne file without XICs, re-extract your XICs (see Section 3.4.2.3).
4. Group your data tree by precursor window.
Happy reviewing!
Page 57 of 181
Figure 26. Manually define interfering fragment ions or manually accept fragment ions for quantitation that
were defined as interferences by Spectronaut.
Spectronaut allows you to refine the fragment ion selection of your libraries. You can remove
fragments that show interferences, specifically select fragments to cover interesting
modification sites, or add additional fragments that were not detected in the original DDA
analysis but are nicely visible in DIA.
To perform library refinement, the peptide assay must originate from a library generated in the
Library Perspective. Right-click on the elution group node and select "Refine Fragment
Selection". A new dialog will appear that shows the selected peak in detail (Figure 27). The
list on the left will show you all the fragments of this peptide that are present in the library, and
which of those are currently selected. Additionally, a list of theoretical fragments is generated.
In order to change the selection of theoretical fragments, right-click on the "Theoretical
Fragments" node and select "Set Fragment Filter". This will allow you to expand the set of
theoretical fragments so that it will contain different ion types, as well as common loss types.
Page 58 of 181
Figure 27. Refine your library while reviewing your analysis. You can, for example, use different ion series
(z-ions) by adding them into the fragment tree and selecting them. A preview of how the XIC looks with the
current selected ions is shown on the right. You can also look at how individual ions XICs appear by selecting
individual ions.
On the right, you can see a preview of how the fragment selection will affect both the XIC and
the match between the predicted and measured fragmentation pattern. Please note that
theoretical fragments will not contain a predicted intensity. To add or remove a fragment from
the spectral library, simply check or uncheck the corresponding fragments.
After the new fragment selection is complete, click "Apply Selection" in the bottom-right corner.
Please note that the refinement is not performed immediately. In order to effectively change
the selection in your experiment, as well as in the spectral library, right-click on the experiment
tab in the Analysis Perspective and click "Commit Library Changes…".
This extra step is necessary since the update will require a re-extraction of the affected peptide
from all currently loaded runs. Spectronaut will, therefore, perform this operation in batches
once all manual fragment selection is done, and not for each peptide individually. All changes
made to a peptide's fragment selection will only take effect once the library changes are
committed.
After clicking "Commit Library Changes…" another popup window will appear asking you for
a version name. Spectronaut features a version control for spectral libraries that allows you to
switch between different versions of a given library. This way, any changes to the original
library can be reverted. In order to change the version of a library, go to the
Library Perspective, right-click on the respective library node and select "Set Selected
Page 59 of 181
Version". A small window featuring a drop-down list will allow you to select which version of
this library to use.
Page 60 of 181
3.5 Post Analysis Perspective
The Post Analysis Perspective in Spectronaut™ reports processed results for your analysis.
It shows summary information about identification, quantification, and results from the
differential abundance test, hierarchical clustering, principal component analysis and GO
terms enrichment and clustering (Figure 28). Moreover, when the PTM analysis in the PTM
workflow settings is selected, Spectronaut provides differential abundance results at the PTM
site level. For example, the PTM vs protein fold changes plot, showing protein group log2
ratios plotted against log2 ratios of modification sites, helps to identify changes on the
modification site level that are independent of changes of the protein abundance.
Furthermore, the digestion efficiency plot, especially useful for LiP workflows, shows the
number of peptides per digest type and enzyme. Finally, under specialized workflows, a plot
dedicated to LFQ benchmark studies is available (Figure 83).
Figure 28. Post Analysis Perspective. Several summaries, tables, and plots are available as you navigate
through the nodes in the tree. In the figure, you can visualize the Candidates list. You can modify your
candidate set by filtering directly on this table. The default applied filters are a 1.5-fold change and a q-value
of 0.05.
Here you will find some experiment-wide information that will give you a rough idea about
dataset characteristics. Under Overview, you will see a summary with the number of proteins
and peptides identified (by conditions), miss-cleavages, library recovery, and other general
Page 61 of 181
metrics about the experimental outcome. To support this overview, you will find several plots
related to the number of identifications, the data completeness, the coefficient of variation, and
the normalization. On each of these plots, you can change many settings by using the right-
mouse click option. To see the full details of each plot, see Appendix 7. Post Analysis
Perspective Plots (Section 7.7). Learn more about plots in Spectronaut in Box 8.
Under this node, you will find plots related to the behavior of the target and the decoy
distribution estimation. This behavior defines the discriminant scores (Cscore), q-values
(Qvalue), and sensitivity on the precursor level. Scoring histograms are shown for each
workflow in the current experiment depending on your experimental set-up. Find all the details
about these plots in Appendix 7. Post Analysis Perspective Plots (Section 7.7).
The Binned Identification plots in the Analysis Details shows the number of identifications
across conditions and are binned according to three variables: iRT, intensity and m/z. This
provides valuable feedback on the performance of the measurements for different conditions
according to technical criteria such as liquid chromatography and mass spectrometer
performance. Datapoints Per Peak plot shows data points per peak in each condition as well
as their distribution with the median. The Binned Coefficients of Variation show CVs within
conditions and are binned according to the same three variables: iRT, intensity and m/z. See
these plots in Appendix 7. Post Analysis Perspective Plots (Section 7.7).
The results of the differential abundance testing will show up under this node. The Candidates
node shows a table with the results, annotated by paired or unpaired t-test comparison
(Figure 28):
• The direction and the percentage of change are noted by color and color intensity,
respectively; the level of significance is noted by the size of the circle.
• The fold changes are expressed as log2 transformed ratios of averaged replicates
(AVG Log2 Ratio).
• The table is, by default, filtered by a q-value (multiple testing corrected p-value) of 0.05 and
an absolute log2 ratio of 0.58. You can change these filters to your preferred cutoffs. The
Page 62 of 181
filters applied to this two metrics in the table will automatically apply to the volcano plot as
well.
• You can add and hide columns in this table by right-clicking on any of the headers and
selecting Column Chooser. For example, you may want to add the p-value column.
• It is possible to search any character in the table with the Search field at the bottom of the
table.
The candidates table can be exported as an excel file by clicking on "Export Table…" at the
bottom.
In addition to the table, the candidates will be shown as plots on the right side. These plots
can be customized in several ways by right-clicking on them and choosing your preferred
options.
Principal Component Analysis plot shows clustering of the samples based on their protein
profiles.
Figure 29 Principal Component Analysis plot. By right clicking on the plot you can perform
several actions, such as Save Image As, export the data matrix, or modify the scaling.
3.5.4.3 GO Enrichment
Under the Differential Abundance node, you will also find the results from the Gene Ontology
(GO) term enrichment and the GO term clustering. If you added GO annotation to your
Page 63 of 181
experiment, either within the library or during the analysis set-up, Spectronaut will perform a
GO term enrichment test.
Spectronaut comes with the human GO annotation implemented. If you are working with a
different organism, you can download your relevant annotation from
http://geneontology.org/page/download-annotations or any other source, and import it into
Spectronaut via the Databases Perspective (Databases Perspective → GO Databases →
Import Gene Annotation). Your annotation is now ready to be appended to a library during the
library generation, or to be selected during the analysis set-up. Please note, that GO
annotation file you will use for the library and for the analysis set up should match the FASTA
file selected for those steps.
The term enrichment test will check whether there are biological processes, functions or cell
compartments over or under-represented within the candidate set. In other words, it will
highlight processes, functions or compartments affected by the experimental conditions.
During the analysis, the first step is to determine how frequently a GO term occurs in the
background proteome, i.e., all proteins identified throughout the whole experiment. Based on
this information, this term is expected to be found a certain number of times in a random set
of a given size. If the GO term occurs more frequently in your candidate set than expected, it
is considered as significantly overrepresented; if it occurs less frequently, the term is
considered as significantly underrepresented (Mi et al., 2013). The level of significance is
given by a p-value. Spectronaut will perform two multiple testing correction methods to this
test: Bonferroni (Dunn, 1961) and Benjamini-Hochberg (Benjamini & Hochberg, 1995), for
which the corresponding corrected p-values are also displayed.
If you change the candidate set, i.e., you apply a different filter in the Candidates table, the
enrichment must be recalculated.
The result of the enrichment test will be shown as an interactive table where you can group
the results according to a column or filter according to any of the features (Figure 30). Similar
to the Candidates table, you can easily search within the table with the search field at the
bottom. The GO Enrichment table can be exported by clicking “Export Table…" below the
table panel.
Page 64 of 181
Figure 30. GO term enrichment result. Similar to the Candidates table, each column can be filtered according
to several options. You can also group the results by any of the column headers by dragging them to the
Group by field. If you want to do GO clustering on manually selected terms, use the first column of this table.
Page 65 of 181
3.5.4.4 GO Clustering
GO clustering is a step further towards reducing the complexity of the differential abundance
test results into an easier to interpret picture. If your GO term enrichment seems too
convoluted, GO clustering will group related terms by similarity. The result is a shorter list
showing groups of GO terms. GO clustering works based on the REVIGO algorithm (Supek
et al., 2011). The semantic similarity of two terms is calculated based on their position/relation
in the Gene Ontology graph.
Spectronaut will perform a GO term clustering on a subset of the terms from the enrichment
analysis. This subset can be defined in two ways:
1. Manually selecting them in the GO Enrichment node, by activating the check box in the first
column (Figure 30).
2. Filtering from the GO enrichment node by:
• Namespace: biological process, molecular function, subcellular compartment
• Type of representation: over, under, or both
• Number of terms to cluster, ranked by p-value
• Fold change
• Number of proteins per term
The results show a list of GO terms (cluster representatives) on the left, and the dispensed
terms that were clustered underneath them. The Dispensability score shows at what similarity
cutoff the GO term would be clustered under another term.
As usual, you can export the table of results as an excel sheet by clicking on "Export Table…"
at the bottom.
Page 66 of 181
3.5.4.5 Differential Abundance Plots
Under the Differential Abundance node, several plots related to the post-analysis are also
generated. Please, find detailed information on each of these plots in Appendix 7. Post
Analysis Perspective Plots (Section 7.7). The most relevant are the Heatmap and the
Volcano Plot:
1. The Heatmap will be clustered row and column wise according to the Post Analysis
settings. The raw data of the Heatmap can be exported via right-click on the plot (Figure 31)
Figure 31. Heatmap with clustering in both rows and columns. The heatmap is built using the set of
confidently identified datapoints. By right-clicking on the plot you can perform several actions, such as Save
Image As, export the data matrix, or modify the scaling.
2. The Volcano Plot shows the results of the differential abundance test by plotting the
peptides or proteins' fold changes against the significance level. The candidates will appear
in red on the plot (Figure 32). By selecting one of the boxes above the plot, you can display
annotations for all differential analysis candidates, or you can custom select proteins of
interest and highlight them in blue.
Page 67 of 181
Figure 32. The Volcano Plot shows the candidates in red. This plot is updated when you modify the
Candidates table's thresholds. By right-clicking you can choose several actions, such as deactivate the
legend or change the scale in the graph.
The PTM analysis results are available for the experiments where the PTM workflow settings
were chosen. The new Spectronaut 15 performs PTM differential abundance analysis on a
modification site level (for more details see section 3.4.1.5)
The results of the PTM differential abundance analysis are reported in the candidates list and
corresponding volcano plot. Among other information, the PTM analysis node contains
principal component analysis that shows clustering of the samples based on their PTM sites
quantification profiles. Additionally, PTM analysis specific graphs are available: Modification
Enrichment (see section 3.5.5.2) and PTM vs Protein Fold Changes (see section 3.5.5.3).
The results of the differential abundance testing on a modification site level will show up under
this node. The Candidates table contains a list of differentialy abundant modification sites with
their fold changes and Qvalues, annotated by paired or unpaired t-test comparison. The
identification key of the specific modification site object, for which differential abundance
analysis result is reported, could be found in the column «Group».
Page 68 of 181
The PTM analysis candidates table can be viewed, modified and exported in a similar way as
candidates of differential abundance analysis on the protein level (see section 3.5.4.1). The
candidates of the PTM differential analysis will be visualized in the corresonding volcano plot
which example is available in Appendix 7.7.
The plot shows the percentage of all identified precursors that are carrying a selected
modification in each of the experimental runs. If the modification can occur on different amino
acids, the plot will show percentage of the precursors carrying that particular modification on
each of those amino acids. The example of such a graph is presented in Figure 33, showing
an enrichment of phosphorylation localized on tyrosine, serine and threonine. The modification
enrichment plot is dedicated to experiments conducted with the step of modified peptides
enrichment.
Figure 33. Modification Enrichment plot shows the percentage of all the precursors carrying given
modification in each of the experimental runs.
PTM vs Protein Fold Changes plot shows protein group log2 ratios plotted against log2 ratios
of modification sites. The plot will help to identify changes on the PTM site level that are
independent of the protein group abundance changes.
Page 69 of 181
Figure 34. PTM vs Protein Fold Changes plot shows log2 ratio of the protein groups ploted against Log2
ratios of the PTM sites.
Page 70 of 181
3.6 Report Perspective
Spectronaut™ has a very powerful reporting strategy. In the Report Perspective, you can
design and customize your report to contain any information you may need about the analysis.
In case PTM workflow was used for the analysis, a specialized PTM site report will be
available. Report schemas of any report type can be saved and reused. You can also change
the column names to fit your needs (Figure 35).
Figure 35. Report Perspective. The figure shows the process of customizing a Normal Report schema and
exporting the data. Detailed explanation of the headers can be found by hovering over them or in
Appendix 8. Most Relevant Report Headers
1. The Schema tree: all different report building schemas. If you save a custom one, it will
appear here.
2. Column chooser: all possible reportable elements with check boxes to add or remove them.
Below this panel, there is a search field to help you navigate through the different fields.
3. Filters applied to the report.
4. Report preview: A preview of how your report will look like. This is very useful when you
are modifying a schema. When you are happy with your report structure, you can export it
by clicking on "Export Report…" in the bottom left corner to be able to see the whole matrix.
Page 71 of 181
3.6.1 Report Schemas
Spectronaut includes several preconfigured reporting schemas that may fit most frequent
needs. If you want to design your own, you can use one of the included as a base to build
your preferred report.
Within the report schemas, there are two main formats you can export your data into:
Normal Report and Run Pivot Report. Find detailed information about each format below.
In a Normal Report (long format), you will find each reported event in a single row (Figure 35).
A Normal Report will usually have many more rows than a Run Pivot Report. This format is
the one allowing for the most comprehensive report of your data. To build your Normal Report,
add or remove columns from the Columns panel by checking or unchecking them (Figure 35).
The Columns are organized by levels, from more general (Experiment) to more specific
(Fragment):
Experiment
Run
Protein Group
Peptide
Elution Group
Fragment Group
Fragment
Within each of these levels, the columns are again organized by categories (e.g., identification,
quantification, scoring, etc.). The whole Columns tree is quite comprehensive, and
expanding/collapsing categories when looking for a column can be cumbersome: to make the
search for columns easier, there is a search field at the bottom of the Columns panel where
you can type what you are looking for (Figure 35). Finally, to know which information a header
contains, hover over it and you will see a text box popping up with a description.
To see a detailed description of some of the most relevant columns, see Appendix 8. Most
Relevant Report Headers
In a Run Pivot Report (wide format), each run (sample) will be a header column. You can
choose which element you want to be rows in the Columns panel under Row Labels (e.g.,
Page 72 of 181
stripped peptide sequence) and which value you want in the cells under Cell Values (e.g.,
quantitative value, Figure 36). If you choose more than one Row Label or Cell Value, the table
will multiply its length column-wise. This report will probably have fewer rows than a
Normal Report.
Figure 36. Run Pivot Report. This report is in wide format and contains one column per run (sample).
Page 73 of 181
3.6.4 PTM site report
For the experiments analyzed with the PTM workflow, a specialized, PTM site report is
dedicated. The Columns are organized by levels, similarly like standard report, from more
general (Experiment) to more specific (PTM site):
Experiment
Run
Protein Group
>PTM site
Among other information, PTM site level of the report contains details on how the precursor
collapse was performed in order to obtain quantitative data for each PTM site object, PTM site
quantitation data, PTM flanking region and PTM localization site probability. In order to obtain
detailed description of some of the most relevant columns of PTM site report, see Appendix
7.8)
Figure 37. PTM site report. The report is avaiable for the experiments analyzed with PTM workflow.
The quality control perspective of Spectronaut™ is based on the iRT Kit. Chromatography,
mass spectrometer performance and analysis can be monitored over time using several
Page 74 of 181
performance indicators. Every successful analysis is stored in the quality control perspective
(Figure 38). Spectronaut automatically detects various instruments and will create a separate
quality control history for each of them. If you have more than one instrument of the same type
it might be useful to rename them manually. Additional folder structures can be made
according to the established quality control in a specific laboratory.
Figure 38. QC Perspective. Runs in which the QC panel is detected are saved in the History tree. You can
monitor instrument performance with help of many plots related to several aspects of the experiment, from
LC-MS to Spectronaut analysis.
Only as many runs as specified in Settings → Global → General → QC Plot History Length
are shown in the plots.
3.7.1 QC Panels
In addition to the iRT Kit, sample specific QC panels can be created in the Library Perspective.
Right-click on a spectral library and select "Enable QC". Whenever this spectral library is used
in an analysis, a respective QC file is written for each of the runs included in this analysis.
When selecting this novel QC panel in the QC perspective, all the corresponding QC files
(runs analyzed with the library enable for QC) will be available for QC monitoring
Page 75 of 181
Page 76 of 181
3.8 Pipeline Perspective
The Pipeline Perspective is used to batch process library-based DIA analyses using
predefined settings. Spectronaut™ works most efficiently when several experiments are
processed sequentially rather than in parallel (because of disc IO). If you are not interested in
manual evaluation of your peaks, the pipeline perspective might be your preferred choice. The
set-up of an experiment works similarly to the set-up in the Analysis Perspective. The setup
Analyses will be added to the Pipeline Queue. Clicking "Run Pipeline" will start to process the
queued experiments sequentially. Spectronaut will automatically generate the report
according to the settings in the chosen schema (Figure 39).
Figure 39. Pipeline Perspective. Queue DIA analyses so Spectronaut can process them sequentially.
Experiment files, reports, plots, and summaries will be generated and stored according to the DIA Analysis
Reporting Settings.
Page 77 of 181
3.8.1 SNE Combine Workflow
Introduced with Spectronaut 14, the SNE combine workflow allows for a novel type of analysis
workflow for processing large scale experiments. The idea behind this workflow is that a large
experiment can be analyzed in small batches and the results stored as individual SNE files.
You can then use the SNE combine workflow to merge the identification results of the
individual batches in FDR controlled manner.
Additionally, SNE combine will run the same post processing steps as for a single batch
experiment. These include cross run normalization, interference correction, protein inference,
quantification, and FDR.
The entire process is geared towards memory scalability that will allow the analysis of 10’000+
DIA files on a low-cost desktop workstation.
In order to ensure a successful analysis, the workflow comes with a few limitations though.
• All batches must be analyzed using the same spectral library and acquisition method.
• SNE files of directDIA experiments are currently not supported
• Regulation analysis and related processes (like Heatmap and Volcano plot) are
currently not supported
• Final report exports are currently only supported in long-format (no Pivot report
possibilities).
• Normalization is limited to the “Global Normalization” option
• Information on sample grouping will be merged from the individual SNE files’ Condition
Setups and cannot be changed in the SNE combine workflow
You can use the SNE combine workflow from the Pipeline perspective.
1. Click on “Set up a SNE combine Process” at the bottom left. The SNE combine wizard
dialog will open. Select multiple SNE files (or a folder with SNE files) you want to combine.
2. The next step in the wizard allows you to change settings for the DIA analysis. You can
define Quantification and Protein Inference settings as well as your output files (in the
“Pipeline Mode” settings; Figure 40).
Page 78 of 181
▪ Note: Not all analysis settings are available because the identification results
from the individual SNE files are used, and the associated settings cannot be
changed during SNE file combination. Make sure you select the correct settings
when you initially generate the SNE files.
Figure 40. The SNE Combine Wizard guides through the parameters that are recalculated during multiple
SNE files merging.
3. In the next step of the wizard, you can optionally select a protein database (FASTA file)
that you would like to use for protein inference (instead of what was used for initial
generation of the SNE files). This can be useful if the SNE files were made with different
protein databases.
4. Finally, you need to define an output folder. The report files you selected will be saved
there in a new subfolder (named after the experiment name with a time stamp). Once
you click “Finish”, the SNE combine job will be added to your pipeline queue.
You can start the job by clicking “Run Pipeline”. Alternatively, you can first add additional
SNE combine or DIA analysis jobs to the pipeline.
Page 79 of 181
3.8.2 DIA Analysis Pipeline Mode Settings
Here you can choose which reports should be written, whether run-based or experiment-based
reports should be performed, whether scoring histograms should be reported, and if the whole
experiment should be saved to an *.sne file (Figure 41).
Figure 41. DIA Analysis Pipeline Mode Settings. Define which results from the Pipeline Perspective should
be saved.
Page 80 of 181
3.9 Databases Perspective
The Databases Perspective allows you to store and manage information that you will need to
use when setting up analyses. This includes protein databases, gene annotations, peptide
modifications, etc.
This section of the settings perspective lets you import and manage your protein databases.
Spectronaut uses protein databases (FASTA files) to make searches for library generation
with Pulsar and to do protein inference. The protein databases contain all of the sequences,
as well as meta-information extracted from the FASTA protein headers, using the specified
parsing rule. Spectronaut already contains the UniProt parsing rule, but you can add a new
rule by clicking "New Rule" in the Protein Databases page or during an importation (Figure
42).
In order to import a new proteome database from FASTA click on "Import…" in the bottom left
corner" (Figure 42). While importing a new protein database from FASTA, Spectronaut will try
to find the appropriate parsing rule for this file format from the already specified rules. Should
no matching parsing rule found, you will be asked to specify a new one. Once your new protein
database is imported it will be available in the Databases tree during the analysis set-up.
Page 81 of 181
Figure 42. Importing a new FASTA file into Spectronaut. Your new database will appear in the Databases
tree and will be available for setting-up analyses.
3.9.2 Modifications
Spectronaut comes with a database of default modifications for all search engines. If you use
special modifications, please import the corresponding modifications file into Spectronaut.
To import non-default modifications into Spectronaut, you can batch import (see also Table 5):
Page 82 of 181
• For ProteinPilot, using Unified Modification Catalog.xlsx, located in the ProteinPilot/Help
folder in the Program Files.
• For Proteome Discoverer, no action is required.
• For Mascot, non-default modifications have to be created as custom modifications, see
below.
When possible, Spectronaut will merge identical modification from multiple sources and save
only the necessary search engine specific mapping information. However, if it is not able to
unambiguously merge two or more modifications, you will be asked to resolve any conflicts at
the time of import. You can tell if a modification has been mapped to multiple search engines
by looking at the "Mapped to" data grid in the panel.
It is possible to specify a new modification. This action has two main applications:
1. Incorporate modifications for Mascot searches which are not in the Unimod database
(non-default ones).
2. Add a new label to generate a labeled library
To create a new modification, click "new" in the bottom left corner, give a name to your
modification and click "OK" (Figure 43). Edit your new modification as desired and click "Save"
in the bottom left corner. You can also modify an existing modification by clicking "Save As…".
Page 83 of 181
Figure 43. Adding a new modification to the database.
This tool lets you define the rule to in silico digest your proteins from the protein database(s).
Digest rules are applied whenever you do a Pulsar search (in library generation or in
directDIA™). The most frequent rules are already included in Spectronaut, such as Trypsin,
Trypsin/P, and LysC.
To design your own rule, you have to click on an existing one, modify it and click on "Save
as…" in the bottom left corner (Figure 44). The rules are defined by which sites are cleaved
by the enzyme. In the Digest Rule page, you will see a 20 x 20 matrix containing all possible
combinations of amino acids. Select the combination where your enzyme cleaves (Figure 44).
At the bottom you will see a preview of how a sequence will look like after being cleaved
following your digest rule. You can also include a description.
Page 84 of 181
Figure 44. Define a new Cleavage Rule. The Cleavage Rule editor will allow the generation of new cleavages
rules in a very friendly manner.
3.9.4 GO Databases
Similar to the Protein Databases, Gene Ontology (GO) Databases in Spectronaut are used to
further annotate your data. This annotation will be used for calculating term enrichment and
give further biological insight into the differential abundance results. The GO databases
section manages two different data structures: Gene Ontologies and gene annotations. Find
more details about each below.
These allow you to import complex gene ontology structures in the shape of graphs. These
structures are used for hierarchical grouping of functions, components, and processes.
Currently, Spectronaut supports the *.obo file format from the GO Consortium. The
go-basic.obo is already part of the Spectronaut installation. Information from a gene ontology
tree can only be used in combination with an organism specific gene annotation file.
Page 85 of 181
3.9.4.2 Gene Annotations
The gene annotation file functions as a link between the protein identifier (Uniprot accession
number) and the GO tree. In its most basic form, the gene annotation file must feature two
columns:
Using this format, Spectronaut will connect the protein entries of your analysis via the GO-ID
with the respective entries in the Gene Ontology to annotate your data further.
The official GO Consortium annotation file (*.gaf) is recommended, but you can specify a
custom annotation file.
To import a new gene annotation file into Spectronaut, go to the GO Databases page of the
Databases Perspective. Click "Import Gene Annotation…" and navigate to your *.gaf file. The
GO annotation will automatically appear in your Gene Annotations tree.
Spectronaut™ can remember column names in user spectral libraries. Once you import a new
library format into Spectronaut, it will ask you whether it should store novel synonyms for
column header. You can manage those synonyms in the column recognition settings tab.
Page 86 of 181
3.10 Settings Perspective
The Settings Perspective of Spectronaut™ is meant to define custom settings schemas for
any of the processes performed by the software. In this perspective, you will see a tab
corresponding to each of these processes: DIA Analysis, Pulsar Search, directDIA™ and
Library Generation (Figure 45). In addition, you can alter global settings of Spectronaut in the
Global page (see below).
Detailed information regarding each setting option can be obtained by hovering the mouse
over the label of a specific settings variable (Figure 45).
Make your own setting schema by modifying one of the predefined ones. Go throughout the
nodes and edit the corresponding settings. Once you are done with the customization, click
"Save as…" in the bottom left corner to give a name to your schema, and click "OK"
(Figure 45). Your new schema will appear in the tree and it will be available to be chosen
during the set-up of your next analysis. You can set your newly created schema as default
Spectronaut settings by right clicking on the schema and choosing “set as default”.
See the Appendixes for detailed information about the numerous settings within each process.
Figure 45. Make a custom schema for your analysis. The new schemas will be available during the
subsequent analysis set-ups.
Page 87 of 181
3.10.1 DIA Analysis Settings
The DIA Analysis Settings define the details of how Spectronaut should analyze the data, from
DIA targeted data extraction to post-analysis calculations. These settings will specify important
metrics, such as FDR cutoffs, decoy set generation to estimate scores, quantification settings,
workflow to be used (label-free, labeled, spike-in), among many others. Find details of each
setting in Appendix 1. DIA Analysis Settings (Section 7.1).
These settings define how Pulsar should create the search-space when performing a search.
You can specify the expected peptide characteristics (enzyme used, length, modifications,
among others). Find details of each setting in the Appendix 7.2
The directDIA™ Settings are divided to section related to the Pulsar settings and section
related to the quantitative DIA analysis. The Pulsar settings will define how to generate the
search-space and how to perform the identification (FDR cutoffs). DIA analysis settings will
define how the quantification will be performed. Find details of each setting in Appendix 3.
directDIA™ Settings (Section 7.3).
This set of settings defines the Library Generation process, either from Pulsar or from an
external search engine. Metrics such as MS1 and MS2 tolerances, FDR cutoffs for
identification confidence, peptide-based filters for your library, among others. Find details of
each setting in Appendix 4. Library Generation Settings (Section 7.4).
The "Global" settings tab in the Settings perspective will allow you to change parameters that
can be considered analysis unspecific. Here you will find options regarding plotting, working
directories, as well as some general settings.
3.10.5.1 General
This section contains settings options that allow you to modify the default behavior of
Spectronaut. For more information about these options use the tool-tip hover for each
individual entry.
Page 88 of 181
One important aspect of these settings is the File Name Parsing Strategy to let Spectronaut
read information directly from the run file name. One of the most relevant uses of this function
is the Condition Setup annotations. By defining the meaning of the different blocks in the file
name, Spectronaut is able to obtain this information automatically (Figure 46).
Figure 46. Parsing Rule Editor in the General Settings of the Global Settings page. Setting this rule properly
will let Spectronaut read annotation information directly from the run file name.
3.10.5.2 Directories
Here you can setup the different storage paths for data managed by Spectronaut. Should you
have a central storage location for all your DDA, you can specify this location here. This will
allow Spectronaut to automatically map the correct shotgun acquisitions during the setup of
the library generation pipeline (see Box 6). Please note that all changes within the "Directories"
section will require a restart of Spectronaut in order to take effect. See some recommendation
in Section 1.4.
3.10.5.3 Plotting
The plotting section allows you to customize the look and feel of most of the plotting options
used in Spectronaut. You can specify whether XIC plots should show the integration
Page 89 of 181
boundaries, as well as the expected elution time. Additionally, you can also apply smoothing
to your plots. For more information about these options use the tool tip hover for each
individual entry.
3.10.5.4 Reporting
1. Where to locate the results of the analysis performed via the Pipeline Perspective.
2. The default name of the files exported from Spectronaut.
When generating a library from the command line, you simply call the Spectronaut.exe file
followed by “-lg -se”. The other parameters depend on the search engine of your choice.
When generating a library with the Pulsar search engine, start the command with
“-lg -se Pulsar”. After that, use the following parameters to set up the experiment:
Table 8. Command line arguments for spectral library generation from Pulsar
Argument Explanation
-lg -se Pulsar Used as first command argument to run a library generation pipeline using Pulsar.
-r [path to file] Adds a single raw file to the experiment. Any file format that is also supported during
the analysis setup from the user interface is possible (*.raw, *.wiff, *.bgms,
_HEADER.TXT, analysis.baf). This command can be used multiple times to add
additional files.
-d [path to directory] Adds all raw files, recognized by Spectronaut, from a given directory. This option
includes vendor files that are already represented as folders (Bruker .d folders,
Waters run folders). This command can be used multiple times to add additional
directories.
-sa [path to file] [OPTIONAL] Adds a specific Search Archive (*.psar file) to this library generation
experiment. This command can be used multiple times to add additional files. The
default location for Search Archives is configured in the Settings Perspective > Global
> Directories.
Page 90 of 181
-sad [path to directory] [OPTIONAL] Adds all search archives (*.psar files) within a specified directory to this
library generation experiment.
-fasta [path to file] Specifies the path to a *.bgsfasta (Managed FASTA, namely FASTA file containing
parsing rule i.e. a set of instructions that inform Spectronaut on how to read the
column headers) file to be used as the DDA search space. This command can be
used multiple times to add additional files. The default location for Managed FASTA
files is configured in the Settings Perspective > Global > Directories.
-rs [path to file] [OPTIONAL] Specifies the Pulsar Search schema to be used for this search. If not
specified, whatever is selected as the default schema will be used. This command
OR
can either be provided with a path to the schema file (*.prop) or with the schema
-rs [schema-name] name. The latter requires the schema to be in the internal Spectronaut Search
schema repository (i.e., you should see it in the GUI).
-es [path to file] [OPTIONAL] Specifies the Library Generation schema to be used for this search. If
not specified, whatever is selected as the default schema will be used. This command
OR
can either be provided with a path to the schema file (*.prop) or with the schema
-es [schema-name] name. The latter requires the schema to be in the internal Spectronaut Library
Generation schema repository (i.e., you should see it in the GUI).
-a [path to file] [OPTIONAL] Specifies the target location for the Search Archive file generated from
this search. If not provided, the Search Archive will be stored in the default location.
The default location for Search Archives is configured in the Settings Perspective >
Global > Directories.
-k [path to file] [OPTIONAL] Specifies the target location for the spectral library file generated from
this search. If not provided, the library will be stored in the default location. The default
location for Spectral Libraries is configured in the Settings Perspective > Global >
Directories.
-n [any text] [OPTIONAL] Specifies the experiment name for this search. This name will be used
to label the resulting spectral library and Search Archive. If not provided, Spectronaut
will automatically generate an experiment name from the selected run file names.
-inf [path to file] [OPTIONAL] Specifies the path to a *.bgsfasta (Managed FASTA) file to be used in
this library for protein inference. The default location for Managed FASTA files is
configured in the Settings Perspective > Global > Directories. By default, the FASTA
files specified for the search space will be used for protein inference. This command
can be used multiple times to add additional files.
Page 91 of 181
-go [path to file] Path to Gene Annotation file (could be multiple). This command can be used multiple
times to add additional files.
Table 9. Command line arguments for spectral library generation from an external search engine
Argument Explanation
-lg -se <SearchEngineName> Used as first command argument to run a library generation pipeline using
an external search engine. Replace <SearchEngineName> with one of the
following choices: ProteomeDiscoverer, MaxQuant, ProteinPilot,
BGSGenericSearchFormat, or Mascot.
-sr [path to search results] Path to the search results. Please refer to the search engine specific
section to see what type of results are needed for a given search engine.
-rd [path to raw files] Specify the path to the run files.
-o [output library file] Library file destination including the file name.
-s [path to file] [OPTIONAL] Specifies the Library Generation schema to be used for this
search result. If not specified, whatever is selected as the default schema
will be used. This command should be provided with a path to the schema
file (*.prop).
-fasta [path to file] [OPTIONAL] Specifies the path to a *.bgsfasta (Managed FASTA) file to
be used in this library for protein inference. The default location for
Managed FASTA files is configured in the Settings Perspective > Global
> Directories.
In addition to the visual pipeline mode, Spectronaut is also capable of running the pipeline
from command line. To run Spectronaut in command line mode you simply call the
Spectronaut.exe file using the following parameter.
Page 92 of 181
Argument Explanation
-r [path to file] Adds a single run file to the experiment. Any file format that is also supported
during the analysis setup from the user interface is possible (*.htrms, *.raw,
*.wiff, *.bgms, _HEADER.TXT, analysis.baf). This command can be used
multiple times to add additional files.
-d [path to directory] Adds all run files, recognized by Spectronaut, from a given directory. This
option includes vendor files that are already represented as folders (Bruker:
.d folders, Waters: .raw folders).
-regex [regular expression] [OPTIONAL] Applies a regular expression as filter to the -d command.
-a [path to file] Assigns a spectral library to every run in the experiment. This command can
be used multiple times to add additional files.
-ar [path to file] Assigns a spectral library to the last run added with the -r command. This
command can be used multiple times to add additional files.
-s [path to file] [OPTIONAL] Specifies the settings schema to be used for the analysis. If not
specified, whatever is selected as the default schema will be used. This
OR
command can either be provided with a path to the schema file (*.prop) or
-s [schema-name]
with the schema name. The latter requires the schema to be in the default
location for Spectronaut analysis schemas.
-o [path to directory] [OPTIONAL] Specifies the output directory for this experiment. All generated
reports will be located in a sub-folder titled with the analysis date and the
experiment name. By default, the results will be placed in the users specified
default output location (global settings in Spectronaut) or in
%Appdata%/Spectronaut/Results if nothing is specified.
-n [any text] [OPTIONAL] Specifies the name of this experiment. If not provided,
Spectronaut will automatically generate an experiment name from the
selected run file names.
-fasta [path to file] [OPTIONAL] Specifies the path to a *.bgsfasta file to be used in this
experiment. In a library based (peptide-centric) analysis, this file is used for
protein inference. This command is not optional in combination with the -direct
command. This command can be used multiple times to add addition files.
-go [path to file] [OPTIONAL] Specifies the path to the Gene Ontology (GO) annotation file to
append. This command can be used multiple times to add additional files.
Page 93 of 181
-direct [OPTIONAL] Triggers a directDIA (spectrum-centric) pipeline instead of a
library-based (peptide-centric) pipeline. Most commands work the same in
this mode. However, the -fasta command is required while the -a and -ar
commands will be ignored. This command has to be the first one to be called.
-con [path to file] [OPTIONAL] Specifies the condition setup to be used for the post processing
(such as regulation analysis). This condition setup file is best generated from
the Spectronaut user interface and then exported for use in the command line
mode. If not used, Spectronaut will use the specified run name parsing
strategy to derive condition names for the samples.
-command [path to file] [OPTIONAL] Specifies the path to a command (or arguments) file to be used
instead of the standard command line arguments. This must be the first
command if used. Any subsequent commands will be ignored. The command
arguments file works the same as the regular command line arguments but is
a line-based file that is read in instead.
One can generate an example for the command arguments file from the last
page of the Spectronaut experiment setup.
An example:
If you encounter problems with the automatic parsing of your spectral library, please first try
to load the spectral library using Spectronaut's graphical user interface and make sure that all
necessary columns are recognized automatically.
One can also export a command line setup from the last page of the experiment setup wizard
for a DIA analysis in Spectronaut.
To get more information on command line mode, watch our short video tutorial here.
Page 94 of 181
4 HTRMS Converter
The HTRMS Converter converts DIA run files into a Biognosys compatible format called
HTRMS. These files are pre-processed and optimized to be analyzed in Spectronaut.
Converting run files into HTRMS files is very useful if you need to analyze the same files
several times. The overall analysis time will be significantly reduced.
The HTRMS converter is free to use and can be run on multiple computers without the
requirement of a license key. To get more information on the use of the HTRMS converter,
watch our short video tutorial here.
Figure 47. The new HTRMS converter with multiple tasks added to the task-list and monitoring a local
directory for new MS/MS files.
In order to select one or more files from your hard drive to be converted, click on the drop-
down arrow in the Converter perspective and select "Add Files…". After selecting one or more
MS/MS files from the hard-drive, an input form will appear which allows you to specify the
Page 95 of 181
conversion parameter. Click on "OK" to add all selected files to the main task list in the
Converter perspective. You can add more tasks to the list at any moment.
The HTRMS Converter also allows the deconvolution of spectra, collected using overlapping
windows, as proposed by (Amodei et al., 2019). This feature can be applied for varied- or
fixed-size MS/MS scans. To enable this function, head to the HTRMS file settings and select
the MS2 DeMultiplexing feature.
Using the drop-down arrow and selecting "Add Folder…" will ask you to specify an input folder.
The folder conversion will automatically convert all valid MS/MS files within the target folder
that meet some basic filter criterion.
Using the folder conversion, you now have access to the "Batch Conversion" settings, which
allows you to specify filter criteria such as vendor or file age. You can also specify monitoring
of the folder in order to automatically convert every new file that is added to the input folder.
-o [path] FALSE Destination path for a HTRMS file (including file name) or path to
destination folder.
Default: BGS_FactorySettings
-nogui FALSE Runs the conversion in command line window instead of starting
the task in UI mode
Page 96 of 181
5 BGMS Raw API
Besides the specified vendor formats, Spectronaut™ also supports the Biognosys generic MS
file format called BGMS. This file format can be generated using the BGSRawAPI.dll that is
installed together with Spectronaut. This API is written in C# .NET and can be used with any
Microsoft .NET language to build a custom file processor for raw vendor formats. This is
especially useful if you want to process DIA data from an instrument that Spectronaut does
not natively support or if your DIA method requires some special scan pre-processing not
implemented in Spectronaut. An example project for how to use the BGMS raw API can be
requested after reaching out to support@biognosys.com.
Page 97 of 181
6 References
Amodei, D., Egertson, J., MacLean, B. X., Johnson, R., Merrihew, G. E., Keller, A., Marsh, D.,
Vitek, O., Mallick, P., & MacCoss, M. J. (2019). Improving Precursor Selectivity in Data-
Independent Acquisition Using Overlapping Windows. Journal of the American Society
for Mass Spectrometry, 30(4), 669–684. https://doi.org/10.1007/s13361-018-2122-8
Bekker-Jensen, D. B., Bernhardt, O. M., Hogrebe, A., Martinez-Val, A., Verbeke, L., Gandhi,
T., Kelstrup, C. D., Reiter, L., & Olsen, J. V. (2020). Rapid and site-specific deep
phosphoproteome profiling by data-independent acquisition without the need for spectral
libraries. Nature Communications, 11(1), 787. https://doi.org/10.1038/s41467-020-
14609-1
Bekker-Jensen, D. B., Martinez-Val, A., Steigerwald, S., Rüther, P. L., Fort, K. L., Arrey, T. N.,
Harder, A., Makarov, A. A., & Olsen, J. V. (2020). A Compact Quadrupole-Orbitrap Mass
Spectrometer with FAIMS Interface Improves Proteome Coverage in Short LC Gradients.
Molecular & Cellular Proteomics, mcp.TIR119.001906.
https://doi.org/10.1074/mcp.TIR119.001906
Benjamini, Y., & Hochberg, Y. (1995). Controlling the False Discovery Rate: A Practical and
Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B
(Methodological), 57(1), 289–300.
Bilbao, A., Zhang, Y., Varesio, E., Luban, J., Strambio-De-Castillia, C., Lisacek, F., &
Hopfgartner, G. (2015). Ranking Fragment Ions Based on Outlier Detection for Improved
Label-Free Quantification in Data-Independent Acquisition LC-MS/MS. Journal of
Proteome Research, 14(11), 4581–4593.
https://doi.org/10.1021/acs.jproteome.5b00394
Bruderer, R., Bernhardt, O. M., Gandhi, T., Miladinović, S. M., Cheng, L.-Y., Messner, S.,
Ehrenberger, T., Zanotelli, V., Butscheid, Y., Escher, C., Vitek, O., Rinner, O., & Reiter,
L. (2015). Extending the limits of quantitative proteome profiling with data-independent
acquisition and application to acetaminophen-treated three-dimensional liver
microtissues. Molecular & Cellular Proteomics : MCP, 14(5), 1400–1410.
https://doi.org/10.1074/mcp.M114.044305
Bruderer, R., Bernhardt, O. M., Gandhi, T., Xuan, Y., Sondermann, J., Schmidt, M., Gomez-
Varela, D., & Reiter, L. (2017). Optimization of Experimental Parameters in Data-
Independent Mass Spectrometry Significantly Increases Depth and Reproducibility of
Results. Molecular & Cellular Proteomics : MCP, 16(12), 2296–2309.
https://doi.org/10.1074/mcp.RA117.000314
Callister, S. J., Barry, R. C., Adkins, J. N., Johnson, E. T., Qian, W.-J., Webb-Robertson, B.-
Page 98 of 181
J. M., Smith, R. D., & Lipton, M. S. (2006). Normalization approaches for removing
systematic biases associated with mass spectrometry and label-free proteomics. Journal
of Proteome Research, 5(2), 277–286. https://doi.org/10.1021/pr050300l
Cox, J., Neuhauser, N., Michalski, A., Scheltema, R. A., Olsen, J. V, & Mann, M. (2011).
Andromeda: a peptide search engine integrated into the MaxQuant environment. Journal
of Proteome Research, 10(4), 1794–1805. https://doi.org/10.1021/pr101065j
Distler, U., Kuharev, J., Navarro, P., Levin, Y., Schild, H., & Tenzer, S. (2014). Drift time-
specific collision energies enable deep-coverage data-independent acquisition
proteomics. Nature Methods, 11(2), 167–170. https://doi.org/10.1038/nmeth.2767
Dunn, O. J. (1961). Multiple Comparisons among Means. Journal of the American Statistical
Association, 56(293), 52–64. https://doi.org/10.1080/01621459.1961.10482090
Egertson, J. D., Kuehn, A., Merrihew, G. E., Bateman, N. W., MacLean, B. X., Ting, Y. S.,
Canterbury, J. D., Marsh, D. M., Kellmann, M., Zabrouskov, V., Wu, C. C., & MacCoss,
M. J. (2013). Multiplexed MS/MS for improved data-independent acquisition. Nature
Methods, 10(8), 744–746. https://doi.org/10.1038/nmeth.2528
Escher, C., Reiter, L., MacLean, B., Ossola, R., Herzog, F., Chilton, J., MacCoss, M. J., &
Rinner, O. (2012). Using iRT, a normalized retention time for more targeted measurement
of peptides. Proteomics, 12(8), 1111–1121. https://doi.org/10.1002/pmic.201100463
Geiger, T., Cox, J., & Mann, M. (2010). Proteomics on an Orbitrap benchtop mass
spectrometer using all-ion fragmentation. Molecular & Cellular Proteomics : MCP, 9(10),
2252–2261. https://doi.org/10.1074/mcp.M110.001537
Gillet, L. C., Navarro, P., Tate, S., Rost, H., Selevsek, N., Reiter, L., Bonner, R., & Aebersold,
R. (2012). Targeted Data Extraction of the MS/MS Spectra Generated by Data-
independent Acquisition: A New Concept for Consistent and Accurate Proteome
Analysis. Molecular & Cellular Proteomics, 11(6), O111.016717-O111.016717.
https://doi.org/10.1074/mcp.O111.016717
Huang, T., Bruderer, R., Muntel, J., Xuan, Y., Vitek, O., & Reiter, L. (2019). Combining
Precursor and Fragment Information for Improved Detection of Differential Abundance in
Data Independent Acquisition. Molecular &Amp; Cellular Proteomics,
mcp.RA119.001705. https://doi.org/10.1074/mcp.RA119.001705
Kiyonami, R. (2014). Large Scale Targeted Protein Quantification Using WiSIM-DIA on an
Orbitrap Fusion Tribrid Mass Spectrometer.
Lambert, J.-P., Ivosev, G., Couzens, A. L., Larsen, B., Taipale, M., Lin, Z.-Y., Zhong, Q.,
Lindquist, S., Vidal, M., Aebersold, R., Pawson, T., Bonner, R., Tate, S., & Gingras, A.-
C. (2013). Mapping differential interactomes by affinity purification coupled with data-
independent mass spectrometry acquisition. Nature Methods, 10(12), 1239–1245.
Page 99 of 181
https://doi.org/10.1038/nmeth.2702
Li, W., Chi, H., Salovska, B., Wu, C., Sun, L., Rosenberger, G., & Liu, Y. (2019). Assessing
the Relationship Between Mass Window Width and Retention Time Scheduling on
Protein Coverage for Data-Independent Acquisition. Journal of the American Society for
Mass Spectrometry, 30(8), 1396–1405. https://doi.org/10.1021/jasms.8b06074
Meier, F., Geyer, P. E., Virreira Winter, S., Cox, J., & Mann, M. (2018). BoxCar acquisition
method enables single-shot proteomics at a depth of 10,000 proteins in 100 minutes.
Nature Methods, 15(6), 440–448. https://doi.org/10.1038/s41592-018-0003-5
Mi, H., Muruganujan, A., Casagrande, J. T., & Thomas, P. D. (2013). Large-scale gene
function analysis with the PANTHER classification system. Nature Protocols, 8(8), 1551–
1566. https://doi.org/10.1038/nprot.2013.092
Moseley, M. A., Hughes, C. J., Juvvadi, P. R., Soderblom, E. J., Lennon, S., Perkins, S. R.,
Thompson, J. W., Steinbach, W. J., Geromanos, S. J., Wildgoose, J., Langridge, J. I.,
Richardson, K., & Vissers, J. P. C. (2018). Scanning Quadrupole Data-Independent
Acquisition, Part A: Qualitative and Quantitative Characterization. Journal of Proteome
Research, 17(2), 770–779. https://doi.org/10.1021/acs.jproteome.7b00464
Silva, J. C., Gorenstein, M. V, Li, G.-Z., Vissers, J. P. C., & Geromanos, S. J. (2006). Absolute
quantification of proteins by LCMSE: a virtue of parallel MS acquisition. Molecular &
Cellular Proteomics : MCP, 5(1), 144–156. https://doi.org/10.1074/mcp.M500230-
MCP200
Supek, F., Bošnjak, M., Škunca, N., & Šmuc, T. (2011). REVIGO summarizes and visualizes
long lists of gene ontology terms. PloS One, 6(7), e21800.
https://doi.org/10.1371/journal.pone.0021800
Tsou, C.-C., Avtonomov, D., Larsen, B., Tucholska, M., Choi, H., Gingras, A.-C., &
Nesvizhskii, A. I. (2015). DIA-Umpire: comprehensive computational framework for data-
independent acquisition proteomics. Nature Methods, 12(3), 258–264, 7 p following 264.
https://doi.org/10.1038/nmeth.3255
Zhang, B., Chambers, M. C., & Tabb, D. L. (2007). Proteomic parsimony through bipartite
graph analysis improves accuracy and transparency. Journal of Proteome Research,
6(9), 3549–3557. https://doi.org/10.1021/pr070230d
Data Extraction
Intensity Extraction: Defines how the ion intensity for a requested m/z value is calculated from
a specific scan.
• Maximum Intensity (default setting): Picks the highest data-point within the selected m/z
tolerance
• Sum within Tolerance: Sums up all intensity values that are within the selected m/z
tolerance range
• Highest Peak: Pick the intensity corresponding to the highest peak within the m/z tolerance
• Nearest Peak: Pick the peak closest to the calibrated m/z that is within the m/z tolerance
MS Mass Tolerance Strategy: Spectronaut™ will calculate the ideal mass tolerances for data
extraction and scoring based on its extensive mass calibration. However, you can also specify
your preferred tolerances for both MS1 and MS2 levels. You can choose amongst:
The XIC IM Extraction Window defines if ion mobility should be used to predict the elution of a
peptide:
• Dynamic (default setting): Spectronaut will dynamically adjust the XIC window in Ion
Mobility dependent manner, based on large sample set during calibration. A correction
factor can be applied, e.g., a factor of 2.0 would mean that you want to use 2 times the
window that Spectronaut suggests. The default settings are recommended for most
applications.
• Full: Spectronaut will use the full ion mobility space to find the target.
The XIC RT Extraction Window defines if iRT should be used to predict the elution of a peptide:
• Dynamic (default setting): Spectronaut will determine the ideal extraction window
dynamically depending on iRT calibration and gradient stability. Sections of the gradient
that show higher variability during the calibration step will automatically be extracted using
wider windows.
• Static: Spectronaut will use a fixed width (in min).
• Full: Spectronaut will use the full gradient width to find the target.
Calibration Mode (only Determines whether a re-calibration of the selected HTRMS files
relevant when using should be triggered or not
HTRMS files)
• None: keep current stored calibration.
• Automatic (default): Spectronaut will decide whether a
recalibration is needed or not.
• Force: will recalibrate the HTRMS files.
MZ Extraction Strategy Defines which MS peak within the m/z tolerance will be picked
during mass calibration
Precision iRT If selected and available, Spectronaut will use a larger set of
calibration peptides to perform an extensive calibration and
improve iRT precision and accuracy.
MS1 Mass Tolerance Allows to determine MS1 tolerance for XIC extraction and scoring
Strategy during calibration. One could choose from the following three
options:
MS2 Mass Tolerance Allows to determine MS2 tolerance for XIC extraction and scoring
Strategy during calibration. The option one can choose are the same as
described above for MS1 Mass Tolerance Strategy.
Exclude Duplicate Spectronaut will keep only the best performing assay if a peptide is
Assays duplicated in the libraries.
Generate Decoys • If unchecked, decoys have to be provided a priori in the library for
Spectronaut to estimate the q-values (Qvalue). In such case the
decoys need to be annotated in the column "IsDecoy" with the
label "TRUE", and the targets need to be annotated with the label
"FALSE".
• If checked, you can specify the following:
Decoy Method: defines how to generate the decoys. For details,
please use the text hovers in the software. The default option is
Mutated. Furthermore, since Spectronaut 14, the setting of
Preferred Fragment Source was introduced. It could be chosen
between two options: Template Fragments (will carry over the
fragmentation pattern from the template peptide and just
recalculate the masses based on the new sequence) and NN
Predicted Fragments (neural network based strategy; will
generate the ideal fragmentation pattern based on the newly
generated decoy sequence).
Decoy Limit Strategy: set the maximum number of decoys to be
generated:
• Dynamic: specify the number of decoys a fraction of the number
of targets.
• Static: choose a decoy limit as a fixed number of decoys.
• None: generate the same number of decoys as targets.
Machine Learning • Per Run (default): calculates the discriminant scores (Cscores)
and q-values (Qvalues) per run.
• Across Experiment: makes a experiment-wise Cscore space. Can
compromise the sensitivity.
Precursor PEP Cutoff Specify the Posterior Error Probability (Precursor PEP) cutoff that
should be considered identified. This primarily affects which
precursors are quantifiable. When using either the global or run-
wise imputation strategy, precursors that do not satisfy the PEP
cutoff will be imputed.
Single Hit Definition Define what should be considered a protein single hit: stripped
sequence, modified sequence, or peptide precursor ID. If only a
single instance of the selected definition is identified across the
entire experiment, the protein group will be marked as a single-hit
protein group.
Exclude Single Hit Discard protein groups identified with only one peptide hit (as
Proteins defined above).
Pvalue Estimator Specify how you prefer the null distribution to be estimated to
calculate the p-values: Kernel Density or Normal Distribution
Estimator.
Interference Exclude fragment ions detected as interferences across all runs (Bilbao
Correction et al., 2015). If checked, set a minimum number of features to be kept at
MS1 and MS2 levels in order for Spectronaut to still perform the
quantification.
Protein LFQ Specify how protein level label-free quantification should be performed.
method
• Automatic will pick MaxLFQ for smaller experiments (<= 500 Runs)
and Quant2.0 for experiments exceeding this cutoff. Such setting is
recommended due to exponential run-time behavior of MaxLFQ
algorithm.
• MaxLFQ derives label-free quantities based on inter-run peptide
ratios.
• Quant 2.0 (SN standard) is based on aggregation of minor group
quantities based on user specified TopN and summation strategies
Proteotypicity Filter Choose whether you want to quantify only based on non-shared
(only with automatic peptides, either at the level of protein (very stringent) or at the level of
inference) protein group.
Major Group Specify how you want the minor groups (peptides) to be used to
Quantity calculate the major group (proteins) quantities.
Major Group Top N Use the best N minor group elements to calculate the major group
quantities. The elements are ranked by evidence count and quantity.
Minor Group Specify how you want the precursors to be used to calculate the minor
Quantity groups (peptides) quantities.
Quantity MS-Level Choose which MS level you want to use to perform quantification: MS1
or MS2.
MS2 level = sum of the quantities of the number of fragment ions per
precursor, as specified in the spectral library.
Quantity type Decide which feature of the peaks should be used for quantification:
area under the curve within integration boundaries or apex peak height.
Imputing strategy: this option will be available for any Data Filtering
selection different from Qvalue. The imputing strategy defines how to
estimate the missing values (identifications not fulfilling the FDR
threshold).
PTM Workflow
PTM localization Calculates a PTM localization probability for all variable modification site
options. A specified probability cut-off can be applied (default is 0.75).
Here you can specify if you are running a label-free analysis or a different kind of quantification
Multi-Channel Specifies what kind of workflow should be used for multi-channel peptide
Workflow Definition assays.
Template Correlation Profiling: takes the best peptide signal in all runs
as a template to find low abundant signals in the rest of the runs.
iRT Profiling: takes the best peptide signal in all runs as a template and
translates the empirical iRT to the integration boundaries of the low
abundant signals in the rest of the runs.
Unify Peptide Unify the peak picking across different charge states of the same
Peaks Strategy modified peptide based on the highest scoring instance.
In-Silico Library Changes a few parameters in the calibration and machine learning
Optimization processes to optimize the DIA analysis towards very large (>800k
precursors) spectral libraries with very low expected recovery rate.
This is typically the case for In-Silico predicted spectral libraries made
from a theoretical FASTA digest.
Spectronaut is able to perform protein inference using the IDPicker algorithm (Zhang et al.,
2007). Protein grouping will be well defined and protein group counts will be comparable
across search engines and spectral libraries. Spectronaut also checks which peptides are
proteotypic. The options are:
• Automatic: (default) When using a Spectronaut formatted library (.KIT) and all information is
available, Spectronaut will use the protein inference parameters (the FASTA file used and
the specified protein cleavage rules) to re-calculate the protein inference based on all
identified peptides in the experimental DIA evidence.
• From Search Engine: will keep the protein inference as defined by the search engine within
the provided spectral library. The protein entries will not be re-grouped based on the
experimental DIA evidence.
• From protein-db matching: you can overwrite the existing grouping by choosing a FASTA
sequence database and specifying the protein cleavage rules.
Calculate Explained The explained TIC is the proportion of the TIC that can be associated to
TIC identified peaks. Choose if and how the relative explained TIC is
calculated.
Calculate Sample Choose whether you want to calculate the sample correlation matrix. If
Correlation Matrix selected, a new plot will be available in the Post Analysis perspective,
Analysis overview node, called Sample Correlation. For large
experiments it can be very time consuming.
Differential Select what the biological unit you want your results to be based on as
Abundance defined in the quantification settings: Major (proteins) or Minor (peptides)
Grouping Group.
Pipeline Mode
These settings are only relevant when running analyses from the Pipeline Perspective or from
command line.
Specify if you want your analysis to be saved (with or without ion traces), which reports should
be generated and saved, etc.
Peptides
Digest Type Specific: both N- and C-terminus follow the specified digest rules
Semi-specific: only one of the termini follows the specified digest rules
Enzyme/Cleavage Proteases used to in silico digest the proteins from the protein
Rules database(s). Defined in Databases → Cleavage Rules
Missed Cleavages How many consecutive cleavage sites the protease could miss
Toggle N-terminal Pre-processing of the protein database by toggling (processing both with
M and without) the protein N-terminal methionine (when there is one) to
account for N-terminal methionine excision.
Labeling
Labeling If selected, there are up to 3 channels where one can specify which labels,
applied from the modifications database, are in each channel.
The directDIA™ Settings consist of settings related to Pulsar search and library generation
and settings related to DIA Analysis..
Peptides
Enzyme/Cleavage Proteases used to in silico digest the proteins from the protein
Rules database(s). Defined in Databases → Cleavage Rules
Digest Type Specific: both N- and C-terminus follow the specified digest rules
Missed Cleavages How many consecutive cleavage sites the protease could miss
Toggle N-terminal Pre-processing of the protein database by toggling (processing both with
M and without) the protein N-terminal methionine (when there is one) to
account for N-terminal methionine excision.
Modifications
Identification
Peptide FDR
Specify the FDR threshold on peptide level
PTM Localization Calculates a PTM localization probability for all variable modification site
Filter options. A specified probability cut-off can be applied (default is 0.75).
Spectronaut™ will, by default, calculate the ideal mass tolerances to generate the library.
Spectronaut performs two calibration searches: based on the first-pass calibration (rough
calibration), the ideal tolerance for the second-pass calibration is defined; based on the
second-pass calibration (finer calibration), the ideal tolerance for the main search is defined.
Spectronaut will do this under default settings (Dynamic).
However, Spectronaut allows you to set you preferred tolerances for the different MS
instruments (Thermo Ion Trap, Thermo Orbitrap, TOF). Hence, for both the calibration search
(second-pass, finer calibration), and the main search, you can define your tolerances:
• Dynamic: determined by Spectronaut based on the precedent search (default). You can
set a correction factor for MS1 and MS2 levels (default is no correction = 1)
• Relative: set a relative mass tolerance in ppm for both MS1 and MS2 levels
• Static: set a fix mass tolerance in Thomson for both MS1 and MS2 levels
Fragment Ion Selection Strategy: defines the strategy to be used for selecting the top N
fragment ions per peptide precursor
• Intensity Based: Prioritizes fragment ions by their intensity in the consensus spectra
• Evidence Based: Prioritizes fragment ions by how often they have been observed in the
experiment for the same precursor
• Maximize sequence coverage: Groups fragment ions by type and position and ranks them
based on the best fragment ion per group (either by intensity of by evidence) in an iterative
manner.
Spike in workflow: will create a light channel for all heavy (SIS) peptides that are identified
without a light counterpart.
Labeled workflow: will detect the labeling setup of an experiment and add the channels that are
missing for a given peptide
Inverted spike-in: will create a heavy channel for all light peptides that are identified.
Use DNN Ion Mobility decides if ion mobility should be predicted based on deep neural
network (DNN) for library generation.
• Auto will always predict Ion Mobility during library generation. Only if empirical ion mobility
value is not available for the peptide, a predicted value will be used.
• Always use predicted Ion Mobility – a library will contain predicted ion mobility values,
regardless if empirical information is available or not.
• Never predict Ion Mobility – ion mobility prediction will not be performed. Library will
contain only empirical ion mobility values (if available).
Result Filters
Fragment Ions Filter peptides not fulfilling the conditions specified regarding fragment ions.
Find more details by hovering over the option in the software. You can specify
your defined criteria for:
Best N fragments per Peptide: specify the range in number of fragment ions
based on response
Best N Peptides per Protein Group: keep only the N most abundant peptides
per protein
FASTA matched: keep only peptide that are found in a user-specified FASTA
sequence database and abide the digest rules. Only available if protein
inference is selected.
Data Extraction
Intensity Extraction: Defines how the ion intensity for a requested m/z value is calculated from
a specific scan.
• Maximum Intensity (default setting): Picks the highest data-point within the selected m/z
tolerance
• Sum within Tolerance: Sums up all intensity values that are within the selected m/z
tolerance range
• Highest Peak: Pick the intensity corresponding to the highest peak within the m/z tolerance
• Nearest Peak: Pick the peak closest to the calibrated m/z that is within the m/z tolerance
MS Mass Tolerance Strategy: Spectronaut™ will calculate the ideal mass tolerances for data
extraction and scoring based on its extensive mass calibration. However, you can also specify
your preferred tolerances for both MS1 and MS2 levels. You can choose amongst:
The XIC IM Extraction Window defines if ion mobility should be used to predict the elution of a
peptide:
• Dynamic (default setting): Spectronaut will dynamically adjust the XIC window in Ion Mobility
dependent manner, based on large sample set during calibration. A correction factor can
be applied, e.g., a factor of 2.0 would mean that you want to use 2 times the window that
Spectronaut suggests. The default settings are recommended for most applications.
• Full: Spectronaut will use the full ion mobility space to find the target.
The XIC RT Extraction Window defines if iRT should be used to predict the elution of a peptide:
• Dynamic (default setting): Spectronaut will determine the ideal extraction window
dynamically depending on iRT calibration and gradient stability. Sections of the gradient that
show higher variability during the calibration step will automatically be extracted using wider
windows.
• Static: Spectronaut will use a fixed width (in min).
• Full: Spectronaut will use the full gradient width to find the target.
Calibration
MZ Extraction Defines which MS peak within the m/z tolerance will be picked during
Strategy mass calibration
MS1 Mass Allows to determine MS1 tolerance for XIC extraction and scoring
Tolerance Strategy during calibration. One could choose from the following three options:
MS2 Mass Allows to determine MS2 tolerance for XIC extraction and scoring
Tolerance Strategy during calibration. The option one can choose are the same as
described above for MS1 Mass Tolerance Strategy.
Exclude Duplicate Spectronaut will keep only the best performing assay if a peptide is
Assays duplicated in the libraries.
Generate Decoys • If unchecked, decoys have to be provided a priori in the library for
Spectronaut to estimate the q-values (Qvalue). In such case the decoys
need to be annotated in the column "IsDecoy" with the label "TRUE",
and the targets need to be annotated with the label "FALSE".
• If checked, you can specify the following:
Decoy Method: defines how to generate the decoys. For details, please
use the text hovers in the software. The default option is Mutated.
Furthermore, since Spectronaut 14, the setting of Preferred Fragment
Source was introduced. It could be chosen between two options:
Template Fragments (will carry over the fragmentation pattern from the
template peptide and just recalculate the masses based on the new
sequence) and NN Predicted Fragments (neural network based
strategy; will generate the ideal fragmentation pattern based on the
newly generated decoy sequence).
Decoy Limit Strategy: set the maximum number of decoys to be
generated:
• Dynamic: specify the number of decoys a fraction of the number of
targets.
• Static: choose a decoy limit as a fixed number of decoys.
• None: generate the same number of decoys as targets.
Machine Learning • Per Run (default): calculates the discriminant scores (Cscores) and q-
values (Qvalues) per run.
• Across Experiment: makes a experiment-wise Cscore space. Can
compromise the sensitivity.
Precursor PEP Specify the Posterior Error Probability (Precursor PEP) cutoff that should
Cutoff be considered identified. This primarily affects which precursors are
quantifiable. When using either the global or run-wise imputation strategy,
precursors that do not satisfy the PEP cutoff will be imputed.
Single Hit Define what should be considered a protein single hit: stripped sequence,
Definition modified sequence, or peptide precursor ID. If only a single instance of
the selected definition is identified across the entire experiment, the
protein group will be marked as a single-hit protein group.
Exclude Single Hit Discard protein groups identified with only one peptide hit (as defined
Proteins above).
Pvalue Estimator Specify how you prefer the null distribution to be estimated to calculate
the p-values: Kernel Density or Normal Distribution Estimator.
Interference Exclude fragment ions detected as interferences across all runs (Bilbao
Correction et al., 2015). If checked, set a minimum number of features to be kept
at MS1 and MS2 levels in order for Spectronaut to still perform the
quantification.
Protein LFQ Specify how protein level label-free quantification should be performed.
method
• Automatic will pick MaxLFQ for smaller experiments (<= 500 Runs)
and Quant2.0 for experiments exceeding this cutoff. Such setting is
recommended due to exponential run-time behavior of MaxLFQ
algorithm.
• MaxLFQ derives label-free quantities based on inter-run peptide
ratios.
Proteotypicity Filter Choose whether you want to quantify only based on non-shared
(only with automatic peptides, either at the level of protein (very stringent) or at the level of
inference) protein group.
Major Group Specify how you want the minor groups (peptides) to be used to
Quantity calculate the major group (proteins) quantities.
Major Group Top N Use the best N minor group elements to calculate the major group
quantities. The elements are ranked by observability (evidence count
within the experiment) and quantity (sum quantity within the experiment).
Minor Group Specify how you want the precursors to be used to calculate the minor
Quantity groups (peptides) quantities.
Quantity MS-Level Choose which MS level you want to use to perform quantification: MS1
or MS2.
Quantity type Decide which feature of the peaks should be used for quantification:
area under the curve within integration boundaries or apex peak height.
Imputing strategy: this option will be available for any Data Filtering
selection different from Qvalue. The imputing strategy defines how to
estimate the missing values (identifications not fulfilling the FDR
threshold).
PTM Workflow
PTM localization Calculates a PTM localization probability for all variable modification site
options. A specified probability cut-off can be applied (default is 0.75).
Workflow
Here you can specify if you are running a label-free analysis or a different kind of quantification
Method Evaluation Will perform a separate Pulsar DIA search per condition in order to
better compare different DIA methods within one experiment. This
workflow is not meant for the quantitative experiments.
MS2 DeMultiplexing Allows the processing of alternating shifted MS2 windows as presented
in D. Amodei et al. 2019:
Template Correlation Profiling: takes the best peptide signal in all runs
as a template to find low abundant signals in the rest of the runs.
iRT Profiling: takes the best peptide signal in all runs as a template and
translates the empirical iRT to the integration boundaries of the low
abundant signals in the rest of the runs.
Run Limit for Defines number of DIA runs which are randomly selected for the library
directDIA Library construction in directDIA. The software records which specific runs were
selected. If this number is set to -1, Spectronaut will use all the DIA runs
for construction of that library.
Unify Peptide Unify the peak picking across different charge states of the same
Peaks Strategy modified peptide based on the highest scoring instance.
Spectronaut is able to perform protein inference using the IDPicker algorithm (Zhang et al.,
2007). Protein grouping will be well defined and protein group counts will be comparable
across search engines and spectral libraries. Spectronaut also checks which peptides are
proteotypic. The options are:
• Automatic: (default) When using a Spectronaut formatted library (.KIT) and all information
is available, Spectronaut will use the protein inference parameters (the FASTA file used
and the specified protein cleavage rules) to re-calculate the protein inference based on all
identified peptides in the experimental DIA evidence.
• From Search Engine: will keep the protein inference as defined by the search engine
within the provided spectral library. The protein entries will not be re-grouped based on
the experimental DIA evidence.
• From protein-db matching: you can overwrite the existing grouping by choosing a
FASTA sequence database and specifying the protein cleavage rules.
Post-Analysis
Calculate The explained TIC is the proportion of the TIC that can be associated to
Explained TIC identified peaks. Choose if and how the relative explained TIC is
calculated.
Quick: will use a simple and fast feature detector and correct the results
based on a heuristic technique.
Calculate Sample Choose whether you want to calculate the sample correlation matrix. For
Correlation Matrix large experiments it can be very time-consuming.
Run Clustering: Choose whether you want to cluster your samples and
potential candidates, and how.
Run Clustering Choose whether you want to cluster your samples and potential
candidates, and how.
Pipeline Mode
These settings are only relevant when running analyses from the Pipeline Perspective or from
command line.
Specify if you want your analysis to be saved (with or without ion traces), which reports should
be generated and saved, etc.
Configure the library generation settings. Most settings are described below. For further
information, there are also helpful text hovers directly in the software.
Tolerances
Spectronaut™ will, by default, calculate the ideal mass tolerances to generate the library.
Spectronaut performs two calibration searches: based on the first-pass calibration (rough
calibration), the ideal tolerance for the second-pass calibration is defined; based on the
second-pass calibration (finer calibration), the ideal tolerance for the main search is defined.
Spectronaut will do this under default settings (Dynamic).
However, Spectronaut allows you to set you preferred tolerances for the different MS
instruments (Thermo Ion Trap, Thermo Orbitrap, TOF). Hence, for both the calibration search
(second-pass, finer calibration), and the main search, you can define your tolerances:
• Dynamic: determined by Spectronaut based on the precedent search (default). You can
set a correction factor for MS1 and MS2 levels (default is no correction = 1)
• Relative: set a relative mass tolerance in ppm for both MS1 and MS2 levels
• Static: set a fix mass tolerance in Thomson for both MS1 and MS2 levels
Identification
You can specify the search engine scoring type and thresholds
Protein Inference
You can have this option activated or deactivated. If you let Spectronaut do your proteins
inference, you can refine your sequence settings further in these settings:
Enzyme/Cleavage Proteases used to in silico digest the proteins from the protein
Rules database(s). Defined in Databases → Cleavage Rules
Digest Type Specific: both N- and C-terminus follow the specified digest rules
Toggle N-terminal Pre-processing of the protein database by toggling (processing both with
M and without) the protein N-terminal methionine (when there is one) to
account for N-terminal methionine excision.
You have a number of options to filter the search engine results for library generation. There
are filters at the level of fragment ion and at the level of precursor. The filters are quite
self-explanatory. Please, use the hover text-tools if you need more information. Find below
some of the most relevant.
Fragment Ions Filter peptides not fulfilling the conditions specified regarding fragment
ions. Find more details by hovering over the option in the software. You
can specify your defined criteria for:
Best N Peptides per Protein Group: keep only the N most abundant
peptides per protein
iRT Calibration
iRT Reference Define how the reference iRT is derived for iRT calibration:
Strategy
• Deep Learning Assisted iRT Regression. Use the new Deep Learning
algorithm to generate the iRT reference set. This is useful when
working with non-model organisms hardly covered in Spectronaut's
internal empirical iRT reference dataset.
• Empirical iRT Database. Use Spectronaut's internal empirical iRT
reference database of more than 100.000 iRT reference peptides from
multiple sources.
• Use RT as iRT. No iRT calibration will be performed. It should only be
used if the peptide separation method is very stable, homogeneous
and non-standard, such as capillary zone electrophoresis (CZE).
Minimum Rsquare Choose how strict you want to be to accept the iRT calibration of your data
by specifying the minimum coefficient of determination (R2) allowed.
Use Stripped If selected, will ignore any modifications of the iRT reference peptides
Sequence to when identifying them for calibration.
Identify Reference
Peptides
Fragment Ion Selection Strategy: defines the strategy to be used for selecting the top N
fragment ions per peptide precursor
• Intensity Based: Prioritizes fragment ions by their intensity in the consensus spectra
• Evidence Based: Prioritizes fragment ions by how often they have been observed in the
experiment for the same precursor
• Maximize sequence coverage: Groups fragment ions by type and position and ranks
them based on the best fragment ion per group (either by intensity of by evidence) in an
iterative manner.
Spike in workflow: will create a light channel for all heavy (SIS) peptides that are identified
without a light counterpart.
Labeled workflow: will detect the labeling setup of an experiment and add the channels that are
missing for a given peptide
Inverted spike-in: will create a heavy channel for all light peptides that are identified.
Use DNN Ion Mobility decides if ion mobility should be predicted based on deep neural
network (DNN) for library generation.
• Auto will always predict Ion Mobility during library generation. Only if empirical ion mobility
value is not available for the peptide, a predicted value will be used.
• Always use predicted Ion Mobility – a library will contain predicted ion mobility values,
regardless if empirical information is available or not.
• Never predict Ion Mobility – ion mobility prediction will not be performed. Library will
contain only empirical ion mobility values (if available).
This chart shows the status of run calibration. Spectronaut™ supports non-linear gradients
using a refined calibration based on the initial calibration and detailed information in the user
library.
Figure 48. iRT Calibration Chart showing the non-linear transformation from library iRT to actual predicted
retention times. The chart is shown after extended non-linear calibration has been performed. The extended
calibration allows you to correct even small gradient fine-structure fluctuations to get the most accurate
retention time prediction for your library.
Based on extensive calibration using the respective calibration Kit and information about the
spectral library used, Spectronaut can determine the optimal XIC extraction width for your
experiment. This extraction width is calculated dynamically over the whole gradient and allows
Spectronaut to automatically adapt to areas with lower retention time prediction accuracy. The
window width can be influenced with a correction factor in the Settings perspective (Settings
→ Analysis → Peak Detection → Correction Factor).
Figure 49. The XIC Extraction Width plot gives insight into your gradient stability and the overall accuracy of
your library’s iRT values. The blue and the orange lines show the window selections as suggested by
Spectronaut (blue) and as set by the user (orange → correction factor in the settings). These extraction
window widths (y-axis) change over time (x-axis) based on gradient stability and iRT accuracy. The red dot’s
show your libraries iRT accuracy assuming a linear iRT to RT transformation. The green dots show the iRT
accuracy using the extended non-linear iRT to RT transformation. The later one is used for the actual
analysis.
Figure 50. The Ion Mobility Calibration plot shows the empirical ion mobility as a function of the ion mobility
values in the library.
Figure 51. The Ion Mobility Extraction Width plot shows the difference between predicted and measured ion
mobility for each ion mobility value as well as the applied extraction window.
The MS1 TIC Chromatogram plot shows the signal of all ions in function of RT. The Base Peak
Chromatogram shows the signal of the most intense precursor ion at certain RT.
Figure 52. The MS1 TIC chromatogram shows the total ion current of a run, giving insight into the amount of
sample injected.
The analysis log contains all the information pertaining to the analysis of your whole
experiment. In the event of errors one can consult the analysis log for detailed information of
what went wrong.
Figure 53. The analysis log with detailed information about the analysis processes in Spectronaut.
Figure 54. The Ion Mobility Overview plot shows the ion mobility dimension in function of m/z values. The red
lines show consecutive DIA boxes. Their overlay with registered MS1 signal could be used for DIA method
optimization and validation.
MS2 XIC
The default plot on Elution Group (EG or peptide), Fragment Group (FG or peptide precursor)
and Fragment (F or fragment ion) level. This plot shows the extracted ion current
chromatogram of the selected peptide. The plot contains the XICs for all fragments present in
the library. Additionally, the expected retention time is marked (black dotted line) and the
On fragment level, this plot only shows the selected fragment in color and all others in gray.
There are a number of options available upon right-clicking in the plot such as switching the
y-axis to log scale, toggling accept/reject the peak, changing x-axis scale to iRT, showing
normalized intensities and showing the XIC chromatogram for the whole gradient.
Figure 55. XIC chromatogram for the peptide MPEMNIK++. The color coding of the fragments indicates an
overall good correlation to the expected fragment intensities. The dotted blue line indicates a potential
interference that was detected by Spectronaut and automatically removed for relative quantitation.
The XIC Sum chromatogram chart shows the selected peptides quantitative information. The
XIC shown is the sum of all fragment XICs that qualified for quantitation. All fragments that
were excluded due to interfering signals are not used to calculate the sum XIC chromatogram.
This plot shows detailed information about the correlation of predicted and measured relative
fragment ion intensity. The predicted values in red correspond to the relative intensities
provided by the spectral library. The black lines correspond to the relative measured intensity
of each fragment ion. Fragment ions with potential interferences are displayed as dotted lines.
Figure 57. The MS2 Intensity Correlation plot for a given peptide precursor. The plot indicates a very good
correlation between the expected relative intensities (red) and the measured intensities of the library
fragments.
This plot shows the monoisotopic precursor plus its most abundant isotopic forms as an XIC
chromatogram. The XIC chromatograms on MS1 and MS2 level are expected to have identical
apex retention times and elution shapes. As with the MS2 XIC chromatogram plot, the MS1
XIC chromatogram is also color coded to reflect the predicted relative intensities. A color
coding from red (highest) to blue (lowest) indicates high correlation with the predicted
abundance.
Figure 58. The MS1 Isotope Envelope XIC chromatogram for the precursor IILDLISESPIK++ as extracted by
Spectronaut. The color pattern again indicates a high correlation to the expected relative abundances.
Similar to the MS2 Intensity Correlation plot, this plot highlights the correlation between the
expected and the observed fragment isotope patterns. A high correlation between the
measured (black) and the predicted (red) abundances signals high confidence in the
identification and quantification.
Figure 59. The envelope correlation plot for two fragments with very high correlation with respect to the
predicted abundance. The measured intensities (black) are almost perfect mirror images of the predicted
(red) intensities.
Figure 60. The new PTM localization plot shows the different possible modified versions of a peptide assay,
depicting the corresponding scores for each of the fragment ions, either confirming the given site probabilities,
or refuting it.
The MS1 and MS2 XIC Alignment plots allow visualization of the extracted ion chromatograms
of a peptide across runs. This visualization allows you to easily and quickly browse through
thousands of XICs in a glance. By right-clicking on the panel you can also sort the grid by
experiment conditions and replicates - conditions are sorted in rows and replicates in columns.
The name above the XIC plot is a clickable link, which redirects you to that specific precursor
in a run.
Figure 61. The MS2 XIC Alignment across runs. The x-axis is automatically changed to iRT to reduce
chromatographic variance. The axis can be changed to retention time by right-clicking on the plot and un-
selecting the "Use iRT Scale" option.
XIC graph
This plot combines the MS2 XIC and the MS1 Isotope Envelope XIC together and across all
your runs. This is a nice way of looking how well MS2 and MS1 peaks correlates and how
reproducibly thy behave across runs.
This plot allows you to show all sum XIC chromatograms of your selected peptide from all runs
as an overlay plot. The x-axis scale is by default in iRT but on right-click can be changed to
actual retention time.
Figure 63. The iRT XIC Sum Overlay chart for the peptide IILDLISESPIK++. The 4 XICs correspond to the
sum XIC of one peptide in the 4 different runs loaded for this experiment. The title of the plot additionally
shows the peptides coefficient of variation, in this case 4.5%.
Ion Mobilogram
For data that use ion mobility as a fourth dimension for ion separation, such as dia-PASEF
data, the mobilogram is available. The mobilogram allows for manual peak integration in ion
Figure 64.The ion mobilogram for visualization and peak re-integration in the ion mobility dimension.
Similar to the MS2 XIC Alignment, this plot gives detailed information about the signal stability
for one peptide across several runs. The different bars show the relative abundance of each
fragment ion across multiple runs. Using this plot, one can quickly identify an inconsistent
signal by the change in the color pattern. Right-click on the plot and un-selecting "Normalize"
to show intensities on absolute scale.
Figure 65. The MS2 Intensity alignment for a peptide containing 8 fragment ions. Each fragment ions relative
intensity compared to the total peak height is indicated using a differently colored bar. The peptide was
targeted in 24 runs. An inconsistent signal can be easily identified due to the sudden change in the color
pattern.
Similar to the MS2 Intensity Alignment chart, the Cross Run RT Accuracy plot allows one to
quickly validate the peak picking across several LC-MS runs. The x-axis shows colored bars
that correspond to the peptide in different runs. On the y-axis one can see the retention time
in iRT.
The height of each box corresponds to the peak width at the start and the end iRT according
to the y-axis. The line through the middle of the box shows the apex retention time in iRT while
the blue colored boxes in the back show the total XIC extraction width. The colors of the bars
again correspond to the relative intensities of the measured fragment ions. The bar with the
green background is the currently selected node. The black, dotted, horizontal line
corresponds to the expected retention time in iRT. You can hop to any other node by clicking
on the colored bar.
Figure 66. The Cross Run RT Accuracy plot for a peptide measured in 4 different runs. The multi-colored
bars correspond to the detected peak with the colors encoding the relative fragment intensities and the upper
and lower boundary of the bar corresponding to the peaks start and end retention time in iRT.
This plot shows the MS1 signal at apex retention time for a given peptide. The chart
automatically zooms in on the isotopic envelope and labels the different peaks accordingly.
The red bars indicate the expected relative intensities of the different isotopic forms of the
precursor.
Figure 67. MS1 spectrum at apex for the peptide IILDLISESPIK++ showing the monoisotopic precursor plus
its first 3 isotopic forms.
Similar to the MS1 Spectrum at Apex, this chart shows the full recorded MS2 spectrum
corresponding to a given peptide’s LC-peak apex. By default, the plot only highlights the
fragments as provided by the library. Additionally, other fragments can be calculated and
annotated by right-clicking on the plot and selecting "Show All Theoretical Fragments". You
can also use the "Set Fragment Filter" window to open up the range of fragments and show,
for instance, a-series ions.
Figure 68. MS2 Spectrum at Apex for the peptide ILLDLISESPIK++. The option "Show All Theoretical
Fragments" was turned on and a-ions were selected in the fragment filter editor.
The PDM plot is newly introduced with Spectronaut 13. This plot shows the MS features for a
precursor and how well it matches the expected values. The main area of the plot (on the right
side) shows the full recorded MS2 spectrum corresponding to the peptide LC-peak apex.
Highlighted, you can see expected fragments for that peptide (full line denotes the calibrated
m/z value and the dashed line denotes the theoretical m/z value of a specific ion). At the
bottom of this plot, the mass errors of each of the highlighted fragments are shown. On the
left side, the MS1 spectrum at Apex plot is depicted (similar to the MS1 Spectrum at Apex plot
described before, the full line denotes the calibrated m/z value and the dashed line denotes
the theoretical m/z value of a specific ion). Finally, at the top-left side, you can choose among
several elements to show on the MS1 and MS2 plots. Some of the options are:
• Label the fragment peaks with the predicted ion name or with the m/z of the peak
• Show the mass error in ppm or in Thompson
• Choose which fragments to highlight on the MS2 spectrum: the ones used for the assay
(Library Fragments), the ones detected in the library but not selected (Extended Library
Fragments), or All Theoretical. You can also filter by fragment length, by type and by neutral
loss.
Figure 69. Peptide Data Match (PDM) plot for the precursor SHGQDYLVGNK+++. At the top-right panel you
can select what to highlight on the MS1 and MS2 spectrum plots. In this case, all fragments in the library
Protein Coverage
Spectronaut gives you a detailed overview of a protein’s coverage within your analysis. The
protein coverage plot shows you all the peptides of a protein that were targeted within your
current experiment. The color coding by default indicates the confidence level for each
peptide. This can be toggled on right-click to show the proteotypicity status for each peptide.
This option is only available if protein inference was enabled during the DIA analysis. By
choosing the Digest Specificity option, you can see if the identified peptides were digested
specifically. For PTM workflows, you can also visualize PTM site annotation. Only confidently
localized PTMs are highlighted in the plot. If you click on one of the peptides in the plot, you
can additionally visualize its XIC.
Figure 70. Protein coverage plot. The protein sequence coverage can be displayed at run- and experiment
level. Right-clicking on a peptide also allows to see proteotypicity or digest specificity.
In the Analysis Perspective, right-click on the experiment tab. A context menu will open
with several functionalities to apply to the analysis (Figure 21). The most common
actions are available in intuitive icons displayed below the experiment tab.
Add runs and Remove Add or remove runs from the analysis. For the changes to take
runs effect, you need to refresh the Post-Analysis.
Map missing runs If Spectronaut™ lost the link with the run files, you can map them
back. If your analysis contains XICs, this is not needed.
Recalculate Qvalues If you manually change the peak picking, by selecting a different
peak of changing the integration boundaries, the q-values need to
be re-calculated.
Refresh Post Analysis If you do any modification on the analysis, such as manually modify
a peak or add new runs, the Post-Analysis has to be refreshed.
Re-extract all XICs If you saved your experiment without XICs, it is recommended that
you re-extract them from the run files for a better performance.
Export Experiment Export a report with the settings you used to run your analysis
Settings
Group by You can group your data tree under different criteria. Default and
recommended grouping for better performance is by precursor
window.
Reset All Peaks Revert the manually modified peak picking back to the automatic
one. The q-values need to be recalculated
Commit Library Changes If you refined your library in the Analysis Perspective, the changes
will not make effect until your commit the changes (see
Section 3.4.2.5).
Export All… (Ctrl + R) With this function, you can batch export many reports of your choice
relevant to your analysis.
Settings This tool allows to explore and change many settings of your
analysis:
Changing these settings will let you recalculate your analysis, which
is significantly less expensive than running it again from scratch.
Run Identifications
The Run Identification panel summarizes the number of identifications per run, displayed in a
bar plot (Figure 71) and table format (Figure 72). On right-click you can change the basis of
quantification between precursor, modified sequence, stripped sequence and protein group.
Figure 71. The upper-panel plot under the "Run Identifications" node shows the number of precursors per
run and their completeness.
Figure 72. The lower-panel table under the "Run Identifications" node shows the number of precursors per
run and their completeness.
In the data completeness plot the "Cumulative Sparse Profiles" describes the cumulative
number of identifications when looking at 1 to n runs, where n corresponds to the complete
experiment. The "Cumulative Full Profiles" corresponds to identifications that were consistent
across 1 to n runs, i.e. identified in all the runs currently looked at. The Coefficients of Variation
plot shows the distribution of CVs in the experiment. On right-click you can change the basis
of quantitation between precursor, modified sequence, stripped sequence and protein group.
Figure 73. The upper plot under the "Data Completeness" node shows you the difference between cumulative
sparse and full profiles. The green bars represent the growth of cumulative precursor identification. For
example, if you only consider the first 3 runs (x = 3) 26.878 precursors where identified at least once in these
3 runs. The blue bars represent the decline of full profiles. For example, after the first 3 runs, only 22.747
precursors where identified in all 3 previous runs. Adding more runs, the green bars can only ever go up or
remain constant while the blue bars can only ever go down or remain constant. As with many other plots you
can change the context to display the data completeness on precursor, peptide, protein-group or protein
level.
Figure 75. The Ranked Protein Groups plot shows all the identified protein groups ranked according to
quantitative value. You can hover over each data point to find out which protein group correspond to each
point. In addition, by selecting one of the boxes above the graph, you can choose to show labels of all
candidates of the differential analysis or only of the selected ones.
Coefficients of Variations
Figure 76. The upper plot under the "Coefficients of Variation" node shows the %CV distribution for all
conditions in your experiment. You can also change the context by right-clicking on the plot to show you the
CVs for precursor, peptide, protein-group and protein quantities or to show the CV distribution across all
conditions.
CVs Below X
Figure 78. The CVs below X plot shows the number of precursors that were below either 20% or 10% CVs.
The red bar shows the number of all identifications for each condition regardless of quantitative precision.
The bar for ≤ 20% CVs also includes all counts from ≤ 10% CVs. As with the previous figures, you can change
the context of the plot by right-clicking on it to select either precursor, peptide, protein-group or protein scope.
In the normalization you can see boxplots of responses for the individual runs before and after
normalization.
Figure 79. The "Normalization" node shows details about the normalization status of you experiment. The
left side shows boxplots of precursor quantities before normalization for each run. The right side shows
boxplots of the same precursor quantities after normalization.
Figure 80. Coefficients of Variation plot shows distribution of coefficients of variation in each of the
experimental conditions. By right clicking at the plot, the CV base could be changed from precursor to peptide,
protein or protein group. Also, you can select to show data per sample or per whole experiment.
Figure 81. Datapoints per Peak plot shows the data points per peak in each condition as well as their
distribution with the median. By right clicking, the grouping of the data could be changed to grouping per
sapmple, per condition or across whole experiment.
Figure 82. Binned Identification plot shows realtive ID plotted for Retention Time buckets, Log10 Quantity
buckets and m/z buckets. Data is grouped by the condidtion which, upon right clicking, could be changed to
run or fraction.
LFQBench
Figure 83. The LFQBench plot for mixed proteomes of three organisms. The Y-axis (log2Ratio) shows the
ratios of the mixed proteomes based on experimental data. The expected ratios (dashed lines) can be set by
Heatmap
Figure 84. Heatmap showing the clustering of 9 runs from 3 conditions. Runs within the same condition
cluster nicely as illustrated by the condition-based color code in the bottom of the heatmap and the x-axis
dendrogram.
Figure 85. Volcano plot showing the potential candidates of an experiment containing 3 conditions to each 3
replicates. By default, the filters are set to ≥ 1.5 absolute fold change and ≤ 0.05 q-value. Selection of
Highlight Candidates box above the graph will show labels of all candidates. Custom Selection can be chosen
to highlight only selected ones.
Figure 86. The sample correlation matrix shows correlation of precursor quantities between all samples. The
rows and columns of the matrix are ordered by condition and replicate annotation. A high correlation between
technical replicates of the same sample is to be expected while low correlation between different samples
might indicate biological variance. By default, the coloring range is set from 0.75 to 1.0 but can be changed
via right-click option.
PTM Analysis plots in the Post Analysis Perspective are available when the PTM workflow is
selected for the analysis of a given experiment.
Figure 87. Principal Component Analysis plot shows clustering of the samples based on their modification
sites profiles.
Volcano Plot
Figure 88. Volcano Plot of the PTM analysis shows -Log10 P value plotted against Log 2 Fold change of
differential abundance on the modified site level. By default, the filters are set to ≥ 1.5 absolute fold change
and ≤ 0.05 q-value. Selection of Highlight Candidates box above the graph will show labels of all modified
sites candidates. Custom Selection can be chosen to highlight only user selected ones.
Figure 89. PTM vs Prtoein Fold Changes plot shows log2 ratio of the protein groups ploted against Log2
ratios of the PTM sites. Selection of Highlight Candidates box above the graph will show labels of all PTM
differential analysis candidates. Custom Selection will enable highlighting only candidates of intrest.
Figure 90. Example of the Modification Enrichement plot showing the percentage of the identified
precursors that are carrying a phosphorylation at particular amino acid in each of the runs. The plot can
display relative identification count (as on the graph above) or the quantity.
Many headers have a text hover tool directly in the software. If you don’t find the information
you are looking for, do not hesitate to contact us at support@biognosys.com.
Headers related to Protein Group (PG) as defined in the settings. Most headers related to PGs
are self-explanatory. Here are the most relevant and some which are not too obvious
PG.ProteinGroups One or several protein groups separated with a "|". Protein ids
within protein groups are separated with a ";". The protein groups
can either originate from the Spectronaut IDPicker protein grouping
or from the search engine used to generate the spectral library.
PG.Qvalue The q-value (FDR) for that PG. The q-values for protein groups are
experiment-wise.
PG.IsSingleHit True of False. It tells you whether the PG was found with only one
hit, as defined in the settings.
PG.ProteinAccessions The protein accessions in the same order as the protein ids in
"PG.ProteinGroups". The value corresponds to what was defined
by the parsing rule. If all the values are identical a unique value is
reported
PG.ProteinDescriptions The protein descriptions in the same order as the protein ids in
"PG.ProteinGroups". The value corresponds to what was defined
by the parsing rule. If all the values are identical a unique value is
reported
PG.ProteinNames The protein names in the same order as the protein ids in
"PG.ProteinGroups". The value corresponds to what was defined
by the parsing rule. If all the values are identical a unique value is
reported
PG.FastaHeaders The FASTA headers in the same order as the protein ids in
"PG.ProteinGroups". The value corresponds to what was defined
by the parsing rule. If all the values are identical a unique value is
reported
Peptide headers
Headers related to Peptides (PEP) as defined in the settings. Many headers related to Peptides
are self-explanatory. Here are the most relevant and some which are not too obvious.
PEP.UsedForProteinGroupQuantity True or False. Tells you whether this peptide was used
to calculate PG quantities, as defined in the settings.
Headers related to Elution Groups (EG). Many headers related to EG are self-explanatory.
Here are the most relevant and some which are not too obvious.
EG.PTMPositions [Mod-Name] The sequence positions for all amino acids that could
carry this modification
EG.MeanTailingFactor The average tailing factor of the elution group across all
the fragment ions determined at the FWHM
EG.AllProteinAccessions All protein accessions this peptide points to. This field is
only reported when the Spectronaut protein inference
was used. It represents the input to the IDPicker protein
grouping algorithm
Headers related to Fragment Group (FG). FG is only relevant in labeled and spike-in workflows.
Two FGs belong to one EG. The FG id corresponds to the EG id plus the isotopic labelling.
Many headers related to FG are self-explanatory. Here are the most relevant and some which
are not too obvious.
FG.Label A label of the fragment ion group. The label is not necessarily
unique and therefore not intended for structuring data
FG.TotalPeakArea The summed-up peak area of all fragment ions for the
corresponding peptide precursor
FG.TotalPeakHeight The summed-up apex peak height of all fragment ions for the
corresponding peptide precursor
Headers related to Fragment ions. If you choose fragment ion level information in your report,
you will have one row per fragment ion. This can make the report considerably large.
Most headers related to Fragments are self-explanatory. Here are the most relevant and some
which are not too obvious.
F.NormalizePeakArea The quantitative value calculated as the area under the curve
PTM site report is available whenever the experiment was analyzed with PTM workflow. The most
relevant PTM site headers are listed and described below.
PTM.CollapseKey The key used to group and condense parent peptides together into
the PTM site object.
PTM.FlankingRegion The flanking region of amino acids around the site location modified
with given PTM
PTM.Group All parent peptide sequences that were condensed in the PTM site
object
PTM.ModificationTitle The title of the modification that is presented by this PTM site object
PTM.Multiplicity The maximum multiplicity observed for this PTM site. The multiplicity
is defined by the number of the modifications of the same type being
identified at any of the parent peptides containing this modification
PTM.NrOfCollapsedPeptides The number of peptides that were condensed into this PTM site
object.
PTM.Quantity The condensed quantity of all parent peptides that cover this PTM
site.
PTM.SiteLocation The amino acid sequence position of this PTM site in the parent
protein sequence.
PTM.SiteProbability The highest observed site probability corresponding to this PTM site
in this run/ sample.