1109.0473v1

Solar Physics
DOI: 10.1007/•••••-•••-•••-••••-•
A Multi-Wavelength Analysis of Active Regions and

Sunspots by Comparison of Automatic Detection
Algorithms
C. Verbeeck1 · P.A. Higgins2 · T. Colak3 ·

F.T. Watson4 · V. Delouille1 ·
B. Mampaey1 · R. Qahwaji3
arXiv:1109.0473v1 [astro-ph.SR] 2 Sep 2011
c Springer ••••
Abstract Since the Solar Dynamics Observatory (SDO) began recording ∼ 1 TB

of data per day, there has been an increased need to automatically extract
features and events for further analysis. Here we compare the overall detection
performance, correlations between extracted properties, and usability for feature
tracking of four solar feature-detection algorithms: the Solar Monitor Active Re-
gion Tracker (SMART) detects active regions in line-of-sight magnetograms; the
Automated Solar Activity Prediction code (ASAP) detects sunspots and pores in
white-light continuum images; the Sunspot Tracking And Recognition Algorithm
(STARA) detects sunspots in white-light continuum images; the Spatial Possi-
bilistic Clustering Algorithm (SPoCA) automatically segments solar EUV images
into active regions (AR), coronal holes (CH) and quiet Sun (QS). One month of
data from the SOHO/MDI and SOHO/EIT instruments during 12 May – 23 June
2003 is analysed. The overall detection performance of each algorithm is bench-
marked against National Oceanic and Atmospheric Administration (NOAA) and
Solar Influences Data Analysis Centre (SIDC) catalogues using various feature
properties such as total sunspot area, which shows good agreement, and the
number of features detected, which shows poor agreement. Principal Component
Analysis indicates a clear distinction between photospheric properties, which are
highly correlated to the first component and account for 52.86% of variability in
the data set, and coronal properties, which are moderately correlated to both the
first and second principal components. Finally, case studies of NOAA 10377 and
10365 are conducted to determine algorithm stability for tracking the evolution
of individual features. We find that magnetic flux and total sunspot area are
the best indicators of active-region emergence. Additionally, for NOAA 10365,
it is shown that the onset of flaring occurs during both periods of magnetic-flux
emergence and complexity development.
1 Royal Observatory of Belgium, Belgium email:

cis.verbeeck@oma.be
2 Trinity College Dublin, Ireland email: pohuigin@gmail.com
3 University of Bradford, UK email: t.colak@bradford.ac.uk
4 University of Glasgow, UK email: f.watson@astro.gla.ac.uk
SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 1

C. Verbeeck, et al.
Keywords: Active Regions; Magnetic Fields; Coronal Structures; Sunspots
1. Introduction
In the 1960s, NASA launched the Pioneer 6, 7, 8, and 9 spacecraft, that were
tasked with observing the solar wind and interplanetary magnetic fields, forming
the first space-based space-weather network and recording 512 bits per second.
By comparison, the recently launched Solar Dynamics Observatory (SDO) is
currently relaying solar data back to Earth at a rate of 150 000 000 bits per
second. With SDO returning the equivalent of an image with 4096 by 4096
pixels every second, human analysis of every image would require a large team
of people working 24 hours a day. The technological advances that have allowed
the increased flow of data, such as improving communication bandwidths and
onboard processing power, allows us to record data with a much greater cadence
and spatial resolution than ever before. However, there are problems with the
storage, transfer, and analysis of such a large flow of data. SDO generates around
1 TB of data per day which is unprecedented in solar physics. Getting this volume
of data to researchers around the world, as well as storing it in convenient places
for analysis, is essential to make good use of it. An effective solution to the
problem is to use automated feature-detection methods, which allow users to
selectively acquire interesting portions of the full data set.
Development of automated solar feature detection and identification meth-
ods has increased dramatically in recent years due to the growing volume of
data available. An overview of the fundamental image-processing techniques
used in these algorithms is presented in Aschwanden (2010). These techniques
are used to detect many features in various types of observations at differ-
ent heights in the solar atmosphere (Pérez-Suárez et al., 2011). In thie present
work, we focus on detecting sunspot groups and active regions in photospheric
continuum images, magnetograms, and EUV images. Previously, detection of
sunspot groups in photospheric images was investigated by Zharkov et al. (2004)
and Curto, Blanca, and Martı́nez (2008). As well as detecting sunspot groups,
Nguyen, Nguyen, and Nguyen (2005); Colak and Qahwaji (2008) also make au-
tomated classifications. The detection of active regions in magnetograms is ex-
plored in McAteer et al. (2005); LaBonte, Georgoulis, and Rust (2007a), Lefebvre and Rozelot
(2004) and Qahwaji and Colak (2006), while Dudok de Wit (2006) introduces a
supervised segmentation of EUV images into AR, CH, and QS regions.
The purpose of this article is to determine the robustness of four algorithms
for detecting and physically characterising active regions and sunspot groups
by comparison of their outputs. We determine overall detection performance,
the correlations between extracted feature properties using Principal Compo-
nent Analysis, and the usability of these algorithms for tracking feature evo-
lution over time. The tools that we consider are the Solar Monitor Active
Region Tracker (SMART: Higgins et al., 2011) which detects magnetic features
using magnetograms, the Automated Solar Activity Prediction code (ASAP:
Colak and Qahwaji, 2009) which detects sunspots and pores using photospheric
intensity images, the Sunspot Tracking And Recognition Algorithm (STARA:

Multi-Wavelength Analysis of Active Regions and Sunspots
Watson et al., 2009) which also detects sunspots in photospheric intensity im-
ages, and the Spatial Possibilistic Clustering Algorithm (SPoCA: Barra et al.,
2009) which detects active regions in the corona using extreme ultraviolet images.
More detail on how these algorithms operate is provided in Section 3.
The overall performance of the algorithms is compared by determining the
total number of features detected as well as their full-disk area. Our meth-
ods are benchmarked against National Oceanic and Atmospheric Administra-
tion (NOAA) and Solar Influences Data Analysis Centre (SIDC) catalogues.
Few studies include comparisons of detection methods or different data types.
Benkhalil et al. (2006) compare the detection of active regions in several data
types using a single region-growing method. Other studies compare the detection
of features in magnetograms using a variety of region-growing and morphological
methods (DeForest et al., 2007; Parnell et al., 2009). Direct comparison of algo-
rithms is important for their characterisation, as each is designed in a different
way to detect features for a specific purpose.
Correlations between the properties determined by the algorithms are in-
vestigated using Principal Component Analysis (Jolliffe, 2002). PCA has been
used previously for various purposes in solar-physics and space-weather litera-
ture, e.g. to detect outliers (Sarro and Berihuete, 2011), to reduce dimension-
ality (Dudok de Wit and Auchère, 2007), or for exploratory data analysis of
space-weather data sets (Habash Krause, Franz, and Stevenson, 2011).
Finally, the stability of the algorithms is tested for tracking feature evolu-
tion through time. The evolution of two ARs is studied in detail, including
their emergence in several layers of the solar atmosphere. Lites et al. (1995)
present a similar multi-layered analysis of the emergence of an AR. As non-
potentiality increases in an AR, it may begin to exhibit enhanced coronal activ-
ity. This effect has been studied in many articles, and it is related to dynamic
behaviours such as helicity injection (Morita and McIntosh, 2005), turbulent
cascades (Hewett et al., 2008; Conlon et al., 2008, 2010), enhanced polarity sepa-
ration line gradient (Falconer, Moore, and Gary, 2008), and changes in magnetic
connectivity (Georgoulis and Rust, 2007; Ahmed et al., 2010). In this article
we study multiple behaviours in the same AR using magnetic property de-
terminations. Finally, the decay of the AR in the corona and photosphere is
compared. To our knowledge, this is the first time that automated feature-
detection algorithms have been used to study temporal evolution using properties
of magnetic non-potentiality, sunspot characteristics, and coronal activity of ARs
simultaneously.
The following sections detail these investigations. Observations used in this
study are described in Section 2, and the four algorithms to be compared are
introduced in Section 3. Our results are presented in Section 4, including an
evaluation of the algorithms’ overall performance, a correlation study of the
complete sample of active regions and a detailed case study of two different
active regions. Finally, a discussion of the results and concluding remarks is
presented in Section 5.

C. Verbeeck, et al.
2. Observations
In this study we analyse data from the interval 12 May – 23 June 2003. The
detections obtained from each algorithm for the entire data set are studied as a
whole in Section 4.2 and NOAA ARs 10377 and 10365 are individually studied
in detail in Section 4.3. This particular data set was selected for the diversity of
solar features present. SOHO/MDI magnetograms are used for magnetic region
detection by SMART, while SOHO/MDI continuum images are used for sunspot
detection by ASAP and STARA, and SOHO/EIT images are employed for active
region detection by SPoCA. These algorithms are described in Section 3.
The MDI instrument on SOHO provides almost continuous observations of
the Sun in the white-light continuum, in the vicinity of the Ni i 676.78 nm
photospheric absorption line. These photospheric intensity images are primar-
ily used for sunspot observations. MDI data are available in several processed
“levels”. We used level-2 images, which are smoothed, filtered, and rotated
(Scherrer et al., 1995). SOHO provides two to four MDI photospheric intensity
images per day with continuous coverage since 1995.
Using the same instrument level, 1.8 line-of-sight (LOS) MDI magnetograms
are recorded with a nominal cadence of 96 minutes. The magnetograms show the
magnetic fields of the solar photosphere, with negative (represented as black) and
positive (as white) areas indicating opposite LOS magnetic-field orientations.
The Extreme Ultraviolet Telescope (EIT: Delaboudinière et al., 1995), on-
board SOHO, delivers synoptic observations consisting of 1024 by 1024 images
of the solar corona recorded in four different wavelengths every six hours. Every
SPoCA segmentation in this article was based on a pair of 17.1 and 19.5 nm
EIT images. All images used have been preprocessed using the standard eit-
prep procedure of the SolarSoftware library. A fixed-centre segmentation with
six classes was performed on the logarithms of the image pixel values. The AR
centre values are 401.74 and 324.25 DN s−1 in 17.1 and 19.5 nm, respectively.
These values were derived from a cumulative run of SPoCA on a data set of
monthly EIT image pairs from February 1997 until April 2005; see Barra et al.
(2009).
In the case studies presented in Section 4.3, we compare observations of NOAA
10365 and 10377 with flares characterised by the Reuven Ramaty High Energy
Solar Spectroscopic Imager (RHESSI) team and distributed in the RHESSI flare
list.
3. Methods
SMART, ASAP, and STARA detect photospheric features such as active regions
and sunspots while SPoCA uses images in the extreme ultraviolet to observe
active regions at the coronal level. In this section, we will describe each of the
feature-detection algorithms used (Sections 3.1, 3.2, 3.3, and 3.4). The outputs
from each of the algorithms are associated using the method explained in Section
3.5.

Magnetic Structure Detections 10-Jun-2003

1000
500
PL UD
AR PL PLUE
AR UD AR
AR
Y (arcsecs)
AR
0
UD
UE
AR AR AR
-500
-1000
-1000 -500 0 500 1000
X (arcsecs)
Figure 1. An example set of SMART detections from 10 June 2003. PL, UD, and UE identify
three classes of unipolar feature, while AR denotes multipolar features.
3.1. The SMART algorithm
The Solar Monitor Active Region Tracker (SMART: Higgins et al., 2011) is an
algorithm that uses magnetograms to automatically extract, characterise, and
track active regions over multiple solar rotations – from first emergence to decay.
This allows one to study the complete life-cycle of ARs. The algorithm uses a
combination of image-processing techniques to determine the boundary of an
AR. Two consecutive line-of-sight magnetograms are smoothed using a gaussian
kernel with a full-width at half-maximum of five pixels and thresholded by 70 G
to identify potential features. The two detections are overlaid to identify and
remove features that are not present in both magnetograms. The remaining
detection boundaries are then dilated by ten pixels to create the final mask.
Dilation is performed to include nearby decaying and plage fragments that
may have separated from the main AR. This is intended to help conserve the
measured polarity balance of the AR as it evolves. An example set of SMART
detections is shown in Figure 1. In this article, SOHO/MDI LOS magnetograms
are used for detection, but recently the algorithm has been adapted for use with
SDO/HMI magnetograms (For near-realtime detections, see http://solarmonitor.org/smart disk).

C. Verbeeck, et al.
NOAA 10365 28-May-2003 Ising Energy Map
150 150
Y (arcsecs)
100 100
50 50
0 0
0 50 100 150 200 250 0 50 100 150 200 250
X (arcsecs) X (arcsecs)
Figure 2. Left: NOAA 10365 highlighting the PSL (white contour), a linear fit to the locus
of PSL positions (dotted red line), bipolar connection line (solid red line), and heliographic
longitude and latitude reference lines (dotted black lines). Right: An Ising energy map of the
same region. Red represents the magnitude of energy for each pixel from highest (light) to
lowest (dark). Since the connection between pixels of opposite polarity is being represented,
the energy map is only shown for one of the polarities.
Several new physical-property modules have been added to SMART for this
study. The tilt of an AR is obtained by measuring the angle between the line
connecting the centroids of the largest flux-weighted positive and negative blobs
(solid red line in left panel of Figure 2) and the heliographic-latitude line passing
through the centroid of the AR. The length of this bipolar connecting line [BCL]
line is also determined and provides a measure of the relative compactness of
an AR when compared to the evolution of AR area (left panel of Figure 2).
Additionally, the angle [α] detected between the best-fit line to the locus of
pixels forming the main polarity separation line [PSL] and the BCL is measured.
The main PSL is defined as the interface between the aforementioned largest
flux-weighted positive and negative blobs. The temporal derivative of the angle
α is shown to be a useful proxy for the occurrence of helicity injection in an
AR (Morita and McIntosh, 2005), which may be an important flare predictor
(LaBonte, Georgoulis, and Rust, 2007b). The evolution of this angle is studied
in Section 4.3. These properties are less informative when studying AR com-
plexes or non-bipolar ARs, since often no main axes can be discerned, making a
description of the AR orientation impossible.
The original Ising model is used for the analysis of magnetic interactions and
structures of ferromagnetic substances. Here, as by Ahmed et al. (2010), Ising
energy is used as a proxy for magnetic connectivity and complexity within an AR
(right panel of Figure 2). We use a modified form of that given by Ahmed et al.
(2010).
This version is calculated using
X Bi Bj
, (1)
i,j
Dij
where Bi (Bj ) are pixel values of positive (negative) line-of-sight magnetic field
and Dij is the spatial distance between pixels i and j. The Ising energy increases

Figure 3. An example set of ASAP detections from 10 June 2003.
as the negative and positive magnetic footprints within an active region become
more entangled. The evolution of this property is studied in Section 4.3.
The modules added to SMART for this work will be used for future large-scale
active region studies and will be added to the pipe-line versions of SMART run-
ning at the Heliophysics Events Knowledgebase (http://www.lmsal.com/hek/index.html:
Hurlburt et al., 2010) and included in the Heliophysics Integrated Observatory
(http://www.helio-vo.eu/index.php: Bentley et al., 2009). In the future, other prop-
erty modules will be added to calculate a physically motivated magnetic-connectivity
measurement (Georgoulis and Rust, 2007), multi-scale energy spectrum slope
(Hewett et al., 2008), and multi-fractal spectrum properties (Conlon et al., 2008).
3.2. The ASAP algorithm
Automated Solar Activity Prediction (ASAP) is the collective name for a set of
algorithms used to process solar images. It is composed of algorithms for sunspot,
faculae, and active-region detections (Colak and Qahwaji, 2008) and solar-flare
prediction (Colak and Qahwaji, 2009). Unlike other algorithms described in this
article, ASAP uses quick look (in GIF or JPEG format) images for its processes.
In this article a recently developed sunspot detection algorithm for ASAP is
used. This new sunspot-detection algorithm works with continuum images and
is described in detail by Colak et al. (2011). The main steps in this algorithm
can be summarized as follows:

C. Verbeeck, et al.
• Images are pre-processed to detect the solar disk, and to remove limb
darkening.
• Detected solar-disk data are converted from heliocentric coordinates to
Carrington heliographic coordinates.
– The key point in heliographic conversion is to choose the size of the
resulting image. If a very small image size is selected, this will cause
truncation and loss of data. If a very large one is chosen, there will be
many spaces in the resulting image. In this study, each heliographic
degree is represented by ten pixels therefore the resulting heliographic
images are 3600 by 1800 pixels.
– Initially, an empty 3600 by 1800 image is created. When the Car-
rington longitude and latitude of all of the pixels on the solar disk
are calculated, their pixel intensities are placed in corresponding lo-
cations.
– The distribution of pixels representing degrees in heliocentric coordi-
nates is not uniform due to the spherical shape of the Sun. Towards
the limb of the Sun on a two dimensional heliocentric image each
degree will be represented by fewer pixels although in a heliographic
image each degree is represented with same amount of pixels. Size of
the heliographic image is larger than heliocentric image and therefore
there will be gaps (empty pixels that are not filled with data from
heliocentric image) in the resulting initial heliographic image after
conversion.
– The following algorithm is applied to estimate the pixel intensities of
the gaps in the heliographic image. Every pixel on the heliographic
image is examined and when a pixel without any information is found,
its neighbouring pixels are searched using variable-size windows. First
a 3 3 pixel window is centred on the empty pixel and the values of the
non-empty neighbouring pixels within this window are added. If all
the neighbouring pixels within the initial window were empty, the size
of the window is increased by one and the process continued until at
least one pixel with information is available within the window. Then
the average value of the valid pixels found within the final window is
assigned to the empty pixel. The algorithm continues until all of the
pixels have been processed.
– After all data gaps have been filled, a smoothing algorithm using a
3 × 3 linear uniform filter is applied to create the final heliographic
image.
• Subsequently, the following filter is applied to the Carrington heliographic
image for detecting sunspots:
– An intensity filtering threshold value T = µ − (σ × α) is calculated
where µ is the mean, σ is the standard deviation of the image, and α
is a constant equal to 2.5.
– The intensity of each pixel in the image is compared to this T value.
If it is less than the calculated threshold value, the pixel under con-
sideration is marked as a sunspot.

10−Jun−2003 06:24:00
Figure 4. An example set of STARA detections from 10 June 2003.
Although heliographic conversion can be computationally expensive, it yields

detections that are more accurate compared to the ones done on heliocentric
coordinates. A tracking algorithm was added to ASAP for this study, finding the
intersections between objects (e.g. sunspots) on two consecutive heliographic
images. Since the differential rotation is very small in Carrington heliographic
coordinates, there is no need for longitudinal corrections. ASAP tends to detect
small sunspots, which can be classified as pores. This is useful especially when
grouping and classifying sunspots. However, the number of tracked sunspots
increases because most pores are only visible for a few hours on the solar disk.
An example set of ASAP detections is shown in Figure 3.
It was not necessary to update ASAP for this work, but some computational
issues have been discovered. For instance, the application of ASAP to SDO/HMI
images larger than 1024 × 1024 pixels shows that heliographic conversion algo-
rithms must be made more efficient to tackle larger images.
3.3. The STARA algorithm
The Sunspot Tracking And Recognition Algorithm (STARA) code was written
in 2008 in order to perform consistent long term observations of sunspots over
Solar Cycle 23 (Watson et al., 2009). It was originally developed for use with

C. Verbeeck, et al.
MDI data but has since been extended for use with data from SDO as well as
from a number of ground-based instruments. A simple detection method was
required to speed up processing when large data sets were used and suitable
techniques were found in the field of morphological image processing.
The STARA detection method works as follows: The image is read in and
inverted so that the sunspots appear as bright areas on a dark background
for compatibility with the top-hat transform. Morphological erosion is applied,
which removes peaks and works by treating the 2D image as a 3D surface with
the pixel values indicating the height of the surface at that point. A probe known
as a structuring element is then chosen (in this case, a sphere with a radius of
14 MDI pixels) and is “rolled” underneath the surface whilst always touching
it. The centre of the sphere then maps out a new surface that is close to 14
units below the original. However, any sharp peaks present (such as sunspots)
would not allow the sphere to fit inside them and so are not represented in the
eroded image. A morphological dilation is then performed which is identical to
an erosion apart from the sphere being rolled on top of the surface. The dilated
surface is subtracted from the original to leave only the sunspot peaks present.
As the sphere rolls over the surface it also carries the limb-darkening profile
through each step and when the final surface is subtracted from the original, the
limb-darkening effects are automatically removed.
More detail on this step and the top-hat transform is given by Watson et al.
(2009) and Dougherty and Lotufo (2003). A size filter is then applied that re-
moves areas containing less than ten pixels as they are far more likely to be pores
than sunspots. The remaining areas are recorded along with their locations as
well as the number of umbral regions detected within the sunspot boundary. This
is repeated for a number of consecutive images and using the solar-rotation model
of Howard, Harvey, and Forgach (1990) the sunspots can be tracked throughout
a sequential data set. This allows the evolution of individual sunspots to be
followed as well as the overall properties of the sunspot population as a whole.
An example of a typical set of STARA sunspot detections is given in Figure 4.
STARA has undergone very few changes over the course of this work as
the code was well established beforehand. Nevertheless, some subtle problems
have been discovered in the process. As sunspots approach the limb (at lon-
gitudes greater than 75◦ ) the sunspot position returned by the code quickly
loses accuracy. This is a common problem with feature-detection methods as the
geometrical foreshortening effects test the limits of automated systems. There
are also potential problems present with bad data. Obviously the best remedy
is to remove it altogether but with MDI it is possible to have images with only
half of the solar disk present or with large artifacts. Both of these situations can
have substantial effects on the detected global properties and cause problems
with analysis.
3.4. The SPoCA algorithm
The Spatial Possibilistic Clustering Algorithm (SPoCA) is a multi-channel fuzzy

clustering algorithm that automatically segments solar EUV images into a set
of features; see Barra et al. (2009) for a complete presentation. It optimally sep-
arates active regions, quiet Sun, and coronal holes, even though the boundaries

Figure 5. An example set of SPoCA detections from 10 June 2003.
of these regions are not always well defined. The description of the segmentation
process in terms of fuzzy logic was motivated by the facts that information
provided by a solar EUV image is noisy (corruption by Poisson and readout
noise as well as by cosmic-ray hits) and subject to both observational biases (line-
of-sight integration of a transparent volume) and interpretation (the apparent
boundary between regions is a matter of convention).
SPoCA takes as input an image in one (or several) EUV passband(s) and
uses as “feature vector” the pixel value (or the pixel value vector in the multi-
channel case) in order to classify a pixel as belonging to one of three classes,
namely AR, QS, and CH. SPoCA is based on a fuzzy clustering technique called
“Possibilistic C-Means” (PCM: Krishnapuram and Keller, 1993, 1996). For each
class, it assigns a “probability” or membership value ∈ [0, 1] to every feature
vector.
PCM is an iterative method that searches for three compact clusters in the
space of feature vectors, corresponding to AR, QS and CH. In practice, this is
achieved through a gradient-descent scheme that minimizes an objective function
that is related to the total intracluster variance plus some penalty term. In every
iteration, new membership values are calculated based on the class centre values.

C. Verbeeck, et al.
The membership values are used in turn to compute the new class centres, and
so on, until the class centres converge to within a preset accuracy.
In order to cope successfully with intensity outlier pixels such as those affected
by cosmic rays and proton storms, a spatial regularization term was added to
the PCM objective function, forcing membership values in a neighbourhood to
be as close as possible. By assigning each pixel to the class for which its feature
vector has the largest membership value, the image is segmented. An example
set of SPoCA detections is shown in Figure 5.
Since the solar corona is optically thin, and since the intensity in EUV
images is obtained through an integration along the line of sight, there is a
limb-brightening effect in those images, which may hinder the segmentation pro-
cess. Therefore, the EUV images are pre-processed so as to lower the enhanced
brightness near the limb. The initial SPoCA class contours are automatically
postprocessed using a morphological opening with a circular isotropic element
of size unity.
Since the publication of Barra et al. (2009), the SPoCA algorithm was opti-
mized and extended in several ways:
• In order to gain more consistent results, we introduced some constraints on
the penalty term of the objective function to be minimized.
• The limb correction is now applied in a continuously increasing way towards
the limb instead of introducing it abruptly from some point onwards.
• For individual AR detection, first the Bright Points are removed (size
threshold is 1500 square arcseconds) and then a spherical dilation (ra-
dius: 12 EIT pixels) is employed to group the remaining bright blobs into
individual active regions.
• Individual AR are tracked through time by comparing the masks of regions
in two consecutive time frames, taking into account differential rotation.
SPoCA has been running in near-realtime on AIA data since September 2010
as part of the SDO Feature Finding Project (Martens et al., 2011), a suite of
software pipeline modules for automated feature recognition and analysis for the
imagery from SDO. The resulting AR events are automatically ingested by the
Heliophysics Events Knowledgebase (Hurlburt et al., 2010).
SPoCA is the only algorithm presented here that detects ARs in the solar
corona. The method is generic enough to allow the introduction of other channels
or data. It has been applied to SOHO/EIT, SDO/AIA, PROBA2/SWAP, and
STEREO/EUVI images, and could potentially be used on other multi-channel
maps such as Differential Emission maps. In this article we focus on ARs, but
QS and CH can also be detected and tracked.
3.5. Association of Detected Features
The SMART tracking module, called “Multiple Disk Passage” (MuDPie: Higgins et al.,
2011), is used to associate individual SMART detections of the same physical
feature over time by comparing the centroids of all detections in consecutive
magnetograms. Two detections are associated if their centroids match within
5◦ heliographic latitude and longitude. The tracked SMART detections are then

associated with the best matched detections in each of the other algorithms as
described in the following paragraphs.
In order to analyse the relation between the features detected by different
algorithms, a routine developed in Python associates detections from each algo-
rithm in two ways. First outputs from ASAP, STARA, and SPoCA are associated
with SMART outputs based on time and location information. Second, individual
association outputs (ASAP vs. SMART, STARA vs. SMART, SPoCA vs. SMART)
from the first step are combined using SMART IDs and timing information.
For associations, SMART is chosen as the base algorithm because SMART
detections usually encircle the corresponding ASAP and STARA detections and
they are also more stable over time than SPoCA detections, due to the frequent
splitting and merging of coronal AR detected by SPoCA. Also, SMART detects
magnetic regions from MDI images which are more frequently available than
the continuum and EIT images that the other algorithms are working on. The
association rules are described below.
First Step: Individual associations (ASAP, STARA, SPoCA vs. SMART)
• The time difference between the solar detections under consideration (i.e.,
sunspots from ASAP and STARA, active regions from SPoCA versus mag-
netic regions from SMART) is calculated.
• If the time difference between a magnetic region detected by SMART and
a solar region detected by another algorithm is less than 0.2 Julian days
and their heliographic bounding boxes intersect, then these detections are
associated. Since SPoCA does not deliver heliographic bounding boxes, a
bounding box of 5◦ in longitude and latitude is assumed.
• If the same solar detection is associated to more than one SMART region,
only the closest (in terms of time and distance between centres) SMART
region is selected as associated.
• Associations are saved in separate files (three files; ASAP vs. SMART,
STARA vs. SMART, SPoCA vs. SMART) including the selected characteris-
tics from each algorithm.
output that is going to be analysed.
Second Step: Combining all of the associations
• The SMART algorithm uses an ID for each magnetic region detected and
in this second step, the association data saved in the three separate files
from the first step are combined using this ID and time information. The
association data with same SMART ID and closest timing are combined
together. Timing information still has to be used due to the difference
between the image times.
• The final combined data are saved in one file.
SMART provided 9356 detections (207 magnetic region features), ASAP 3039
detections (952 sunspot features), STARA 1329 detections (433 sunspot features)
and SPoCA 1222 detections (190 coronal active-region features) within the con-
sidered time-frame. In the first step, 714 SMART detections were associated to

C. Verbeeck, et al.
2889 ASAP detections, 550 SMART detections were associated to 1315 STARA
detections and 1089 SMART detections were associated to 1117 SPoCA detec-
tions. In the second step when all of these data were combined, 350 detection (33
feature) associations were created for SMART, ASAP, STARA and SPoCA. The
daily averages of some of the outputs such as average daily sunspot numbers,
active region numbers and average areas are compared to the NOAA active
region catalogue in Section 4.1. In the considered period, NOAA recorded 217
detections (37 features).
In case of merging or splitting of neighbouring coronal regions as detected
by SPoCA, the association procedure described above does not relate the new
SPoCA detection to the corresponding SMART detection. This happened several
times in the case studies in Section 4.3. For these cases, we applied a manual
association of SPoCA detections to SMART detections.
4. Results
The feature detections from each algorithm are compared in the following sec-
tions. First, in Section 4.1 the overall detection performance of the algorithms is
presented, and compared to the corresponding NOAA detections and the daily
international sunspot number. Next, Principal Component Analysis is performed
on the full set of detections to probe the overall structure of the physical prop-
erties calculated by the algorithms in Section 4.2. Finally, in Section 4.3 the
evolution and flare activity of NOAA 10377 and 10365 are analysed in depth,
using physical properties determined by each algorithm.
4.1. Algorithm Performance
The performance of the algorithms is measured by comparing the daily total and
average values of some of the solar feature properties to each other, to values
reported by NOAA (http://www.swpc.noaa.gov/ftpdir/forecasts/SRS/README),
and to the international daily sunspot numbers (SIDC-Team, 2003) between 12
May and 23 June 2003.
A comparison of these data is provided in Figure 6. The graph on the up-
per left side of Figure 6 compares the daily number of sunspots detected by
ASAP and STARA to the total number of spots within NOAA regions and to
the international sunspot number. Generally, peaks and valleys in all of the
series follow each other but the international sunspot numbers and the sunspot
numbers for NOAA are usually higher than the sunspot numbers for ASAP and
STARA. When sunspots are detected manually, each umbra within a penumbra is
counted as one sunspot, whereas the automated algorithms discussed here count
each penumbra as one sunspot although it could have more than one umbra
within. Therefore the difference in sunspot numbers increases when the number
of complex sunspot regions increases. Also, the number of sunspots detected by
ASAP is always higher than the ones detected by STARA. This is because ASAP
tends to detect very small sunspots (sometimes pores) while STARA has a higher
threshold for the size of sunspot candidates.

150 25
Int. Sunspot Num. NOAA
NOAA SMART
ASAP 20 SPOCA
STARA
100
# of Regions
15
# of Spots
10
50
0 0
15-May 25-May 04-Jun 14-Jun 24-Jun 15-May 25-May 04-Jun 14-Jun 24-Jun
Time Time
4000
Area [Millionths of Solar Hemisphere]
NOAA
ASAP
STARA
3000 Int. Sunspot Num.
2
2000
1000
0
15-May 25-May 04-Jun 14-Jun 24-Jun
Time
Figure 6. Comparison of average detection results of algorithms to reported NOAA and in-
ternational sunspot numbers. Upper-left: Comparison of number of sunspots detected by ASAP
and STARA and reported by NOAA and recorded international sunspot numbers. Upper-right:
Number of regions detected by SMART and SPoCA compared with the ones reported by NOAA.
Lower-left: Comparison of average daily sunspot areas detected by ASAP and STARA versus
NOAA. The normalised international sunspot number is over-plotted for context. Lower-right:
Comparison of average daily region areas detected by SMART and SPoCA.
The graph on the upper right side of Figure 6 compares the daily number of
regions detected by SMART and SPoCA to the daily number of NOAA regions.
SMART and SPoCA detect more regions than NOAA because the NOAA number
is given to a region only if it has one or more sunspots, while SMART and SPoCA
regions do not depend on the existence of sunspots within detected boundaries.
Because of the projection effects of large coronal loops, two close but distinct
regions in the photosphere will often be detected by SPoCA as one region. This
explains why SMART has a higher tally of daily regions than SPoCA. This effect
is most visible near the solar limb.
A comparison of the areas of ARs and sunspots as detected by the four algo-
rithms and NOAA is presented in the lower part of Figure 6. SMART, ASAP, and
STARA areas were corrected for the line-of-sight projection effect that decreases
the observed area as the feature moves away from the central meridian. Since the
line-of-sight projected area of coronal loops does not necessarily decrease with
longitude, no systematic effect is expected for the observed SPoCA area, so the
raw area is presented. Sunspot areas are given in millionths of solar hemisphere
to be consistent with the units of the NOAA catalogue.
The graph on the lower left side of Figure 6 compares the sunspot areas
detected daily by ASAP and STARA to the NOAA sunspot areas, while the
international sunspot number is added for context. These three time series agree
well but there appears to be a one-day shift in NOAA sunspot areas. The ASAP
and STARA sunspot measurements are averages of observations throughout the

C. Verbeeck, et al.
whole day, whereas, depending on the day, the NOAA sunspot observation may
be quite early in the day. Since sunspots emerge quickly and decay slowly, any
sunspots that emerge late in a day are likely to be missed by NOAA, but
registered by ASAP and STARA. The following day, the new sunspots are likely
to remain visible, so ASAP and STARA are likely to show the same area as the
previous day, while the NOAA area will increase.
The graph on the lower right side of Figure 6 shows the comparison between
active region areas detected daily by SMART and by SPoCA. Considering we are
dealing here with photospheric versus coronal areas, a good agreement between
the areas is obtained. Both SMART and SPoCA areas vary smoothly. Moreover,
they are large enough to include the whole sunspot group if present, and to
measure changes in topology or complexity consistently. In summary, the time
series of sunspot and AR areas are well correlated, showing similar behaviour
in time, and the differences observed are likely due to the data and detection
methods used.
4.2. Principal Component Analysis
Principal Component Analysis (PCA: Jolliffe, 2002) aims at reducing the dimen-
sionality of a problem. It does so by maximizing the data structure information
in the principal component space. More precisely for a data set containing
n observations of p variables, the principal components are the directions in
n-dimensional variable space in which the data set exhibits maximal variance.
In this article, PCA (based on linear values) is used to get some insight in the
correlation structure of the following p variables: the Schrijver R value (Schrijver,
2007), length of the strong gradient line, magnetic flux, maximum B field, area,
length of the bipole connecting line, Ising energy, and Ising energy per pixel (Ising
E ppx) as computed by the SMART algorithm, the sunspot area and number of
sunspots as given by ASAP, and the raw AR area, maximum, variance, kurtosis
and skewness of the EUV intensity as computed by SPoCA.
These variables were computed on data recorded 12 May – 23 June 2003, at
a cadence of 96 minutes for photospheric features, and of six hours for coronal
features. Data from the various algorithms were then associated as described in
Section 3.5.
We excluded data points corresponding to regions whose centre was more
than 60◦ from the central meridian, as projection errors involved become too
large. Table 1 lists the percentage and cumulative percentage of the variance
explained by the principal components. The first two components explain 67%
of the total variability in the data set.
Figure 7 represents the variables in the plane of the first two components.
Each variable lies within a circle of radius one in this figure. Variables that
lie close to the circle are well represented by the first two components, while
variables close to the origin are not. The cosine of the angle formed by the origin
and two points on the graph of Figure 7 gives the correlation between the two
corresponding variables. This figure thus yields a graphical representation of the
correlation structure between variables.

Table 1. Percentage and cumulative percentage of the

variance of the 15-dimensional variable space described
above, that can be explained by the consecutive prin-
cipal components. Note that the first two principal
components comprise 67% of the variance.
Component % variance Cumulative % variance
1 52.86 52.86
2 14.61 67.47
3 10.30 77.77
4 6.55 84.32
5 5.17 89.49
6 3.61 93.11
7 2.82 95.93
8 1.61 97.54
9 0.93 98.47
10 0.44 98.92
11 0.39 99.30
12 0.29 99.59
13 0.22 99.81
14 0.12 99.93
15 0.07 100.00
The first observation is that no variable is strongly anti-correlated with an-

other in this data set. Roughly speaking, all variables evolve in the same direc-
tion. A more detailed inspection shows that the Schrijver R value, the length
of the strong gradient line [Lsg], and the Ising energy per pixel are strongly
(positively) correlated to each other, as are the Ising Energy and the ASAP
area. To a lesser extent, these five variables are all correlated to each other, as
well as to the magnetic flux. This related behaviour of PSL length, R value, Ising
energy, and ASAP area is apparent in Figures 9, 11, 13, 15, 16 and 18. These
variables are linked to a measure of complexity of the AR and its capability to
produce a flare, see Colak et al. (2010).
The maximum and skewness of the EUV intensity are strongly correlated:
indeed a high value of maximum EUV intensity implies a long tail in the intensity
distribution function, hence a high skewness. Similarly, the variance and kurtosis
of the EUV intensity are strongly correlated, which is expected since they are
related by definition. Note that variance and kurtosis lie further from the circle
than maximum and skewness, and hence are less well represented by the first two
principal components. Finally, the maximum magnetic field, length of the Bipole
Connecting Line, SPoCA area, SMART area, and ASAP number of sunspots are
not well represented by the two first components, and hence their correlation
structure cannot be interpreted from this plot.
PCA also tends to separate photospheric and coronal contribution. Features
computed at the photospheric level such as R, Lsg, Ising energy, ASAP area, and
flux have a large contribution to the first component, which accounts for 52.86%
of the variability in the total data set. The maximum, variance, skewness, and
kurtosis in EUV intensity images are moderately correlated to both the first and
second principal components.

C. Verbeeck, et al.
Correlation between variables and Principal Components 1 and 2

1.0
kurt I skew I
max I
0.5
var I
Principal Component 2 B max

0.0
BCL length R
SPoCA area IsingE ppx Lsg
ASAP area IsingE
SMART area flux
-0.5 ASAP spots
-1.0
-1.0 -0.5 0.0 0.5 1.0
Principal Component 1
Figure 7. The projections of the algorithm variables upon the first and second principal
components are plotted. They provide a measure of the extent to which these variables are
correlated with the first and second principal components.
This study shows that a reduction in dimensionality using PCA can be per-
formed without losing too much information. Such reduction can enhance the
accuracy and robustness of a subsequent classification scheme (Jiang, 2011) that
would aim for example at separating active regions that are prone to flares from
quiet active regions.
4.3. Case Studies
In the following section we analyse the time evolution of the ARs that emerge as
NOAA 10377 (a simple region) and 10365 (a complex, flaring region). Of special
interest is how activity in the corona results from changes in the photosphere.
Drawing this connection is essential for flare prediction, since the photosphere is
more easily physically characterised than the corona, where flares actually occur.
The photosphere–corona connection is not well understood, e.g. the work of
Leka and Barnes (2007) and of Handy and Schrijver (2001), with the references
therein.
We compare observations of NOAA 10365 and 10377 with flares characterised
by the Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI) team
and distributed in the RHESSI flare list( http://sprg.ssl.berkeley.edu/∼jimm/hessi/hsi flare list.html).
The flares, which have been associated with the individual ARs by the RHESSI
team, are represented in plots (Section 4.3) as downward pointing arrows, whose
size is logarithmically proportional to their peak count rate.
4.3.1. NOAA 10377
NOAA 10377 first emerges just before rotating onto the visible disk on 4 June
2003. It continues to gradually develop as it progresses across the disk producing

SOHO/MDI 6-Jun-2003 9-Jun-2003
300 300
200 200
Y (arcsecs)
100 100
0 0
-100 -100
-200 -200
-900 -800 -700 -600 -500 -400 -300 -200 -100 0 100 200
Figure 8. A comparison of NOAA AR 10377 detections. ASAP sunspots are represented by

black crosses. The contours represent SMART in black (with NOAA 10377 outlined in red) for
the magnetic features, SPoCA in dashed blue for coronal features, and STARA in orange for
sunspot penumbrae and magenta for umbrae.
very little activity (only one B9.1 event is listed in the NOAA events catalogue
(http://www.swpc.noaa.gov/ftpdir/indices/events/README)). Some of the flares
produced by 10377 may have been missed due to the presence of 10375, which
produced many large flares, swamping any signal that could be attributed to
10377.
Figure 8 shows the SMART detection of 10377 in red, while other features are
outlined in black. The extended dashed blue contours are SPoCA AR detections
and the small symbols and contours are sunspot detections from ASAP and
STARA, respectively. It is clear from Figure 8 that positions of the SMART,
ASAP, STARA, and SPoCA detections agree quite well. Whereas the sunspots
detected by ASAP and STARA are well confined within the SMART magnetic
region boundary, the SPoCA region most often contains most of the SMART
detection. In the case of coronal loop structures forming between nearby ARs,
adjacent SPoCA detections will merge and the SMART and SPoCA centroids will
diverge. This is especially apparent near the solar limb, where coronal structures
extending above the solar surface will be superimposed.
Figures 9 – 11 show the evolution of NOAA 10377 as it progresses across
the disk. In the top panel of Figure 9 the Stonyhurst longitudes of the re-
gion centroids from each algorithm are shown. The vertical dotted lines indi-
cate where the AR magnetic bounding box edges (dashed–dotted) and centroid
(dashed) cross −60 and 60◦ longitude. The cosine correction used to correct for
line-of-sight effects on magnetic-field properties is not sufficient outside of this
range. Also, beyond 60◦ , sunspot visibility is below ≈ 31 of that at disk centre
(Watson et al., 2009) due to the Wilson depression.
The top panel of Figure 9 tracks the longitude of centroids over time. We
see that ASAP and STARA curves are above the SMART curve on this plot,
suggesting that the centroid of the magnetic footpoints (SMART) follows behind
the sunspot centroids (ASAP and STARA). Since the longitudinal speed of the
white-light and magnetic detections are the same, this implies that the following

C. Verbeeck, et al.
80
60 SMART
Longitude [deg]
40 SPOCA
20 ASAP
0 STARA
-20
-40
Area (SMART,SPOCA/3) [Mm2]
-60
Area (ASAP,STARA) [Mm2]

3•104 2000
1500
2•104
1000
1•104
500
0 0
15
10
# Spots
0
04-Jun 06-Jun 08-Jun 10-Jun 12-Jun 14-Jun 16-Jun
Time
Figure 9. Time series of position, area, and sunspot information characterising the evolution
of NOAA AR 10377. The legend indicates symbols and colors for each of the detection algo-
rithms. The axes of the area plot are split between left (SPoCA and SMART) and right (ASAP
and STARA). The SPoCA areas have been divided by three for display.
polarity of the AR extends beyond the embedded sunspots, while the leading
polarity remains compact. As the NOAA region 10377 is close to 10375, this
last region affects the SPoCA detections. From 3 – 6 June, SPoCA detects both
NOAA regions within a single boundary. When this region splits into two parts
on 6 June, the SPoCA longitude and area curves decrease abruptly, and can now
be directly compared to the photospheric structures. This changes when the two
NOAA regions merge again on 11 June. Whenever the region detected by SPoCA
corresponds to the region detected by the other algorithms, all four longitudes
agree well.
The total sunspot area determined by ASAP and STARA (Figure 9, middle)
is very similar except for one data point near 12 June 2003. This is due to the
MDI image on 11 June 2003 at 1736 UT being distorted. Most of the distortion
is visible on the south limb of the image where this area is darker than the
rest of the solar disk. Because ASAP detects the solar disk directly from the
image, while STARA uses FITS keywords, the determination of the solar disk
by these two methods is different. This explains why on this image the ASAP
sunspot area is much smaller than the STARA area: whereas the distorted area is
detected by STARA as a large sunspot, it is completely discarded by ASAP. The
SMART and SPoCA areas of photospheric magnetic regions and coronal active
regions obey the same general trend as the sunspot areas, although the absolute
scales are different. While the area measurements are stable, the total number of
sunspots is not. The total area is dominated by the largest sunspots, while the
total number of spots is affected by small transients which ASAP is especially
sensitive to.

4•1022
ΣEUV Int. [DN/s]

1.5•107
3•1022
ΦTOT [Mx] 1.0•107
2•1022
1•1022 5.0•106
0 0 4
4000 3•10
EUV Max. [DN/s]

B-field Max. [G]
3000 2•104
2000 1•104
1000 0
Time
Figure 10. Time series of (top) total magnetic flux, total EUV intensity, (bottom) maximum
magnetic field, and maximum EUV intensity for NOAA AR 10377. The axes of the plots are
split between left (magnetic-field properties, black crosses) and right (coronal properties, blue
squares). RHESSI flares associated with the AR are indicated by downward arrows.
In the top panel of Figure 10, the emergence of the magnetic structure of
10377 is clearly seen in measurements of its total flux. The AR is stable until
≈ 8 June 2003 when a phase of rapid emergence begins, lasting until ≈ 11 June
when the total magnetic flux has more than doubled. Comparing Figure 9 and
Figure 10, we see that the total magnetic flux increases faster than the magnetic
area, implying that the AR magnetic fields emerge relatively faster than they
diffuse. The same general smooth trend is observed in the SPoCA total EUV
intensity between 6 and 11 June. After NOAA 10377 merges with 10375 on 11
June, we see a clear decay of the total EUV intensity in this combined region.
Note that both SMART flux and SPoCA total EUV intensity behave similarly
to the region area time series.
In the bottom panel the maximum magnetic field is much less stable than the
flux, and shows no clear trend. The maximum SPoCA EUV intensity does not
change significantly between 6 and 11 June for NOAA region 10377, but exhibits
three clear peaks afterwards which can be attributed to region 10375. The first
peak, on 11 June, can be attributed to SPoCA merging with NOAA 10375. The
peak on 12 June, near 1300, is probably associated with the M1.0 flare in 10375
at around 1358 UT, whereas the 13 June 0700 UT peak is probably related to
the M1.8 (0628 UT) or C6.1 (0710 UT) flares in 10375. These flares (appearing
in the NOAA events catalogue) are not indicated by the RHESSI arrows, since
we have only displayed those flares attributed to 10377. This shows that SPoCA
maximum intensity is capable of indicating solar eruptions.
Magnetic properties related to polarity mixing and complexity are shown in
Figure 11. In the top panel, the angle between the bipole connection line and

C. Verbeeck, et al.
200
PSL Angle [deg]

150
100
50
0
150
Length [Mm] Bipole Separation
100 PSL Length
50
0
2.0•1012
1.5•1012
R [Mx]
1.0•1012
5.0•1011
0
6•108
5•108
Ising Energy
4•108
3•108
2•108 Energy
1•108 (Energy/Px)*1E3
0
Time
Figure 11. Time series of (top) PSL orientation with respect to the bipole separation line,
(middle-top) bipole separation line length (crosses) and PSL length (dashed), (middle-bottom)
R, and (bottom) Ising energy (crosses) and Ising energy per pixel (dashed; multiplied by 1000
for display) for NOAA AR 10377.
polarity separation line (PSL) is presented. Since the PSL in this AR is only
a few megameters (or pixels) long (cf. middle-top panel), this angle cannot be
measured in a reliable way. Indeed, a small growth in the PSL detection in any
direction can cause the angle to change dramatically. In the middle-bottom panel
the total flux near the PSL [R] is very small until it begins to increase as a false
PSL is detected due to the near-horizontal fields of the large leading polarity
sunspot approaching the west limb on 12 June.
Ising energy, a proxy for magnetic connectivity, is shown in the bottom panel.
This property increases during the main magnetic emergence phase (≈ 8 to 10
June 2003) since it is dependent on the magnetic-field strength and inversely
dependent on the distance between individual magnetic elements. The Ising
energy per pixel (dashed line) appears to be very susceptible to geometrical
effects as the large decrease near the west limb and increase near the east limb
both coincide with the formation of false PSLs in the leading sunspot. It should
be noted that this quantity was calculated without remapping the data to disk-
centre as done by Ahmed et al. (2010), giving the measurement an even larger
viewing-angle dependence.
4.3.2. NOAA 10365
Active region NOAA 10365 rotates onto the visible solar disk on 19 May 2003
at heliographic latitude -5◦ . At this point 10365 is mature and decaying, having
emerged and evolved on the far side of the Sun. On 24 May, a new bipolar
structure rapidly emerges in the extended plage of the trailing (positive) polarity.
NOAA switches the 10365 designation to this newly emerged bipole several days

SOHO/MDI 21-May-2003 26-May-2003

200
100
100
0
0
Y (arcsecs)
-100
-100
-200
-200
-300
-300
-400
-1100 -1000 -900 -800 -700 -600 -100 0 100 200 300 400
Figure 12. A comparison of detection contours for NOAA AR 10365. ASAP sunspots are
represented by black crosses. The contours represent SMART in black (with NOAA 10365
outlined in red) for the magnetic features, SPoCA in dashed blue for coronal features, and
STARA in orange for sunspot penumbrae and magenta for umbrae.
later. As the bipole evolves it develops a strong double PSL by merging with the
decayed flux. It produces many C- and M-class flares and several X-class flares.
The AR progresses around the visible disk, eventually returning as NOAA 10386.
The onset of decay occurs as C- and M-class flares are produced with decreasing
frequency and the spot areas, magnetic flux, and field strengths decrease.
Figure 12 shows a comparison of the heliographic positions and sizes of two
sets of SMART, ASAP, STARA and SPoCA detections of NOAA 10365. We can
see that positions of the SMART, ASAP, STARA and SPoCA detections agree
well. The SPoCA detection, however, includes coronal loops extending away from
the footpoint boundary of NOAA 10365. Before 24 May, SPoCA merges NOAA
region 10367 with its detection of 10365. From 24 May – 27 May it only detects
10365, on 27 May at 1300 UT there is a single data point where these regions are
merged by SPoCA, and from 29 May at 0100 UT onwards, SPoCA merges them
for the remaining observation period. The longitudes of all detections within 24 –
27 May agree well. After 27 May the SPoCA longitude drifts, reflecting changes
in the merged coronal structures. Unlike 10377, the magnetic centroid of 10365
at first trails behind the sunspot centroid but then precedes it, as evidenced by
the top panel in Figure 13. This is because the new bipole, which develops many
spots, emerges behind the existing weakly spotted bipole. The new emergence
is clear in the plot of total sunspot area (middle panel), and is unclear in the
magnetic and EUV area plots since the new bipole emerges partially within the
boundary of the old one. Note that all areas for NOAA 10365 are much larger
than those for simpler region 10377. For SMART, area is very sensitive to weak
magnetic plage, however. This can be seen in the sudden jumps around May 25
and 28, which are due to nearby plage temporarily merging with the AR. The
jump in STARA area on 27 May can be attributed to a bad data file (note that
there is no ASAP data point at that time).
From 25 May onwards, the total magnetic flux increases gradually to over
four-fold the initial value during development and levels off around 29 May (see

C. Verbeeck, et al.
80
60 SMART
Longitude [deg]
40 SPOCA
20 ASAP
STARA
0
-20
-40
Area (SMART,SPOCA/3) [Mm2]
-60

4•104 5000
4000
3•104
3000
2•104
2000
1•104
1000
0 0
15
10
# Spots
0
19-May 21-May 23-May 25-May 27-May 29-May 31-May
Time
Figure 13. Time series of position, area, and sunspot information characterising the evolu-
tion of NOAA AR 10365. The legend indicates symbols and colors for each of the detection
algorithms. The axes of the area plot are split between left (SPoCA and SMART) and right
(ASAP and STARA). The SPoCA areas have been divided by three for display.
8•1022 2.0•107
ΣEUV Int. [DN/s]
ΦTOT [Mx]
6•1022 1.5•107
4•1022 1.0•107
2•1022 5.0•106
0 0 4
5000 3•10
EUV Max. [DN/s]
B-field Max. [G]
4000
2•104
3000
1•104
2000
1000 0
Time
magnetic field, and maximum EUV intensity for NOAA AR 10365. The axes of the plots are
split between left (magnetic-field properties, black crosses) and right (coronal properties, blue
squares). RHESSI flares associated with the AR are indicated by downward arrows.

200
PSL Angle [deg]

150
100
50
0
200
Length [Mm] Bipole Separation
150 PSL Length
100
50
0
2.0•1013
1.5•1013
R [Mx]
1.0•1013
5.0•1012
0
2.5•109
2.0•109
Ising Energy
Energy
(Energy/Px)*1E3
1.5•109
9
1.0•10
5.0•108
0
Time
Figure 15. Time series showing proxies for the complexity and polarity mixing in NOAA
AR 10365. (middle-top) bipole separation line length (crosses) and PSL length (dashed),
(middle-bottom) R, and (bottom) Ising energy (crosses) and Ising energy per pixel (dashed;
multiplied by 1000 for display)
Figure 14). The maximum magnetic field increases abruptly on 25 May and also
increases over time, albeit less smoothly than the magnetic flux. The maximum
magnetic field did not show an overall increasing or decreasing trend in the case
of simpler NOAA region 10377.
The time series of SPoCA maximum intensity exhibits some peaks, which can
be related to the following flares produced by NOAA 10365: the M1.9 flare at
0534 UT on 26 May, the M1.6 flare at 0506 UT on 27 May, the X1.2 flare at
0051 UT on 29 May, and the M9.3 flare at 0213 UT on 31 May. The last two
flares are even visible in the total SPoCA intensity, which shows more or less a
gradual increase over time, but less smooth than in the case of the simpler NOAA
region 10377. The flares not picked up by SPoCA likely occurred in between EIT
images.
Signatures in the evolution of the magnetic topology of NOAA 10365 precede
its intense coronal activity, indicated by the associated RHESSI flares in Figure
15. Just before 25 May 2003 the new emergence causes a jump in the main bipole
separation line length. As the emergence continues and strong PSLs develop, this
length decreases, while the total PSL length increases, as shown in the middle-
top panel of Figure 15. Also, there are signs of gradual helicity injection as the
angle between the main bipole connection line and the main PSL grows from
near perpendicular (90◦ ) to around 120◦ (top panel). The flux near PSL [R]
grows during this time, as does the Ising energy (middle-bottom and bottom
panels, respectively). A bump in R just before 26 May is followed by an intense
RHESSI flare. Intense flaring begins again around the second bump in R on 28
June. Examining the development of Ising energy, it appears that sharp increases
in the property are followed by the most intense flaring.

C. Verbeeck, et al.
40
20 SMART
Longitude [deg]
SPOCA
0 ASAP
-20 STARA
Area (SMART,SPOCA/3) [Mm2] -40
-60

3•104 2000
1500
2•104
1000
1•104
500
0 0
15
10
# Spots
0
16-Jun 18-Jun 20-Jun 22-Jun 24-Jun
Time
Figure 16. Time series of position, area, and sunspot information characterising the decay
phase of NOAA AR 10365 (renamed 10386) during its second disk passage. The legend indicates
symbols and colors for each of the detection algorithms. The axes of the area plot are split
between left (SPoCA and SMART) and right (ASAP and STARA). The SPoCA areas have been
divided by three for display.
NOAA 10365 returns for a second disk passage, renamed 10386. We are able to
observe its decay phase, as shown in Figures 16 – 18. As no RHESSI data on flares
is available for this period, no flare arrows were added to these figures. While
the longitude of the SMART magnetic centroid increases linearly with time,
the ASAP and STARA sunspot centroids show small departures from this line
between 19 and 21 June, preceding the magnetic centroid. The SPoCA detection
of NOAA 10386 merges with 10388 and 10389, so a direct comparison with the
other algorithms cannot be made.
The magnetic area does not change significantly, but the total sunspot area
clearly decreases (middle panel, Figure 16), and has already decreased substan-
tially since the previous disk passage (as NOAA 10365). The total magnetic
flux decreases (top panel, Figure 17) as its magnetic fields diffuse and weaken.
Comparing the values to Figure 14, we notice that the flux had already decreased
significantly since the previous solar rotation. The total EUV intensity does not
change substantially, regardless of the weakening magnetic footpoints, although
it has decreased since the previous solar rotation. Its increase on 22 June is
due to the detection merging with a large region near the limb. The maximum
magnetic-field value shows a gradual, although not very smooth decrease, and
has also decreased since the previous passage. The maximum EUV intensity does
not show a clear trend, and although several jumps are detected, the intensity
levels are much less than those associated with flares over the previous passage.
The peak on 18 June at 0100 UT, for instance, can likely be associated to the
M6.8 flare produced by region 10386 at 2227 UT on 17 June. The PSL length has

4•1022
2.0•107
ΣEUV Int. [DN/s]

ΦTOT [Mx] 3•1022
1.5•107
2•1022 1.0•107
1•1022 5.0•106
0 0 4
4000 3•10
EUV Max. [DN/s]

B-field Max. [G]
3000 2•104
2000 1•104
1000 0
Time
magnetic field, and maximum EUV intensity for NOAA AR 10365 on its second disk passage
as 10386. The axes of the plots are split between left (magnetic-field properties, black crosses)
and right (coronal properties, blue squares).
200
PSL Angle [deg]
150
100
50
0
200
Length [Mm]
Bipole Separation
150 PSL Length
100
50
0
5•1012
4•1012
R [Mx]
3•1012
2•1012
1•1012
0
1.5•109
Ising Energy
Energy
1.0•109 (Energy/Px)*1E3
5.0•108
0
Time
Figure 18. Time series showing proxies for the complexity and polarity mixing in NOAA
10386. (top) PSL orientation with respect to the bipole separation line, (middle-top) bipole
separation line length (crosses) and PSL length (dashed), (middle-bottom) R, and (bottom)
Ising energy (crosses) and Ising energy per pixel (dashed; multiplied by 1000 for display)

C. Verbeeck, et al.
decreased since the previous solar rotation, and shows a further gradual decrease
in Figure 18. The same is true for both R and the Ising energy.
5. Discussion and Conclusion
In this article the performance of the presented algorithms is tested in several

ways: the overall detection performances are investigated as compared to es-
tablished methods; the correlations between extracted physical properties are
determined using Principal Component Analysis; the stability and usefulness
of the algorithms and of various feature properties for studying feature time-
evolution is tested. Additionally, some lessons were learned about data analysis
using automated detection algorithms. The following relates the main results of
these investigations.
In Section 4.1 we compared ASAP, STARA, SIDC, and NOAA sunspot detec-
tions and found out that although different numbers of features are detected,
the total disk areas agree well. The same is true for the comparison between
SMART and SPoCA active-region detections. The difference in numbers can
be explained as follows: ASAP is sensitive to pores while STARA only detects
developed sunspots. Pores clearly do not add significant area to the total disk
coverage, while there is a large number of them detected. NOAA sunspot counts
include both penumbrae and umbral cores, whereas ASAP and STARA only count
sunspot groups. Consequently, sunspot counts should be very carefully assessed
when applied to a long-term inhomogeneous historical record. NOAA only counts
those active regions that possess sunspots, which is not the case for SMART
nor SPoCA. SMART detects both small AR fragments as well as complexes of
bipolar structures, while SPoCA is sensitive to bright loop connections between
adjacent features, often merging them into a single detection. These differences
are partially dependent on thresholds chosen by the developers and may be tuned
to return an agreeable feature boundary. Boundaries are inherently arbitrary as
there is no established definition of each feature. Additionally, features evolve
continuously and are prone to merging and fragmentation, so universally defining
a feature is very difficult. The summed area of features is a preferred quantity
as it is both more stable and less ambiguous.
The use of Principal Component Analysis (Section 4.2) has allowed us to
determine which feature properties contain the most information about our data
set. PCA also tends to separate photospheric and coronal contribution. Features
computed at the photospheric level such as R, Lsg, Ising energy, ASAP sunspot
area, and magnetic flux are highly correlated to each other and have a large
contribution to the first component, which accounts for 52.86% of the variability
in the total data set. The maximum, variance, skewness, and kurtosis in EUV
intensity images are highly correlated to each other and moderately correlated to
both the first and second principal components. By reducing dimensionality the
accuracy and robustness of a classification scheme can be enhanced (Jiang, 2011).
For example, this could be used to discriminate between flaring and non-flaring
active regions properties.
Through the time series analysis of two AR case studies (Section 4.3), we
have observed three physical processes evident in their evolution: emergence of

a bipolar magnetic structure, sunspots, and EUV loops (Section 4.3.1); increase
and peak in non-potentiality, followed by the onset of flaring (Section 4.3.2);
decay and weakening of magnetic footpoints (Section 4.3.2). We find that the
algorithms show good correspondence between centroid positions and areas but
significant divergence is seen in other properties.
In the case study of the simple active region NOAA 10377 (Section 4.3.1)
we see that the total number of detected sunspots fluctuates wildly as transient
spots rapidly emerge and disappear. This is partly due to the visibility curve
since the area of small spots is highly impacted by the observer’s viewing angle
(Dalla, Fletcher, and Walton, 2008; Watson et al., 2009). In order to study AR
evolution, sunspot area is more indicative of the emergence and decay of an AR
than coronal or magnetic area, which do not necessarily decrease during decay
(see Section 4.3). As a basis for long-term AR tracking, magnetic flux is more
useful than the sunspot area (since sunspots are much more transient than their
magnetic footprints) or maximum magnetic field value (since it is affected by the
MDI saturation problem Liu, Norton, and Scherrer (2007)). Also, the maximum
magnetic-field value is unstable since different positions in the active region will
over-take each other in field magnitude as they develop, causing the location
that the value is sampled from to vary wildly. Finally, the total EUV intensity
determined by SPoCA has a smooth behaviour over time and is closely linked to
area. The maximum EUV intensity peaks when the active region emits a large
flare, and appears to be a useful indicator of eruptions in the corona.
The case study of complex, flaring NOAA 10365 (Section 4.3.2) shows that
flaring can happen both in periods of flux emergence as well as non-potentiality
enhancement in an active region. Following the initial flaring during the emer-
gence phase of evolution, further flaring occurs as the main PSL rotates with
respect to the bipole connection line. This is a sign of helicity injection and is
coincident with increases of other properties related to polarity mixing. Helicity
injection has been established as a method of increasing non-potentiality and
may be caused by the emergence of subsurface twisted flux ropes, as seen in
Dun et al. (2007).
As NOAA 10365 returns after one solar rotation, decay is seen in the strength
of its magnetic footprint. However, the area is not seen to decrease significantly,
since supergranular diffusion causes a radial dispersal of magnetic elements.
Coronal structures do not appear to decay readily, either. This result agrees with
Lites et al. (1995), where it is reasoned that if the coronal magnetic structure
is closed, it may be in a state of quasi-static equilibrium, whereby the magnetic
buoyancy of the loops is cancelled by the weight of plasma trapped at the bottom
of the closed structure.
By performing these studies, it is found that magnetic active-region detections
provide the most stable base for feature tracking. Sunspots are only visible for
short periods of time, and coronal detections continually form bright loop connec-
tions to nearby features. The simple feature tracking method used in this article
(see Section 3.5) is novel in that it allows features to be tracked between multiple
disk passages. This is essential for analysing the complete life-cycle of an active
region, as exemplified in the analysis of NOAA 10365 (Section 4.3.2). Future work
on active-region evolution should combine morphological information to better

C. Verbeeck, et al.
handle merging and splitting (as done by Welsch and Longcope (2003)) with our
method of multiple disk passage tracking. Future work on our algorithms will
also address the automatic detection and handling of structural or visible errors
in solar data, to avoid discontinuities in the time series due to a corrupted image,
as was seen in the STARA outputs used in the case studies (see Section 4.3.1).
This work will be expanded in the future to include an analysis of the full
SOHO archive as well as detailed studies of photospheric and coronal SDO data
sets. Many physical studies will benefit from this work, as investigations that
examine coronal heating as a result of large scale magnetic fields (Schrijver, 1987;
Fisher et al., 1998), coupling between the photosphere and corona (Handy and Schrijver,
2001), sources of coronal mass ejections (Subramanian and Dere, 2001), flux
emergence and distribution (Liu and Kurokawa, 2004; Abramenko and Longcope,
2005) and flare forecasting (Gallagher, Moon, and Wang, 2002) can all be re-
peated with these automated detection methods. Using these methods allows a
far greater number of features to be analysed and reduces human bias in the
detection of features in the solar data.
The algorithms presented here are automated (once thresholds have been
fixed), independent, and unsupervised. Although some development remains to
be done, they detect features the way that they are intended, and will provide
useful additions to the SDO pipeline feature-detection methods. However, this
work shows that automated methods cannot replace human data analysis but
they can help to stream-line the process.
Acknowledgements Funding of CV and VD by the Belgian Federal Science Policy Office

(BELSPO) through the ESA/PRODEX SIDC Data Exploitation program, as well as by the
Solar–Terrestrial Center of Excellence/ROB, is hereby appreciatively acknowledged. FTW
acknowledges Ph. D. funding from the Science and Technology Facilities Council and the
guidance of his supervisor, Lyndsay Fletcher. We acknowledge support from ISSI through
funding for the International Team on “Mining and exploiting SDO data in Europe” led
by V. Delouille. ASAP is supported by an EPSRC Grant (EP/F022948/1), which is entitled
”Image Processing, Machine Learning and Geometric Modelling for the 3D Representation of
Solar Features”. PAH acknowledges support from ESA/PRODEX and a grant from the EC
Framework Programme 7 (HELIO) and the guidance of his supervisor, Peter T. Gallagher. We
would like to thank the SOHO team for making both their data and analysis software publicly
available, Omar W. Ahmed for use of the Ising Energy software, and Aidan M. O’Flannagain
for advice on the RHESSI flare list.
References
Abramenko, V.I., Longcope, D.W.: 2005, Distribution of the Magnetic Flux in

Elements of the Magnetic Field in Active Regions. Astrophys. J. 619, 1160 –
1166. doi:10.1086/426710.
Ahmed, O., Qahwaji, R., Colak, T., DudokDeWit, T., Ipson, S.: 2010, A new
technique for the calculation and 3d visualisation ofmagnetic complexities on
solar satellite images. The Visual Computer 26, 385 – 395. 10.1007/s00371-
010-0418-1. http://dx.doi.org/10.1007/s00371-010-0418-1.

Aschwanden, M.J.: 2010, Image Processing Techniques and Feature Recognition

in Solar Physics. Solar Phys. 262, 235 – 275. doi:10.1007/s11207-009-9474-y.
Barra, V., Delouille, V., Kretzschmar, M., Hochedez, J.: 2009, Fast and robust
segmentation of solar EUV images: algorithm and results for solar cycle 23.
Astron. Astrophys. 505, 361 – 371. doi:10.1051/0004-6361/200811416.
Benkhalil, A., Zharkova, V.V., Zharkov, S., Ipson, S.: 2006, Active Region De-
tection and Verification With the Solar Feature Catalogue. Solar Phys. 235,
87 – 106. doi:10.1007/s11207-006-0023-7.
Bentley, R.D., Aboudarham, J., Csillaghy, A., Jacquey, C., Hapgood, M.A.,
Messerotti, M., Gallagher, P., Bocchialini, K., Hurlburt, N.E., Roberts, D.,
Sanchez Duarte, L.: 2009, Addressing Science Use Cases with HELIO. AGU
Fall Meeting Abstracts, A6.
Colak, T., Qahwaji, R.: 2008, Automated McIntosh-Based Classification

of Sunspot Groups Using MDI Images. Solar Phys. 248, 277 – 296.
doi:10.1007/s11207-007-9094-3.
Colak, T., Qahwaji, R.: 2009, Automated Solar Activity Prediction: A

hybrid computer platform using machine learning and solar imaging
for automated prediction of solar flares. Space Weather 7, S06001.
doi:10.1029/2008SW000401.
Colak, T., Ahmed, O.W., Qahwaji, R., Higgins, P.A.: 2010, Automated So-
lar Flare Prediction: Is it a Myth? Presentation in Seventh European Space
Weather Week. http://spaceweather.inf.brad.ac.uk/colak19nov.pdf.
Colak, T., Qahwaji, R., Ipson, S., Ugail, H.: 2011, Representation of Solar Fea-
tures in 3D for Creating Visual Solar Catalogues. Adv. Space Res. 47(12),
2092 – 2104. doi:10.1016/j.asr.2010.08.030.
Conlon, P.A., Gallagher, P.T., McAteer, R.T.J., Ireland, J., Young,

C.A., Kestener, P., Hewett, R.J., Maguire, K.: 2008, Multifractal
Properties of Evolving Active Regions. Solar Phys. 248, 297 – 309.
doi:10.1007/s11207-007-9074-7.
Conlon, P.A., McAteer, R.T.J., Gallagher, P.T., Fennell, L.: 2010, Quantifying
the Evolving Magnetic Structure of Active Regions. Astrophys. J. 722, 577 –
585. doi:10.1088/0004-637X/722/1/577.
Curto, J.J., Blanca, M., Martı́nez, E.: 2008, Automatic Sunspots Detection on
Full-Disk Solar Images using Mathematical Morphology. Solar Phys. 250,
411 – 429. doi:10.1007/s11207-008-9224-6.
Dalla, S., Fletcher, L., Walton, N.A.: 2008, Invisible sunspots and
rate of solar magnetic flux emergence. Astron. Astrophys. 479, 1 – 4.
doi:10.1051/0004-6361:20078800.

C. Verbeeck, et al.
DeForest, C.E., Hagenaar, H.J., Lamb, D.A., Parnell, C.E., Welsch, B.T.:
2007, Solar Magnetic Tracking. I. Software Comparison and Recommended
Practices. Astrophys. J. 666, 576 – 587. doi:10.1086/518994.
Delaboudinière, J., Artzner, G.E., Brunaud, J., Gabriel, A.H., Hochedez, J.F.,
Millier, F., Song, X.Y., Au, B., Dere, K.P., Howard, R.A., Kreplin, R., Michels,
D.J., Moses, J.D., Defise, J.M., Jamar, C., Rochus, P., Chauvineau, J.P.,
Marioge, J.P., Catura, R.C., Lemen, J.R., Shing, L., Stern, R.A., Gurman,
J.B., Neupert, W.M., Maucherat, A., Clette, F., Cugnon, P., van Dessel, E.L.:
1995, EIT: Extreme-Ultraviolet Imaging Telescope for the SOHO Mission.
Solar Phys. 162, 291 – 312. doi:10.1007/BF00733432.
Dougherty, E.R., Lotufo, R.A.: 2003, Hands-on morphological image processing,

SPIE Optical Engineering Press, Washington, 130.
Dudok de Wit, T., Auchère, F.: 2007, Multispectral analysis of solar EUV im-
ages: linking temperature to morphology. Astron. Astrophys. 466, 347 – 355.
doi:10.1051/0004-6361:20066764.
Dudok de Wit, T.D.: 2006, Fast Segmentation of Solar Extreme Ultraviolet

Images. Solar Physics 239, 519 – 530.
Dun, J., Kurokawa, H., Ishii, T.T., Liu, Y., Zhang, H.: 2007, Evolution of
Magnetic Nonpotentiality in NOAA AR 10486. Astrophys. J. 657, 577 – 591.
doi:10.1086/510373.
Falconer, D.A., Moore, R.L., Gary, G.A.: 2008, Magnetogram Measures of Total
Nonpotentiality for Prediction of Solar Coronal Mass Ejections from Active
Regions of Any Degree of Magnetic Complexity. Astrophys. J. 689, 1433 –
1442. doi:10.1086/591045.
Fisher, G.H., Longcope, D.W., Metcalf, T.R., Pevtsov, A.A.: 1998, Coronal Heat-
ing in Active Regions as a Function of Global Magnetic Variables. Astrophys.
J. 508, 885 – 898. doi:10.1086/306435.
Gallagher, P.T., Moon, Y., Wang, H.: 2002, Active-Region Monitoring and Flare
Forecasting I. Data Processing and First Results. Solar Phys. 209, 171 – 183.
doi:10.1023/A:1020950221179.
Georgoulis, M.K., Rust, D.M.: 2007, Quantitative Forecasting of Major Solar

Flares. Astrophys. J. Lett. 661, 109 – 112. doi:10.1086/518718.
Habash Krause, L., Franz, A., Stevenson, A.: 2011, On the application of Ex-
ploratory Data Analysis for characterization of space weather data sets. Adv.
Space Res. 47, 2199 – 2209. doi:10.1016/j.asr.2011.03.017.
Handy, B.N., Schrijver, C.J.: 2001, On the Evolution of the Solar Pho-
tospheric and Coronal Magnetic Field. Astrophys. J. 547, 1100 – 1108.
doi:10.1086/318429.

Hewett, R.J., Gallagher, P.T., McAteer, R.T.J., Young, C.A., Ireland, J., Conlon,
P.A., Maguire, K.: 2008, Multiscale Analysis of Active Region Evolution. Solar
Phys. 248, 311 – 322. doi:10.1007/s11207-007-9028-0.
Higgins, P.A., Gallagher, P.T., McAteer, R.T.J., Bloomfield, D.S.: 2011, Solar
magnetic feature detection and tracking for space weather monitoring. Adv.
Space Res. 47, 2105 – 2117. doi:10.1016/j.asr.2010.06.024.
Howard, R.F., Harvey, J.W., Forgach, S.: 1990, Solar surface velocity fields
determined from small magnetic features. Solar Phys. 130, 295 – 311.
doi:10.1007/BF00156795.
Hurlburt, N., Cheung, M., Schrijver, C., Chang, L., Freeland, S., Green,
S., Heck, C., Jaffey, A., Kobashi, A., Schiff, D., Serafin, J., Seguin, R.,
Slater, G., Somani, A., Timmons, R.: 2010, Heliophysics Event Knowledge-
base for the Solar Dynamics Observatory (SDO) and Beyond. Solar Phys..
doi:10.1007/s11207-010-9624-2.
Jiang, X.: 2011, Linear subspace learning-based dimensionality reduction . IEEE

Signal Processing Magazine 28(2), 16 – 26.
Jolliffe, I.T.: 2002, Principal component analysis, 2nd edition, Springer, New
York, 487 p.
Krishnapuram, R., Keller, J.M.: 1993, A possibilistic approach to clustering.

IEEE Transactions on Fuzzy Systems 1, 98 – 110.
Krishnapuram, R., Keller, J.M.: 1996, The Possibilistic C-Means Algorithm:

Insights and Recommendations. IEEE Transactions on Fuzzy Systems 4, 385 –
393.
LaBonte, B.J., Georgoulis, M.K., Rust, D.M.: 2007a, Survey of Magnetic Helicity
Injection in Regions Producing X-Class Flares. Astrophys. J. 671, 955 – 963.
doi:10.1086/522682.
LaBonte, B.J., Georgoulis, M.K., Rust, D.M.: 2007b, Survey of Magnetic Helicity
Injection in Regions Producing X-Class Flares. Astrophys. J. 671, 955 – 963.
doi:10.1086/522682.
Lefebvre, S., Rozelot, J.: 2004, A new method to detect active features at the so-
lar limb. Solar Phys. 219, 25 – 37. doi:10.1023/B:SOLA.0000021818.97402.1e.
Leka, K.D., Barnes, G.: 2007, Photospheric Magnetic Field Properties of Flar-
ing versus Flare-quiet Active Regions. IV. A Statistically Significant Sample.
Astrophys. J. 656, 1173 – 1186. doi:10.1086/510282.
Lites, B.W., Low, B.C., Martinez Pillet, V., Seagraves, P., Skumanich, A.,
Frank, Z.A., Shine, R.A., Tsuneta, S.: 1995, The Possible Ascent of a
Closed Magnetic System through the Photosphere. Astrophys. J. 446, 877 – .
doi:10.1086/175845.

C. Verbeeck, et al.
Liu, Y., Kurokawa, H.: 2004, On a Surge: Properties of an Emerging Flux Region.
Astrophys. J. 610, 1136 – 1147. doi:10.1086/421715.
Liu, Y., Norton, A.A., Scherrer, P.H.: 2007, A Note on Saturation
Seen in the MDI/SOHO Magnetograms. Solar Phys. 241, 185 – 193.
doi:10.1007/s11207-007-0296-5.
Martens, P.C.H., Attrill, G.D.R., Davey, A.R., Engell, A., Farid, S., Grigis,
P.C., Kasper, J., Korreck, K., Saar, S.H., Savcheva, A., Su, Y., Testa, P.,
Wills-Davey, M., Bernasconi, P.N., Raouafi, N., Delouille, V.A., Hochedez,
J.F., Cirtain, J.W., Deforest, C.E., Angryk, R.A., de Moortel, I., Wiegel-
mann, T., Georgoulis, M.K., McAteer, R.T.J., Timmons, R.P.: 2011, Com-
puter Vision for the Solar Dynamics Observatory (SDO). Solar Phys..
doi:10.1007/s11207-010-9697-y.
McAteer, R.T.J., Gallagher, P.T., Ireland, J., Young, C.A.: 2005, Automated
Boundary-extraction And Region-growing Techniques Applied To Solar Mag-
netograms. Solar Phys. 228, 55 – 66. doi:10.1007/s11207-005-4075-x.
Morita, S., McIntosh, S.W.: 2005, Genesis of AR NOAA10314. In: K. Sankara-
subramanian, M. Penn, & A. Pevtsov (ed.) Large-scale Structures and their
Role in Solar Activity 346, Astron. Soc. Pacific, San Francisco, 317 – .
Nguyen, S.H., Nguyen, T.T., Nguyen, H.S.: 2005, Rough set approach to sunspot
classification problem. In: Slezak, D., Yao, J., Peters, J.F., Ziarko, W., Hu, X.
(eds.) Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing Lecture
Notes in Computer Science 3642, Springer, Berlin / Heidelberg, 263 – 272.
Parnell, C.E., DeForest, C.E., Hagenaar, H.J., Johnston, B.A., Lamb, D.A.,
Welsch, B.T.: 2009, A Power-Law Distribution of Solar Magnetic Fields
Over More Than Five Decades in Flux. Astrophys. J. 698, 75 – 82.
doi:10.1088/0004-637X/698/1/75.
Pérez-Suárez, D., Higgins, P.A., McAteer, R.T.J., Bloomfield, D.S., Gallagher,
P.T.: 2011, Automated Solar Feature Detection for Space Weather Ap-
plications. In: Qahwaji, R., Green, R., Hines, E.L. (eds.) Applied Signal
and Image Processing: Multidisciplinary Advancements, IGI Global, Hershey,
Pennsylvania, 207 – 225. doi:10.4018/978-1-60960-477-6.
Qahwaji, R., Colak, T.: 2006, Hybrid imaging and neural networks techniques
for processing solar images. I. J. Comput. Appl. 13(1), 9 – 16.
Sarro, L.M., Berihuete, A.: 2011, Statistical techniques for the detection
and analysis of solar explosive events. Astron. Astrophys. 528, A62+.
doi:10.1051/0004-6361/201014894.
Scherrer, P.H., Bogart, R.S., Bush, R.I., Hoeksema, J.T., Kosovichev, A.G.,
Schou, J., Rosenberg, W., Springer, L., Tarbell, T.D., Title, A., Wolf-
son, C.J., Zayer, I., MDI Engineering Team: 1995, The Solar Oscilla-
tions Investigation - Michelson Doppler Imager. Solar Phys. 162, 129 – 188.
doi:10.1007/BF00733429.

Schrijver, C.J.: 1987, Solar active regions - Radiative intensities and large-scale
parameters of the magnetic field. Astron. Astrophys. 180, 241 – 252.
Schrijver, C.J.: 2007, A Characteristic Magnetic Field Pattern Associated with

All Major Solar Flares and Its Use in Flare Forecasting. Astrophys. J. Lett.
655, 117 – 120. doi:10.1086/511857.
SIDC-Team: 2003, The International Sunspot Number. Monthly

Report on the International Sunspot Number, online catalogue.
http://www.sidc.be/sunspot-data/dailyssn.php).
Subramanian, P., Dere, K.P.: 2001, Source Regions of Coronal Mass Ejections.
Astrophys. J. 561, 372 – 395. doi:10.1086/323213.
Watson, F., Fletcher, L., Dalla, S., Marshall, S.: 2009, Modelling the Longitu-
dinal Asymmetry in Sunspot Emergence: The Role of the Wilson Depression.
Solar Phys. 260, 5 – 19. doi:10.1007/s11207-009-9420-z.
Welsch, B.T., Longcope, D.W.: 2003, Magnetic Helicity Injection by Horizontal

Flows in the Quiet Sun. I. Mutual-Helicity Flux. Astrophys. J. 588, 620 – 629.
doi:10.1086/368408.
Zharkov, S., Zharkova, V., Ipson, S., Benkhalil, A.: 2004, Automated recognition
of sunspots on the soho/mdi white light solar images. In: Knowledge-Based
Intelligent Information and Engineering Systems Lecture Notes in Computer
Science 3215, Springer, Berlin / Heidelberg, 446 – 452.


1109.0473v1

Uploaded by

Document Informationclick to expand document information

Document Informationclick to expand document information

Copyright:

Available Formats

1109.0473v1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1109.0473v1

Uploaded by

Copyright:

Available Formats

Solar Physics

A Multi-Wavelength Analysis of Active Regions and

C. Verbeeck1 · P.A. Higgins2 · T. Colak3 ·

Abstract Since the Solar Dynamics Observatory (SDO) began recording ∼ 1 TB

1 Royal Observatory of Belgium, Belgium email:

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 1

Keywords: Active Regions; Magnetic Fields; Coronal Structures; Sunspots

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 2

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 3

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 4

Magnetic Structure Detections 10-Jun-2003

3.1. The SMART algorithm

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 5

NOAA 10365 28-May-2003 Ising Energy Map

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 6

Figure 3. An example set of ASAP detections from 10 June 2003.

3.2. The ASAP algorithm

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 7

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 8

Figure 4. An example set of STARA detections from 10 June 2003.

Although heliographic conversion can be computationally expensive, it yields

3.3. The STARA algorithm

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 9

3.4. The SPoCA algorithm

The Spatial Possibilistic Clustering Algorithm (SPoCA) is a multi-channel fuzzy

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 10

Figure 5. An example set of SPoCA detections from 10 June 2003.

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 11

3.5. Association of Detected Features

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 12

First Step: Individual associations (ASAP, STARA, SPoCA vs. SMART)

Second Step: Combining all of the associations

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 13

4.1. Algorithm Performance

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 14

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 15

4.2. Principal Component Analysis

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 16

Table 1. Percentage and cumulative percentage of the

The first observation is that no variable is strongly anti-correlated with an-

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 17

Correlation between variables and Principal Components 1 and 2

Principal Component 2 B max

4.3. Case Studies

4.3.1. NOAA 10377

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 18

SOHO/MDI 6-Jun-2003 9-Jun-2003

Figure 8. A comparison of NOAA AR 10377 detections. ASAP sunspots are represented by

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 19

Area (ASAP,STARA) [Mm2]

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 20

ΣEUV Int. [DN/s]

EUV Max. [DN/s]

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 21

PSL Angle [deg]

4.3.2. NOAA 10365

SOLA: SOLA1525R2_Watson_20110901.tex; 7 October 2018; 13:27; p. 22

SOHO/MDI 21-May-2003 26-May-2003