1109.0473v1
1109.0473v1
1109.0473v1
DOI: 10.1007/•••••-•••-•••-••••-•
c Springer ••••
1. Introduction
In the 1960s, NASA launched the Pioneer 6, 7, 8, and 9 spacecraft, that were
tasked with observing the solar wind and interplanetary magnetic fields, forming
the first space-based space-weather network and recording 512 bits per second.
By comparison, the recently launched Solar Dynamics Observatory (SDO) is
currently relaying solar data back to Earth at a rate of 150 000 000 bits per
second. With SDO returning the equivalent of an image with 4096 by 4096
pixels every second, human analysis of every image would require a large team
of people working 24 hours a day. The technological advances that have allowed
the increased flow of data, such as improving communication bandwidths and
onboard processing power, allows us to record data with a much greater cadence
and spatial resolution than ever before. However, there are problems with the
storage, transfer, and analysis of such a large flow of data. SDO generates around
1 TB of data per day which is unprecedented in solar physics. Getting this volume
of data to researchers around the world, as well as storing it in convenient places
for analysis, is essential to make good use of it. An effective solution to the
problem is to use automated feature-detection methods, which allow users to
selectively acquire interesting portions of the full data set.
Development of automated solar feature detection and identification meth-
ods has increased dramatically in recent years due to the growing volume of
data available. An overview of the fundamental image-processing techniques
used in these algorithms is presented in Aschwanden (2010). These techniques
are used to detect many features in various types of observations at differ-
ent heights in the solar atmosphere (Pérez-Suárez et al., 2011). In thie present
work, we focus on detecting sunspot groups and active regions in photospheric
continuum images, magnetograms, and EUV images. Previously, detection of
sunspot groups in photospheric images was investigated by Zharkov et al. (2004)
and Curto, Blanca, and Martı́nez (2008). As well as detecting sunspot groups,
Nguyen, Nguyen, and Nguyen (2005); Colak and Qahwaji (2008) also make au-
tomated classifications. The detection of active regions in magnetograms is ex-
plored in McAteer et al. (2005); LaBonte, Georgoulis, and Rust (2007a), Lefebvre and Rozelot
(2004) and Qahwaji and Colak (2006), while Dudok de Wit (2006) introduces a
supervised segmentation of EUV images into AR, CH, and QS regions.
The purpose of this article is to determine the robustness of four algorithms
for detecting and physically characterising active regions and sunspot groups
by comparison of their outputs. We determine overall detection performance,
the correlations between extracted feature properties using Principal Compo-
nent Analysis, and the usability of these algorithms for tracking feature evo-
lution over time. The tools that we consider are the Solar Monitor Active
Region Tracker (SMART: Higgins et al., 2011) which detects magnetic features
using magnetograms, the Automated Solar Activity Prediction code (ASAP:
Colak and Qahwaji, 2009) which detects sunspots and pores using photospheric
intensity images, the Sunspot Tracking And Recognition Algorithm (STARA:
Watson et al., 2009) which also detects sunspots in photospheric intensity im-
ages, and the Spatial Possibilistic Clustering Algorithm (SPoCA: Barra et al.,
2009) which detects active regions in the corona using extreme ultraviolet images.
More detail on how these algorithms operate is provided in Section 3.
The overall performance of the algorithms is compared by determining the
total number of features detected as well as their full-disk area. Our meth-
ods are benchmarked against National Oceanic and Atmospheric Administra-
tion (NOAA) and Solar Influences Data Analysis Centre (SIDC) catalogues.
Few studies include comparisons of detection methods or different data types.
Benkhalil et al. (2006) compare the detection of active regions in several data
types using a single region-growing method. Other studies compare the detection
of features in magnetograms using a variety of region-growing and morphological
methods (DeForest et al., 2007; Parnell et al., 2009). Direct comparison of algo-
rithms is important for their characterisation, as each is designed in a different
way to detect features for a specific purpose.
Correlations between the properties determined by the algorithms are in-
vestigated using Principal Component Analysis (Jolliffe, 2002). PCA has been
used previously for various purposes in solar-physics and space-weather litera-
ture, e.g. to detect outliers (Sarro and Berihuete, 2011), to reduce dimension-
ality (Dudok de Wit and Auchère, 2007), or for exploratory data analysis of
space-weather data sets (Habash Krause, Franz, and Stevenson, 2011).
Finally, the stability of the algorithms is tested for tracking feature evolu-
tion through time. The evolution of two ARs is studied in detail, including
their emergence in several layers of the solar atmosphere. Lites et al. (1995)
present a similar multi-layered analysis of the emergence of an AR. As non-
potentiality increases in an AR, it may begin to exhibit enhanced coronal activ-
ity. This effect has been studied in many articles, and it is related to dynamic
behaviours such as helicity injection (Morita and McIntosh, 2005), turbulent
cascades (Hewett et al., 2008; Conlon et al., 2008, 2010), enhanced polarity sepa-
ration line gradient (Falconer, Moore, and Gary, 2008), and changes in magnetic
connectivity (Georgoulis and Rust, 2007; Ahmed et al., 2010). In this article
we study multiple behaviours in the same AR using magnetic property de-
terminations. Finally, the decay of the AR in the corona and photosphere is
compared. To our knowledge, this is the first time that automated feature-
detection algorithms have been used to study temporal evolution using properties
of magnetic non-potentiality, sunspot characteristics, and coronal activity of ARs
simultaneously.
The following sections detail these investigations. Observations used in this
study are described in Section 2, and the four algorithms to be compared are
introduced in Section 3. Our results are presented in Section 4, including an
evaluation of the algorithms’ overall performance, a correlation study of the
complete sample of active regions and a detailed case study of two different
active regions. Finally, a discussion of the results and concluding remarks is
presented in Section 5.
2. Observations
In this study we analyse data from the interval 12 May – 23 June 2003. The
detections obtained from each algorithm for the entire data set are studied as a
whole in Section 4.2 and NOAA ARs 10377 and 10365 are individually studied
in detail in Section 4.3. This particular data set was selected for the diversity of
solar features present. SOHO/MDI magnetograms are used for magnetic region
detection by SMART, while SOHO/MDI continuum images are used for sunspot
detection by ASAP and STARA, and SOHO/EIT images are employed for active
region detection by SPoCA. These algorithms are described in Section 3.
The MDI instrument on SOHO provides almost continuous observations of
the Sun in the white-light continuum, in the vicinity of the Ni i 676.78 nm
photospheric absorption line. These photospheric intensity images are primar-
ily used for sunspot observations. MDI data are available in several processed
“levels”. We used level-2 images, which are smoothed, filtered, and rotated
(Scherrer et al., 1995). SOHO provides two to four MDI photospheric intensity
images per day with continuous coverage since 1995.
Using the same instrument level, 1.8 line-of-sight (LOS) MDI magnetograms
are recorded with a nominal cadence of 96 minutes. The magnetograms show the
magnetic fields of the solar photosphere, with negative (represented as black) and
positive (as white) areas indicating opposite LOS magnetic-field orientations.
The Extreme Ultraviolet Telescope (EIT: Delaboudinière et al., 1995), on-
board SOHO, delivers synoptic observations consisting of 1024 by 1024 images
of the solar corona recorded in four different wavelengths every six hours. Every
SPoCA segmentation in this article was based on a pair of 17.1 and 19.5 nm
EIT images. All images used have been preprocessed using the standard eit-
prep procedure of the SolarSoftware library. A fixed-centre segmentation with
six classes was performed on the logarithms of the image pixel values. The AR
centre values are 401.74 and 324.25 DN s−1 in 17.1 and 19.5 nm, respectively.
These values were derived from a cumulative run of SPoCA on a data set of
monthly EIT image pairs from February 1997 until April 2005; see Barra et al.
(2009).
In the case studies presented in Section 4.3, we compare observations of NOAA
10365 and 10377 with flares characterised by the Reuven Ramaty High Energy
Solar Spectroscopic Imager (RHESSI) team and distributed in the RHESSI flare
list.
3. Methods
SMART, ASAP, and STARA detect photospheric features such as active regions
and sunspots while SPoCA uses images in the extreme ultraviolet to observe
active regions at the coronal level. In this section, we will describe each of the
feature-detection algorithms used (Sections 3.1, 3.2, 3.3, and 3.4). The outputs
from each of the algorithms are associated using the method explained in Section
3.5.
500
PL UD
AR PL PLUE
AR UD AR
AR
Y (arcsecs)
AR
0
UD
UE
AR AR AR
-500
-1000
-1000 -500 0 500 1000
X (arcsecs)
Figure 1. An example set of SMART detections from 10 June 2003. PL, UD, and UE identify
three classes of unipolar feature, while AR denotes multipolar features.
The Solar Monitor Active Region Tracker (SMART: Higgins et al., 2011) is an
algorithm that uses magnetograms to automatically extract, characterise, and
track active regions over multiple solar rotations – from first emergence to decay.
This allows one to study the complete life-cycle of ARs. The algorithm uses a
combination of image-processing techniques to determine the boundary of an
AR. Two consecutive line-of-sight magnetograms are smoothed using a gaussian
kernel with a full-width at half-maximum of five pixels and thresholded by 70 G
to identify potential features. The two detections are overlaid to identify and
remove features that are not present in both magnetograms. The remaining
detection boundaries are then dilated by ten pixels to create the final mask.
Dilation is performed to include nearby decaying and plage fragments that
may have separated from the main AR. This is intended to help conserve the
measured polarity balance of the AR as it evolves. An example set of SMART
detections is shown in Figure 1. In this article, SOHO/MDI LOS magnetograms
are used for detection, but recently the algorithm has been adapted for use with
SDO/HMI magnetograms (For near-realtime detections, see http://solarmonitor.org/smart disk).
150 150
Y (arcsecs)
100 100
50 50
0 0
0 50 100 150 200 250 0 50 100 150 200 250
X (arcsecs) X (arcsecs)
Figure 2. Left: NOAA 10365 highlighting the PSL (white contour), a linear fit to the locus
of PSL positions (dotted red line), bipolar connection line (solid red line), and heliographic
longitude and latitude reference lines (dotted black lines). Right: An Ising energy map of the
same region. Red represents the magnitude of energy for each pixel from highest (light) to
lowest (dark). Since the connection between pixels of opposite polarity is being represented,
the energy map is only shown for one of the polarities.
Several new physical-property modules have been added to SMART for this
study. The tilt of an AR is obtained by measuring the angle between the line
connecting the centroids of the largest flux-weighted positive and negative blobs
(solid red line in left panel of Figure 2) and the heliographic-latitude line passing
through the centroid of the AR. The length of this bipolar connecting line [BCL]
line is also determined and provides a measure of the relative compactness of
an AR when compared to the evolution of AR area (left panel of Figure 2).
Additionally, the angle [α] detected between the best-fit line to the locus of
pixels forming the main polarity separation line [PSL] and the BCL is measured.
The main PSL is defined as the interface between the aforementioned largest
flux-weighted positive and negative blobs. The temporal derivative of the angle
α is shown to be a useful proxy for the occurrence of helicity injection in an
AR (Morita and McIntosh, 2005), which may be an important flare predictor
(LaBonte, Georgoulis, and Rust, 2007b). The evolution of this angle is studied
in Section 4.3. These properties are less informative when studying AR com-
plexes or non-bipolar ARs, since often no main axes can be discerned, making a
description of the AR orientation impossible.
The original Ising model is used for the analysis of magnetic interactions and
structures of ferromagnetic substances. Here, as by Ahmed et al. (2010), Ising
energy is used as a proxy for magnetic connectivity and complexity within an AR
(right panel of Figure 2). We use a modified form of that given by Ahmed et al.
(2010).
This version is calculated using
X Bi Bj
, (1)
i,j
Dij
where Bi (Bj ) are pixel values of positive (negative) line-of-sight magnetic field
and Dij is the spatial distance between pixels i and j. The Ising energy increases
as the negative and positive magnetic footprints within an active region become
more entangled. The evolution of this property is studied in Section 4.3.
The modules added to SMART for this work will be used for future large-scale
active region studies and will be added to the pipe-line versions of SMART run-
ning at the Heliophysics Events Knowledgebase (http://www.lmsal.com/hek/index.html:
Hurlburt et al., 2010) and included in the Heliophysics Integrated Observatory
(http://www.helio-vo.eu/index.php: Bentley et al., 2009). In the future, other prop-
erty modules will be added to calculate a physically motivated magnetic-connectivity
measurement (Georgoulis and Rust, 2007), multi-scale energy spectrum slope
(Hewett et al., 2008), and multi-fractal spectrum properties (Conlon et al., 2008).
Automated Solar Activity Prediction (ASAP) is the collective name for a set of
algorithms used to process solar images. It is composed of algorithms for sunspot,
faculae, and active-region detections (Colak and Qahwaji, 2008) and solar-flare
prediction (Colak and Qahwaji, 2009). Unlike other algorithms described in this
article, ASAP uses quick look (in GIF or JPEG format) images for its processes.
In this article a recently developed sunspot detection algorithm for ASAP is
used. This new sunspot-detection algorithm works with continuum images and
is described in detail by Colak et al. (2011). The main steps in this algorithm
can be summarized as follows:
• Images are pre-processed to detect the solar disk, and to remove limb
darkening.
• Detected solar-disk data are converted from heliocentric coordinates to
Carrington heliographic coordinates.
– The key point in heliographic conversion is to choose the size of the
resulting image. If a very small image size is selected, this will cause
truncation and loss of data. If a very large one is chosen, there will be
many spaces in the resulting image. In this study, each heliographic
degree is represented by ten pixels therefore the resulting heliographic
images are 3600 by 1800 pixels.
– Initially, an empty 3600 by 1800 image is created. When the Car-
rington longitude and latitude of all of the pixels on the solar disk
are calculated, their pixel intensities are placed in corresponding lo-
cations.
– The distribution of pixels representing degrees in heliocentric coordi-
nates is not uniform due to the spherical shape of the Sun. Towards
the limb of the Sun on a two dimensional heliocentric image each
degree will be represented by fewer pixels although in a heliographic
image each degree is represented with same amount of pixels. Size of
the heliographic image is larger than heliocentric image and therefore
there will be gaps (empty pixels that are not filled with data from
heliocentric image) in the resulting initial heliographic image after
conversion.
– The following algorithm is applied to estimate the pixel intensities of
the gaps in the heliographic image. Every pixel on the heliographic
image is examined and when a pixel without any information is found,
its neighbouring pixels are searched using variable-size windows. First
a 3 3 pixel window is centred on the empty pixel and the values of the
non-empty neighbouring pixels within this window are added. If all
the neighbouring pixels within the initial window were empty, the size
of the window is increased by one and the process continued until at
least one pixel with information is available within the window. Then
the average value of the valid pixels found within the final window is
assigned to the empty pixel. The algorithm continues until all of the
pixels have been processed.
– After all data gaps have been filled, a smoothing algorithm using a
3 × 3 linear uniform filter is applied to create the final heliographic
image.
• Subsequently, the following filter is applied to the Carrington heliographic
image for detecting sunspots:
– An intensity filtering threshold value T = µ − (σ × α) is calculated
where µ is the mean, σ is the standard deviation of the image, and α
is a constant equal to 2.5.
– The intensity of each pixel in the image is compared to this T value.
If it is less than the calculated threshold value, the pixel under con-
sideration is marked as a sunspot.
10−Jun−2003 06:24:00
The Sunspot Tracking And Recognition Algorithm (STARA) code was written
in 2008 in order to perform consistent long term observations of sunspots over
Solar Cycle 23 (Watson et al., 2009). It was originally developed for use with
MDI data but has since been extended for use with data from SDO as well as
from a number of ground-based instruments. A simple detection method was
required to speed up processing when large data sets were used and suitable
techniques were found in the field of morphological image processing.
The STARA detection method works as follows: The image is read in and
inverted so that the sunspots appear as bright areas on a dark background
for compatibility with the top-hat transform. Morphological erosion is applied,
which removes peaks and works by treating the 2D image as a 3D surface with
the pixel values indicating the height of the surface at that point. A probe known
as a structuring element is then chosen (in this case, a sphere with a radius of
14 MDI pixels) and is “rolled” underneath the surface whilst always touching
it. The centre of the sphere then maps out a new surface that is close to 14
units below the original. However, any sharp peaks present (such as sunspots)
would not allow the sphere to fit inside them and so are not represented in the
eroded image. A morphological dilation is then performed which is identical to
an erosion apart from the sphere being rolled on top of the surface. The dilated
surface is subtracted from the original to leave only the sunspot peaks present.
As the sphere rolls over the surface it also carries the limb-darkening profile
through each step and when the final surface is subtracted from the original, the
limb-darkening effects are automatically removed.
More detail on this step and the top-hat transform is given by Watson et al.
(2009) and Dougherty and Lotufo (2003). A size filter is then applied that re-
moves areas containing less than ten pixels as they are far more likely to be pores
than sunspots. The remaining areas are recorded along with their locations as
well as the number of umbral regions detected within the sunspot boundary. This
is repeated for a number of consecutive images and using the solar-rotation model
of Howard, Harvey, and Forgach (1990) the sunspots can be tracked throughout
a sequential data set. This allows the evolution of individual sunspots to be
followed as well as the overall properties of the sunspot population as a whole.
An example of a typical set of STARA sunspot detections is given in Figure 4.
STARA has undergone very few changes over the course of this work as
the code was well established beforehand. Nevertheless, some subtle problems
have been discovered in the process. As sunspots approach the limb (at lon-
gitudes greater than 75◦ ) the sunspot position returned by the code quickly
loses accuracy. This is a common problem with feature-detection methods as the
geometrical foreshortening effects test the limits of automated systems. There
are also potential problems present with bad data. Obviously the best remedy
is to remove it altogether but with MDI it is possible to have images with only
half of the solar disk present or with large artifacts. Both of these situations can
have substantial effects on the detected global properties and cause problems
with analysis.
of these regions are not always well defined. The description of the segmentation
process in terms of fuzzy logic was motivated by the facts that information
provided by a solar EUV image is noisy (corruption by Poisson and readout
noise as well as by cosmic-ray hits) and subject to both observational biases (line-
of-sight integration of a transparent volume) and interpretation (the apparent
boundary between regions is a matter of convention).
SPoCA takes as input an image in one (or several) EUV passband(s) and
uses as “feature vector” the pixel value (or the pixel value vector in the multi-
channel case) in order to classify a pixel as belonging to one of three classes,
namely AR, QS, and CH. SPoCA is based on a fuzzy clustering technique called
“Possibilistic C-Means” (PCM: Krishnapuram and Keller, 1993, 1996). For each
class, it assigns a “probability” or membership value ∈ [0, 1] to every feature
vector.
PCM is an iterative method that searches for three compact clusters in the
space of feature vectors, corresponding to AR, QS and CH. In practice, this is
achieved through a gradient-descent scheme that minimizes an objective function
that is related to the total intracluster variance plus some penalty term. In every
iteration, new membership values are calculated based on the class centre values.
The membership values are used in turn to compute the new class centres, and
so on, until the class centres converge to within a preset accuracy.
In order to cope successfully with intensity outlier pixels such as those affected
by cosmic rays and proton storms, a spatial regularization term was added to
the PCM objective function, forcing membership values in a neighbourhood to
be as close as possible. By assigning each pixel to the class for which its feature
vector has the largest membership value, the image is segmented. An example
set of SPoCA detections is shown in Figure 5.
Since the solar corona is optically thin, and since the intensity in EUV
images is obtained through an integration along the line of sight, there is a
limb-brightening effect in those images, which may hinder the segmentation pro-
cess. Therefore, the EUV images are pre-processed so as to lower the enhanced
brightness near the limb. The initial SPoCA class contours are automatically
postprocessed using a morphological opening with a circular isotropic element
of size unity.
Since the publication of Barra et al. (2009), the SPoCA algorithm was opti-
mized and extended in several ways:
• In order to gain more consistent results, we introduced some constraints on
the penalty term of the objective function to be minimized.
• The limb correction is now applied in a continuously increasing way towards
the limb instead of introducing it abruptly from some point onwards.
• For individual AR detection, first the Bright Points are removed (size
threshold is 1500 square arcseconds) and then a spherical dilation (ra-
dius: 12 EIT pixels) is employed to group the remaining bright blobs into
individual active regions.
• Individual AR are tracked through time by comparing the masks of regions
in two consecutive time frames, taking into account differential rotation.
SPoCA has been running in near-realtime on AIA data since September 2010
as part of the SDO Feature Finding Project (Martens et al., 2011), a suite of
software pipeline modules for automated feature recognition and analysis for the
imagery from SDO. The resulting AR events are automatically ingested by the
Heliophysics Events Knowledgebase (Hurlburt et al., 2010).
SPoCA is the only algorithm presented here that detects ARs in the solar
corona. The method is generic enough to allow the introduction of other channels
or data. It has been applied to SOHO/EIT, SDO/AIA, PROBA2/SWAP, and
STEREO/EUVI images, and could potentially be used on other multi-channel
maps such as Differential Emission maps. In this article we focus on ARs, but
QS and CH can also be detected and tracked.
The SMART tracking module, called “Multiple Disk Passage” (MuDPie: Higgins et al.,
2011), is used to associate individual SMART detections of the same physical
feature over time by comparing the centroids of all detections in consecutive
magnetograms. Two detections are associated if their centroids match within
5◦ heliographic latitude and longitude. The tracked SMART detections are then
associated with the best matched detections in each of the other algorithms as
described in the following paragraphs.
In order to analyse the relation between the features detected by different
algorithms, a routine developed in Python associates detections from each algo-
rithm in two ways. First outputs from ASAP, STARA, and SPoCA are associated
with SMART outputs based on time and location information. Second, individual
association outputs (ASAP vs. SMART, STARA vs. SMART, SPoCA vs. SMART)
from the first step are combined using SMART IDs and timing information.
For associations, SMART is chosen as the base algorithm because SMART
detections usually encircle the corresponding ASAP and STARA detections and
they are also more stable over time than SPoCA detections, due to the frequent
splitting and merging of coronal AR detected by SPoCA. Also, SMART detects
magnetic regions from MDI images which are more frequently available than
the continuum and EIT images that the other algorithms are working on. The
association rules are described below.
• The time difference between the solar detections under consideration (i.e.,
sunspots from ASAP and STARA, active regions from SPoCA versus mag-
netic regions from SMART) is calculated.
• If the time difference between a magnetic region detected by SMART and
a solar region detected by another algorithm is less than 0.2 Julian days
and their heliographic bounding boxes intersect, then these detections are
associated. Since SPoCA does not deliver heliographic bounding boxes, a
bounding box of 5◦ in longitude and latitude is assumed.
• If the same solar detection is associated to more than one SMART region,
only the closest (in terms of time and distance between centres) SMART
region is selected as associated.
• Associations are saved in separate files (three files; ASAP vs. SMART,
STARA vs. SMART, SPoCA vs. SMART) including the selected characteris-
tics from each algorithm.
output that is going to be analysed.
• The SMART algorithm uses an ID for each magnetic region detected and
in this second step, the association data saved in the three separate files
from the first step are combined using this ID and time information. The
association data with same SMART ID and closest timing are combined
together. Timing information still has to be used due to the difference
between the image times.
• The final combined data are saved in one file.
SMART provided 9356 detections (207 magnetic region features), ASAP 3039
detections (952 sunspot features), STARA 1329 detections (433 sunspot features)
and SPoCA 1222 detections (190 coronal active-region features) within the con-
sidered time-frame. In the first step, 714 SMART detections were associated to
2889 ASAP detections, 550 SMART detections were associated to 1315 STARA
detections and 1089 SMART detections were associated to 1117 SPoCA detec-
tions. In the second step when all of these data were combined, 350 detection (33
feature) associations were created for SMART, ASAP, STARA and SPoCA. The
daily averages of some of the outputs such as average daily sunspot numbers,
active region numbers and average areas are compared to the NOAA active
region catalogue in Section 4.1. In the considered period, NOAA recorded 217
detections (37 features).
In case of merging or splitting of neighbouring coronal regions as detected
by SPoCA, the association procedure described above does not relate the new
SPoCA detection to the corresponding SMART detection. This happened several
times in the case studies in Section 4.3. For these cases, we applied a manual
association of SPoCA detections to SMART detections.
4. Results
The feature detections from each algorithm are compared in the following sec-
tions. First, in Section 4.1 the overall detection performance of the algorithms is
presented, and compared to the corresponding NOAA detections and the daily
international sunspot number. Next, Principal Component Analysis is performed
on the full set of detections to probe the overall structure of the physical prop-
erties calculated by the algorithms in Section 4.2. Finally, in Section 4.3 the
evolution and flare activity of NOAA 10377 and 10365 are analysed in depth,
using physical properties determined by each algorithm.
The performance of the algorithms is measured by comparing the daily total and
average values of some of the solar feature properties to each other, to values
reported by NOAA (http://www.swpc.noaa.gov/ftpdir/forecasts/SRS/README),
and to the international daily sunspot numbers (SIDC-Team, 2003) between 12
May and 23 June 2003.
A comparison of these data is provided in Figure 6. The graph on the up-
per left side of Figure 6 compares the daily number of sunspots detected by
ASAP and STARA to the total number of spots within NOAA regions and to
the international sunspot number. Generally, peaks and valleys in all of the
series follow each other but the international sunspot numbers and the sunspot
numbers for NOAA are usually higher than the sunspot numbers for ASAP and
STARA. When sunspots are detected manually, each umbra within a penumbra is
counted as one sunspot, whereas the automated algorithms discussed here count
each penumbra as one sunspot although it could have more than one umbra
within. Therefore the difference in sunspot numbers increases when the number
of complex sunspot regions increases. Also, the number of sunspots detected by
ASAP is always higher than the ones detected by STARA. This is because ASAP
tends to detect very small sunspots (sometimes pores) while STARA has a higher
threshold for the size of sunspot candidates.
150 25
Int. Sunspot Num. NOAA
NOAA SMART
ASAP 20 SPOCA
STARA
100
# of Regions
15
# of Spots
10
50
0 0
15-May 25-May 04-Jun 14-Jun 24-Jun 15-May 25-May 04-Jun 14-Jun 24-Jun
Time Time
4000
Area [Millionths of Solar Hemisphere]
NOAA
ASAP
STARA
3000 Int. Sunspot Num.
2
2000
1000
0
15-May 25-May 04-Jun 14-Jun 24-Jun
Time
Figure 6. Comparison of average detection results of algorithms to reported NOAA and in-
ternational sunspot numbers. Upper-left: Comparison of number of sunspots detected by ASAP
and STARA and reported by NOAA and recorded international sunspot numbers. Upper-right:
Number of regions detected by SMART and SPoCA compared with the ones reported by NOAA.
Lower-left: Comparison of average daily sunspot areas detected by ASAP and STARA versus
NOAA. The normalised international sunspot number is over-plotted for context. Lower-right:
Comparison of average daily region areas detected by SMART and SPoCA.
The graph on the upper right side of Figure 6 compares the daily number of
regions detected by SMART and SPoCA to the daily number of NOAA regions.
SMART and SPoCA detect more regions than NOAA because the NOAA number
is given to a region only if it has one or more sunspots, while SMART and SPoCA
regions do not depend on the existence of sunspots within detected boundaries.
Because of the projection effects of large coronal loops, two close but distinct
regions in the photosphere will often be detected by SPoCA as one region. This
explains why SMART has a higher tally of daily regions than SPoCA. This effect
is most visible near the solar limb.
A comparison of the areas of ARs and sunspots as detected by the four algo-
rithms and NOAA is presented in the lower part of Figure 6. SMART, ASAP, and
STARA areas were corrected for the line-of-sight projection effect that decreases
the observed area as the feature moves away from the central meridian. Since the
line-of-sight projected area of coronal loops does not necessarily decrease with
longitude, no systematic effect is expected for the observed SPoCA area, so the
raw area is presented. Sunspot areas are given in millionths of solar hemisphere
to be consistent with the units of the NOAA catalogue.
The graph on the lower left side of Figure 6 compares the sunspot areas
detected daily by ASAP and STARA to the NOAA sunspot areas, while the
international sunspot number is added for context. These three time series agree
well but there appears to be a one-day shift in NOAA sunspot areas. The ASAP
and STARA sunspot measurements are averages of observations throughout the
whole day, whereas, depending on the day, the NOAA sunspot observation may
be quite early in the day. Since sunspots emerge quickly and decay slowly, any
sunspots that emerge late in a day are likely to be missed by NOAA, but
registered by ASAP and STARA. The following day, the new sunspots are likely
to remain visible, so ASAP and STARA are likely to show the same area as the
previous day, while the NOAA area will increase.
The graph on the lower right side of Figure 6 shows the comparison between
active region areas detected daily by SMART and by SPoCA. Considering we are
dealing here with photospheric versus coronal areas, a good agreement between
the areas is obtained. Both SMART and SPoCA areas vary smoothly. Moreover,
they are large enough to include the whole sunspot group if present, and to
measure changes in topology or complexity consistently. In summary, the time
series of sunspot and AR areas are well correlated, showing similar behaviour
in time, and the differences observed are likely due to the data and detection
methods used.
Principal Component Analysis (PCA: Jolliffe, 2002) aims at reducing the dimen-
sionality of a problem. It does so by maximizing the data structure information
in the principal component space. More precisely for a data set containing
n observations of p variables, the principal components are the directions in
n-dimensional variable space in which the data set exhibits maximal variance.
In this article, PCA (based on linear values) is used to get some insight in the
correlation structure of the following p variables: the Schrijver R value (Schrijver,
2007), length of the strong gradient line, magnetic flux, maximum B field, area,
length of the bipole connecting line, Ising energy, and Ising energy per pixel (Ising
E ppx) as computed by the SMART algorithm, the sunspot area and number of
sunspots as given by ASAP, and the raw AR area, maximum, variance, kurtosis
and skewness of the EUV intensity as computed by SPoCA.
These variables were computed on data recorded 12 May – 23 June 2003, at
a cadence of 96 minutes for photospheric features, and of six hours for coronal
features. Data from the various algorithms were then associated as described in
Section 3.5.
We excluded data points corresponding to regions whose centre was more
than 60◦ from the central meridian, as projection errors involved become too
large. Table 1 lists the percentage and cumulative percentage of the variance
explained by the principal components. The first two components explain 67%
of the total variability in the data set.
Figure 7 represents the variables in the plane of the first two components.
Each variable lies within a circle of radius one in this figure. Variables that
lie close to the circle are well represented by the first two components, while
variables close to the origin are not. The cosine of the angle formed by the origin
and two points on the graph of Figure 7 gives the correlation between the two
corresponding variables. This figure thus yields a graphical representation of the
correlation structure between variables.
kurt I skew I
max I
0.5
var I
-1.0
-1.0 -0.5 0.0 0.5 1.0
Principal Component 1
Figure 7. The projections of the algorithm variables upon the first and second principal
components are plotted. They provide a measure of the extent to which these variables are
correlated with the first and second principal components.
This study shows that a reduction in dimensionality using PCA can be per-
formed without losing too much information. Such reduction can enhance the
accuracy and robustness of a subsequent classification scheme (Jiang, 2011) that
would aim for example at separating active regions that are prone to flares from
quiet active regions.
In the following section we analyse the time evolution of the ARs that emerge as
NOAA 10377 (a simple region) and 10365 (a complex, flaring region). Of special
interest is how activity in the corona results from changes in the photosphere.
Drawing this connection is essential for flare prediction, since the photosphere is
more easily physically characterised than the corona, where flares actually occur.
The photosphere–corona connection is not well understood, e.g. the work of
Leka and Barnes (2007) and of Handy and Schrijver (2001), with the references
therein.
We compare observations of NOAA 10365 and 10377 with flares characterised
by the Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI) team
and distributed in the RHESSI flare list( http://sprg.ssl.berkeley.edu/∼jimm/hessi/hsi flare list.html).
The flares, which have been associated with the individual ARs by the RHESSI
team, are represented in plots (Section 4.3) as downward pointing arrows, whose
size is logarithmically proportional to their peak count rate.
NOAA 10377 first emerges just before rotating onto the visible disk on 4 June
2003. It continues to gradually develop as it progresses across the disk producing
300 300
200 200
Y (arcsecs)
100 100
0 0
-100 -100
-200 -200
-900 -800 -700 -600 -500 -400 -300 -200 -100 0 100 200
X (arcsecs) X (arcsecs)
very little activity (only one B9.1 event is listed in the NOAA events catalogue
(http://www.swpc.noaa.gov/ftpdir/indices/events/README)). Some of the flares
produced by 10377 may have been missed due to the presence of 10375, which
produced many large flares, swamping any signal that could be attributed to
10377.
Figure 8 shows the SMART detection of 10377 in red, while other features are
outlined in black. The extended dashed blue contours are SPoCA AR detections
and the small symbols and contours are sunspot detections from ASAP and
STARA, respectively. It is clear from Figure 8 that positions of the SMART,
ASAP, STARA, and SPoCA detections agree quite well. Whereas the sunspots
detected by ASAP and STARA are well confined within the SMART magnetic
region boundary, the SPoCA region most often contains most of the SMART
detection. In the case of coronal loop structures forming between nearby ARs,
adjacent SPoCA detections will merge and the SMART and SPoCA centroids will
diverge. This is especially apparent near the solar limb, where coronal structures
extending above the solar surface will be superimposed.
Figures 9 – 11 show the evolution of NOAA 10377 as it progresses across
the disk. In the top panel of Figure 9 the Stonyhurst longitudes of the re-
gion centroids from each algorithm are shown. The vertical dotted lines indi-
cate where the AR magnetic bounding box edges (dashed–dotted) and centroid
(dashed) cross −60 and 60◦ longitude. The cosine correction used to correct for
line-of-sight effects on magnetic-field properties is not sufficient outside of this
range. Also, beyond 60◦ , sunspot visibility is below ≈ 31 of that at disk centre
(Watson et al., 2009) due to the Wilson depression.
The top panel of Figure 9 tracks the longitude of centroids over time. We
see that ASAP and STARA curves are above the SMART curve on this plot,
suggesting that the centroid of the magnetic footpoints (SMART) follows behind
the sunspot centroids (ASAP and STARA). Since the longitudinal speed of the
white-light and magnetic detections are the same, this implies that the following
80
60 SMART
Longitude [deg]
40 SPOCA
20 ASAP
0 STARA
-20
-40
Area (SMART,SPOCA/3) [Mm2]
-60
1500
2•104
1000
1•104
500
0 0
15
10
# Spots
0
04-Jun 06-Jun 08-Jun 10-Jun 12-Jun 14-Jun 16-Jun
Time
Figure 9. Time series of position, area, and sunspot information characterising the evolution
of NOAA AR 10377. The legend indicates symbols and colors for each of the detection algo-
rithms. The axes of the area plot are split between left (SPoCA and SMART) and right (ASAP
and STARA). The SPoCA areas have been divided by three for display.
polarity of the AR extends beyond the embedded sunspots, while the leading
polarity remains compact. As the NOAA region 10377 is close to 10375, this
last region affects the SPoCA detections. From 3 – 6 June, SPoCA detects both
NOAA regions within a single boundary. When this region splits into two parts
on 6 June, the SPoCA longitude and area curves decrease abruptly, and can now
be directly compared to the photospheric structures. This changes when the two
NOAA regions merge again on 11 June. Whenever the region detected by SPoCA
corresponds to the region detected by the other algorithms, all four longitudes
agree well.
The total sunspot area determined by ASAP and STARA (Figure 9, middle)
is very similar except for one data point near 12 June 2003. This is due to the
MDI image on 11 June 2003 at 1736 UT being distorted. Most of the distortion
is visible on the south limb of the image where this area is darker than the
rest of the solar disk. Because ASAP detects the solar disk directly from the
image, while STARA uses FITS keywords, the determination of the solar disk
by these two methods is different. This explains why on this image the ASAP
sunspot area is much smaller than the STARA area: whereas the distorted area is
detected by STARA as a large sunspot, it is completely discarded by ASAP. The
SMART and SPoCA areas of photospheric magnetic regions and coronal active
regions obey the same general trend as the sunspot areas, although the absolute
scales are different. While the area measurements are stable, the total number of
sunspots is not. The total area is dominated by the largest sunspots, while the
total number of spots is affected by small transients which ASAP is especially
sensitive to.
4•1022
1•1022 5.0•106
0 0 4
4000 3•10
3000 2•104
2000 1•104
1000 0
04-Jun 06-Jun 08-Jun 10-Jun 12-Jun 14-Jun 16-Jun
Time
Figure 10. Time series of (top) total magnetic flux, total EUV intensity, (bottom) maximum
magnetic field, and maximum EUV intensity for NOAA AR 10377. The axes of the plots are
split between left (magnetic-field properties, black crosses) and right (coronal properties, blue
squares). RHESSI flares associated with the AR are indicated by downward arrows.
In the top panel of Figure 10, the emergence of the magnetic structure of
10377 is clearly seen in measurements of its total flux. The AR is stable until
≈ 8 June 2003 when a phase of rapid emergence begins, lasting until ≈ 11 June
when the total magnetic flux has more than doubled. Comparing Figure 9 and
Figure 10, we see that the total magnetic flux increases faster than the magnetic
area, implying that the AR magnetic fields emerge relatively faster than they
diffuse. The same general smooth trend is observed in the SPoCA total EUV
intensity between 6 and 11 June. After NOAA 10377 merges with 10375 on 11
June, we see a clear decay of the total EUV intensity in this combined region.
Note that both SMART flux and SPoCA total EUV intensity behave similarly
to the region area time series.
In the bottom panel the maximum magnetic field is much less stable than the
flux, and shows no clear trend. The maximum SPoCA EUV intensity does not
change significantly between 6 and 11 June for NOAA region 10377, but exhibits
three clear peaks afterwards which can be attributed to region 10375. The first
peak, on 11 June, can be attributed to SPoCA merging with NOAA 10375. The
peak on 12 June, near 1300, is probably associated with the M1.0 flare in 10375
at around 1358 UT, whereas the 13 June 0700 UT peak is probably related to
the M1.8 (0628 UT) or C6.1 (0710 UT) flares in 10375. These flares (appearing
in the NOAA events catalogue) are not indicated by the RHESSI arrows, since
we have only displayed those flares attributed to 10377. This shows that SPoCA
maximum intensity is capable of indicating solar eruptions.
Magnetic properties related to polarity mixing and complexity are shown in
Figure 11. In the top panel, the angle between the bipole connection line and
200
50
0
2.0•1012
1.5•1012
R [Mx]
1.0•1012
5.0•1011
0
6•108
5•108
Ising Energy
4•108
3•108
2•108 Energy
1•108 (Energy/Px)*1E3
0
04-Jun 06-Jun 08-Jun 10-Jun 12-Jun 14-Jun 16-Jun
Time
Figure 11. Time series of (top) PSL orientation with respect to the bipole separation line,
(middle-top) bipole separation line length (crosses) and PSL length (dashed), (middle-bottom)
R, and (bottom) Ising energy (crosses) and Ising energy per pixel (dashed; multiplied by 1000
for display) for NOAA AR 10377.
polarity separation line (PSL) is presented. Since the PSL in this AR is only
a few megameters (or pixels) long (cf. middle-top panel), this angle cannot be
measured in a reliable way. Indeed, a small growth in the PSL detection in any
direction can cause the angle to change dramatically. In the middle-bottom panel
the total flux near the PSL [R] is very small until it begins to increase as a false
PSL is detected due to the near-horizontal fields of the large leading polarity
sunspot approaching the west limb on 12 June.
Ising energy, a proxy for magnetic connectivity, is shown in the bottom panel.
This property increases during the main magnetic emergence phase (≈ 8 to 10
June 2003) since it is dependent on the magnetic-field strength and inversely
dependent on the distance between individual magnetic elements. The Ising
energy per pixel (dashed line) appears to be very susceptible to geometrical
effects as the large decrease near the west limb and increase near the east limb
both coincide with the formation of false PSLs in the leading sunspot. It should
be noted that this quantity was calculated without remapping the data to disk-
centre as done by Ahmed et al. (2010), giving the measurement an even larger
viewing-angle dependence.
Active region NOAA 10365 rotates onto the visible solar disk on 19 May 2003
at heliographic latitude -5◦ . At this point 10365 is mature and decaying, having
emerged and evolved on the far side of the Sun. On 24 May, a new bipolar
structure rapidly emerges in the extended plage of the trailing (positive) polarity.
NOAA switches the 10365 designation to this newly emerged bipole several days
100
100
0
0
Y (arcsecs)
-100
-100
-200
-200
-300
-300
-400
-1100 -1000 -900 -800 -700 -600 -100 0 100 200 300 400
X (arcsecs) X (arcsecs)
Figure 12. A comparison of detection contours for NOAA AR 10365. ASAP sunspots are
represented by black crosses. The contours represent SMART in black (with NOAA 10365
outlined in red) for the magnetic features, SPoCA in dashed blue for coronal features, and
STARA in orange for sunspot penumbrae and magenta for umbrae.
later. As the bipole evolves it develops a strong double PSL by merging with the
decayed flux. It produces many C- and M-class flares and several X-class flares.
The AR progresses around the visible disk, eventually returning as NOAA 10386.
The onset of decay occurs as C- and M-class flares are produced with decreasing
frequency and the spot areas, magnetic flux, and field strengths decrease.
Figure 12 shows a comparison of the heliographic positions and sizes of two
sets of SMART, ASAP, STARA and SPoCA detections of NOAA 10365. We can
see that positions of the SMART, ASAP, STARA and SPoCA detections agree
well. The SPoCA detection, however, includes coronal loops extending away from
the footpoint boundary of NOAA 10365. Before 24 May, SPoCA merges NOAA
region 10367 with its detection of 10365. From 24 May – 27 May it only detects
10365, on 27 May at 1300 UT there is a single data point where these regions are
merged by SPoCA, and from 29 May at 0100 UT onwards, SPoCA merges them
for the remaining observation period. The longitudes of all detections within 24 –
27 May agree well. After 27 May the SPoCA longitude drifts, reflecting changes
in the merged coronal structures. Unlike 10377, the magnetic centroid of 10365
at first trails behind the sunspot centroid but then precedes it, as evidenced by
the top panel in Figure 13. This is because the new bipole, which develops many
spots, emerges behind the existing weakly spotted bipole. The new emergence
is clear in the plot of total sunspot area (middle panel), and is unclear in the
magnetic and EUV area plots since the new bipole emerges partially within the
boundary of the old one. Note that all areas for NOAA 10365 are much larger
than those for simpler region 10377. For SMART, area is very sensitive to weak
magnetic plage, however. This can be seen in the sudden jumps around May 25
and 28, which are due to nearby plage temporarily merging with the AR. The
jump in STARA area on 27 May can be attributed to a bad data file (note that
there is no ASAP data point at that time).
From 25 May onwards, the total magnetic flux increases gradually to over
four-fold the initial value during development and levels off around 29 May (see
80
60 SMART
Longitude [deg]
40 SPOCA
20 ASAP
STARA
0
-20
-40
Area (SMART,SPOCA/3) [Mm2]
-60
10
# Spots
0
19-May 21-May 23-May 25-May 27-May 29-May 31-May
Time
Figure 13. Time series of position, area, and sunspot information characterising the evolu-
tion of NOAA AR 10365. The legend indicates symbols and colors for each of the detection
algorithms. The axes of the area plot are split between left (SPoCA and SMART) and right
(ASAP and STARA). The SPoCA areas have been divided by three for display.
8•1022 2.0•107
ΣEUV Int. [DN/s]
ΦTOT [Mx]
6•1022 1.5•107
4•1022 1.0•107
2•1022 5.0•106
0 0 4
5000 3•10
EUV Max. [DN/s]
B-field Max. [G]
4000
2•104
3000
1•104
2000
1000 0
19-May 21-May 23-May 25-May 27-May 29-May 31-May
Time
Figure 14. Time series of (top) total magnetic flux, total EUV intensity, (bottom) maximum
magnetic field, and maximum EUV intensity for NOAA AR 10365. The axes of the plots are
split between left (magnetic-field properties, black crosses) and right (coronal properties, blue
squares). RHESSI flares associated with the AR are indicated by downward arrows.
200
1.0•1013
5.0•1012
0
2.5•109
2.0•109
Ising Energy
Energy
(Energy/Px)*1E3
1.5•109
9
1.0•10
5.0•108
0
19-May 21-May 23-May 25-May 27-May 29-May 31-May
Time
Figure 15. Time series showing proxies for the complexity and polarity mixing in NOAA
AR 10365. (middle-top) bipole separation line length (crosses) and PSL length (dashed),
(middle-bottom) R, and (bottom) Ising energy (crosses) and Ising energy per pixel (dashed;
multiplied by 1000 for display)
Figure 14). The maximum magnetic field increases abruptly on 25 May and also
increases over time, albeit less smoothly than the magnetic flux. The maximum
magnetic field did not show an overall increasing or decreasing trend in the case
of simpler NOAA region 10377.
The time series of SPoCA maximum intensity exhibits some peaks, which can
be related to the following flares produced by NOAA 10365: the M1.9 flare at
0534 UT on 26 May, the M1.6 flare at 0506 UT on 27 May, the X1.2 flare at
0051 UT on 29 May, and the M9.3 flare at 0213 UT on 31 May. The last two
flares are even visible in the total SPoCA intensity, which shows more or less a
gradual increase over time, but less smooth than in the case of the simpler NOAA
region 10377. The flares not picked up by SPoCA likely occurred in between EIT
images.
Signatures in the evolution of the magnetic topology of NOAA 10365 precede
its intense coronal activity, indicated by the associated RHESSI flares in Figure
15. Just before 25 May 2003 the new emergence causes a jump in the main bipole
separation line length. As the emergence continues and strong PSLs develop, this
length decreases, while the total PSL length increases, as shown in the middle-
top panel of Figure 15. Also, there are signs of gradual helicity injection as the
angle between the main bipole connection line and the main PSL grows from
near perpendicular (90◦ ) to around 120◦ (top panel). The flux near PSL [R]
grows during this time, as does the Ising energy (middle-bottom and bottom
panels, respectively). A bump in R just before 26 May is followed by an intense
RHESSI flare. Intense flaring begins again around the second bump in R on 28
June. Examining the development of Ising energy, it appears that sharp increases
in the property are followed by the most intense flaring.
40
20 SMART
Longitude [deg]
SPOCA
0 ASAP
-20 STARA
Area (SMART,SPOCA/3) [Mm2] -40
-60
1500
2•104
1000
1•104
500
0 0
15
10
# Spots
0
16-Jun 18-Jun 20-Jun 22-Jun 24-Jun
Time
Figure 16. Time series of position, area, and sunspot information characterising the decay
phase of NOAA AR 10365 (renamed 10386) during its second disk passage. The legend indicates
symbols and colors for each of the detection algorithms. The axes of the area plot are split
between left (SPoCA and SMART) and right (ASAP and STARA). The SPoCA areas have been
divided by three for display.
NOAA 10365 returns for a second disk passage, renamed 10386. We are able to
observe its decay phase, as shown in Figures 16 – 18. As no RHESSI data on flares
is available for this period, no flare arrows were added to these figures. While
the longitude of the SMART magnetic centroid increases linearly with time,
the ASAP and STARA sunspot centroids show small departures from this line
between 19 and 21 June, preceding the magnetic centroid. The SPoCA detection
of NOAA 10386 merges with 10388 and 10389, so a direct comparison with the
other algorithms cannot be made.
The magnetic area does not change significantly, but the total sunspot area
clearly decreases (middle panel, Figure 16), and has already decreased substan-
tially since the previous disk passage (as NOAA 10365). The total magnetic
flux decreases (top panel, Figure 17) as its magnetic fields diffuse and weaken.
Comparing the values to Figure 14, we notice that the flux had already decreased
significantly since the previous solar rotation. The total EUV intensity does not
change substantially, regardless of the weakening magnetic footpoints, although
it has decreased since the previous solar rotation. Its increase on 22 June is
due to the detection merging with a large region near the limb. The maximum
magnetic-field value shows a gradual, although not very smooth decrease, and
has also decreased since the previous passage. The maximum EUV intensity does
not show a clear trend, and although several jumps are detected, the intensity
levels are much less than those associated with flares over the previous passage.
The peak on 18 June at 0100 UT, for instance, can likely be associated to the
M6.8 flare produced by region 10386 at 2227 UT on 17 June. The PSL length has
4•1022
2.0•107
3000 2•104
2000 1•104
1000 0
16-Jun 18-Jun 20-Jun 22-Jun 24-Jun
Time
Figure 17. Time series of (top) total magnetic flux, total EUV intensity, (bottom) maximum
magnetic field, and maximum EUV intensity for NOAA AR 10365 on its second disk passage
as 10386. The axes of the plots are split between left (magnetic-field properties, black crosses)
and right (coronal properties, blue squares).
200
PSL Angle [deg]
150
100
50
0
200
Length [Mm]
Bipole Separation
150 PSL Length
100
50
0
5•1012
4•1012
R [Mx]
3•1012
2•1012
1•1012
0
1.5•109
Ising Energy
Energy
1.0•109 (Energy/Px)*1E3
5.0•108
0
16-Jun 18-Jun 20-Jun 22-Jun 24-Jun
Time
Figure 18. Time series showing proxies for the complexity and polarity mixing in NOAA
10386. (top) PSL orientation with respect to the bipole separation line, (middle-top) bipole
separation line length (crosses) and PSL length (dashed), (middle-bottom) R, and (bottom)
Ising energy (crosses) and Ising energy per pixel (dashed; multiplied by 1000 for display)
decreased since the previous solar rotation, and shows a further gradual decrease
in Figure 18. The same is true for both R and the Ising energy.
a bipolar magnetic structure, sunspots, and EUV loops (Section 4.3.1); increase
and peak in non-potentiality, followed by the onset of flaring (Section 4.3.2);
decay and weakening of magnetic footpoints (Section 4.3.2). We find that the
algorithms show good correspondence between centroid positions and areas but
significant divergence is seen in other properties.
In the case study of the simple active region NOAA 10377 (Section 4.3.1)
we see that the total number of detected sunspots fluctuates wildly as transient
spots rapidly emerge and disappear. This is partly due to the visibility curve
since the area of small spots is highly impacted by the observer’s viewing angle
(Dalla, Fletcher, and Walton, 2008; Watson et al., 2009). In order to study AR
evolution, sunspot area is more indicative of the emergence and decay of an AR
than coronal or magnetic area, which do not necessarily decrease during decay
(see Section 4.3). As a basis for long-term AR tracking, magnetic flux is more
useful than the sunspot area (since sunspots are much more transient than their
magnetic footprints) or maximum magnetic field value (since it is affected by the
MDI saturation problem Liu, Norton, and Scherrer (2007)). Also, the maximum
magnetic-field value is unstable since different positions in the active region will
over-take each other in field magnitude as they develop, causing the location
that the value is sampled from to vary wildly. Finally, the total EUV intensity
determined by SPoCA has a smooth behaviour over time and is closely linked to
area. The maximum EUV intensity peaks when the active region emits a large
flare, and appears to be a useful indicator of eruptions in the corona.
The case study of complex, flaring NOAA 10365 (Section 4.3.2) shows that
flaring can happen both in periods of flux emergence as well as non-potentiality
enhancement in an active region. Following the initial flaring during the emer-
gence phase of evolution, further flaring occurs as the main PSL rotates with
respect to the bipole connection line. This is a sign of helicity injection and is
coincident with increases of other properties related to polarity mixing. Helicity
injection has been established as a method of increasing non-potentiality and
may be caused by the emergence of subsurface twisted flux ropes, as seen in
Dun et al. (2007).
As NOAA 10365 returns after one solar rotation, decay is seen in the strength
of its magnetic footprint. However, the area is not seen to decrease significantly,
since supergranular diffusion causes a radial dispersal of magnetic elements.
Coronal structures do not appear to decay readily, either. This result agrees with
Lites et al. (1995), where it is reasoned that if the coronal magnetic structure
is closed, it may be in a state of quasi-static equilibrium, whereby the magnetic
buoyancy of the loops is cancelled by the weight of plasma trapped at the bottom
of the closed structure.
By performing these studies, it is found that magnetic active-region detections
provide the most stable base for feature tracking. Sunspots are only visible for
short periods of time, and coronal detections continually form bright loop connec-
tions to nearby features. The simple feature tracking method used in this article
(see Section 3.5) is novel in that it allows features to be tracked between multiple
disk passages. This is essential for analysing the complete life-cycle of an active
region, as exemplified in the analysis of NOAA 10365 (Section 4.3.2). Future work
on active-region evolution should combine morphological information to better
handle merging and splitting (as done by Welsch and Longcope (2003)) with our
method of multiple disk passage tracking. Future work on our algorithms will
also address the automatic detection and handling of structural or visible errors
in solar data, to avoid discontinuities in the time series due to a corrupted image,
as was seen in the STARA outputs used in the case studies (see Section 4.3.1).
This work will be expanded in the future to include an analysis of the full
SOHO archive as well as detailed studies of photospheric and coronal SDO data
sets. Many physical studies will benefit from this work, as investigations that
examine coronal heating as a result of large scale magnetic fields (Schrijver, 1987;
Fisher et al., 1998), coupling between the photosphere and corona (Handy and Schrijver,
2001), sources of coronal mass ejections (Subramanian and Dere, 2001), flux
emergence and distribution (Liu and Kurokawa, 2004; Abramenko and Longcope,
2005) and flare forecasting (Gallagher, Moon, and Wang, 2002) can all be re-
peated with these automated detection methods. Using these methods allows a
far greater number of features to be analysed and reduces human bias in the
detection of features in the solar data.
The algorithms presented here are automated (once thresholds have been
fixed), independent, and unsupervised. Although some development remains to
be done, they detect features the way that they are intended, and will provide
useful additions to the SDO pipeline feature-detection methods. However, this
work shows that automated methods cannot replace human data analysis but
they can help to stream-line the process.
References
Ahmed, O., Qahwaji, R., Colak, T., DudokDeWit, T., Ipson, S.: 2010, A new
technique for the calculation and 3d visualisation ofmagnetic complexities on
solar satellite images. The Visual Computer 26, 385 – 395. 10.1007/s00371-
010-0418-1. http://dx.doi.org/10.1007/s00371-010-0418-1.
Barra, V., Delouille, V., Kretzschmar, M., Hochedez, J.: 2009, Fast and robust
segmentation of solar EUV images: algorithm and results for solar cycle 23.
Astron. Astrophys. 505, 361 – 371. doi:10.1051/0004-6361/200811416.
Benkhalil, A., Zharkova, V.V., Zharkov, S., Ipson, S.: 2006, Active Region De-
tection and Verification With the Solar Feature Catalogue. Solar Phys. 235,
87 – 106. doi:10.1007/s11207-006-0023-7.
Bentley, R.D., Aboudarham, J., Csillaghy, A., Jacquey, C., Hapgood, M.A.,
Messerotti, M., Gallagher, P., Bocchialini, K., Hurlburt, N.E., Roberts, D.,
Sanchez Duarte, L.: 2009, Addressing Science Use Cases with HELIO. AGU
Fall Meeting Abstracts, A6.
Colak, T., Ahmed, O.W., Qahwaji, R., Higgins, P.A.: 2010, Automated So-
lar Flare Prediction: Is it a Myth? Presentation in Seventh European Space
Weather Week. http://spaceweather.inf.brad.ac.uk/colak19nov.pdf.
Colak, T., Qahwaji, R., Ipson, S., Ugail, H.: 2011, Representation of Solar Fea-
tures in 3D for Creating Visual Solar Catalogues. Adv. Space Res. 47(12),
2092 – 2104. doi:10.1016/j.asr.2010.08.030.
Conlon, P.A., McAteer, R.T.J., Gallagher, P.T., Fennell, L.: 2010, Quantifying
the Evolving Magnetic Structure of Active Regions. Astrophys. J. 722, 577 –
585. doi:10.1088/0004-637X/722/1/577.
Curto, J.J., Blanca, M., Martı́nez, E.: 2008, Automatic Sunspots Detection on
Full-Disk Solar Images using Mathematical Morphology. Solar Phys. 250,
411 – 429. doi:10.1007/s11207-008-9224-6.
Dalla, S., Fletcher, L., Walton, N.A.: 2008, Invisible sunspots and
rate of solar magnetic flux emergence. Astron. Astrophys. 479, 1 – 4.
doi:10.1051/0004-6361:20078800.
DeForest, C.E., Hagenaar, H.J., Lamb, D.A., Parnell, C.E., Welsch, B.T.:
2007, Solar Magnetic Tracking. I. Software Comparison and Recommended
Practices. Astrophys. J. 666, 576 – 587. doi:10.1086/518994.
Delaboudinière, J., Artzner, G.E., Brunaud, J., Gabriel, A.H., Hochedez, J.F.,
Millier, F., Song, X.Y., Au, B., Dere, K.P., Howard, R.A., Kreplin, R., Michels,
D.J., Moses, J.D., Defise, J.M., Jamar, C., Rochus, P., Chauvineau, J.P.,
Marioge, J.P., Catura, R.C., Lemen, J.R., Shing, L., Stern, R.A., Gurman,
J.B., Neupert, W.M., Maucherat, A., Clette, F., Cugnon, P., van Dessel, E.L.:
1995, EIT: Extreme-Ultraviolet Imaging Telescope for the SOHO Mission.
Solar Phys. 162, 291 – 312. doi:10.1007/BF00733432.
Dudok de Wit, T., Auchère, F.: 2007, Multispectral analysis of solar EUV im-
ages: linking temperature to morphology. Astron. Astrophys. 466, 347 – 355.
doi:10.1051/0004-6361:20066764.
Dun, J., Kurokawa, H., Ishii, T.T., Liu, Y., Zhang, H.: 2007, Evolution of
Magnetic Nonpotentiality in NOAA AR 10486. Astrophys. J. 657, 577 – 591.
doi:10.1086/510373.
Falconer, D.A., Moore, R.L., Gary, G.A.: 2008, Magnetogram Measures of Total
Nonpotentiality for Prediction of Solar Coronal Mass Ejections from Active
Regions of Any Degree of Magnetic Complexity. Astrophys. J. 689, 1433 –
1442. doi:10.1086/591045.
Fisher, G.H., Longcope, D.W., Metcalf, T.R., Pevtsov, A.A.: 1998, Coronal Heat-
ing in Active Regions as a Function of Global Magnetic Variables. Astrophys.
J. 508, 885 – 898. doi:10.1086/306435.
Gallagher, P.T., Moon, Y., Wang, H.: 2002, Active-Region Monitoring and Flare
Forecasting I. Data Processing and First Results. Solar Phys. 209, 171 – 183.
doi:10.1023/A:1020950221179.
Habash Krause, L., Franz, A., Stevenson, A.: 2011, On the application of Ex-
ploratory Data Analysis for characterization of space weather data sets. Adv.
Space Res. 47, 2199 – 2209. doi:10.1016/j.asr.2011.03.017.
Handy, B.N., Schrijver, C.J.: 2001, On the Evolution of the Solar Pho-
tospheric and Coronal Magnetic Field. Astrophys. J. 547, 1100 – 1108.
doi:10.1086/318429.
Hewett, R.J., Gallagher, P.T., McAteer, R.T.J., Young, C.A., Ireland, J., Conlon,
P.A., Maguire, K.: 2008, Multiscale Analysis of Active Region Evolution. Solar
Phys. 248, 311 – 322. doi:10.1007/s11207-007-9028-0.
Higgins, P.A., Gallagher, P.T., McAteer, R.T.J., Bloomfield, D.S.: 2011, Solar
magnetic feature detection and tracking for space weather monitoring. Adv.
Space Res. 47, 2105 – 2117. doi:10.1016/j.asr.2010.06.024.
Howard, R.F., Harvey, J.W., Forgach, S.: 1990, Solar surface velocity fields
determined from small magnetic features. Solar Phys. 130, 295 – 311.
doi:10.1007/BF00156795.
Hurlburt, N., Cheung, M., Schrijver, C., Chang, L., Freeland, S., Green,
S., Heck, C., Jaffey, A., Kobashi, A., Schiff, D., Serafin, J., Seguin, R.,
Slater, G., Somani, A., Timmons, R.: 2010, Heliophysics Event Knowledge-
base for the Solar Dynamics Observatory (SDO) and Beyond. Solar Phys..
doi:10.1007/s11207-010-9624-2.
Jolliffe, I.T.: 2002, Principal component analysis, 2nd edition, Springer, New
York, 487 p.
LaBonte, B.J., Georgoulis, M.K., Rust, D.M.: 2007a, Survey of Magnetic Helicity
Injection in Regions Producing X-Class Flares. Astrophys. J. 671, 955 – 963.
doi:10.1086/522682.
LaBonte, B.J., Georgoulis, M.K., Rust, D.M.: 2007b, Survey of Magnetic Helicity
Injection in Regions Producing X-Class Flares. Astrophys. J. 671, 955 – 963.
doi:10.1086/522682.
Lefebvre, S., Rozelot, J.: 2004, A new method to detect active features at the so-
lar limb. Solar Phys. 219, 25 – 37. doi:10.1023/B:SOLA.0000021818.97402.1e.
Leka, K.D., Barnes, G.: 2007, Photospheric Magnetic Field Properties of Flar-
ing versus Flare-quiet Active Regions. IV. A Statistically Significant Sample.
Astrophys. J. 656, 1173 – 1186. doi:10.1086/510282.
Lites, B.W., Low, B.C., Martinez Pillet, V., Seagraves, P., Skumanich, A.,
Frank, Z.A., Shine, R.A., Tsuneta, S.: 1995, The Possible Ascent of a
Closed Magnetic System through the Photosphere. Astrophys. J. 446, 877 – .
doi:10.1086/175845.
Liu, Y., Kurokawa, H.: 2004, On a Surge: Properties of an Emerging Flux Region.
Astrophys. J. 610, 1136 – 1147. doi:10.1086/421715.
Liu, Y., Norton, A.A., Scherrer, P.H.: 2007, A Note on Saturation
Seen in the MDI/SOHO Magnetograms. Solar Phys. 241, 185 – 193.
doi:10.1007/s11207-007-0296-5.
Martens, P.C.H., Attrill, G.D.R., Davey, A.R., Engell, A., Farid, S., Grigis,
P.C., Kasper, J., Korreck, K., Saar, S.H., Savcheva, A., Su, Y., Testa, P.,
Wills-Davey, M., Bernasconi, P.N., Raouafi, N., Delouille, V.A., Hochedez,
J.F., Cirtain, J.W., Deforest, C.E., Angryk, R.A., de Moortel, I., Wiegel-
mann, T., Georgoulis, M.K., McAteer, R.T.J., Timmons, R.P.: 2011, Com-
puter Vision for the Solar Dynamics Observatory (SDO). Solar Phys..
doi:10.1007/s11207-010-9697-y.
McAteer, R.T.J., Gallagher, P.T., Ireland, J., Young, C.A.: 2005, Automated
Boundary-extraction And Region-growing Techniques Applied To Solar Mag-
netograms. Solar Phys. 228, 55 – 66. doi:10.1007/s11207-005-4075-x.
Morita, S., McIntosh, S.W.: 2005, Genesis of AR NOAA10314. In: K. Sankara-
subramanian, M. Penn, & A. Pevtsov (ed.) Large-scale Structures and their
Role in Solar Activity 346, Astron. Soc. Pacific, San Francisco, 317 – .
Nguyen, S.H., Nguyen, T.T., Nguyen, H.S.: 2005, Rough set approach to sunspot
classification problem. In: Slezak, D., Yao, J., Peters, J.F., Ziarko, W., Hu, X.
(eds.) Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing Lecture
Notes in Computer Science 3642, Springer, Berlin / Heidelberg, 263 – 272.
Parnell, C.E., DeForest, C.E., Hagenaar, H.J., Johnston, B.A., Lamb, D.A.,
Welsch, B.T.: 2009, A Power-Law Distribution of Solar Magnetic Fields
Over More Than Five Decades in Flux. Astrophys. J. 698, 75 – 82.
doi:10.1088/0004-637X/698/1/75.
Pérez-Suárez, D., Higgins, P.A., McAteer, R.T.J., Bloomfield, D.S., Gallagher,
P.T.: 2011, Automated Solar Feature Detection for Space Weather Ap-
plications. In: Qahwaji, R., Green, R., Hines, E.L. (eds.) Applied Signal
and Image Processing: Multidisciplinary Advancements, IGI Global, Hershey,
Pennsylvania, 207 – 225. doi:10.4018/978-1-60960-477-6.
Qahwaji, R., Colak, T.: 2006, Hybrid imaging and neural networks techniques
for processing solar images. I. J. Comput. Appl. 13(1), 9 – 16.
Sarro, L.M., Berihuete, A.: 2011, Statistical techniques for the detection
and analysis of solar explosive events. Astron. Astrophys. 528, A62+.
doi:10.1051/0004-6361/201014894.
Scherrer, P.H., Bogart, R.S., Bush, R.I., Hoeksema, J.T., Kosovichev, A.G.,
Schou, J., Rosenberg, W., Springer, L., Tarbell, T.D., Title, A., Wolf-
son, C.J., Zayer, I., MDI Engineering Team: 1995, The Solar Oscilla-
tions Investigation - Michelson Doppler Imager. Solar Phys. 162, 129 – 188.
doi:10.1007/BF00733429.
Schrijver, C.J.: 1987, Solar active regions - Radiative intensities and large-scale
parameters of the magnetic field. Astron. Astrophys. 180, 241 – 252.
Subramanian, P., Dere, K.P.: 2001, Source Regions of Coronal Mass Ejections.
Astrophys. J. 561, 372 – 395. doi:10.1086/323213.
Watson, F., Fletcher, L., Dalla, S., Marshall, S.: 2009, Modelling the Longitu-
dinal Asymmetry in Sunspot Emergence: The Role of the Wilson Depression.
Solar Phys. 260, 5 – 19. doi:10.1007/s11207-009-9420-z.
Zharkov, S., Zharkova, V., Ipson, S., Benkhalil, A.: 2004, Automated recognition
of sunspots on the soho/mdi white light solar images. In: Knowledge-Based
Intelligent Information and Engineering Systems Lecture Notes in Computer
Science 3215, Springer, Berlin / Heidelberg, 446 – 452.