Computers and Electronics in Agriculture 169 (2020) 105164

Computers and Electronics in Agriculture 169 (2020) 105164
Contents lists available at ScienceDirect
Computers and Electronics in Agriculture

journal homepage: www.elsevier.com/locate/compag
Pre-harvest classification of crop types using a Sentinel-2 time-series and T

machine learning
Mmamokoma Grace Maponya , Adriaan van Niekerk, Zama Eric Mashimbye
⁎
Department of Geography and Environmental Studies, University of Stellenbosch, Stellenbosch, South Africa
ARTICLE INFO ABSTRACT
Keywords: Timely crop type information (preferably before harvest) is useful for predicting food surpluses or shortages.
Pre-harvest crop type classification This study assesses the performance of several machine learning classifiers, namely SVM (support vector ma-
Image selection chine), DT (decision tree), k-NN (k-nearest neighbour), RF (random forest) and ML (maximum likelihood) for
Operational crop type mapping crop type mapping based on a series of Sentinel-2 images. Four experiments with different combinations of
Machine learning classifiers
image sets were carried out. The first three experiments were undertaken with 1) single-date (uni-temporal)
images; 2) combinations of five images selected from the best performing single-date images; and 3) five images
selected manually based on crop development stages. The fourth experiment involved the chronologic addition
of images to assess the performance of the classifiers when only pre-harvest images are used, with the purpose of
investigating how early in the season reasonable accuracies can be achieved. The experiments were carried out
in two different sites in the Western Cape Province of South Africa to provide a good representation of the grain-
producing areas in the region which has a Mediterranean climate. The significance of image selection on clas-
sification accuracies as well as the performance of machine learning classifiers when only pre-harvest images are
used were evaluated. The classification results were analysed by comparing overall accuracies and kappa
coefficients, while McNemar′s test and ANOVA (analysis of variance) were used to assess the statistical sig-
nificance of the differences in accuracies among experiments. The results show that by selecting images based on
individual performance, a viable alternative to selecting images based on crop developmental stages is offered,
and that the classification of crops with an entire time series can be just as accurate as when they are classified
with a subset of hand-selected images. We also found that good classification accuracies (77.2%) can be obtained
with the use of SVM and RF as early as eight weeks before harvest. This result shows that pre-harvest images
have the potential to identify crops accurately, which holds much potential for operational within-season crop
type mapping.
1. Introduction to make economic forecasts such as the estimated contribution (and

impact) of agricultural activities to the gross domestic product (GDP)
The need to produce more food for the growing global population (Immitzer et al., 2016; McNairn et al., 2014; Siachalou et al., 2015).
and increasing biofuel production put continuing pressure on limited Also, major food security programmes such as food aid, strategic food
agricultural resources (Inglada et al., 2016). Consequently, most agri- reserves, import and export licensing for private firms and distribution
cultural systems around the world are being intensified, which makes through social security programmes depend on crop production fore-
agricultural practices highly dynamic as new technologies are im- casts derived from crop type information (Jayne & Rashid, 2010; Foley
plemented to ensure that food supply meets demand. Intensive and et al., 2011).
dynamic agricultural systems require innovative, timely, objective and According to Peña et al. (2014), traditional methods for producing
cost-effective methods for effective monitoring and management. Crop crop type maps require multiple sources of information, for example
type information is particularly useful for monitoring and managing the ground and aerial surveys. These methods are tedious, laborious and
sustainability of agricultural resources. Such information is also useful costly and are reported to produce inconsistent results (Gilbertson et al.,
for generating crop statistics, which aid decision-making regarding 2017). In addition, they can be biased as they may often rely on a small
subsidy payments and the calibration of crop models (Peña-Barragán number of observations made at easily accessible sampling sites (Pena-
et al., 2011). Crop type information further provides a basis from which Barragan et al., 2008). The use of remote sensing and freely available
⁎
Corresponding author.
https://doi.org/10.1016/j.compag.2019.105164
Received 2 May 2019; Received in revised form 17 December 2019; Accepted 18 December 2019
Available online 15 January 2020
0168-1699/ © 2019 Elsevier B.V. All rights reserved.
M.G. Maponya, et al. Computers and Electronics in Agriculture 169 (2020) 105164
satellite imagery is a viable alternative as it enables cost-effective temporal, textural and contextual features for capturing the variations
mapping and continuous monitoring of crop fields across large areas in crop growth stages, crop patterns and field sizes, make it an ideal
(Peña et al., 2014). During the past two decades, remote sensing resource for crop type classifications (Peña et al., 2014).
methods have been developed for discriminating crop types using dif- There is a general agreement that multi-temporal data is important
ferent types of remotely sensed data (Mingwei et al., 2008). This in- for crop type classification. Tavakkoli et al. (2008) argued that the
cludes radio detection and ranging (RADAR) data, which is beneficial selection of the most useful images from a time series is beneficial and
because of its ability to penetrate clouds, thus eliminating the problem generally gives more accurate results than when using all available
of cloud contamination (Forkuor et al., 2014; Jiao et al., 2014; Joshi images. This is supported by several studies which demonstrated that
et al., 2016). However, RADAR data has been shown to have issues with images acquired during peak growth stages are more useful than those
low resolutions and high levels of noise and is expensive (Grandoni, acquired during low growth periods (Leroux et al., 2014; Peña et al.,
2013; LaDue et al., 2010; You et al., 2016). Hyperspectral imagery has 2014; Tavakkoli et al., 2008). This was attributed to the observation
also been used for crop discrimination, but is often unavailable or that some crops might be spectrally similar or constitute high levels of
prohibitively expensive to acquire (Delegido et al., 2010; Thenkabail soil interference due to a lack of growth during certain stages in the
et al., 2013). Limitations associated with RADAR and hyperspectral growing season (Tavakkoli et al., 2008; Veloso et al., 2017). According
data led to the adoption of low spatial but high temporal resolution to Inglada et al. (2015), the selection of optimal dates for mapping crop
multispectral data, with MODIS (moderate resolution imaging spec- types is axiomatically important but not always possible, especially in
troradiometer) and AVHRR (advanced very high resolution radiometer) operational crop type data production workflows in which the hand
being the most exploited datasets for crop type classifications (Hao selection of images is not feasible. Hao et al. (2015) found that five
et al., 2015; Liu et al. 2014; Mingwei et al., 2008; Zheng et al., 2015). images selected from a crop calendar are sufficient for effective crop
Liu et al. (2014) argued that the ability of remote sensors to acquire classification. However, the selection of images based on crop calendars
multiple images during a growing season is the main reason data pro- is greatly dependent on the availability of images and is prone to fail
vided by optical sensors has become the most popular data for crop type when images are not available during the key developmental stages of
differentiation (Conrad et al., 2010; Foerster et al., 2012; Leroux et al., the target crops. According to Blaes et al. (2005), temporal gaps in the
2014; Siachalou et al., 2015). For many years, multi-temporal methods growing season result in significant drops in classification accuracies
for crop type mapping relied on satellite sensors with high temporal but when even a single image is missing at a key period. Also, selecting
low spatial resolutions (Bolton & Friedl, 2013; Gumma et al., 2011; crops based on crop calendars is limiting as it requires a priori
Lunetta et al., 2010; Mingwei et al., 2008). For example, Wu et al. knowledge of all crops cultivated in the area of interest. This compli-
(2014) used multi-sensor data with medium to low spatial resolutions cates the classification workflow, making the approach unsuitable for
to monitor crop production globally (known as the CropWatch system) operational implementations.
and reported such data to be effective for monitoring global crop pro- Classification algorithms can also influence the success of crop type
duction. Using MODIS-NDVI data, Mingwei et al. (2008) used the mapping (Myburgh & Van Niekerk, 2013). Gilbertson et al. (2017)
Fourier analysis method to discriminate between crops on a regional found the ability of non-parametric machine learning algorithms to use
scale and reported the multi-temporal approach useful for this appli- known data for classifying large sets of imagery, while incorporating
cation. However, low resolution satellite sensors such as MODIS have ancillary spatial data, to be ideal for crop type classifications. Popular
limited ability to provide detailed (i.e. field-level) crop type informa- machine learning algorithms include support vector machine (SVM),
tion. According to Gilbertson et al. (2017), imagery with medium (for decision tree (DT), k-nearest neighbour (k-NN) and random forest (RF).
example Landsat-8) to high (for example Sentinel-2) spatial resolutions SVM is a good choice for vegetation classification applications, since it
is needed for successful discrimination of different crops grown on can handle high dimensional data and perform well, even with few
neighbouring fields, and recent studies have shown that high spatial training samples (Gilbertson et al., 2017; Myburgh & Van Niekerk,
and temporal resolution data is critical for producing timely and ac- 2013; Ozdarici-Ok et al., 2015; Peña et al., 2014). One of the ad-
curate crop type maps (Inglada et al., 2016). vantages of RF is its ability to perform well even with incomplete
The recent launch of the Sentinel-2 satellite, which offers 10 m re- (noisy) data or when data with high levels of redundancy is used as
solution imagery at five-day intervals, represents a substantial techno- input. DTs, k-NN and ML have been highly successful in a range of
logical advancement and provides many opportunities for generating remote sensing applications, including land cover mapping and crop
more accurate, up-to-date and detailed crop type maps. For instance, type mapping (Myburgh & Van Niekerk, 2013; Myint et al., 2011).
Immitzer et al. (2016) investigated the appropriateness of pre-opera- The availability of crop type information early in the growing
tional single-date Sentinel-2 data for mapping crop types and tree season (before harvest) is critical for many applications, but in most
species. They recorded cross-validated accuracies ranging between 65% cases such data only becomes available post-harvest. This study assesses
and 76% for tree species and crop types respectively, but concluded the performance of several machine-learning classifiers when applied to
that the full potential of Sentinel-2 data could not be assessed with a pre-harvest images. Sentinel-2 imagery was used to generate a large set
single-date acquisition. They postulated that multi-temporal Sentinel-2 of features (predictor variables) – inclusive of spectral bands, vegeta-
data would likely significantly improve classification accuracies. The tion indices, principal components and texture measures – which was
combination of high spatial resolution, novel spectral capabilities (in- used as input for five different classifiers. The classifiers were trained
cluding three bands in the red-edge and two bands in the shortwave and validated using in situ data. The experiments were carried out in
infrared regions) and high temporal resolution provides a dataset of two sites located approximately 310 km from each other to assess the
unprecedented richness from which crop type differentiations can be consistency of the results. The results were interpreted in the context of
made. finding an operational solution for classifying crop types at regional
Using simulated Sentinel-2 data, Inglada et al. (2016) reviewed a scales.
range of methods for crop type mapping and concluded that the ima-
gery closes the gap between the availability of timely and accurate crop 2. Materials and methods
type maps and users′ needs. Lebourgeois et al. (2017) used the same
data for mapping smallholder agriculture in a tropical region char- 2.1. Study sites
acterised by high intra- and inter-field spatial variability. They found
that it is effective for mapping crops on smallholder farms and attrib- This study was conducted in two sites (covered by Sentinel-2 tiles
uted its success to the imagery’s relatively high spatial resolution. In T34HJB and T34HEH) in the Western Cape Province of South Africa.
addition, Sentinel-2 imagery′s novel spectral bands, as well as its For the purpose of this study, T34HJB and T34HEH will be referred to
2
Fig. 1. Location of the study sites in the Western Cape, South Africa.
Fig. 2. RGB Sentinel-2A imagery collected in August: (A) the extent of Study Site (A, B) the extent of Study Site B.
as study Site A and B respectively (Figs. 1 and 2). Site A is located about average minimum temperature of 11 °C and an average maximum
50 km north of Cape Town, while Site B is located east of the Langeberg temperature of 22 °C (Tererai et al., 2015). The area where Site B is
mountain range and south of the Swartberge mountain range. Both sites located has an average rainfall of 800 mm, with an average minimum
have a Mediterranean climate and is as such characterised by warm and temperature of 7 °C and an average maximum temperature of 22 °C
dry summers and cool and wet winters (Malan, 2016). The area where (Malan, 2016). These sites were chosen owing to the diversity of annual
Site A is located has an average annual rainfall of 550 mm, with an winter crops cultivated and the availability of crop census data. The
3
Table 1
Images collected, crop calendars and the approximate number of days before harvest in study Site A.
# Acquisition date Canola Lucerne Pasture Wheat
Calendar Weeks before harvest Calendar Weeks before harvest Calendar Weeks before harvest Calendar Weeks before harvest
1 6 Apr 2016 Sowing 25 Multiple cutting options. Sowing 14

2 26 Apr 2016 Sowing 22 Stages may vary Sowing Sowing 11
3 16 May 2016 Sowing 19 Sowing Sowing 8
4 6 June 2016 Sowing 16 Possible summer senescence 5
5 4 Aug 2016 8 Harvest 0
6 3 Sept 2016 3 Harvest 0
7 3 Oct 2016 Harvest 0 Harvest 0
10 31 Jan 2017 Post-harvest Post-harvest Post-harvest Post-harvest
most common crops cultivated in both sites are wheat, lucerne, planted Table 3
pasture and canola. Canola, wheat and planted pastures are grown The number of samples (fields) per class in study Sites A and B.
during winter, while lucerne is harvested throughout the year. Crop type Number of sampled fields
2.2. Satellite imagery acquisition and preparation Study site A Study site B
Canola 66 93
A selection of cloud free Sentinel-2 images, pre-processed at Level Lucerne 98 98
1C and captured between April 2016 and January 2017, were sourced Pasture 167 87
from the Sentinel Hub (https://www.sentinel-hub.com/). The temporal Wheat 96 97
Fallow 213 43
period was chosen to represent a typical winter growing season. Data
Total 640 418
preparation involved resampling and stacking the 20 m spectral bands
to 10 m (Immitzer et al. 2016). Tables 1 and 2 present the images used
in both sites, the respective crop stages, and the number of weeks before aerosol (band 1), water vapour (band 9) and SWIR cirrus (band 10)
harvest that each image represents. bands were deliberately excluded, as they were deemed unsuitable for
crop type classification as reported by Immitzer et al. (2016) and
2.3. In situ data collection and field delimitation Inglada et al. (2016). The generated features included 24 variations
(Table A-1) of vegetation indices (VIs) derived from the equations in
A crop type and field boundary dataset (i.e. agricultural census) Table 2.4 (Immitzer et al. 2016). Mean values for all spectral features
containing polygons of agricultural fields was obtained from the were calculated for each agricultural field. Homogeneity, dissimilarity
Western Cape Department of Agriculture. These field boundaries were and entropy texture features measuring pixel uniformity and disorder
used as basic units for classification. Due to an insufficient number of were also calculated and included in the classifications (Peña-Barragán
samples for some classes, additional samples (fields with crop type in- et al., 2011; Schmedtmann et al., 2015). A principal component ana-
formation) were collected by visual interpretation of Sentinel-2 and lysis (PCA) was performed on all spectral bands per image date, and the
high resolution satellite imagery. The total number of samples for Site A first four principle components (PCs) were retained as recommended by
and B was 640 and 418 respectively. The crop types and number of Gilbertson et al. (2017).
samples per class for the two study sites are given in Table 3.
2.5. Experimental design
2.4. Feature set development
Four experiments were undertaken (Fig. 3). The first experiment
Forty-one features (per image date) were generated from the (Experiment 1) involved performing the classification using available
Sentinel-2 bands (Table 4). For this study, only the blue (band 2), green cloud free images collected throughout the winter growing season, i.e.
(band 3), red (band 4), vegetation red edge (band 5), vegetation red each classification was based on a single image (henceforth called uni-
edge (band 6), vegetation red edge (band 7), NIR (band 8), narrow NIR temporal). In the second experiment (Experiment 2), four different
(band 8A), SWIR (band 11) and SWIR (band 12) were used. The coastal combinations of five uni-temporal images were used as input to the
Table 2
Images collected, crop calendars and the approximate number of days before harvest in study Site B.
# Acquisition Date Canola Lucerne Pasture Wheat
Calendar Weeks before harvest Calendar Weeks before harvest Calendar Weeks before harvest Calendar Weeks before harvest
1 3 Apr 2016 Sowing 25 Multiple cutting options. Sowing Sowing 19

2 22 June 2016 Sowing 14 Stages may vary Sowing 12
3 2 July 2016 12 Sowing 11
4 1 Aug 2016 8 Possible summer 7
5 11 Aug 2016 7 senescence 5
6 21 Aug 2016 5 4
8 29 Dec 2016 Post-harvest Post-harvest Post-harvest Post-harvest
4
Table 4
Features used as input for classifications (refer to Appendix 1 for the full names, formula and bands used to compute each feature).
Variable source Features
Spectral bands Blue, green, red, vegetation red edge1, vegetation red edge2, vegetation red edge3, NIR, Narrow NIR, SWIR1, SWIR2
Vegetation indices AFRI, EVI2, NDMI, NDVI, NDVI red edge, AFRI red edge, EVI2 red Edge
Textural features dissimilarity, entropy, homogeneity
Image transforms PC1, PC2, PC3, PC4
classifiers. The images were selected based on the overall accuracy (OA) and validation samples were maintained for all the experiments. SLICE
rankings obtained from Experiment 1 (i.e. the five individual images automatically generates confusion matrices and calculates overall ac-
that provided the highest uni-temporal OAs were selected). curacies (OAs) and Kappa coefficients (Ks) for every classification
For Experiment 3, the following five image dates were hand-picked (Gilbertson et al., 2017). OA is interpreted as the percentage of fields
and used for classification: 26 April, 6 June, 4 August, 3 September and corresponding to the errors of omission and commission (Campell &
23 October for Site A; and 3 April, 22 June, 2 July, 11 August and 20 Wynne, 2011), while K assesses the statistical differences between
October for Site B. Figs. 4 and 5 show how these dates relate to the crop classifications (Foody & Atkinson, 2004). McNemar′s test, ANOVA and
calendars for each site. The number of images chosen was based on t-tests (as implemented in Microsoft Excel) were used for assessing the
findings by Hao et al. (2015) and Gilbertson et al. (2017). statistical significance of the accuracies obtained from the experiments,
The fourth experiment involved performing the classification using as recommended by Foody et al. (2003) and applied by Duro et al.
all the available cloud free images captured throughout the growing (2015) and Gilbertson et al., (2017). The alpha values for ANOVA and t-
season. This experiment was executed by the chronological addition of tests were set to 0.05, while the alpha value for McNemar′s test was set
images to the classifiers to assess how early in the season reasonable to 1.96.
accuracies can be achieved.
3. Results
2.6. Classification and accuracy assessment
3.1. Experiment 1: uni-temporal, individual images
The supervised learning and image classification environment
(SLICE) software, developed by the Centre for Geographical Analysis Experiment 1 was undertaken with uni-temporal images collected
(CGA) at the Stellenbosch University, was used for classification and between April 2016 and January 2017, thus covering a full winter
accuracy assessment (Myburgh & Van Niekerk, 2013). SLICE includes growing season. Results for Experiment 1 are listed in Table 5. At Site A,
several classification algorithms, namely SVM, k-NN, DT, RFs and ML. the highest individual OA (80%) was achieved using SVM with an
The ML and k-NN classifiers were implemented using the Open CV 2.2 image acquired early in September (Image 6, approximately four weeks
libraries (Bradski, 2000), while Libsvm 3.0 was used to implement the before harvest). This OA is only marginally higher than the second
SVM classifier (Chang & Lin, 2011). The parameters for DT and RF were highest OA (79.6%), also achieved with SVM (P = −0.40) using an
based on the guidelines provided in the OpenCV library documentation. image acquired in early August (Image 5, about eight weeks before
The Geospatial Data Abstraction Library (GDAL) was used to manip- harvest). The highest OAs achieved with SVM and RF (77.2%) are
ulate the raster files and shapefiles. (See Myburgh & Van Niekerk significantly different (P > 3.10) from those achieved with k-NN and
(2013) for an incisive description of how the various classifiers were DT. It is important to note that the number of weeks before harvest
configured.) greatly depends on individual crop planting and harvest dates. For the
A 3:2 sample split ratio was employed for accuracy assessment. sake of simplicity, the first harvest date is henceforth used as reference.
Forty percent of the samples were randomly selected and reserved for At Site A, the five images with the highest mean OAs were collected
independent validation of the classifications. The same set of training between June and October with the image collected early in August (4
Fig. 3. Experimental design.
5
Fig. 4. Phenological stages of targeted crops in study Site B (the dotted lines represent the selected images).
Fig. 5. Phenological stages of targeted crops in study Site B (the dotted lines represent selected images).
Aug) recording the highest mean OA (73.9%), followed by early The highest individual OA (79.6%) for Site B (Table 5) was achieved
September (3 Sept = 70.3%), mid-October (13 Oct = 60.7%), early in August using SVM (79.6%) (approximately eleven weeks before
October (3 Oct = 56.6%) and early June (6 Jun = 51.5%). The 4th of harvest). The second highest OA (77.8%) was achieved in the second
August image also returned a relatively low OA standard deviation week of August with DT (roughly ten weeks before harvest). The dif-
(SD = 5.41%), indicating that it consistently produced good results ference in OA between these classifications was, however, statistically
across all classifiers. The differences in mean OAs between the images insignificant (P < 1.09). In contrast, the outputs of k-NN (67.6%) and
collected in August and those collected in September (maturation/ ML (47.9%) were significantly (P > 3) lower than those achieved by
harvesting stage, depending on the crop and the respective planting and SVM and RF. For Site B, the highest mean OA was recorded with images
harvest dates) are not statistically significant (two-tailed t-test collected between July and October. Similar to Site A, the highest mean
P = 0.387). However, the difference in OAs between the August image OA was recorded with an image collected early in August (1
and those collected at the beginning of the growing season (June) and Aug = 65.4%), followed by images collected mid-August (11 & 21
after harvest (January) are statistically significant (two-tailed t-test Aug = 63.5%), beginning of July (2 Jul = 51.6%) and mid-October (20
P < 0.0009). Oct = 50.9%). The 21 Aug image returned a relatively low OA standard
Table 5
Uni-temporal image classification overall accuracies (OAs %) and Kappa coefficients (Ks), as well as the average (AVG) and standard deviation (SD) of OAs and Ks, for
all classifiers and image dates in study sites A& B.
Site Image Date SVM k-NN DT RF ML AVG SD
OA K OA K OA K OA K OA K OA K OA K
A 1 6 Apr 46.6 0.29 42.7 0.24 40.7 0.22 50.9 0.34 42.7 0.24 44.7 0.26 4.06 0.04
2 26 Apr 48.2 0.30 35.6 0.15 41.1 0.23 45.8 0.27 42.3 0.25 42.6 0.24 4.88 0.05
3 16 May 52.1 0.36 42.3 0.25 40.7 0.22 53.7 0.39 47.0 0.31 47.1 0.30 5.75 0.07
4 6 Jun 60.3 0.48 42.3 0.24 48.6 0.32 58.4 0.44 48.2 0.29 51.5 0.35 7.56 0.10
5 4 Aug 79.6 0.73 70.1 0.61 66.6 0.56 77.2 0.70 76.4 0.69 73.9 0.65 5.41 0.07
6 3 Sep 80.0 0.73 62.3 0.51 65.0 0.54 74.5 0.66 69.8 0.59 70.3 0.60 7.14 0.08
7 3 Oct 74.9 0.67 55.6 0.55 51.7 0.37 67.4 0.57 33.7 0.23 56.6 0.47 15.8 0.17
8 13 Oct 74.5 0.66 53.3 0.38 56.0 0.42 67.0 0.56 52.9 0.37 60.7 0.47 9.58 0.12
9 23 Oct 70.1 0.60 54.9 0.40 51.7 0.37 68.6 0.58 48.2 0.31 58.7 0.45 10.0 0.13
10 31 Jan 54.1 0.39 47.8 0.32 50.5 0.35 57.6 0.44 50.1 0.35 52.0 0.37 3.84 0.04
AVG 64.0 0.52 50.6 0.36 51.2 0.36 62.1 0.49 51.1 0.36
SD 13.2 0.17 10.5 0.15 9.31 0.12 10.3 0.14 12.8 0.15
B 1 3 Apr 45.5 0.30 34.1 0.16 41.9 0.26 50.2 0.36 20.3 0.00 38.4 0.21 11.7 0.14
2 22 Jun 66.4 0.57 42.9 0.34 52.6 0.39 67.6 0.58 22.7 0.00 50.4 0.37 18.5 0.23
3 2 Jul 70.0 0.61 50.2 0.37 54.4 0.42 63.4 0.53 20.3 0.00 51.6 0.38 19.1 0.23
4 1 Aug 79.6 0.74 65.2 0.56 66.4 0.57 75.4 0.68 40.7 0.23 65.4 0.55 15.1 0.19
5 11 Aug 73.6 0.66 67.6 0.58 77.8 0.71 76.6 0.70 20.3 0.00 63.5 0.53 24.2 0.30
6 21 Aug 72.4 0.64 64.6 0.54 61.6 0.50 71.2 0.63 47.9 0.32 63.5 0.52 9.83 0.12
7 20 Oct 63.4 0.53 55.1 0.42 51.4 0.38 61.6 0.50 23.3 0.04 50.9 0.37 16.2 0.19
8 29 Dec 63.4 0.53 42.5 0.27 44.9 0.30 57.4 0.45 32.3 0.14 48.1 0.33 12.3 0.15
9 8 Jan 58.6 0.46 44.3 0.28 48.5 0.35 58.6 0.46 43.7 0.28 50.7 0.36 7.40 0.09
10 18 Jan 59.2 0.47 53.2 0.40 43.7 0.28 59.2 0.47 42.5 0.27 51.5 0.37 8.11 0.09
AVG 65.2 0.55 51.9 0.39 54.3 0.41 64.1 0.53 20.3 0.00
SD 9.59 0.12 11.2 0.13 11.3 0.14 8.47 0.10 11.2 0.13
6
deviation (SD = 9.83%) indicating that it consistently produced good increase in the number of images in Site A, while the opposite is ob-
results across all classifiers. Comparable to Site A, there is a statistically served in Site B (Figs. A-4 & A-5). These results confirm those of Ex-
significant difference (two-tailed t-test P < 0.0488) in OAs achieved periment 1 in that the highest OAs are achievable with combinations of
with the images collected in August (complete maturity) compared to images collected between August and September. This finding applies
those collected in June and January. to both sites.
With regards to the overall performance of the classifiers, SVM Overall, SVM achieved superior results across all image combina-
yielded the highest mean OA (64%), followed by RF (62.1%), DT tions, with a mean OA of 80% (Fig. A-4) in Site A. Although there is no
(51.2%), k-NN (51.1%) and ML (50.6%) in Site A (Fig. A-1). Both SVM statistically significant difference between SVM, RF, k-NN and DT (two-
and RF performed significantly better than k-NN, DT and ML (two- tailed t-test P < 0.174), the difference between SVM and ML is sta-
tailed: t-test P < 0.039). Similarly, SVM yielded the highest mean OA tistically significant (two-tailed t-test P = 0.003). A similar pattern is
(65.2%) in Site B, followed by RF (64.1%), DT (54.1%), k-NN (51.9%) seen in the results of Site B – there is no statistically significant dif-
and ML (20.3%) (Fig. A-2). Again, SVM and RF performed significantly ference in the performance of SVM, RF, k-NN and DT (Fig. A-5), but
better than k-NN, DT and ML (two-tailed: t-test P < 0.032). In Site A, SVM preformed significantly better than ML (two-tailed t-test
SVM was the least stable classifier (OA SD = 13.2%), while it generated P = 0.0003). From the standard deviations, it is evident that SVM, RF,
accuracies that were the second most stable (OA SD = 9.59%) in Site B. k-NN and DT were more stable (SD < 2.5) in both sites, while ML
RF produced consistently good results (OA SDs of 10.3% and 8.47% in displayed some instability (see Table 3.2).
sites A and B respectively). Overall, for Experiment 1 (uni-temporal
images), OAs increased from the beginning of the growing season and 3.3. Experiment 3: multitemporal, hand-selected image set
peaked around August and/or September. The OAs then drops at the
start of harvest. Experiment 3 was undertaken using five images which were hand
selected on the basis of the critical developmental stages in the growing
3.2. Experiment 2: multi-temporal, ranked-based image set cycle of crops, similar to Gilbertson et al. (2017). Results are presented
in Table 7. The highest individual OA (81.5%) in Site A was achieved
This experiment was conducted using combinations of five images with RF (Fig. A-5), which is marginally higher than the second highest
whose selection was based on the performance of individual images OA achieved with SVM (79.6%) (Table 3.5). Overall, DT and ML
(from Experiment 1). Four different combinations of these images were achieved the lowest OA, with ML being the worst performing classifier
tested and the results are presented in Table 6 The highest individual (28.2%), which was found to be significantly lower (P = 11.1) than the
OA (82.4%) in Site A was achieved using SVM on a combination of the OA achieved with RF (81.5%). Overall, this experiment achieved a
two images with the highest ranking (4 Aug & 3 Sept). However, this mean OA of 67.0%, which is not significantly different (two-tailed t-
was only marginally (2.2%) and insignificantly (P = -1.34) higher than test; P = 0.515) from the mean OA achieved with the highest mean OA
when SVM was applied to a combination of three images (4 Aug, 3 Sept (73.9%) obtained with Experiment 2 (Table 6) in Site A (although a 5%
& 13 Oct). When the mean OAs of all classifiers are considered, the difference would be regarded as significant by many analysts).
combination of the 4 Aug & 3 Sept images yielded a mean OA of 70.4%, With regard to Site B, the highest OA was achieved with SVM
which was slightly lower than when only the 4 Aug image was used as (80.2%), followed by RF (77.2%) (Table 3.3; Fig. A-6). The difference
input (OA of 73.9%). between the classifiers with the highest and second highest OAs was
The highest individual OA (79.6%) for Site B was achieved using the found to be statistically insignificant (P = 1.14). k-NN and ML achieved
single image with the highest rank (1 Aug) as input to SVM (Table 6). the lowest OA. Overall, this experiment achieved a mean OA of 72.1%,
However, this is only marginally (0.6%) and insignificantly (P-value of which is identical to the highest mean OA obtained with Experiment 2
0.25) higher than when SVM was applied to a combination of three (Table 6) in Site B. Similar to Experiment 1 & 2, the highest OAs for
images (1, 11 & 21 August). The combination of the two highest-ranked both sites were based on both SVM and RF.
individual images (1 August & 11 August) yielded the highest mean OA
(72.1%), but this was not significantly (two-tailed t-test P = 0.757) 3.4. Experiment 4: Chronologic image addition
higher than the second highest mean OA (70.8%) achieved when all
five images were used as input. Overall, the mean OA declines with an Experiment 4 was conducted by chronologically adding images from
Table 6
OAs and Kappa coefficients for rank-based image combinations, OAs obtained with all the images available for the season (All) and the average (AVG) for classifiers
and image combinations in study Site A & B.
Site # Dates SVM k-NN DT RF ML AVG SD
Image OA K OA K OA K OA K OA K OA K OA K
A Best 4 Aug 79.6 0.73 70.1 0.61 66.6 0.56 77.2 0.70 76.4 0.69 73.9 0.65 5.41 0.07
2 Best 4 Aug & 3 Sep 82.4 0.76 70.6 0.62 72.0 0.63 77.8 0.71 49.2 0.36 70.4 0.62 12.7 0.15
3 Best 4 Aug, 3 Sep, 13 Oct 80.2 0.74 69.6 0.58 69.6 0.59 79.0 0.73 49.2 0.36 69.5 0.60 12.4 0.15
4 Best 4 Aug, 3 Sep, 3 & 13 Oct 79.3 0.73 72.5 0.65 67.5 0.58 80.2 0.74 38.6 0.26 67.6 0.59 17.0 0.19
5 Best 4 Aug, 3 Sep, 3, 13 & 23 Oct 78.4 0.72 74.3 0.68 70.4 0.60 79.3 0.73 35.8 0.18 67.6 0.58 18.1 0.23
All 81.1 0.75 69.8 0.60 67.4 0.57 80.3 0.74 32.5 0.03 66.2 0.53 19.8 0.29
AVG 80.0 0.74 71.2 0.62 69.2 0.59 78.7 0.72 49.8 0.37
SD 1.50 0.01 1.94 0.03 2.18 0.02 1.20 0.01 16.0 0.19
B Best 1 Aug 79.6 0.74 65.2 0.56 66.4 0.57 75.4 0.68 40.7 0.23 65.4 0.55 15.1 0.19
2 Best 1 & 11 Aug 78.4 0.72 68.2 0.59 71.2 0.63 77.2 0.70 65.8 0.55 72.1 0.64 5.50 0.07
3 Best 1, 11 & 21Aug 79.0 0.73 67.6 0.58 69.4 0.61 78.4 0.72 58.6 0.46 70.6 0.62 8.45 0.11
4 Best 1, 11 & 21 Aug & 2 Jul 79.0 0.73 69.4 0.61 68.8 0.60 79.0 0.73 54.4 0.41 70.9 0.61 10.0 0.13
5 Best 1, 11, 21 Aug, 2 Jul & 20 Oct 78.4 0.72 65.8 0.56 71.8 0.64 77.2 0.71 61.0 0.49 70.8 0.62 7.42 0.09
All 80.8 0.75 66.4 0.57 71.8 0.64 76.0 0.69 62.8 0.52 71.5 0.63 7.21 0.09
AVG 78.9 0.73 67.2 0.58 69.5 0.61 77.4 0.70 56.1 0.42
SD 0.50 0.00 1.72 0.02 2.13 0.02 1.38 0.01 9.54 0.12
7
Table 7
OA and Kappa coefficients and the average (AVG), for the five images selected based on the crop development stages for study Site A & B (refer to Fig. 2.2 for
respective crop development stages corresponding to the selected images).
Site # Image dates SVM k-NN DT RF ML AVG SD
image
A Five 26 Apr, 6 Jun, 4 Aug, 3 Sept & 23 Oct 79.6 0.73 76.8 0.69 69.0 0.59 81.5 0.75 28.2 0.13 67.0 0.57 22.2 0.25
B Five 3 Apr, 22 Jun, 11 & 21 Aug & 20 Oct 80.2 0.74 65.2 0.55 73.6 0.66 77.2 0.70 0.54 72.1 0.63 7.03 0.08
64.6
Table 8
OAs and Kappa coefficients, as well as the average (AVG) and standard deviation (SD) of OAs and Ks, for all classifiers and image dates in study sites A& B for the
incremental classifications in Site A and B.
Site Image Dates SVM k-NN DT RF ML AVG SD
A 1– 2 6 Apr – 26 Apr 51.7 0.35 44.3 0.27 41.5 0.24 54.5 0.39 26.6 0.12 43.7 0.30 10.9 0.10
1– 3 6 Apr – 16 May 56.0 0.42 50.5 0.36 46.2 0.30 57.6 0.44 22.7 0.10 46.6 0.32 14.1 0.13
1– 4 6 Apr – 6 Jun 61.5 0.49 57.2 0.44 49.0 0.33 63.1 0.51 27.4 0.10 51.6 0.35 14.6 0.16
1– 5 6 Apr – 4 Aug 79.6 0.73 63.1 0.51 70.1 0.61 78.0 0.71 21.5 0.06 62.6 0.52 23.8 0.27
1– 6 6 Apr – 3 Sept 80.3 0.74 68.2 0.58 70.1 0.61 79.6 0.73 34.9 0.05 66.6 0.54 18.5 0.28
1– 7 6 Apr – 3 Oct 79.2 0.72 69.0 0.59 69.4 0.60 79.6 0.73 33.3 0.02 66.1 0.53 19.0 0.29
1– 8 6 Apr – 13 Oct 78.0 0.71 69.4 0.60 66.6 0.56 83.1 0.78 32.5 0.13 65.9 0.55 19.8 0.25
1– 9 6 Apr – 23 Oct 78.8 0.72 69.0 0.59 66.6 0.56 78.8 0.72 32.5 0.13 65.4 0.54 19.0 0.24
All 6 Apr – 31 Jan 81.1 0.75 69.8 0.60 67.4 0.57 80.3 0.74 32.5 0.03 66.2 0.53 19.8 0.29
AVG 69.2 0.59 60.3 0.47 58.7 0.46 70.5 0.60 30.6 0.08
SD 11.8 0.15 9.49 0.12 11.6 0.15 11.0 0.14 4.91 0.04
B 1– 2 3 Apr – 22 Jun 56.8 0.45 52.0 0.39 54.4 0.42 62.2 0.52 55.0 0.41 56.0 0.43 3.82 0.05
1– 3 3 Apr – 2 Jul 68.8 0.60 55.6 0.44 50.2 0.36 67.6 0.58 47.9 0.32 58.0 0.46 9.71 0.12
1– 4 3 Apr – 1 Aug 77.2 0.70 65.2 0.56 74.2 0.67 77.8 0.71 55.0 0.41 69.8 0.61 9.72 0.12
1– 5 3 Apr – 11 Aug 79.0 0.73 67.6 0.59 74.2 0.67 82.0 0.77 59.8 0.48 72.5 0.64 8.95 0.11
1– 6 3 Apr – 21 Aug 78.4 0.72 70.6 0.62 71.8 0.64 77.8 0.71 67.6 0.58 73.2 0.65 4.69 0.05
1– 7 3 Apr – 20 Oct 80.2 0.74 68.8 0.60 72.4 0.64 78.4 0.72 64.0 0.53 72.7 0.64 6.70 0.08
1– 8 3 Apr – 29 Dec 80.2 0.74 71.2 0.63 72.4 0.65 76.6 0.70 65.2 0.55 73.2 0.65 5.68 0.07
1– 9 3 Apr – 8 Jan 80.8 0.75 68.2 0.59 71.8 0.64 78.4 0.72 64.0 0.53 72.6 0.64 6.97 0.09
All 3 Apr – 18 Jan 80.5 0.75 66.4 0.57 71.8 0.64 76.0 0.69 62.8 0.52 71.5 0.63 7.12 0.09
AVG 72.7 0.64 61.9 0.51 65.5 0.55 72.7 0.64 72.1 0.43
SD 8.02 0.10 6.71 0.08 9.08 0.11 6.22 0.07 6.33 0.08
the beginning to the end of the growing season. At Site A (Table 8), used as input. The difference between these results was statistically
SVM and RF were able to achieve accuracies of greater than 75% insignificant (P = 0.22). The highest individual classification result
(SVM = 79.6%; RF = 78%) by the first week of August (about eight (82%) was achieved when all images up to the image collected in the
weeks before harvest). The highest individual OA for this site (83.1%) second week of August (11 August) was used as input to RF (Fig. A-8).
was achieved when all images up to 13 October (approximately one This is only marginally higher than the highest individual OA (80.5%)
week after harvest started) was used as input to RF. A slightly lower OA obtained when the entire time series was used with SVM. Although
(81.1%) was achieved when all images up to January were used with there is no statistically significant difference between the highest in-
SVM. Overall, the highest mean OA (66.6%) for Site A was obtained dividual OAs obtained when using images collected pre-harvest only
when the image collected in the first week of September (Image 6) was and when the entire time series is used in both Site A & B (P < 0.5),
added to the classification (Fig. A-7). This mean OA is slightly higher there is a notable difference, with pre-harvest images achieving a
than what was obtained with the entire time series (66.2%), but is not slightly higher (1.5%) OA. In Site B, the highest mean OA (73.2%) was
significant (two tailed: t-test P = 0.97). For Experiment 4, RF generally achieved with all images up to the image collected in the third week of
outperformed the other classifiers, with a mean OA of 70.5% for all August (21 August: Image 6), which is approximately four weeks before
scenarios, followed by SVM (69.2%), k-NN (60.3%), DT (58.7%) and harvest. This is slightly higher than the mean OA achieved when using
ML (30.6%). the entire time series (71.5%). Similar to Site A, the difference in these
The results for the entire time series were slightly better than when OAs is not statistically significant (two tailed: t-test P = 0.793).
the pre-harvest images were used, but the differences in OAs are sta- Concerning classifiers, RF recorded the highest mean OA (73.4%),
tistically insignificant across all classifiers (two tailed: t-test followed by SVM (72%) when only pre-harvest images were used. The
P > 0.321). Generally, RF and SVM outperformed the other classifiers two classifiers performed on par when the entire time series was used as
and there is no statistical significance between the SVM and RF (two input (72.7%). Although there is no significant difference between the
tailed: t-test P > 0.864) results. There is, however, a statistically sig- performance of SVM and RF, they both achieved significantly higher
nificant difference in the OAs of both RF and SVM in comparison to k- OAs than those of k-NN and ML (two tailed: t-test P > 0.1) in both
NN, DT and ML (two tailed: t-test P < 0.04). All classifiers displayed sites. Overall, all classifiers achieved slightly higher results in Site A
some instability across the different growth stages (SD > 11), except when the entire time series was used compared to when only pre-har-
for ML which consistently produced low OAs. vest images were used (Fig. A-9). No clear trend could be established
Similar to Site A, SVM and RF achieved OAs of greater than 75% regarding the difference in OA between pre-harvest and the entire time
(SVM = 77.2%; RF = 77.8%) in Site B (Table 8) when images up to the series for the various classifiers in Site B (Fig. A-10).
beginning of August (approximately eleven weeks before harvest) were In comparison to Experiment 2, Experiment 4 achieved significantly
8
Table 9 producer′s accuracies (PAs) exceeding 90% in both sites. The only ex-
Summary of the highest OAs achieved across all Experiments in site A & B. ception was in Experiment 1 (87% for canola). Planted pasture was the
Experiment Site A best OAs Site B best OAs AVG best OAs most confused class in Site A (PA of below 60%), while lucerne was the
most confused in Site B (PA of below 55%). Wheat returned OAs of
1 80.0 79.6 79.8 above 70% with SVM and RF across all experiments during the August/
2 82.4 79.6 81
September period.
3 81.5 80.2 80.8
4 (pre-harvest) 80.3 82.0 81.1
4 (all) 81.1 80.8 80.9 4. Discussion
AVG best OAs 81.4 80.4 80.7
From the results, it is noticeable that rank-based selection

(Experiment 2) and pre-harvest images (Experiment 4, pre-harvest)
lower results (two-tailed t-test P = 0.014) in Site A, while Experiment 4
outperformed the other experiments (Table 9). However, based on the
achieved a slightly higher OA in Site B (although the difference is sta-
marginal differences in accuracies among the experiments, it is ques-
tistically insignificant; P = 0.587). Table 9 summarises the overall re-
tionable whether selecting images based on individual performance is
sults per experiment (using only the best results in each experiment).
worth the effort (and expense) of carrying out multiple uni-temporal
Compared to Experiment 3, Experiment 4 achieved slightly lower OAs
classifications, especially given that hand-picking images based on
in Site A while achieving slightly higher OAs in Site B. When both sites
critical crop developmental stages (Experiment 3) produced similar
are considered in combination, Experiment 4 (pre-harvest images)
results. This finding is in agreement with Asgarian et al. (2016),
outperformed the other experiments (average OA of 81.1%), but this is
Gilbertson et al. (2017) and Peña et al. (2014) who reported that the
only marginally higher than the 81% OA achieved in Experiment 2 and
selection of images at important crop developmental stages is an ef-
80.9% and 80.8% attained in Experiment 4 (all) and Experiment 3 re-
fective strategy to improve crop type classification accuracies. The re-
spectively.
sults of Experiment 3 and 4 are in agreement with Lussem et al. (2016)
With regard to the best period for identifying crop types, it is evi-
who found that the addition of “good” images (images that produced
dent that crops can be classified with good OAs at the beginning of
high OAs) to “bad” images (images that produced low OAs) does not
August, which marks the complete maturation stage for most crops.
compensate for errors caused by the latter. The statistical insignificance
Based on the confusion matrices of the best classification results ob-
in the differences in OAs achieved across all experiments is in agree-
tained within the August/September period across all experiments,
ment with Hao et al. (2015) who assessed the influence of time series
SVM and RF were able to classify canola and fallow fields with
length on classification accuracies and found that the addition of more
Fig. 6. Spectral profiles of the targeted crops during (a) planting, (b) maturation and (c) post-harvest stages in both study sites.
9
than five images did not have any significant impact on classification Images collected after harvest are unlikely to have any positive effect on
accuracies. crop classification accuracies. This is supported by slightly higher ac-
Although it was not possible to clearly establish trends with curacies achieved in Experiment 2, which did not include images from
Experiment 2 and 3 (image selection), it is clear from Experiment 1 and this period.
4 that the OAs of images acquired at the beginning of the season are low With regard to classifier performance, SVM and RF achieved sig-
(mean OA < 55%), peak around Aug/Sept (mean OA > 65%) and nificantly higher OAs compared to the other classifiers and were able to
then decline towards the end of the season (mean < 55%). Thus, achieve relatively good results by August/September (approximately
images collected during the maturation period (August/September) eight weeks before harvest when the image captured early in August is
yielded significantly higher classification accuracies when compared to used; the period will vary depending on planting and harvesting dates
those acquired during the early stages of development (April/June) and of different crops). The highest individual accuracy (83.1%) was
after harvest (post October). Low OAs are expected in the beginning of achieved with RF when all images up to mid-October were used. The
the season (January–April) due to a lack of growth in most fields, strength of RF over the other classifiers, especially with the addition of
making the soil background the main scattering contributor to the re- the eighth image (Experiment 4) can be attributed to the classifier′s
flected signal and maximising soil interference with the spectral re- ability to handle highly dimensional data without deletion. This is
flectance of the crops in focus. This is supported by the low classifica- supported by the classifier′s ability to perform well (mean
tion accuracies obtained in Experiment 1 when images collected OA = 80.3%) when all images were used (410 variables) as input. This
between April and early June (pre-maturation) and after December is comparable to the findings of Gilbertson et al. (2017) who compared
(post-harvest) were used. A comparison of spectral curves generated for feature sets with different sizes to assess the impact of dimensionality
the different crops during three main stages of development (planting, reduction on crop type classification accuracies and found both SVM
maturation and post-harvest) is shown Fig. 6. The generated profiles and RF to produce high accuracies (> 85%) when all features (2 0 5)
show that the crops had similar reflectance during planting and post- were used as input. This is also in agreement with Rodriguez-Galiano
harvest periods, but during maturation their spectral characteristics et al. (2012) who used RF to classify Mediterranean land cover with
were more distinct allowing for better discrimination. multi-temporal imagery and multi-seasonal texture measures – 972
Similarities in spectral responses of crops in their early stages of input variables in total – and found RF to be able to handle the high
development have also been reported to have a negative effect on dimensional data well.
classification accuracies (Azar et al., 2016; Hao et al., 2018). For ex- The strong performance of SVM corresponds to the findings of
ample, the separation of crops such as lucerne and planted pastures are Myburgh & Van Niekerk (2013). While differences within crop fields
expected to be the most challenging because of their spectral simila- (e.g. crop conditions), differences in developmental stages, etc. have
rities, especially at the beginning of the growing season. This is con- been shown to have a negative impact on classification accuracies when
firmed by the confusion matrices, which show that most confusion remotely sensed imagery is used (Peña-Barragán et al. 2011), SVM has
occurred between these two classes. been shown to be less sensitive to intra-class variations compared to
As the crops reach maturation (late July–late September) they de- other classifiers (Zheng et al., 2015). This characteristic is believed to
velop more distinguishing properties, providing classifiers with more have been a major contributing factor to its above-average performance
information for differentiation. For example, crops such as canola turn in this study. Furthermore, SVM′s use of an optimised sample for cal-
bright yellow during maturation (flowering), making the crop easy to culating support vectors and defining the hyperplane by prioritising
distinguish from other crops. Wheat was also well differentiated, likely samples that lie on the edge of the class distribution in feature space
due to the absence of other cereals such as barley and oats in the study (Zheng et al. 2015) make it suitable for use with high dimensional
sites (the classes most commonly confused with wheat) (Veloso et al., datasets. This explains the classifier′s relatively good performance when
2017). The confusion between planted pastures and lucerne persisted all images were included in the classification (Experiment 4).
throughout the maturation stage, suggesting that these two classes do The insignificance in the different OAs achieved among the various
not have sufficient differentiating attributes. Fallow fields were the best experiments in this study suggests that image selection does not have
differentiated when the cultivated crops reached maturation (Aug/ any significant impact on crop type mapping. This is an important
Sept), which was expected given that the biomass difference between finding for operational crop type mapping as the use of all available
cultivated and uncultivated fields (bare soil interspersed with weed cloud-free images reduces the complexity of automated workflows.
growth) is most dramatic at this stage. However, images during the August/September period (a few weeks
Although OAs peaked throughout the maturation stage (until the before harvest) are critical as it seems to have contributed the most to
end of September), a decline in OA is observed at the beginning of the OAs achieved in this study.
harvest (start of October) (Table 5). This decline in OA at the start of In this study, cloud-contaminated images were manually identified
harvest is likely caused by the sudden removal of biomass and variation (through visual interpretation) and excluded from the analyses. Given
in harvest dates. This is supported by the increase in confusion among that manual selection of cloud-free images is not viable for many op-
cultivated and fallow fields (see Table A2–A9). This finding is in erational implementations (e.g. over very large areas and involving
agreement with Veloso et al. (2017) who found that crops are mostly multiple images), alternative approaches are needed. One option is to
misclassified in their early stages of development, with the accuracies make use of cloud-masking algorithms (Sedano et al., 2011; Bai et al.
peaking as the vegetation cover increases and crops mature, then de- 2016; Han et al. 2017) to automatically select suitable images. Another
clining again at the beginning of harvest. Also, temporal gaps (missing option is to perform image compositing (Vancutsem et al., 2007; Flood,
images at important development stages) have been cited as one of the 2013; Lück and Van Niekerk, 2016) to reduce the effect of cloud-con-
primary causes of reduced classification accuracies in multi-temporal tamination. These approaches warrant more research, particularly
crop type mapping studies (Inglada et al., 2015). For instance, Blaes within the context of crop type classification.
et al. (2005) reported significant drops in classification accuracies when This study made use of Sentinel-2A imagery only. It would be of
even a single optical image was unavailable at a key period within the interest to perform a similar set of experiments using the improved
growing season. temporal resolution provided by the Sentinel-2B satellite (which be-
Additionally, the value of including images collected post-harvest came operational after the in situ survey of this study was carried out).
was evaluated in Experiments 1 and 4. The accuracies were generally The denser time-series provided by the combination of imagery from
low (< 65%) when the post-harvest (January) images were used as both satellites is expected to increase classification accuracies, but more
input to the uni-temporal classifications (Experiment 1). Fields are research is needed to test this hypothesis.
generally left fallow after harvest and are thus spectrally similar. The classifications in this study were carried out with a relatively
10
small set of training and test samples (as little as 43 for some classes). It crop type census and the Centre for Geographical Analysis is thanked
is likely that some of these samples were inaccurate, which could have for providing access to the classification and accuracy assessment
had an impact on the relatively low overall accuracies obtained in this software. We are also grateful to www.linguafix.net for the language
study. The collection of suitable training and test data is a major lim- checking and editing services provided.
iting factor of supervised crop type mapping approaches and more work
is needed to develop methods whereby training samples collected in Appendix A. Supplementary material
one season can be reused in subsequent seasons and to extend training
samples collected in one region to another. Supplementary data to this article can be found online at https://
doi.org/10.1016/j.compag.2019.105164.
5. Conclusion
References
This study investigated the use of five machine learning classifiers
for crop type classification in two sites located in the Western Cape Asgarian, A., Soffianian, A., Pourmanafi, S., 2016. Crop type mapping in a highly frag-
province of South Africa. A set of cloud-free Sentinel-2 images was used mented and heterogeneous agricultural landscape: a case of central Iran using multi-
temporal Landsat 8 imagery. Comput. Electron. Agric. 127, 531–540.
as input to the classifiers. Considering the effort involved in developing Azar, R., Villa, P., Stroppiana, D., Crema, A., Boschetti, M., Brivio, P.A., 2016. Assessing
crop calendars, as well as the impact of seasonal weather variations on in-season crop classification performance using satellite data: a test case in Northern
the accuracy of such calendars, alternative approaches for selecting Italy. European J. Remote Sens. 49 (1), 361–380.
Bai, T., Li, D., Sun, K., Chen, Y., Li, W., 2016. Cloud detection for high-resolution satellite
images were evaluated. Specifically, the efficacy of selecting images imagery using machine learning and multi-feature fusion. Remote Sens. 8 (9), 715.
based on their uni-temporal performance (i.e. using a single image as Blaes, X., Vanhalle, L., Defourny, P., 2005. Efficiency of crop identification based on
input to the classifiers) was assessed. The performance of the machine optical and SAR image time series. Remote Sens. Environ. 96 (3–4), 352–365.
Bradski, G. R., &Pisarevsky, V. (2000). Intel's Computer Vision Library: applications in
learning classifiers when only pre-harvest images are used as input was calibration, stereo segmentation, tracking, gesture, face and object recognition. In
also tested. The results showed that the selection of images based on Computer Vision and Pattern Recognition, 2000. Proceedings. IEEE Conference on
individual performance offers a viable alternative to selecting images (Vol. 2, pp. 796–797). IEEE.
Bolton, D.K., Friedl, M.A., 2013. Forecasting crop yield using remotely sensed vegetation
based on crop developmental stages, but the effort and cost of im-
indices and crop phenology metrics. Agric. For. Meteorol. 173, 74–84.
plementing such an approach may not warrant the marginal (and in- Campbell, J.B., Wynne, R.H., 2011. Introduction to remote sensing. Guilford Press.
significant) accuracy improvements observed. Furthermore, even Chang, C.C., Lin, C.J., 2011. LIBSVM: a library for support vector machines. ACM Trans.
though this approach offers a viable alternative to the conventional Intell. Syst. Technol. (TIST) 2 (3), 27.
Conrad, C., Fritsch, S., Zeidler, J., Rücker, G., Dech, S., 2010. Per-field irrigated crop
method of selecting images, the results should be interpreted within the classification in arid Central Asia using SPOT and ASTER data. Remote Sens. 2 (4),
context of Mediterranean climates. The findings of this study suggest 1035–1056.
that the accuracy of the classification of crops when using an entire Delegido, J., Alonso, L., González, G., Moreno, J., 2010. Estimating chlorophyll content of
crops from hyperspectral data using a normalized area over reflectance curve
time series can be as accurate as when a subset of hand selected images (NAOC). Int. J. Appl. Earth Obs. Geoinf. 12 (3), 165–174.
are used. This suggests that image selection is not necessary for op- Duro, D.C., Franklin, S.E., Dubé, M.G., 2015. A comparison of pixel-based and object-
erational crop type mapping, which simplifies automated image pro- based image analysis with selected machine learning algorithms for the classification
of agricultural landscapes using SPOT-5 HRG imagery. Remote Sens. Environ. 118,
cessing workflows. 259–272.
The principal finding of this study is that the crops (and fallow Foody, G., & Atkinson, P., 2004. Uncertainty in Remote Sensing and GIS. Wiley,
fields) considered in this study can effectively be classified with images Chichester, 2002–2002.
Foley, J.A., Ramankutty, N., Brauman, K.A., Cassidy, E.S., Gerber, J.S., Johnston, M.,
acquired from the beginning of June to before harvest (up to Sept). SVM Muller, N.D., O’Connell, C., Ray, D.K., West, P.C., Balzer, C., Bennett, E.M., Hill, C.J.,
and RF are recommended as these classifiers performed consistently Mofreda, S.P., Polasky, S., Rockstrom, J., Sheehan, J., Siebert, S., Tilman, D., Zaks,
well in all the experiments. D.P.M., 2011. Solutions for a cultivated planet. Nature 478 (7369), 337–342.
Foody, G. M., & Atkinson, P. M. (Eds.). (2003). Uncertainty in remote sensing and GIS.
John Wiley&Sons.
CRediT authorship contribution statement Flood, N., 2013. Seasonal composite Landsat TM/ETM+ images using the medoid (a
multi-dimensional median). Remote Sens. 5 (12), 6481–6500.
Mmamokoma Grace Maponya: Conceptualization, Methodology, Foerster, S., Kaden, K., Foerster, M., Itzerott, S., 2012. Crop type mapping using spec-
tral–temporal profiles and phenological information. Comput. Electron. Agric. 89,
Validation, Formal analysis, Investigation, Resources, Data curation, 30–40.
Writing - original draft, Visualization, Project administration. Adriaan Forkuor, G., Conrad, C., Thiel, M., Ullmann, T., Zoungrana, E., 2014. Integration of op-
van Niekerk: Conceptualization, Project administration, Visualization, tical and synthetic aperture radar imagery for improving crop mapping in north-
western Benin West Africa. Remote Sens. 6 (7), 6472–6499.
Funding acquisition. Zama Eric Mashimbye: Conceptualization, Gilbertson, J.K., Kemp, J., Van Niekerk, A., 2017. Effect of pan-sharpening multi-tem-
Visualization, Project administration, Funding acquisition. poral Landsat 8 imagery for crop type differentiation using different classification
techniques. Comput. Electron. Agric. 134, 151–159.
Grandoni, D. (2013). Advantages and Limitations of Using Satellite Images for Flood
Declaration of Competing Interest Mapping. in Workshop on the Use of the Copernicus Emergency Service for Floods.
Gumma, M.K., Nelson, A., Thenkabail, P.S., Singh, A.N., 2011. Mapping rice areas of
The authors declare that they have no known competing financial South Asia using MODIS multitemporal data. J. Appl. Remote Sens. 5 (1), 053547.
Hao, P., Tang, H., Chen, Z., Liu, Z., 2018. Early-season crop mapping using improved
interests or personal relationships that could have appeared to influ-
artificial immune network (IAIN) and Sentinel data. PeerJ 6, e5431.
ence the work reported in this paper. Hao, P., Zhan, Y., Wang, L., Niu, Z., Shakir, M., 2015. Feature selection of time series
MODIS data for early crop classification using random forest: a case study in Kansas,
USA. Remote Sens. 7 (5), 5347–5369.
Acknowledgement
Han, Y., Bovolo, F., Lee, W.H., 2017. Automatic cloud-free image generation from high-
resolution multitemporal imagery. J. Appl. Remote Sens. 11 (2), 025005.
This work forms part of a larger project titled “Wide-scale modelling Immitzer, M., Vuolo, F., Atzberger, C., 2016. First experience with Sentinel-2 data for
of water use and water availability with earth observation/satellite crop and tree species classifications in central Europe. Remote Sens. 8 (3), 166.
Inglada, J., Arias, M., Tardy, B., Hagolle, O., Valero, S., Morin, D., Koetz, B., 2015.
imagery”, which was initiated and funded by the Water Research Assessment of an operational system for crop type map production using high tem-
Commission (WRC) of South Africa. More information about this pro- poral and spatial resolution satellite optical imagery. Remote Sens. 7 (9),
ject is available in WRC Report no. TT 745 available at 12356–12379.
Inglada, J., Vincent, A., Arias, M., Marais-Sicre, C., 2016. Improved early crop type
www.wrc.org.za. The authors also thank the Agricultural Research identification by joint use of high temporal resolution SAR and optical image time
Council (ARC) for co-funding this study. The European Space Agency series. Remote Sens. 8 (5), 362.
(ESA) is also acknowledged for providing the Sentinel-2 data, while the Jayne, T. S., & Rashid, S. (2010). The value of accurate crop production forecasts. In:
Fourth African Agricultural Markets Program (AAMP) Policy Symposium, Lilongwe,
Western Cape Department of Agriculture is credited for supplying the
11
Malawi, September 6–10, 2010. environmental measures by remote sensing. Agron. Sustain. Dev. 28 (2), 355–362.
Jiao, X., Kovacs, J.M., Shang, J., McNairn, H., Walters, D., Ma, B., Geng, X., 2014. Object- Peña-Barragán, J.M., Ngugi, M.K., Plant, R.E., Six, J., 2011. Object-based crop identifi-
oriented crop mapping and monitoring using multi-temporal polarimetric cation using multiple vegetation indices, textural features and crop phenology.
RADARSAT-2 data. ISPRS J. Photogramm. Remote Sens. 96, 38–46. Remote Sens. Environ. 115 (6), 1301–1316.
Joshi, N., Baumann, M., Ehammer, A., Fensholt, R., Grogan, K., Hostert, P., Reiche, J., Peña, J.M., Gutiérrez, P.A., Hervás-Martínez, C., Six, J., Plant, R.E., López-Granados, F.,
2016. A review of the application of optical and radar remote sensing data fusion to 2014. Object-based image classification of summer crops with machine learning
land use mapping and monitoring. Remote Sens. 8 (1), 70. methods. Remote Sens. 6 (6), 5019–5041.
LaDue, D.S., Heinselman, P.L., Newman, J.F., 2010. Strengths and limitations of current Rodriguez-Galiano, V.F., Chica-Olmo, M., Abarca-Hernandez, F., Atkinson, P.M.,
radar systems for two stakeholder groups in the southern plains. Bull. Am. Meteorol. Jeganathan, C., 2012. Random Forest classification of Mediterranean land cover
Soc. 91 (7), 899–910. using multi-seasonal imagery and multi-seasonal texture. Remote Sens. Environ. 121,
Lebourgeois, V., Dupuy, S., Vintrou, É., Ameline, M., Butler, S., Bégué, A., 2017. A 93–107.
combined random forest and OBIA classification scheme for mapping smallholder Schmedtmann, J., Campagnolo, M.L., 2015. Reliable crop identification with satellite
agriculture at different nomenclature levels using multisource data (simulated sen- imagery in the context of common agriculture policy subsidy control. Remote Sens. 7
tinel-2-time series, VHRS and DEM). Remote Sens. 9 (3), 259. (7), 9325–9346.
Leroux, L., Jolivot, A., Bégué, A., Seen, D.L., Zoungrana, B., 2014. How reliable is the Siachalou, S., Mallinis, G., Tsakiri-Strati, M., 2015. A hidden Markov models approach for
MODIS land cover product for crop mapping Sub-Saharan agricultural landscapes? crop classification: Linking crop phenology to time series of multi-sensor remote
Remote Sens. 6 (9), 8541–8564. sensing data. Remote Sens. 7 (4), 3633–3650.
Liu, M.W., Ozdogan, M., Zhu, X., 2014. Crop type classification by simultaneous use of Sedano, F., Kempeneers, P., Strobl, P., Kucera, J., Vogt, P., Seebach, L., San-Miguel-
satellite images of different resolutions. IEEE Trans. Geosci. Remote Sens. 52 (6), Ayanz, J., 2011. A cloud mask methodology for high resolution remote sensing data
3637–3649. combining information from high and medium resolution optical sensors. ISPRS J.
Lück, W., Van Niekerk, A., 2016. . Evaluation of a rule-based compositing technique for Photogramm. Remote Sens. 66 (5), 588–596.
Landsat-5 TM and Landsat-7 ETM+ images. Int. J. Appl. Earth Observ. Geoinformat. Tavakkoli, S. S., Lohmann, P., &Soergel, U. (2008). Monitoring agricultural activities
47, 1–14. using multi-temporal ASAR ENVISAT data. The International Archives of the
Lunetta, R.S., Shao, Y., Ediriwickrema, J., Lyon, J.G., 2010. Monitoring agricultural Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII,
cropping patterns across the Laurentian Great Lakes Basin using MODIS-NDVI data. Part B7, WG VII/5 Processing of Multi-Temporal Data and Change Detection,
Int. J. Appl. Earth Obs. Geoinf. 12 (2), 81–88. 735–742.
Lussem, U., Hütt, C., Waldhoff, G., 2016. Combined analysis of sentinel-1 and rapideye Tererai, F., Gaertner, M., Jacobs, S.M., Richardson, D.M., 2015. Resilience of invaded
data for improved crop type classification: an early season approach for rapeseed and riparian landscapes: the potential role of soil-stored seed banks. Environ. Manage. 55
cereals. Int. Arch. Photogr. Remote Sens. Spatial Inform. Sci. 41. (1), 86–99.
Malan, G. J. (2016). Investigating the suitability of land type information for hydrological Thenkabail, P.S., Mariotto, I., Gumma, M.K., Middleton, E.M., Landis, D.R., Huemmrich,
modelling in the mountain regions of Hessequa, South Africa (Doctoral dissertation, K.F., 2013. Selection of hyperspectral narrowbands (HNBs) and composition of hy-
Stellenbosch: Stellenbosch University). perspectral two band vegetation indices (HVIs) for biophysical characterization and
McNairn, H., Kross, A., Lapen, D., Caves, R., Shang, J., 2014. Early season monitoring of discrimination of crop types using field reflectance and Hyperion/EO-1 data. IEEE J.
corn and soybeans with TerraSAR-X and RADARSAT-2. Int. J. Appl. Earth Obs. Sel. Top. Appl. Earth Obs. Remote Sens. 6 (2), 427–439.
Geoinf. 28, 252–259. Vancutsem, C., Pekel, J.F., Bogaert, P., Defourny, P., 2007. Mean compositing, an alter-
Mingwei, Z., Qingbo, Z., Zhongxin, C., Jia, L., Yong, Z., Chongfa, C., 2008. Crop dis- native strategy for producing temporal syntheses. Concepts and performance as-
crimination in Northern China with double cropping systems using Fourier analysis of sessment for SPOT VEGETATION time series. Int. J. Remote Sens. 28 (22),
time-series MODIS data. Int. J. Appl. Earth Obs. Geoinf. 10 (4), 476–485. 5123–5141.
Myburgh, G., Van Niekerk, A., 2013. Effect of feature dimensionality on object-based land Veloso, A., Mermoz, S., Bouvet, A., Le Toan, T., Planells, M., Dejoux, J.F., Ceschia, E.,
cover classification: a comparison of three classifiers. South African J. Geomat. 2 (1), 2017. Understanding the temporal behavior of crops using Sentinel-1 and Sentinel-2-
13–27. like data for agricultural applications. Remote Sens. Environ. 199, 415–426.
Myint, S.W., Gober, P., Brazel, A., Grossman-Clarke, S., &Weng, Q. (2011). Per-pixel vs. Wu, B., Meng, J., Li, Q., Yan, N., Du, X., Zhang, M., 2014. Remote sensing-based global
object-based classification of urban land cover extraction using high spatial resolu- crop monitoring: experiences with China's CropWatch system. Int. J. Digital Earth 7
tion imagery. Remote Sens. Environ., 115(5), 1145–1161. (2), 113–137.
Ozdarici-Ok, A., Ok, A.O., Schindler, K., 2015. Mapping of agricultural crops from single You, H., Jianjuan, X., Xin, G., 2016. Radar data processing with applications. John Wiley
high-resolution multispectral images—Data-driven smoothing vs. parcel-based & Sons.
smoothing. Remote Sens. 7 (5), 5611–5638. Zheng, B., Myint, S.W., Thenkabail, P.S., Aggarwal, R.M., 2015. A support vector machine
Peña-Barragán, J.M., López-Granados, F., García-Torres, L., Jurado-Expósito, M., de La to identify irrigated crop types using time-series Landsat NDVI data. Int. J. Appl.
Orden, M.S., García-Ferrer, A., 2008. Discriminating cropping systems and agro- Earth Obs. Geoinf. 34, 103–112.
12

Computers and Electronics in Agriculture 169 (2020) 105164

Uploaded by

Copyright:

Available Formats

Computers and Electronics in Agriculture 169 (2020) 105164

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computers and Electronics in Agriculture 169 (2020) 105164

Uploaded by

Copyright:

Available Formats

Computers and Electronics in Agriculture 169 (2020) 105164

Contents lists available at ScienceDirect

Computers and Electronics in Agriculture

Pre-harvest classification of crop types using a Sentinel-2 time-series and T

ARTICLE INFO ABSTRACT

1. Introduction to make economic forecasts such as the estimated contribution (and

1 6 Apr 2016 Sowing 25 Multiple cutting options. Sowing 14

1 3 Apr 2016 Sowing 25 Multiple cutting options. Sowing Sowing 19

Fig. 3. Experimental design.

From the results, it is noticeable that rank-based selection

You might also like