Academia.eduAcademia.edu

Change Detection Based on Information Measure

2011, IEEE Transactions on Geoscience and Remote Sensing

A new unsupervised change detection method for modeling nonlinear temporal dependences based on local information is proposed. A theoretical analysis is presented, demonstrating how to derive optimal parameters for automating the method. It is then validated on both simulated data and very high resolution remote sensing imagery. The results show a clear improvement in change detection using the proposed method compared to other state-of-the-art change detection techniques.

1 Change Detection between Very High Resolution Images Based on Information Measure Lionel Gueguen, Pierre Soille, and Martino Pesaresi Abstract A new method for performing change detection between very high resolution images in the context of post crisis situation assessment is presented. The method adopts an object based image representation to focus on human settlements and uses information based change indicator. An automatic thresholding of the proposed change detectors is proposed. Simulations and real data sets enable to assess the improvement, in terms of receiver operational characteristics, of such detectors with respect to change vector analysis ones. Also the automatic thresholding process is assessed in terms of total error probability. Finally, real case scenarios about internally displaced people camp detection and rebuilt buildings detection are presented, highlighting the benefits of the methodology in crisis management context. I. I NTRODUCTION Post crisis situation assessment is one of the most important aspect for establishing sustainable security as well as restoring normal life in a hit area. Within this context, earth observation imagery is of particular interest to analyze the anthropogenic or biophysical changes engendered by a conflict or a natural hazard. Population movements, building and infrastructure changes, environment modification can be retraced from the analysis of image time series and help in assessing the population vulnerability which is a key parameter for peace keeping operations in the area. As time evolution of human settlement is of interest, often pre- and post-event very high spatial L. Gueguen, P. Soille, M. Pesaresi are with the Joint Research Center of the European Commission, Ispra, Italy. They are with the Global Security and Crisis Management Unit in the Institute for the Protection and Security of the Citizen. August 13, 2010 DRAFT 2 ground resolution imagery is required to detect small scale changes. In such a crisis management context, rapid mapping of changes is often of high demand, such that two dates dataset is sufficient: one pre event image generally available in archives and the firstly available post event image. Indeed, analyzing image time series (more than 2 images) is irrelevant within this context, since the event time scale is much finer than the images series time resolution. As mentioned in [1], such constraints result in solving the problem of detecting abrupt unmodeled transitions in a temporal series with only two dates. Change detection [2] is the paradigm for solving such a problem. Nevertheless, this methodology has to cope with the following inherent problematics: • very high resolution images have low spatial consistency requiring advanced registration tool; • registration does not resolve the problem of parallax distortion, visible for high structures; • atmospheric conditions may vary, inducing unwanted changes in optical datasets; • multi sensors data may be used with respect to their availabilities. In literature, change detection methods, as developed for middle resolution images, are often based on efficient registration tool followed by a pixel wise temporal analysis [3]. Nevertheless, due to the enunciated problematics, these methods are not directly applicable. In this paper, a new change detection methodology is proposed, which copes with some of the described problematics. While integrating spatial uncertainty of data sets, this method enables to capture non linear dependencies due to atmospheric conditions, and multi sensors acquisitions [4]. Developed in the Information theory framework, the proposed change detection measure discriminates changes from steady states by evaluating the time statistical dependence. The paper is organized as follows. Section II extends the introduction by reviewing change detection background, concluding with a generic workflow. Then, information based change indicators are described in section III, in order to evaluate time statistical dependencies. An automatic change detection process is proposed in section IV and is based on information based change indicators. Simulations are performed to assess the benefits of such indicators with respect to classical approaches in section V. In section VI, the overall change detection methodology is assessed in two real case scenarios; the first one being internally displaced people camp detection, the second one being post-war damaged building detection. The proposed methodology is also August 13, 2010 DRAFT 3 compared to state of the art techniques, showing the efficiency of information based change detection. Finally section VII concludes and gives some perspectives. II. C HANGE D ETECTION BACKGROUND In this section, change detection in remote sensing is briefly reviewed, while crucial points are highlighted. First, Change Vector Analysis is discussed. Then, the use of spatial context modeling in change detection is described. Dealing with Very High Resolution images, the concept of geometrical features extraction is introduced. Finally, a generic change detection workflow is proposed. A. Change Vector Analysis Several unsupervised change-detection methods for images acquired by passive sensors have been proposed in the remote-sensing literature. Among them, a widely used technique is the change vector analysis (CVA) [3], [5]. The CVA technique consists in computing differences between two images measuring the same scene at two distinct instants. As result each image pixel is associated to a single difference value or a multidimensional spectral change vector. Then, the analysis of these spectral change vectors by supervised [6] or unsupervised classification [7] produces a change map. Especially, automatic analysis of differences [2] try to determine classification boundary in the difference space or the spectral change vector space, where generally low density parts are considered as changes. While this analysis simplifies the change detection problem, it limits inherently the extracted information as the difference space is a partial view of the actual multitemporal space. Subsequently, new analysis transforms are proposed to tackle this problem. B. Spatial Context Modeling Performing a pixel based analysis surely discards spatial information, which is inherent to natural images. Therefore, pixel context is generally modeled and used as features for performing change detection. Some approaches use Gibbs Markov Random Field (GMRF) to model spatiotemporal dependencies [8], [9] or spatio-difference dependencies [7], and others adopte an object based representation [10], [11]. While these methods are computationally expensive, simpler context modeling has been proposed in literature [1], by considering independent spatial relations August 13, 2010 DRAFT 4 in both images. An independent Gaussian modeling is proposed, such that all values in the vicinity of the considered pixel are considered independent and identically distributed, Vxi ∼ N (µi , σi ), where x is the pixel and Vxi is a random variable representing neighbors values in the image i. Then, comparing two time separated configurations Vx1 , Vx2 at location x is performed through the calculation of Kullback-Leibler divergences [12] being the change indicator at location x: δx = DKL (Vx1 | Vx2 ) + DKL (Vx2 | Vx1 ), σ 4 + σ24 + (µ1 − µ2 )2 (σ12 + σ22 ) − 1. = 1 2σ12 σ22 (1) (2) Generally the neighborhood is chosen to be isotropic, such that it is defined by a disk. For example, computing the mean µi of the neighborhood of x in the image i can be easily implemented by a low pass filtering of the original image by a disk. A key parameter of this model, is the neighborhood definition, which mainly depends on a scale parameter as mentioned in [1], [13]. The bigger the neighborhood scale is, the more robust the change detection is, although spatial precision decreases also. Indeed, a small scale, defining an unpopulated neighborhood, implies unreliable statistics and weakens the benefit of spatial context. In addition, registration errors have a statistical significance at small scales, and thus may be confused with changes [14]. Therefore enlarging the neighborhood scale disminishes misalignment effects on statistics estimation. On the counter part, a large scale, defining an extended neighborhood, should remain below the typical change scale in order to avoid a statistical shadowing. Such phenomenon easily appears as the probability of changes is generally small compared to the probability of steady states. As a remark, if equal variances are considered, the change indicator δx becomes a CVA indicator because only intensity differences are considered for making decisions. In addition, this change detection analysis may suffer from illumination variations and saturation which often occur in multitemporal satellite images [15], and in the case of multisensor based change detection such indicator becomes requires normalization of both images. C. Geometrical Features Extraction Many change detection methods consider full image information for performing change detection, where color and texture are modeled [16], [17]. However in some cases, it has been demonstrated that focusing on geometry related features enhances the efficiency of change August 13, 2010 DRAFT 5 detection, as the geometrical information content becomes predominant in Very High Resolution Images. In [18], morphological mathematics features are used previously to the change detection in order to detect buildings having lost their roof. In both cases, the extraction focuses on the relevant information in both images, before performing the change detection, thus avoiding falsealarm due to other type of changes. Morphological filtering [19] is a powerfull image analysis tool for extracting geometrical features from images, including satellite imagery [20]. Opening and closing are well known morphological filters depending on a structuring element B. For example, an opening γB by a structuring element B will remove all bright blobs which cannot contain the set B. Area opening [21] is a morphological filter which does not depend on the structuring element shape, and removes from the image all the bright blobs with an area inferior to some area threshold λ expressed in number of pixels. An area opening is defined as the supremum of multiple opening: _ γλ = {γBi | Bi is connected and |Bi | = λ}, (3) i where Bi is a connected structuring element containing λ pixels. Therefore, if all bright structures with an area smaller than λ are of interest, a top-hat by area opening will reveal those: THλ = Id − γλ. (4) On the contrary, a top-hat by area closing extracts all dark structures with an area smaller than some threshold. Top-hat transforms are used in the sequel, in order to extract geometrical features previous to the change detection. D. Change Detection Workflow Combining geometrical features extraction with spatial context modeling enables the definition of an efficient change detection workflow, as depicted in Fig. 1. First, geometrical features are extracted to focus on the relevant information. This transform is application dependent as various features may be of interest. For example, one may be interested in small blobs which change into bigger blobs. In our application case, bright blobs of some areas are the key image features and are extracted through Differential Area Decomposition [22]. Then, spatial context is modeled with a Gaussian distribution of unknown mean. As outputs, images of means are obtained through a low pass filtering of the top-hat images. As hinted previously, the scale parameter is of major August 13, 2010 DRAFT 6 Image Pre Image Post Feature extraction Feature extraction Spatial Context Modeling Spatial Context Modeling Change Indicator Automatic Threshold Thresholding Change map Fig. 1. Change detection workflow. importance for defining the neighborhood size. Again the scale parameter is selected to conform with the change detection problem. The first steps of the change detection workflow are processed separately on both images, producing two data sets. Then, a change indicator is computed at each location between the two means images. Finally, the change indicator can be thresholded to obtain a change image. Either the threshold is selected manually or automatically. Many change indicators have been proposed in literature [1], where most common ones are based on CVA, like δx . Nevertheless, such indicators are unlikely to work when images are formed differently or acquired in different conditions. Linear transform compensation have been proposed to overcome these problems [23]. Unfortunately, these techniques may fail with non linear temporal relations, which appear after non linear features extraction, such as morphological filtering. III. I NFORMATION BASED C HANGE I NDICATOR In this section, an information based change indicator is proposed to model non linear temporal relations. Then, this change indicator is related to the Bayesian test of independance, giving insight into the information based indicator. Finally, extension lead to the definitions of variational and mixed informations based change indicators, which improve detection power. August 13, 2010 DRAFT 7 A. Mutual Information Based Change Indicator As mentioned in [24], a major drawback of the CVA technique is that difference and magnitude operators are not bijective and result in a loss of information with respect to the original multitemporal/spectral feature space. Instead of analyzing change vectors or differences, multitemporal dependances should be considered [11], where pre- and post-values are jointly analyzed. Mutual information was first introduced in [25] in order to analyze jointly two random variables, and to measure their dependances. This metric has been successfully applied for image alignment [26], and more recently for automatic change detection [27], [28], where the change indicator was demonstrated to model non linear dependancies between two images. In the following, we remind the mutual information change detection criterion for two coregistered images j, k. Let x be a pixel in the images grid, and j(x), k(x) be its bitemporal values. Usually values belong to finite sets j(x) ∈ J and k(x) ∈ K due to quantization. Making the strong assumption that the pixels values within an image are independent, only the joint probability pJ,K is modeled to analyze changes. Several non parametric techniques exist to estimate such joint probability [29]. They are usually based on the computation of joint histogram hJ,K defined over the joint space J × K: hJ,K (l, p) = 1 + X 1l=j(x)1p=k(x) , (5) x∈G where G is the shared images grid. Thus, the joint probability pJ,K is estimated by: pJ,K (l, p) = P hJ,K (l, p) . (l,p)∈J ×K hJ,K (l, p) (6) The non parametric estimation of probability density is the key to perform non linear dependence modeling. A direct use of the joint probability would lead to consider changes as rare cooccurences, such that its complement function is a change indicator: τx = 1 − pJ,K (j(x), k(x)). (7) In other words, a change is likely to correspond to low probability density parts as mentioned in [7]. More than a frequency analysis, the mutual information is a measure of statictical link between two random variables. It measures the Kullback-Leibler divergence between the joint probability and the joint distribution where both variables are independent. In this context, low mutual information indicates the lack of statistical link between two variables. Denoting by pJ , August 13, 2010 DRAFT 8 pK , the marginal distributions: pJ (l) = X pK (p) = X pJ,K (l, p), (8) pJ,K (l, p), (9) p∈K l∈J the mutual information I(J; K) is expressed by: X pJ,K (l, p) , I(J; K) = pJ,K (l, p) log pJ (l)pK (p) (10) (l,p)∈J ×K I(J; K) = DKL (pJ,K | pJ × pK ). (11) The mutual information I(J; K) is a global measure and is not directly suited for change detection. Nevertheless, the mutual information is an average of single measurements which were used for performing change detection in [30]. The corresponding information based change indicator is then expressed by: ηx = log pJ,K (j(x), k(x)) . pJ (j(x))pK (k(x)) (12) This information based change indicator measures, with respect to the global behavior encoded into the joint distribution pJ,K , if the two realizations j(x) and k(x) are independent, highlighting a possible change. Apart from being rare, changes should be first characterized by weak statistical links in comparison to the average statistical link measured in the whole dataset. Such considerations are often made for outliers detection [28], where outliers have a atypical behavior in comparison to the overall dataset behavior. By analogy between outliers and changes, the independent realizations become the atypical realization of the statistically linked variables. B. Bayesian Testing of Independence As mentioned in the previous section, independence testing is an adapted tool for detecting changes in between non linearly dependent random variables. In this section, Bayesian testing of independence [31] is reviewed, thus giving insight into the information based change detector efficiency. In the following, we review the independence test theoretical framework. Studying operational characteristics of Lautum information, bounds on detection efficiency are derived in [32]. Given a set of n realizations of the joint random variable (J, K), the independence test identifies if the set follows the joint distribution pJ,K or the independent joint distribution August 13, 2010 DRAFT 9 pJ × pK . Let πi|d , πd|i be the operational characteristics of this independence test, which are defined by: πi|d = p(decide Hi | Hd is true), (13) πd|i = p(decide Hd | Hi is true), (14) where Hi : (J, K) ∼ pJ × pK and Hd : (J, K) ∼ pJ,K are the hypothesis of independence and dependence, respectively. If there was a bijective relation between change detection and independence testing, πi|d would represent the probability of false alarm that change occur, while πd|i would represent the probability of omitting a change. In [32], the independence test is performed by thresholding the following log-ratio defined for n realizations {(jz , kz )}nz=1 of the joint random variable (J, K): p(Hd ) X pJ,K (jz , kz ) = + log . p(Hi ) z=1 pJ (jz )pK (kz ) n l({(jz , kz )}nz=1 ) (15) It was demonstrated [32] that the proposed independence test has the following operational characteristics asymptotic lower bounds: πi|d ≥ e−nL(J;K), (16) πd|i ≥ e−nI(J;K), (17) where I(J; K) is the mutual information expressed in (10) and L(J; K) is the Lautum information: L(J; K) = X pJ (l)pK (p) log (l,p)∈J ×K pJ (l)pK (p) , pJ,K (l, p) L(J; K) = DKL (pJ × pK | pJ,K ). (18) (19) The Mutual and Lautum information control the exponential decay of operational characteristics. In addition, when Mutual and Lautum information are different, operational characteristics become unbalanced. For example in the case where L ≥ I, the probability of omission is likely to be greater than the probability of false alarm, πd|i ≥ πi|d . To gain in confidence about the test, meaning minimizing the operational characteristics with an increasing n, several realizations should be considered. Therefore, instead of considering a August 13, 2010 DRAFT 10 pixel based change detection ηx as expressed in (12), the following information based change indicator is defined: γx = X ηy , (20) y∈N (x) = X y∈N (x) log pJ,K (j(y), k(y)) , pJ (j(y))pK (k(y)) (21) where N(x) is the set of neighbors of x. A fast implementation consists in performing a low pass filtering of the change indicator image obtained through the operator ηx . Such low pass filtering yields a more spatially robust independence indicator. C. Variational and Mixed Information Measures Recent studies have experimentally demonstrated that variational and mixed information measures may enhance change detection efficiency [27], [33]. In this section, such information derived metrics are reminded. First, the variational information [34], measures the complement of the mutual information with respect to the joint entropy and is expressed by: V I(J; K) = H(J; K) − I(J; K) X pJ,K (l, p)2 V I(J; K) = − pJ,K (l, p) log pJ (l)pK (p) (22) (23) (l,p)∈J ×K In other words, it measures the quantity of variational information that exhibit both variables J, K. As mutual information enables to indicate dependencies (21), this metric enables to define a change indicator based on variational information: X pJ,K (j(y), k(y))2 , βx = log pJ (j(y))pK (k(y)) (24) y∈N (x) where negative values highlight changes. This variational based change indicator is evaluated experimentally in section V, as its theoretical operational characteristics are unknown. Experimentally, the variational information based change indicator βx decreases the false alarm probability πd|i , for very small values of πi|d , in comparison to the information based change indicator γx . On the other hand, this last information based change indicator performs better for bigger values of πi|d . Consequently, by benefiting of both metric advantages, some compromise measure should increase the overall operational characteristics. Such a compromise metric is proposed in [27], which is a trade-off between mutual information August 13, 2010 DRAFT 11 and variational information. This compromise metric was named mixed information and is a linear combination of mutual and variational information measures: MIα (J; K) = (1 − α)I(J; K) + α(−V I(J; K)), = I(J; K) − αH(J; K), X pJ,K (l, p)(1+α) = pJ,K (l, p) log , pJ (l)pK (p) (25) (26) (27) (l,p)∈J ×K where 0 ≤ α ≤ 1 trades off between mutual and variational information predominance. Because of this simple formulation, the following relations holds: MI0 (J; K) = I(J; K) and MI1 (J : K) = −V I(J; K). The sign of the measure encodes the predominant measurement, since positive values correspond to mutual information and negative values are linked to variational information. Then, a straightforward change indicator based on the mixed information is defined by: X pJ,K (j(y), k(y))1+α . ωxα = log pJ (j(y))pK (k(y)) (28) y∈N (x) This change detector depends on the parameter α, which will be of interest when dealing with automatic detection. As remarked previously, this last change indicator embraces both mutual and variational change indicators as the following relations hold: γx = ωx0 (29) βx = ωx1 (30) A fine tuning of the trade off parameter α will be discussed in the next section in order to select the compromise change indicator. IV. AUTOMATIC T HRESHOLD C OMPUTATION Automatic threshold estimation based on Expectation Maximization algorithm has been proposed for CVA techniques [7]. However, such framework is not suited for the proposed case. When dealing with information derived change indicators, defining an automatic threshold can be done by analyzing some extremal limits of such indicators [31]. In this section, we propose an analytical formulation of automatic threshold which fastens the automatic process in comparison to Expectation Maximization based techniques. First, when dealing with mutual information based change indicator, as the number of realizations August 13, 2010 DRAFT 12 grows, the mutual information based indicator γx tends to two limits depending if the realizations are dependent or independent: γx → I(J; K), if (J; K) ∼ pJ,K , |N (x)|→∞ |N(x)| γx → −L(J; K), if (J; K) ∼ pJ pK , lim |N (x)|→∞ |N(x)| lim (31) (32) where |N(x)| is the size of the neighborhood of x. As a remark, the distance between both limits is given by I(J; K) + L(J; K) and is linked to the independence test efficiency, since a great distance highlights an easier delineation between both hypothesis. In the case where the realizations are generated from the mixture of dependent and independent distributions in equal proportions, the information based change indicator has the following limit: γx I(J; K) − L(J; K) → |N (x)|→∞ |N(x)| 2 lim (33) In this specific case, where dependent and independent realizations are in equal proportions, the test becomes undecidable. Therefore, an automatic threshold of the indicator γx is given by this mixed limit (I(J; K) − L(J; K))/2, since being apart enables to decide. In the more general case of the mixed information change indicator ωxα , as the neighborhood size grows the indicator as two limits: ωxα → I(J; K) − αH(J; K), |N (x)|→∞ |N(x)| if (J; K) ∼ pJ,K . α ωx lim → −L(J; K) − α(H(J) + H(K)), |N (x)|→∞ |N(x)| if (J; K) ∼ pJ pK . lim (34) (35) This results in a mixed limit with equal proportions expressed by: ωxα (1 − α)I(J; K) − L(J; K) − 2αH(J; K) → |N (x)|→∞ |N(x)| 2 lim (36) Thus, this last limit depending on α can be considered as an automatic threshold for the corresponding change indicator ωxα . |N (x)| It should be mentioned that the analytical formulation, makes the automatic threshold computation an easy task in real scenarios. As a remark, the difference between both extremal limits, emphasizing in some way the test power, increases linearly with α. Thus, for α = 1, the difference is maximal and should improves August 13, 2010 DRAFT 13 −2.5 250 Mutual Information Variational Information Mixed Information −3 −3.5 200 −4 log(πd|i ) 150 K (J, K) ∼ pJ,K (J, K) ∼ pJ pK 100 −4.5 −5 −5.5 −6 −6.5 50 −7 0 0 a) Fig. 2. 50 100 150 J 200 −7.5 −10 250 −8 b) −6 −4 −2 0 log(πi|d ) a) Simulated bivariate data highlighting non linear dependence, green crosses. b) logarithm of ROC curves obtained with three different change indicators γx = ωx0 , βx = ωx1 and ωxα̂ applied to the synthesized dataset. the test. However, we observe that in practice the mixed information based indicator gives better results. V. S IMULATION In this section, simulations are performed to assess the efficiency of information based change detectors and the relevance of automatic threshold determination. We simulate data which highlight non linear dependences, such as an exponential link with a saturation phenomenon. The data is synthesized such that 5% of the realizations follow the independent distribution pJ pK . These 5% of independent realizations are collocated, such that a neighborhood base indicator becomes relevant. In this test case, the synthesized data are composed of 10000 dependent realizations, followed by 500 independent realizations, displayed in Fig.2.a. Then, the joint probability is estimated by histogram computation (6) without excluding independent realizations. Using the proposed information based methodology, three change indicators with a neighborhood of width 81 are assessed on this data set in terms of receiver operational characteristics (ROC) πi|d and πd|i . The three change indicators are γx = ωx0 , βx = ωx1 and ωxα̂ , where α̂ = I(J; K)/H(J; K), and the corresponding ROC curves are displayed in Fig.2.b. A straightforward ROC analysis of the mutual and variational information based change indicators γx , βx underlines their complementarity. The first change indicator γx is more efficient than August 13, 2010 DRAFT 14 −3 2.4 x 10 2.2 2 α̂ = I(J ; K)/H (J ; K) AUC 1.8 1.6 1.4 1.2 1 0.8 Fig. 3. 0 0.2 0.4 α 0.6 0.8 1 The efficiencies of the series of ωxα are assessed for 0 ≤ α ≤ 1. We observe that α̂ = I(J; K)/H(J; K) is a good indicator of the most efficient test measurement in terms Area Under the Curve. βx for high πi|d . On the contrary, βx is a better test for small πi|d . As hinted before, a mixed information base indicator may benefits form both behaviors, which is successfully realized in our test case for the particular compromise indicator ωxα̂, where α̂ = I(J; K)/H(J; K). In fact, α̂ is derived from the following equality MIα = 0, which heuristically corresponds to the switch from mutual predominance to variational predominance in the output metric. In Fig.2.b, we observe that the proposed mixed information based change indicator ωxα̂ performs as well as both others over the all possible domain. In this simulation we have I = 1.18, H = 9.15 and L = 1.58. As noted before, because I < L, the inequality πd|i > πi|d is likely to happen . This phenomenon probably induces the unbalanced operational characteristics observable in Fig.2.b. To complete the α impact analysis, the test indicators ωxα are assessed for varying α. Their efficiencies are measured by the Area Under the Curve AUC. This measure is a good indicator for comparing the overall ROC curves. As displayed in Fig.3, selecting α̂ = I(J; K)/H(J; K) gives a good estimate of the best compromise indicator, as it reaches almost the efficiency minimum. Now evaluating automatic threshold computation as expressed in (36), the following analysis is performed. Let p(Hd ) and p(Hi ) be the prior probabilities of dependence and independence, respectively, such the total probability of errors is expressed by π = p(Hd )πi|d + p(Hi )πd|i . Given an information based indicator, thresholding it by t gives a probability of errors denoted by π(t). August 13, 2010 DRAFT 15 2 1.9 πα (t̂)/ min t πα (t) 1.8 1.7 1.6 1.5 1.4 1.3 1.2 1.1 0 Fig. 4. 0.2 0.4 α 0.6 0.8 1 Assessment of automatic threshold determination by comparison of total errors probabilities. Thus, the optimal threshold t̄ corresponding to this indicator provides the minimum probability of errors, t̄ = arg mint π(t). Let π α (t) be the total probability of errors which is associated to the mixed information based indicator ωxα |N (x)| and thresholded at t. Let t̂ be the corresponding automatic threshold given by (36): t̂ = (1 − α)I(J; K) − L(J; K) − 2αH(J; K) . 2 (37) In this context, the automatic threshold process is assessed by comparing the minimum error probability to the automatic threshold probability: π α (t̂) π α (t̂) = . π α (t̄) mint π α (t) (38) Values close to one indicate a good automatic threshold determination. This last quantity is evaluated for varying α and is displayed in Fig.4. In this case, we observe that values are between 1 and 2 for any α, meaning that the automatic threshold determination is efficient for providing the minimum probability of errors, in an analytical way. VI. A PPLICATIONS In this section, two applications of the information based change detection are described. The first application deals with the detection of Internally Displaced People camps in Haiti after 2010 earthquake, exploiting very high resolution panchromatic pre- and post-event images. The second application pertains to the detection of damaged houses in Koidu, Sierra Leone, after an August 13, 2010 DRAFT 16 armed conflict. In both cases, the information based changed indicators are evaluated thanks to ground truth collected by visual interpretation. A. Internally Displaced People Detection in Haiti The 12th January 2010, an earthquake ravaged the island of Haiti, forcing the creation of refugee camps. Thanks to a pre-event image acquired in 2009 by Quickbird satellite and apost event image acquired the 21th January 2010 by aerial sensor, a change detection was made possible to detect Internally Displaced People camps. The post panchromatic image of resolution 15cm has been degraded to the resolution of the pre image which is 60cm. Then, both images, of size 30000 × 35000 pixels, have been coregistered by using B-spline deformation model [35] which is implemented in the Orfeo Toolbox implementation [36]. These images were then used as entries of the proposed change detection workflow. The workflow parameters have been chosen to detect IDP camps, which are composed of tents. In the images, tents are bright structures of area inferior to 90m2 = 250pix. Through a tophat by area opening, such structures are extracted in both images, as presented in Fig.5. Then, both images have been filtered by a disk of radius 15m = 30pix corresponding to an area of 700m2 , because IDP camps are expected to contain several tents producing changes at this scale. This last step corresponds to the spatial context modeling. Finally, information based change indicators are computed with a disk neighborhood of radius 15m = 30pix. Thanks to a visual interpretation of both images, IDP camps were vectorized as polygons delineating the borders of the camps. Considering these vectors as ground truth, several change indicators were assessed by computing their ROC curves. The results are displayed in Fig. 6. In our case, the mutual information is smaller than the Lautum information, resulting in a greater omission rate than the false-alarm rate as discussed in section III-B. We observe that the mutual information based change indicator perform better than variational information based change indicator δx for a wide range of false alarm rates. However, their complementarity remains noticeable. The mixed information based change indicator benefiting from both advantages gives the best operational characteristics. The CVA like change indicator is also assessed and compared to the information based change indicators. The results obviously highlights the better efficiency of information based indicators. The automatic threshold determination is tested only for the mixed information based change inAugust 13, 2010 DRAFT 17 Fig. 5. Pre and post-images of Haiti are processed, where an IDP camp was created. Both images are transformed to extract bright structures before being spatially averaged. Finally, the Mutual Information base change indicator is obtained highlighting the IDP camp. 1 δx ωxα̂ ωx0 = ιx ωx1 = βx 0.9 0.8 omission rate 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 false alarm rate Fig. 6. ROC curves obtained for four changes indicators δx , ωxα̂ , ωx0 = γx and ωx0 = βx . False alarm and omission rates are estimated by comparing change image to ground truth obtained represented by polygons and obtained by visual interpretation. August 13, 2010 DRAFT 18 0.025 probability of errors π(t) 0.024 automatic threshold t̂ 0.023 0.022 0.021 0.02 0.019 1.2 1.4 1.6 1.8 2 threshold t Fig. 7. The total probability of errors π α̂ (t) is computed for varying threshold values t. In addition, the optimal threshold is displayed. The automatic threshold attains an optimum of the total error curve. This curve corresponds to the Haiti case. dicator ωxα̂ . Approximatively, the prior probability of change is 2%, such that the total probability of errors π α̂ (t) becomes computable. A part of this function is displayed in Fig.7, with a reminder of the automatic threshold. The automatic threshold attains the optimum of the error probability curve. This example underlines the relevance of this automatic threshold determination in real scenario for minimizing the global probability of errors, while remaining analytically computable. As a remark, such computation does not require any knowledge of the a priori probabilities. Finally, as an indicator of the overall computational complexity, the change detection process took around 15 minutes with an Intel 3.33GHz dual core processor. Since the procedure has a linear complexity with the number of pixels, this makes it possible to integrate such process for emergency response. B. Detection of Rebuilt Buildings In between 2002 and 2005, the city of Koidu was rebuilt following a conflict. Two Ikonos panchromatic images of resolution 1m, were used to assess the changes. Such dataset was processed previously [18] to detect rebuilt houses. Destroyed houses do not have any roof highlighting small bright and dark blobs. Through the intersection of two top-hats by area opening and closing , such structures are extracted from the first image before being low pass filtered. On the contrary the post images is directly low pass filtered with an area opening, highlighting buildings and specifically reconstructed buildings. Then, the mixed information paradigm is applied by computing the change indicator between the small structures image and the large August 13, 2010 DRAFT 19 a) Fig. 8. b) c) A subset of size 400 × 400 pixels of the Koidu test case is displayed. a) Pre image is displayed, containing destroyed buildings. b) Post image is presented containing rebuilt buildings. c) The union of yellow and brown parts represents the ground truth mask, which are destroyed buldings which have been rebuilt. The union of brown and blue is the automatic detection based on Mixed Information. Its ROC curve is further analyzed in following figures. 1 low pass scale : 15 pix low pass scale : 20 pix low pass scale : 40 pix 0.9 0.8 omission rate 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 false alarm rate Fig. 9. ROC curves obtained for the mixed information based change indicator computed in the Koidu case. The effects of low pass filter scale on ROC is assessed. bright areas image. This detector is then assessed in terms of ROC computed from a ground truth mask displayed in Fig.8. As both images have non-linear dependences, CVA paradigm becomes irrelevant and is not assessed in this case. The change indicator is computed for the whole image which is of size 10000 × 10000 pixels, where the previous figure is a subset. Then, the change indicator is analyzed on this restricted set. The corresponding ROC curves are given in Fig.9, and underlines the change scale effect. We observe that the neighborhood size is of major importance for optimal efficiency of the detectors. Indeed an optimal neighborhood scale exists, which should be determined previously depending on the application. August 13, 2010 DRAFT 20 VII. C ONCLUSION In this paper, new information based change detection scheme has been proposed. Dealing with very high resolution images, a structure based information extraction is first applied to pre and post images, highlighting information of interest. Modeling spatial context enables to overcome spatial inconsistency problems, and to fit the scale of changes. This process was efficiently implemented by a low pass filtering. These steps being generic to any change detection work flow for processing of very high resolution images, new information based change indicators have been proposed, where changes are independent temporal realizations. This approach was used the capture non linear temporal dependencies inherent to very high resolution change detection. Such indicators where discussed and presented in two intricate frameworks being information and detection theories. Then, an automatic threshold analytical calculation was presented in order to perform fully automatic change detection. With simulated and real data sets, proposed and CVA change indicators were assessed and compared in terms of receiver operational characteristics. It is worth to note that in all cases information based change indicators were more efficient than CVA one, which are incapable of modeling non linear dependencies. In addition, automatic thresholding procedure was validated by the computation of total probability of errors, giving quasi optimal results. Finally, by applying the change detection workflow to real case scenarios (IDP camp detection in Haiti and Rebuilt buildings detection in Koidu), it was proven that such methodology was applicable for post crisis situation assessment, since: • it performs better than state-of-the-art change detection techniques, • it can be fully automatic, • it can be run on very big images. As remaining questions about the use of mixed information change indicator, further studies should focus on the computation of the optimal parameter α, which makes the balances between variational and mutual information measures. Concerning the context modeling, the impact of change scale needs to be better understood to make it automatically selected. As further studies, this proposed indicator should be compared with unsupervised or supervised classification in terms of accuracy and complexity, since the proposed is very low computational. August 13, 2010 DRAFT 21 R EFERENCES [1] J. Inglada and G. Mercier, “A new statistical similarity measure for change detection in multitemporal SAR images and its extension to multiscale change analysis,” IEEE Tran. Geoscience Remote Sensing, vol. 45, no. 5, pp. 1432–1445, May 2007. [Online]. Available: http://public.enst-bretagne.fr/ mercierg/articles/2007/ingladaTGRS07.pdf [2] A. Singh, “Digital change detection techniques using remotely-sensed data,” Int. J. Remote Sensing, vol. 10, no. 6, pp. 989–1003, 1989. [3] L. Bruzzone and S. Serpico, “An iterative technique for the detection of land-cover transitions in multitemporal remotesensing images,” IEEE Tran. Geoscience and Remote Sensing, vol. 35, no. 4, pp. 858–867, Jul. 1997. [4] L. Gueguen, S. Cui, G. Schwarz, and M. Datcu, “Multitemporal analysis of multisensor data: Information theoretical approaches,” in IGARSS, Honolulu, USA, July 2010. [5] F. Bovolo and L. Bruzzone, “A theoretical framework for unsupervised change detection based on change vector analysis in the polar domain,” IEEE Tran. Geoscience and Remote Sensing, vol. 45, no. 1, pp. 218–236, Jan. 2007. [6] L. Bruzzone, D. Prieto, and S. Serpico, “A neural-statistical approach to multitemporal and multisource remote-sensing image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 37, no. 3, pp. 1350–1359, May 1999. [7] L. Bruzzone and D. Prieto, “Automatic analysis of the difference image for unsupervised change detection,” IEEE Tran. Geoscience and Remote Sensing, vol. 38, no. 3, pp. 1171–1182, May 2000. [8] M. Schröder, H. Rehrauer, K. Seidel, and M. Datcu, “Spatial information retrieval from remote-sensing images. ii. GibbsMarkov random fileds,” IEEE Transactions on Geoscience and Remote Sensing, vol. 36, no. 5, pp. 1446–1455, Sept. 1998. [9] L. Gueguen and M. Datcu, “Image time-series data mining based on the information-bottleneck principle,” IEEE Tran. Geoscience and Remote Sensing, vol. 45, no. 4, pp. 827–838, April 2007. [10] C. Le Men, A. Julea, N. Meger, M. Datcu, B. Bolon, and H. Maı̂tre, “Radiometric evolution classification in high resolution satellite image time series,” in ESA-EUSC on Image Information Mining: pursuing automation of geospatial intelligence for environment and security, Frascati, Italy, March 2008. [11] P. Heas and M. Datcu, “Modeling trajectory of dynamic clusters in image time-series for spatio-temporal reasoning,” IEEE Trans. Geoscience and Remote Sensing, vol. 43, no. 7, pp. 1635–1647, July 2005. [12] S. Kullback and R. Leibler, “On information and sufficiency,” Annals of Mathematicals Statistics, vol. 22, no. 1, pp. 79–86, 1951. [13] L. Carvalho, L. Fonseca, F. Murtagh, and J. Clevers, “Digital change detection with the aid of multiresolution wavelet analysis,” Int. J. Remote Sensing, vol. 2001, no. 18, pp. 3871–3876, Dec. 22. [14] P. Gong, E. Ledrew, and J. Miller, “Registration-noise reduction in difference images for change detection,” International Journal of Remote Sensing, vol. 13, no. 4, pp. 773–779, March 1992. [15] J. Inglada, “Similarity measures for multisensor remote sensing images,” in IEEE International Geoscience and Remote Sensing Symposium, vol. 1, June 2002, pp. 104–106. [16] L. Gueguen and M. Datcu, “A similarity metric for retrieval of compressed objects: Application for mining satellite image time series,” IEEE Tran. Knowledge and Data Engineering, vol. 20, no. 4, pp. 562–575, April 2008. [17] ——, “Spatio-temporal structures characterization based on multi-information bottleneck,” in ESA-EUSC 2006: Image Information Mining for Security and Intelligence, Madrid, Spain, Nov. 2006. [18] E. Pagot and M. Pesaresi, “Systematic study of the urban postconflict change classification performance using spectral August 13, 2010 DRAFT 22 and structural features in a support vector machine,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 1, no. 2, pp. 120–128, Jun. 2008. [19] P. Soille, Morphological image analysis. Springer-Verlag, 2003. [20] P. Soille and M. Pesaresi, “Advances in mathematical morphology applied to geoscience and remote sensing,” vol. 40, no. 9, pp. 2042–2055, 2002. [21] L. Vincent, “Grayscale area openings and closings: their applications and efficient implementation,” in Proc. EURASIP Workshop on Mathematical Morphology and its Applications to Signal Processing, Barcelona, Spain, May 1993. [22] G. Ouzounis and P. Soille, “Differential area profiles,” in Int. Conf. Pattern Recognition, Istanbul, Turkey, Aug. 2010. [23] J. Morisette and S. Khorram, “An introduction to using generalized linear models to enhance satellite-based change detection,” in IEEE IGARSS ’97, vol. 4, Aug. 1997, pp. 1769–1771. [24] F. Bovolo, L. Bruzzone, and M. Marconcini, “A novel approach to unsupervised change detection based on semisupervised SVM and similarity measure,” IEEE Tran. Geoscience and Remote Sensing, vol. 46, no. 7, pp. 2070–, July 2008. [25] C. Shannon, “A mathematical theory of communication,” The Bell System Technical Journal, vol. 4, pp. 379–423, July 1948. [26] D. Sarrut and S. Miguet, “Similarity measures for image registration,” in European Workshop on Content-Based Multimedia Indexing, Toulouse, France, Oct. 1999. [27] L. Gueguen and M. Datcu, “Mixed information measure: Application to change detection in earth observation,” in MultiTemp, Connecticut, USA, July 2009. [28] J. Theiler and S. Perkins, “Proposed framework for anomalous change detection,” in ICML Workshop on Machine Learning Algorithms for Surveillance and Event Detection, 2006, pp. 7–14. [29] E. Wegman, “Nonparametric probability density estimation,” Journal of Statistical Computation and Simulation, vol. 1, no. 3, pp. 225–245, July 1972. [30] A. Winter, H. Maı̂tre, N. Cambou, and E. Legrand, “Entropy and multiscale analysis: a new feature extraction algorithm for aerial images,” in IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, April 1997, pp. 2765–2768. [31] S. Verdu, Multiuser Detection, S. Verdu, Ed. Cambridge University Press, 1998. [32] D. Palomar and S. Verdu, “Lautum information,” IEEE Trans. Information Theory, vol. 54, no. 3, pp. 964–975, March 2008. [33] L. Gueguen, S. Cui, G. Schwarz, and M. Datcu, “Multitemporal analysis of multisensor data: Information theoretical approaches,” in IEEE Int. Geoscience and Remote Sensing Symposium, Honolulu, USA, July 2010. [34] M. Meila, “Comparing clustering by the variation of information,” in Proc. 16th Annual Conf. Comp. Learning Theory, 2003. [35] D. Rueckert, L. Sonoda, C. Hayes, D. Hill, M. Leach, and D. Hawkes, “Nonrigid registration using free-form deformations: Application to breast MR images,” IEEE Trans. Medical Imaging, vol. 18, no. 8, pp. 712–721, 1999. [36] The Orfeo tool box software guide. [Online]. Available: http://www.orfeo-toolbox.org August 13, 2010 DRAFT