Academia.eduAcademia.edu

Computational Uses of Colour

Why do we have colour? What use is it to us? Some of the obvious answers are that we see colour so that we can recognise objects, spot objects more quickly, tell when fruit is ripe or rotten. These reasons make a lot of sense, but are there others? In this paper, we explore the things that colour makes easier for computational vision systems. In particular, we examine the role of colour in understanding specularities, processing interreflections, identifiying metals from plastics and wet surfaces from dry ones, choosing foveation points, disambiguating stereo matches, discriminating textures and identifying objects. Of course, what is easier for a computational vision system is not necessarily the same for the human visual system but it can perhaps help us create some hypotheses about the role of colour in human perception. We also consider the role of colour constancy in terms of whether or not it is required for colour to be useful to a computer vision system.

Computational Uses of Colour Brian Funt and Vlad Cardei Abstract Why do we have colour? What use is it to us? Some of the obvious answers are that we see colour so that we can recognise objects, spot objects more quickly, tell when fruit is ripe or rotten. These reasons make a lot of sense, but are there others? In this paper, we explore the things that colour makes easier for computational vision systems. In particular, we examine the role of colour in understanding specularities, processing interreflections, identifiying metals from plastics and wet surfaces from dry ones, choosing foveation points, disambiguating stereo matches, discriminating textures and identifying objects. Of course, what is easier for a computational vision system is not necessarily the same for the human visual system but it can perhaps help us create some hypotheses about the role of colour in human perception. We also consider the role of colour constancy in terms of whether or not it is required for colour to be useful to a computer vision system. Introduction Often in computer vision, colour has been treated as an unnecessary complication of an already complex problem. To process colour requires more data storage and on the face of it more computation; however, one of our premises in this paper is that the extra information provided by colour may simplify the processing rather than complicate it. To see the extent to which this might be true, we examine how colour has been applied in solving a variety of computer vision problems. Much of computer vision can be viewed as an effort to extract automatically information about a scene from a digitized image of it. What are the ways in which colour can help in this process? Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 1 In general, colour is a perception that people have and as such is quite difficult to quantify. In this paper, however, we will be talking about colour in the context of computer vision and will use ‘colour’ to refer to the more restricted notion of simply the digitized 3-band signal (R, G and B) obtained from a typical colour video camera. Each pixel location in the camera’s image provides 3 measurements of portions of the spectrum of the light impinging on the camera’s sensor at that point. We can describe the camera’s response to a given spectrum of light in terms of three sensitivity functions similar to the way in which there are 3 cone types in the human visual system each with its own sensitivity as a function of wavelength. In the case of the camera, however, the sensitivity functions are not necessarily the same as those of the human cones, and in fact, in general they differ significantly from the cones. Furthermore, different colour video cameras models tend to generate different RGB signals in response to the same input signal, so even in the case of computer vision the definition of colour is unfortunately underdetermined. The light entering a camera depends on both the surface from which the light is reflected and the light which illuminates that surface. If the illuminating light changes, so will the light entering the camera and as a result so will the RGB values in the colour image the camera records. The same is of course true for light entering the eye; however, as experiments by Brainard (Brainard 1997) have shown, human subjects can make adjustments for changes in the incident illumination quite well. This ability to account for changes in illumination and see colours as approximately the same under different lighting conditions is generally known as colour constancy. It would seem helpful for a computer vision system also to have colour constancy, and we will discuss on approach to computational colour constancy below, but it turns out that many of the computational uses for colour do not in fact require colour constancy for their operation. An important tool in computational colour analysis is the colour histogram. A colour histogram of an image provides a count, for each colour (RGB), of the number of pixels in the image which have that colour. Since image pixels are all the same size, the histogram also describes, for each colour, the total image area covered with that colour. Often only binary Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 2 valued (i.e. 0 or 1) histograms are required, in which case the histogram indicates only whether or not a particular colour is present in the scene, not the amount of the colour present. Histograms discard all spatial information about the image--- we cannot tell from an image’s histogram where the colours were located in the image, but rather only that they were there. Although often the RGB values are histogrammed directly, histograms are also commonly made using the values obtained in other colour spaces, such as a chromaticity space, derived from the original. It must be noted that, since the sensors are camera dependent, all resulting RGB values will also be camera dependent, so the same scene viewed with two different cameras will most probably generate two different images (i.e. RGB sets) and similarly two different colour histograms. As the illumination changes, the colours in the image will also change. and so will the histogram of the image. This colour shift poses a problem concerning the stability of the colours being used. For example, the colours change as the illumination changes, then how can colour be used for object recognition? Clearly, if the colours change, the colour histograms will change as well. How can computational vision systems compensate for the change? One possibility is to imitate human colour constancy in some way and thereby provide an approximately illumination-independent description of colours; however, as we will show it is not always necessary to compute colour constant descriptors in order to discount the effects the of changes in illumination. Colour-Based Object Recognition Swain and Ballard (Swain and Ballard 1991) showed that colour can be very useful for object recognition. They designed a technique called ‘colour indexing’ that compares the colour histogram of a given object to a list of histograms of objects previously stored in a database. This method is based solely on how well the histogram of the test object matches the histograms in the database. The method works surprisingly well even though it completely ignores all spatial information (i.e. regarding object shape or geometry). As a consequence, the algorithm is not affected by any rotation, translation or scaling of the test object. However, even slight changes in Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 3 the scene’s illumination–in intensity or chromaticity– adversely affect the algorithm’s performance as can be seen in Figure 1. When the illumination changes, all colours in the image change, so the new colour histogram of the object no longer matches its histogram stored in the database. To overcome this problem, Swain proposed adding colour constancy as a preprocessing step to compensate for changes in the illuminant. Instead of pre-processing for colour constancy, Funt and Finlayson (Funt and Finlayson 1995) proposed the idea of using relative colour instead of absolute colour as a way of circumventing the colour constancy problem. Their ‘colour constant colour indexing’ method cancels out the effects of the illumination by histogramming the ratios of colours at neighbouring pixels rather than the colours themselves. So long as the illumination varies smoothly, if at all, throughout the image then the ratios will be approximately independent of the illumination no matter what it is. This assumption is reasonable since in practice abrupt changes in the incident illumination rarely occur. How does comparing neighbouring pixels eliminate the illumination effects? For many camera sensors the changing from one illumination to another can be approximated as scaling each of the R,G and B by different factors. For example, moving from indoor tungsten bulb lighting to the much bluer daylight will generally mean an increase of α, say, in the B image band to αB. For two neighbouring pixels, of the same or different colour, and considering just the B band for simplicity, if they have B components B1 and B2 under daylight, they will have components αB1 and αB2 indoors. However, the ratio of the two pixels will remain constant: B1 αB1 = B2 αB2 Modelling illumination’s effects by a scaling such as α, however, holds only approximately as can be seen from the fact that E(λ) can not be correctly divided out inside the ∫ E (λ )S (λ )) R(λ )dλ ≅ ∫ S (λ )R(λ )dλ ∫ E (λ )S (λ )R(λ )dλ ∫ S (λ ) R(λ )dλ integral 1 2 1 (1) 2 where S1 and S2 are the surface reflectances corresponding to the two neighbouring pixels, RB is the sensor response of the B channel and E is the illumination. This approximation holds quite Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 4 well for ‘sharp’ sensors (Finlayson, Drew and Funt 1994b), which are what are found in many cameras. The colour indexing and colour constant colour indexing methods show that colour can be very useful for object recognition in a computer vision system. Both methods result in very high, and very comparable, recognition rates: so while colour information is an important object recognition cue, it is not necessary to have the absolute, or true colour, but rather only a measure of the relative colours of neighbouring locations. Figure 1. Left: object to be recognised. Centre: object found by simple colour indexing when illumination effects are not take into account. Right: object found by colour constant colour indexing method based on the illumination independence of colour ratios. Interpreting Specularities Shiny, glossy surfaces are very common so it is important for a computer vision system to be able to recognise glossy regions and interpret them correctly; otherwise, algorithms such as those that extract shape from shading will be misled. The shiny, mirror-like component of surface reflection is known as the specular component and the bright highlights are called specularities. Shafer (Shafer 1985) pointed out that the specular component creates colour effects that can be used in understanding images of glossy surfaces. A lot of work in computer vision has made the assumption that the surfaces seen in an image are primarily matte and can be modelled by the Lambertian reflectance model which states that surfaces appear equally bright from all viewing angles. While this is a useful Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 5 assumption from a theoretical standpoint, truly matte surfaces are rare and many not-very-shiny surfaces like a sheet of paper reflect a significant specular component. Unlike the matte component, the amount of specularity seen on the surface depends on the angle from which it is viewed and the relative position of the illuminating light. What is interesting from the standpoint of colour analysis of specular reflection is that the specular and matte components generally will differ in colour. This is captured in Shafer’s dichromatic model of reflection (Shafer 1985). Materials like plastics, but not metals, are inhomogeneous in that they are composed of a medium and a colorant which scatters the light. In such materials, incident light first interacts with the interface between the air and the material and some of the light is reflected by the interface without further penetrating the material. This part of the light reflection is called interface reflection. Interface reflection is often seen as a highlight or specularity and so it is also called specular reflection. The relative amount of reflected light at the interface as well as the reflection angle are predicted by Fresnel’s laws. Since the index of refraction is relatively constant over the visible spectrum, the interface reflection is assumed to be constant with respect to the wavelength and consequently to have the same spectral composition as the incident light. Hence, the part of the light reflected by the interface will have the same colour as the illuminating light. The remaining part of the incident light, which was not reflected at the interface, penetrates into the medium, is scattered by the colorant particles, and is eventually either absorbed, transmitted (if the material is not opaque) or re-emitted back through the interface. The portion of the light re-emitted from the body of the surface is called the body reflection component. Generally the body reflection component will differ in colour from the colour of the illuminating light because of the selective absorption of the colorant. The dichromatic model of reflection states that the total radiance of reflected light is equal to the sum of two independent components: the radiance of the interface reflection and radiance of the body reflection. The relative amounts of the interface and body reflection components will vary with the angle at which the incident illumination hits the surface and the direction from which the surface is viewed. If the colours of the interface and body components are given by RiGiBi and RbGbBb respectively, then the colour seen by the camera will be: RGB= mi RiGiBi + mb RbGbBb, (2) Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 6 where mi and mb are the respective magnitudes of the two components. Suppose we now consider a R-versus-G-versus-B plot of the colours found in an image containing only a single shiny surface. So long as the surface is not completely flat then as the surface curves, the position of each small patch of surface will vary with respect to the light and camera so that the relative magnitudes of the interface and body components will change. The dichromatic model predicts that all the colours will fall on a single plane in the R-versus-Gversus-B plot and they in fact all be contained within a parallelogram defined by the RiGiBi and RbGbBb colours. The magnitude coefficients mi and mb determine the position inside the parallelogram.. Specular reflection results in well defined distributions of image colours, so colour can help in the analysis of images of shiny surfaces by applying the dichromatic model. By fitting a plane and then a parallelogram to the distribution of plotted RGB’s, the colours of RiGiBi and RbGbBb of the interface and body components can be extracted. They are useful since the interface component, for example, tells us about the colour of the incident illumination. The components of the original image can also be separated to generate an image of the object as if it were completely matte and another as if it were completely mirror-like, both of which are useful in determining the shape of the object. Interpreting Interreflections Light interreflected between surfaces often accounts for a significant amount of the illumination impinging on them. The amount can easily be as high as 15% and yet it seems that people rarely notice interreflections even though they can introduce significant colour shifts. Figure 2 shows quite an extreme example of interreflection where significant components of red and green occur on the white cup; nonetheless, the cup still appears basically white. Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 7 Figure 2. Example of interreflection. In terms of a computer vision system, colour can be used as a clue in analysing images containing interreflections. Correctly interpreting interreflections can be very important since, for example, many computer vision models such as those that extract shape from shading information presume that all the illumination a surface receives comes directly from the light source. Red Surface Blue Surface Figure 3. Light arriving at the eye or camera will consist of blue light reflected directly from the blue surface along with red-blue light reflected by both the red and blue surfaces. Consider a scene that contains two non-coplanar surfaces of different colour, red and blue for example, but illuminated by a common light source as shown in Figure 3. Some of the red Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 8 light reflected from the red surface may impinge on the blue surface, which will then be lit by both the main light source and the red surface. Similarly, the blue surface may add some blue illumination to the red surface. In both cases the extra light interreflected between the surfaces will change the colour of those surfaces as seen by the camera. Ignoring the change in colour created by interreflection could easily adversely affect colour segmentation and object recognition algorithms. The interreflection process, of course, does not necessarily end after one iteration. Red light that hit the blue surface could again be interreflected back to the red surface, but the magnitude of these multiply reflected components drops quickly. Analysis of the colour shifts created by interreflection turns out to have a lot in common with that for specular reflection. As with the case of specular reflection, the goal in interpreting interreflections is to separate the colour image into components describing those arising from the direct illumination of the light source and those arising from indirect illumination via interreflection. While the most general model of interreflection would allow for an infinite number reflections between the two neighbouring surfaces, in practice a simplified one-bounce model works well in many situations and is much easier to analyse. The one-bounce model, based on a single interreflection of the light from one surface to the next, predicts that the light reflected from a surface consists of a linear combination of a no-bounce contribution and a one-bounce contribution. The no-bounce contribution is the one resulting from the illuminant being directly reflected from the surface, while the one-bounce contribution comes from the reflection of another surface. The spectrum of the light CA(x, λ) from a pixel at location x on surface A with reflectance SA(λ) receiving light reflected from the light source of spectrum E(λ) and from surface B with reflectance SB(λ) is described by the one-bounce model as: C A ( x, λ ) = α A ( x) E (λ ) S A (λ ) + β BA ( x) S A (λ ) E (λ ) S B (λ ) , (3) where αA(x) and βBA(x) are the relative magnitudes of the two contributions. In terms of camera RGB rather than full spectra, this means that the final colour registered by the camera results from a no-bounce colour R0G0B0 and a one-bounce interreflection colour R1G1B1 RGB = αA R0G0B0 + βBA R1G1B1 (4) Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 9 This equation exactly parallels Equation (2) above so, as in the case of specularities, the colours created by interreflection will again lie within a parallelogram in RGB space. The case of light reflected from A to B is symmetrical to that of light reflected from B to A. This is reflected in the symmetry of the second term in equation (3), so the one-bounce colour reflected from surface A to B matches that from B to A. The RGB’s from A and B will lie in a two separate parallelograms in colour space; however, the equality of the one-bounce colour between them implies that the two parallelograms will share a common edge. Hence, the onebounce colour can be determined (Funt and Drew 1993) as the intersection of the two parallelograms. This can then be used to extract the no-bounce components and produce images representing the scene with or without interreflection effects. Hence as in the case of specularities, colour provides important information that helps in understanding interreflections. Colour and Shape In computer vision there have been many different approaches to extracting the shape of surfaces from the shading of the surfaces. Most of these have assumed that the surfaces were of a single colour, but of course this assumption is generally violated in any practical scene. Using a graylevel, “black and white” image of a scene does not solve the problem for shape from shading algorithms, since of course the world is still coloured even if the image is not. One use of colour in the context of shape from shading is in separating the changes arising in an image created by changes in shading (i.e. the amount of incident illumination) from those due to variations in surface colour. (Funt, Drew and Brockington 1992) propose an algorithm that recovers the intrinsic shading field from a coloured image. The shading field describes the changes in image intensity associated only with changes in surface orientation and illumination intensity. The algorithm detects colour edges by examining a chromaticity space version of the image in which intensity is eliminated. The algorithm keeps all small colour changes, but thresholds out all large ones in a manner derived from Horn’s version (Horn 1974) of Land’s retinex computation (Land 1977). The algorithm then reconstructs the image created by all the small changes which then encodes the shading information with the confounding surface-colour Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 10 information removed. The resulting shading field can be then used by any shape from shading algorithm to calculate surface shape. A different approach to shape and colour is to consider whether or not colour information can be used directly to extract information about surface shape. Petrov (Petrov 1993) shows that if the illumination on a surface varies in colour as a function of incident angle that a given shape can be identified. He considers the set of possible images that could be produced under all illumination conditions. He then proves that, for uniformly curved surfaces, the set of colour images taken under all possible illuminants and painted any possible colour, forms a manifold on which a co-ordinate system can be introduced. A given shape will have specific coordinates in this space and so can be identified. Another approach to shape and colour follows from an extension to the photometric stereo (Woodham 1980) method. In the original photometric stereo method, multiple images are taken under 3 different light sources. Given the images, positions and relative intensities of the sources, the surface normals at all image points can be calculated. In practice, photometric stereo is often implemented by using a single colour image with a red, green and blue light sources in different locations. The 3 bands of the colour image then have the same effect as 3 images taken in sequence. The requirement that the light source positions and intensities be known can be relaxed as shown by Woodham (Woodham 1994). He considers the set of colours which might appear on an arbitrarily shaped object due to the fact that the colour of the incident light is varying as a function of its incident angle. For a sphere, which contains all possible orientations, he shows that the set of colours must fall on an ellipsoid in RGB colour space. This is analogous to the colours created by specularities and interreflections, discussed above, but now an ellipsoid is created instead of plane. An object with fewer orientations than a sphere will simply create part of the full ellipsoid. Different illumination conditions will result in different ellipsoids, so by fitting an ellipsoid to the available data, Woodham can extract the relevant parameters of the illumination. Once the illumination parameters are fixed, the regular photometric stereo method can be applied to determine the surface normals (up to a global rotation factor) at all locations. Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 11 Petrov’s and Woodham’s methods show that colour encodes shape information that can at least in principle be extracted. Their methods involve many restrictions and assumptions; nonetheless, it is interesting that colour has the potential to directly inform us about shape. Colour and Texture Colour can aid the analysis of texture as has been shown in the work of Panjwani and Healey (Panjwani and Healey 1995). They developed a segmentation algorithm which segments images into interesting regions based on their colour texture. Figure 4 shows an example of the algorithm’s excellent results. Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 12 Figure 4: Top: Input image containing natural colour textures. Bottom: Image segmentation results based on aggregating similar colour textures. . Texture is a random pattern but with fixed properties, so the authors used Markov random fields to model the stochastic properties of a texture. In the case of texture in grey-level images, the Markov fields describe the statistical dependence of the intensity of a pixel as a function of the intensity of neighbouring pixels. In coloured textures, however, a statistical relationship also holds between the different colour channels, information lost by reduction to a single grey-level intensity channel. To exploit the local inter-dependencies between the colour planes, Panjwani and Healey use three-dimensional Markov fields to encode the stochastic properties of a texture not only within a colour plane, but also across colour planes. The Markov fields are used to determine a stochastic dependence between the RGB value of a pixel at a certain location and the RGB values at neighbouring locations. This model takes into account the dependence between all three colour planes and provides a compact description of a texture’s properties. Using this texture description method, scenes can be segmented based on the texture description obtained at each point in the scene. The textures in a scene differ according to the constraints embodied in the stochastic model with similar textures having similar Markov random field descriptors. Thus, similar pixels within a region of a single texture will belong to the same class and will be segmented as a single object. The segmentation algorithm is unsupervised (i.e., it makes no a priori assumptions regarding the textures present in a scene) Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 13 and operates in two stages. First is a region-splitting phase that divides the image into small regions that are guaranteed to contain only one type of texture. This is followed by a clustering phase that merges similar adjacent regions together until a stopping criterion is satisfied. After clustering, the textural properties of the resulting regions are recomputed. As the results in Figure 4 show, the method is very good at creating an intuitively appealing segmentation. For example, the ocean has been appropriately segmented as a single region even though it contains a variety of different colours. Other Applications of Colour In addition to the ways in which colour can help in the interpretation of images that have been discussed in detail in the preceding sections, there are numerous other ways in which colour has proven useful in computer vision. These applications of colour will be briefly described below. Classifying the materials of surfaces in an image is another area in which colour can be helpful. Healey (Healey 1989) designed an algorithm that distinguishes between metals and inhomogeneous dielectrics by using the colour reflectance properties of these classes. The algorithm is founded on the observation that for metals the sensors measurements lie in a onedimensional space (a line in the RGB colour space); whereas, in the case of inhomogeneous dielectrics, for which the dichromatic model of reflection applies, the sensor responses lie in a two-dimensional space (a parallelogram in the RGB colour space). The method, therefore, examines the colour-space histograms of the colours found in a region and uses the dimension of the space to determine the material type. Colour also plays a role in the case of stereo correspondence. Jordan et al. (Jordan, Geisler and Bovik 1990) have shown that, in some cases, chromatic information is used more efficiently than luminance information. This provides some clues into the human stereo processing and how colour influences the visual system. Nguyen (Nguyen 1992) uses colour for stereo correspondence matching in a computer system. Wet surfaces look different from dry ones. Mall et al. (Mall and Vitoria Lobo 1995) developed a model which predicts how the grey-level image of a dry surface changes when it is Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 14 wetted. Their model predicts the fact that wet surfaces are darker than dry ones and that the darkening caused by wetting is not the same as the simple darkening created by a shadow. Instead of being a simple uniform darkening as would occur in a shadow, the darkening of wetness is a function of the surface albedo. While Mall et al. applied their model to grey-level images only, for coloured surfaces the albedo varies with wavelength, so wetting will create colour shifts. It has yet to be tested, but it follows that colour should help in recognising surface wetness. Colour Cues for Foveation Camera sensors usually maintain constant spatial resolution across the entire image. This differs from the human visual system, where the acuity is highly non-uniform and decreases from the optical axis to the periphery. Whether in a human or computer vision system (e.g. Funt 1980), varying resolution imagery necessitates saccadic movement of the eyes or camera to focus in on areas of interest to extract high resolution information. Swain et al. (Swain, Kahn and Ballard 1992) propose using simple multi-coloured regions as landmarks to guide the fixation process. One aspect of the colour indexing method described above in the section “Colour-Based Object Recognition” is called histogram backprojection. It is a technique which finds any instances of a target object in an image based on the target’s colour histogram. To guide fixations, they suggest creating colour histograms of a few colourful regions in the visual field that can be used as landmarks. The regions do not have to correspond to an actual object, but rather are simple collections of colours. After a saccade, the landmarks can be found in the new view by histogram backprojection. Conveniently, colour histograms are not affected very much by changes in spatial resolution, which means that histogram created from the high resolution fovea can be used to backproject onto the low resolution periphery. Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 15 Is colour constancy really necessary? Many colour-based algorithms, such as ‘colour constant colour indexing’ or segmentation algorithms, require only relative colours to perform well and the computation of absolute colours (i.e. object surface colours as seen under a fixed canonical illuminant) is not necessary. In other cases, such as colour reproduction or digital photography, it is important to estimate absolute values for the colours in a scene. Often, there is more than one illuminant in a scene, so the chromaticity of a surface might vary spatially, which will affect most segmentation and object recognition algorithms. Some colour constancy algorithms (Barnard, Finlayson and Funt 1997) can eliminate the effect of multiple illuminants. Colour constancy is an under-determined problem, so the solution will not be unique. It has been shown that under certain circumstances (Finlayson, Drew and Funt, 1994a) images can be converted to make them look like they were taken under a standard ‘canonical’ illuminant. This transformation is done by a simple diagonal transformation for all the pixels in a scene. Thus, once the colours of the initial and canonical illuminants are specified, the problem is reduced to estimating the colour of the illuminant. We designed a neural network (Funt, Cardei and Barnard 1996; Cardei, Funt and Barnard 1997) that is able to recover the chromaticity of the illuminant in a scene, based only on a binarised chromaticity histogram of the image. The neural network is trained on a large set of synthesised scenes for which the illuminant is known. Once the learning period is completed, the network is presented a histogram of a scene and then outputs an estimate of the chromaticity of the illuminant. The estimates obtained are comparable with the best colour constancy algorithms to date (e.g. Finlayson 1996 and 1997, Forsyth 1990) and much superior to traditional methods such as retinex (Land 1977). This non-parametric approach to colour constancy has the advantage that it has no built-in constraints and that it is adaptive and can always be retrained for new conditions such as a different cameras. An example of the colour correction carried out by the neural network is shown in Figure 5. Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 16 Figure 5. An example of neural network based colour correction results. Top left: input image taken under unknown illumination. Top right: target image taken under the canonical illumination. Bottom Left: good colour correction result obtained based on neural network estimation of the illumination. Bottom right: poor colour correction result obtained based on grey world estimation of the illumination Conclusion Whether or not a computer vision system employs colour imagery, the world contains colours. Reducing the colours to one-dimensional grey level images is perhaps more likely to make the image interpretation more rather than less difficult. We have tried to demonstrate this point by example and have consider how colour can be used in computer vision systems to help in interpreting interreflections, specularities, material types, and surface wetness as well as in recognising objects, enhancing stereo correspondences, providing foveation landmarks and extracting shape. Many other applications of colour are possible, interpreting shadows or recognising transparency, for example. We find it somewhat surprising that none of these example applications require absolute colour, that is object surface colour which is independent of the illuminant. For all of them, changes in the illumination either do not matter or can easily be factor out by considering colour Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 17 ratios. While humans clearly possess a fair degree of colour constancy and colour constancy can also be achieved in a computer vision system, it remains unclear exactly what crucial role it might play. Acknowledgement This work has been supported by the Natural Sciences and Engineering Research Council of Canada. References • • • • • • • • • • • Barnard, K., G. Finlayson and B. Funt (1997). Color Constancy for scenes with varying illumination. Computer Vision and Image Understanding, 65(2), 311-321. Brainard, D. H., W.A. Brunt, J.M. Speigle (1997). Color constancy in the nearly natural image. 1. Asymmetric matches. J. Opt. Soc. Am. A, 14(9). Cardei, V., B. Funt and K. Barnard (1997). Modeling Color Constancy with Neural Networks. Proc. Int. Conf. on Vision, Recognition, and Action: Neural Models of Mind and Machine, Boston, May 29-31. Finlayson G., P. M. Hubel and S. Hordley (1997). Color by Correlation. Proc. IS&T/SID Fifth Color Imaging Conference: Color Science, Systems and Applications, 6-11. Finlayson G. (1996). Color in Perspective. IEEE Trans. on PAMI, Vol. 18 , No.10, 10341038. Finlayson, G., M. Drew, and B. Funt (1994a). Color Constancy: Generalized Diagonal Transforms Suffice. J. Opt. Soc. Am. A, 11(11), 3011-3020. Finlayson, G., M. Drew, and B. Funt (1994b). Spectral Sharpening: Sensor Transformations for Improved Color Constancy. J. Opt. Soc. Am. A, 11(5), 1553-1563. Forsyth D.A. (1990). A Novel Algorithm for Color Constancy. Int. J. of Computer Vision, 5:1, 5-36. Funt, B. (1980). Problem-Solving with Diagrammatic Representations. Artificial Intelligence, Vol. 13, 201-230. Funt, B., M. Drew, and M. Brockington (1992). Recovering Shading from Color Images. ECCV’92, 124-132. Funt, B., and M. Drew (1993). Color Space Analysis of Mutual Illumination. IEEE Trans. on PAMI, Vol. 15, No. 12. Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 18 • • • • • • • • • • • • • • • Funt, B., and G. Finlayson (1995). Color Constant Color Indexing. IEEE Trans. on PAMI, Vol. 17, No. 5. Funt, B., V. Cardei, and K. Barnard (1996). Learning Color Constancy. Proc. IS&T/SID Fourth Color Imaging Conference: Color Science, Systems and Applications, 58-60. Healey, G. (1989). Using Color for Geometry-Insensitive Segmentation. J. Opt. Soc. Am. A, Vol.6, No.6, 920-936. Horn, B.K.P (1974). Determining Lightness from an Image. Computer Graphics and Image Processing, Vol. 3, No. 1, 277-299. Jordan III, J.R., W. Geisler, and A. Bovik (1990). Color as a Source of Information in the Stereo Correspondence Process. Vision Res. Vol. 30, No. 12, 1955-1970. Land, E.H. (1977). The Retinex Theory of Color Vision. Scientific American, 108-129. Mall H., N. da Vitoria Lobo (1995). Determining Wet Surfaces Versus Dry, Proc. ICCV’95 Fifth International Conference on Computer Vision, IEEE Computer Society Press, pp. 963968, June 1995. Nguyen, H.H., Cohen, P. (1992). Correspondence from Color Shading, ICPR92 Proc. Intl. Conf. On Pattern Recognition (I:113-116), 1992. Panjwani, D.K., and G. Healy (1995). Markov Random Field Models for Unsupervised Segmentation of Textured Color Images. IEEE Trans. on PAMI, Vol. 17, No. 10. Petrov, A.P. (1993). On Obtaining Shape from Color Shading. Color Res. and Appl., Vol. 18, No.6, 375-379. Shafer S.A. (1985). Using color to separate reflection components. Color Res. Appl., Vol.10, No. 4, 210-218. Swain, M., and D. Ballard (1991). Color Indexing. Int. J. of Computer Vision, 7:1, 11-32. Swain, M., R.E. Kahn and D. Ballard (1992). Low Resolution Cues for Guiding Saccadic Eye Movements. Proc. CVPR IEEE Computer Vision and Pattern Recognition Conference, 737-740. Woodham, R.J. (1980). Photometric Method for Determining Surface Orientation from Multiple Images. Optical Engineering, Vol. 19. No. 1, 139-144. Woodham, R.J. (1994). Gradient and Curvature from the Photometric-Stereo Method, Including Local Confidence Estimation. J. Opt. Soc. of Am. A, Vol. 11(11), 3050-3068. Appears in Color Perception: Philosophical, Psychological, Artistic and Computational Perspectives, Vancouver Studies in Cognitive Science Vol 9, ed. S. Davis, Oxford University Press, 2000. Copyright OUP. 19