Environmental Challenges
Keywords: Urban river pollution is considered a ‘necessary evil’ consequence of disproportionate developmental expansion
Industrial catchment in metropolises. Unprecedented expansion and anthropic activities amongst other reasons lead to choking of ur-
Urban ecosystem ban rivers with municipal and industrial sewage. Urban rivers are dying while awaiting rescue, despite the hard
River pollution
efforts by civic authorities, largely due to lack of coordination amongst river authorities and stakeholders, un-
Multivariate statistics
derlying conflicts spanning across all levels, balkanisation, pluralization, scarcity of reliable technical data, and
Urban river financial constraints. Challenges faced by stakeholders in river pollution management as revealed in informa-
tional interviews are foregrounded. In an attempt to reduce some of the overwhelm faced by officials on ground,
a geospatial framework is proposed which if functionally implemented can transparentize river water quality
(WQ) monitoring, and facilitate pollution control. The pollution situation in a persistently polluted urban river
near Mumbai city, India was explored and assessed in the middle of a restoration battle waged by environmen-
tal activists. Following secondary river water quality data acquisition, primary sample collection campaign and
WQ testing, multivariate statistical data analysis was performed under data- and resource-constraint situation.
Spatiotemporal monitoring and visualization of river water quality data holds great promise for effectively con-
trolling anthropic river pollution. A subsequent geospatial analysis of the study area was performed using digital
elevation model (DEM) based watershed delineation, land use land cover (LULC) classification, mapping of WQ
monitoring locations, mapping of industrial clusters, integration of spatial data, and identifying polluter indus-
tries. This steered us to formulate and propose the geospatial framework and supplementary recommendations on
how better to save this dying river. The framework displays near real-time information on river water quality at
different impaired river stretches in urban industrial ecosystem. Applicable universally for monitoring any river
in urban industrial catchment, it can be used as a reference by stakeholders and research aspirants.
1. Introduction Strokal et al., 2021; Wen et al., 2017). Pollution situation of river
stretches in urban India is exponentially worse due to presence of clus-
Largely unbridled developmental expansion and urbanization result ters of industries as compared to rural India (Panda et al., 2018). The
in dense clustering of micro/small/medium/large industrial catchments anthropical land use/cover (LULC) changes within a watershed nega-
and surrounding residential areas across global urban ecosystems, es- tively impacting river water quantity (Samal & Gedam, 2021) and qual-
pecially in low- and middle-income countries (LMICs) (Elmqvist et al., ity is a well-researched and widely accepted phenomenon in literature
2013; L Sun et al., 2020.). As a seemingly inescapable consequence ur- (Santy et al., 2020; Xiaoping Wang & Zhang, 2018; Xiao et al., 2016).
ban rivers continue getting polluted and choked by municipal and indus- Polluted urban waters are deliberated as a necessary evil in pursuit
trial sewage, thus paving way for their demise (Chandrashekhar, 2018; of economic development/growth despite its unsustainability quotient.
Received 31 August 2021; Received in revised form 6 February 2022; Accepted 24 February 2022
S. Sekharan, D.R. Samal, H.C. Phuleria et al. Environmental Challenges 7 (2022) 100496
Around 63% of the total 62 million litres/ day generated sewage is achieve that. It will also help in monitoring the frequency of impair-
dumped in Indian urban rivers without treatment, as per a Central Pollu- ment happening at river stretches. Using the proposed framework, the
tion Control Board report submitted to National Green Tribunal, India river authorities can map near real-time instream water quality to iden-
(URL1, 2016; URL2, n.d.). Despite well-defined pollution control pro- tify impaired river stretches, thus stimulating timely action for pollution
grams and policies in place for rivers set up by concerned authorities, control. This shall not only help in reconfiguring the river monitoring
river water-quality across global cities is on a constant downhill on ac- activities in a fool-proof manner but also in generating consistent and
count of myriad anthropic reasons (Best, 2019). Implementation of river reliable WQ data for further analyses.
pollution policies faces weak enforcement gap due to several challenges Most urban polluted rivers in LMICs are the direct consequence of er-
associated with urban rivers – undercurrent hierarchization (Chien ratic urban sprawl within highly industrialised areas (Wen et al., 2017;
& Hong, 2018), pluralization (Sherly et al., 2015) and balkanization Xu et al., 2019). People living in urban industrial catchments are usually
(Parthasarathy, 2016) of the environmental management authorities, dependent on the same polluted rivers for their domestic needs, directly
municipal bodies and other water regulatory stakeholder authorities or indirectly. Previous studies have highlighted the importance of GIS
who are obligated to collaborate and coordinate with each other to man- based water quality monitoring in India and abroad (Ramadas & Saman-
age water quality and keep rivers pristine (Cribb, 1990; Weale, 1992); taray, 2018; Xiaoping Wang & Zhang, 2018). However, rarely do such
and unorganized clusters of urban industrial catchments and common studies pertain to rivers flowing through urban industrial ecosystems.
effluent treatment plants (CETPs)/ sewage treatment plants (STPs) not The novel objective of this paper is to bring to light the challenges faced
functioning to their full capacity that directly discharge their effluents in managing urban watershed pollution, and to demonstrate the efficacy
into rivers (Lucas, 2002). Most micro, small, and medium scale indus- of integrating GIS into the existing WQ monitoring procedures through
tries either do not bother or struggle to afford the cost-intensive indus- advanced monitoring strategies and robust data inventorization, specif-
trial effluent treatment plants, and tend to directly/ indirectly discharge ically in urban industrial catchments.
their semi-treated/ untreated effluents to rivers. There are numerous
news articles and eye-witness accounts of illegal, unregulated and un-
2. Material & methods
treated toxic waste-dumping into rivers (Helmer & Hespanhol, 1997;
Undirwade, 2015). At many river stretches, domestic sewage is either
Every possible secondary dataset available with concerned stake-
directly poured into rivers without any form of treatment (Goel, 2006;
holders on water quality parameter analyses in the study area was col-
Suthar et al., 2010; Xu et al., 2019) or is mixed at CETPs meant for
lected for systematic data inventorization and analysed to the extent
industrial wastewater (Salunke et al., 2014) for which they are not
possible. They were largely inconsistent and had a significant number
equipped. Source apportionment studies, that derive information about
of missing values. Nevertheless, multivariate statistical techniques were
sources of pollution and the amounts they contribute (Huang et al.,
applied on the part of the data found to be reliable and consistent,
2010; Kumar et al., 2018; Rout et al., 2013; Zhou et al., 2007), are very
and its results have been displayed for demonstrating the advantage of
important for any pollution control programme but are ever considered
such techniques. A corroborative WQ monitoring was also performed at
so practically. In addition to the above-mentioned challenges faced in
strategic locations in the study area and multiple geospatial data surveys
the context of river pollution control, river water quality analyses data
were conducted.
performed by authorities are often found to be ambiguous, missing and
incongruous which hinders the effective design and implementation of
river pollution control measures. 2.1. Study area description
The above-discussed issues were uncovered as a result of multiple
stakeholder discussions, informational interviews and field visits, while The study area physically explored for pollution assessment is the
conducting a pollution assessment study of a heavily polluted urban polluted Ulhas-Waldhuni (UW) watershed located in the intensely ur-
river located in the Mumbai Metropolitan Region, which is a coastal banized Mumbai Metropolitan Region, Maharashtra state, western In-
region in western India. Secondary river WQ data collected from con- dia. As seen in Fig. 1, Ulhas river originates from Western Ghats and
cerned river authorities was subjected to multivariate statistical anal- flows in north-west direction into the Arabian Sea. Ulhas River at Van-
yses, however, was found to be unreliable and sparse for making any gani village is the most upstream point in the study area, and is thus
robust inferences. In view of such resource- and data-sparse situation, a relatively in cleaner form. The downstream point of NRC Bund is the
one-time comprehensive river monitoring was conducted to get an in- first point in the saline Ulhas Creek. Ulhas Creek at Retibunder is the
stantaneous status of water quality at different stretches of the study most downstream point in the study area. The urban areas around
area. Comprehending the practical challenges faced by the concerned the study area being part of MMR are industrially important and are
stakeholder authorities in reducing pollution levels helped germinate hence densely populated. For much of its flow route, Ulhas river is ei-
the idea of developing an integrated, geospatial decision-support frame- ther dying with low oxygen supply struggling to support marine life
work for near real-time river pollution monitoring. Remote sensing or a glorified gutter laden with anthropogenic and industrial efflu-
and GIS technologies are widely being utilized for effective river wa- ents (Jadhav & Singare, 2015a, 2015b, 2015d, 2015c; Sahu & Mukher-
ter pollution monitoring in data-sparse LMICs (El-Rawy et al., 2020; jee, 1983; Singare, 2012). Waldhuni river which was originally a trib-
Griffin et al., 2018; Kapalanga et al., 2021; Kloiber et al., 2002; Sultana utary of Ulhas river outflowing from GIP and Chikhloli dams in Am-
& Dewan, 2021; X Sun et al., 2022). bernath district, is a visibly dead zone (Nabar et al., 2011; Pardeshi &
The proposed remote-sensed and Geographic Information System Vaidya, 2015; Undirwade, 2015). It is surrounded by a plethora of mi-
(GIS) based integrated framework, universally applicable to any river cro/small/medium scale manufacturing industrial clusters (IC) with reg-
system, is a facilitative attempt to transparentize river pollution mon- istered and unregistered industries and residential slums that contribute
itoring by incorporating GIS technology and possibly reduce some of heavily to pollution load. A significant portion of the total wastewater
the decision-making overwhelm faced by river authorities in pollution generated in the study area is from domestic sources, which is disposed
control. The GIS based monitoring, if functionally implemented, shall into the study area without any treatment.
enable the extraction of near real-time river water quality status in- The total study area is 317.7 km2 . Both Waldhuni watershed (66.4
formation at any point in the river stretch, which shall help track the km2 ) and two impaired stretches of Ulhas river (167.6 km2 sweet inland
impaired river stretches. This can assist in transparent source appor- zone from Vangani village in Thane district to upstream of NRC Bund
tionment analyses. Scrutinizing spatiotemporal variations in river water at Shahad, Kalyan; 83.7 km2 tidal saline zone from Ulhas Creek stretch
quality and LULC patterns contributes in identifying pollution sources from NRC Bund till Retibunder in Dombivli), where pollution concen-
of urban rivers (Liu et al., 2016; Yang et al., 2021), and GIS will help tration is reported to be high, were considered for the study. Six CETPs
S. Sekharan, D.R. Samal, H.C. Phuleria et al. Environmental Challenges 7 (2022) 100496
(catering to around 1000 industries) and four STPs (catering to a pop- ent stretches of river systems (Kazi et al., 2009; Noori et al., 2010;
ulation of 20 million approx.) discharge their industrial and domestic Shrestha & Kazama, 2007; Singh et al., 2004; Vega et al., 1998;
effluents respectively into the study area. Zhang et al., 2011). Multivariate statistical techniques - cluster anal-
ysis (CA), factor analysis (FA), principal component analysis (PCA) -
2.2. Multivariate statistical analysis on river water quality data are applied for water quality assessment (A Mishra, 2010.), identifica-
tion (Puckett, 1995) and apportionment of pollution sources/ water-
Multivariate statistical analysis (MSA) (Anderson et al., 1958; quality parameters (Gholizadeh et al., 2016; Singh et al., 2005) to get
Johnson & Wichern, 2002) of statistically-eligible river water qual- robust information about the water quality and design of monitoring
ity data is extensively applied to assess pollution situation at differ- network/strategy (Bartram & Ballance, 1996; Dixon & Chiswell, 1996;
S. Sekharan, D.R. Samal, H.C. Phuleria et al. Environmental Challenges 7 (2022) 100496
Strobl & Robillard, 2008; Varekar et al., 2015) for effective manage- efficients between the factor and the variable) less than this threshold
ment of water resources (Singh et al., 2004), and subsequent water (0.7) were considered to be non-principal variables (Mavukkandy et al.,
quality modelling (Biswas, 2006; Chapra, 2008; Kannel et al., 2007). 2014). PCA and FA are pattern recognition and dimension reduction
Application of multivariate statistics on water quality datasets help in techniques, and in WQ context assuming there are p variables, the idea
tracing out the origin of different pollutants such as municipal sewage, is that m < p independent and representative factor variables exist that
industrial effluent and agricultural run-off, provided large and con- can capture the water quality variation in river system. PCA explains
sistent datasets are available for relevant water quality parameters variance in a huge dataset of correlated variables with lesser num-
(Mavukkandy et al., 2014; Simeonov et al., 2003). ber of uncorrelated independent variables (principal components (PC)),
In the current study, multivariate statistical analysis techniques were thus removing redundant variables with minimal loss of information
applied to rationalize water quality parameters and monitoring loca- (Iscen et al., 2008; Simeonov et al., 2003). PCA extracts eigenvalues
tions on two sets of reliably consistent secondary datasets from 5 lo- and eigenvectors from covariance matrix of original WQ parameter vari-
cations (numbered in Fig. 1) procured from Monitoring of Indian Na- ables. The 𝑝𝑐𝑖 , weighted linear combinations of original WQ variables,
tional Aquatic Resources System (MINARS) project under National Wa- are the orthogonal i.e., uncorrelated variables obtained by multiplying
ter Quality Monitoring Program (NWMP). Set-I consisted of 10 months the original correlated standardized WQ variables with the eigenvector
(Jan 2014 - Oct 2014) data on 21 water quality parameters. Set-II con- weights, as seen in Eq1.
sisted of 47 months (Jan 2011 - Nov 2014) data on 6 water quality 𝑁 ( )
∑ 𝑥𝑗 − 𝑥̄ 𝑗
parameters (pH, BOD, COD, Nitrate Nitrogen, Faecal Coliform). 𝑝𝑐𝑖 = 𝑉𝑖𝑗 , 𝑖 = 1, 2, … , 𝑝 (1)
Multivariate statistical methods require that the data be dimen- 𝑗=1
sionless, hence data distribution was standardized and normalised where 𝑥𝑗 are observations of ith WQ variable with mean 𝑥̄ 𝑗 and standard
(Zhang et al., 2011). Missing data imputation included replacing by deviation 𝜎𝑗 , and 𝑉𝑖𝑗 are eigenvectors of covariance matrix of WQ data,
mean values from two adjacent stations. Below Detection Limit (BDL) N is total number of observations. 𝑝𝑐1 accounts for maximum variance
values were imputed using a technique explained in (Croghan & in the data, 𝑝𝑐2 accounts for the second largest amount of variance in
Egeghy, 2003). CA was applied on Set-I for stakeholder demonstra- the data and is uncorrelated with 𝑝𝑐1 and so on.
tions, and FA being best suited for large sample sizes was applied FA is performed by examining the covariance data matrix i.e., pat-
on Set-II. In water quality context, CA essentially creates homogenous tern of correlations between the WQ parameters. FA considers the vari-
groups or ‘clusters’ based on similar/dissimilar behavioural patterns ob- ance in the observed WQ variables to be influenced by presence of latent
served amongst WQ variables. Agglomerative (bottom-up) hierarchical factors (common factors). Highly correlated WQ parameters are likely
CA, which is quite suitable for pattern-recognition in small datasets and influenced by the same factors, and uncorrelated WQ parameters are
can be visualized through a Dendrogram, classifies data by consider- likely influenced by different factors. FA follows PCA. FA reduces the
ing each datapoint as a singleton cluster and then merging the closest less significant WQ variables coming from PCA by rotating the axis de-
clusters together stepwise till an all-encompassing root cluster remains. fined by PCA and constructing new variables, vari-factors (VFs). VFs
FA helps identify significant monitoring locations that display the maxi- include latent variables and PCs are linear combinations of observable
mum possible variations in water quality. FA is performed by examining water-quality variables but both VFs and PCs have the same variation
the pattern of correlations between the variable values. Variables that (Iscen et al., 2008).
are highly correlated are likely influenced by the same factors while In matrix notation, factors can be written in linear combination form
those that are relatively uncorrelated are likely influenced by different as
factors. The default case of 2 factors was considered and factor correla-
tion coefficients were computed. The factor correlation coefficient was 𝑋 − 𝜇 = Λ𝑓 + 𝜀 (2)
considered significant if its value was greater than 0.7 (Noori et al., Where = (𝑥1 , 𝑥2 , ⋯ , 𝑥𝑝 mean 𝜇 = (𝜇1 , 𝜇2 , ⋯ , 𝜇𝑝
)′ ; )′ ; factors 𝑓 =
2010; Ouyang, 2005). The variables with factor loading (correlation co- (𝑓1 , 𝑓2 , ⋯ , 𝑓𝑚 )′ ; error matrix 𝜀 = (𝜀1 , 𝜀2 , ⋯ , 𝜀𝑝 )′ ;
S. Sekharan, D.R. Samal, H.C. Phuleria et al. Environmental Challenges 7 (2022) 100496
Fig. 3a. Clusters formed of Inland stations (in Blue) & tidally influenced stations (in Red).
factor loadings matrix Λ = (𝜆𝑖𝑗 )′ is the loading of 𝑋𝑖 on factor 𝑓𝑗 with 𝑖 = data, a geospatial framework was thought out in order to develop a
1, … , 𝑝; 𝑗 = 1, … , 𝑚; 𝑚 < 𝑝 near real-time system for WQ monitoring. The development of geospa-
In this study where getting reliable and consistent data was a con- tial framework as seen in Fig. 2 includes four steps; a) DEM-based
stant challenge, a brief part of secondary river water quality data col- automated watershed delineation, b) mapping of water sampling lo-
lected from stakeholders was found eligible for applying multivariate cations and visualization of pollution concentration, c) preparation of
techniques. Their results have been included ahead for demonstration LULC map from satellite image, d) demarcation of industrial catchments
purposes. The objective was to familiarize river stakeholders with the through spatial analysis and finally all the datasets were integrated in
practical functionality of applying multivariate statistical analysis tech- GIS environment to develop a framework for pollution monitoring in
niques in WQ monitoring. the study area. The globally available Shuttle Radar Topography Mis-
sion (SRTM) data (30 m) was used to delineate the UW watershed and
2.3. Development of a geospatial framework for pollution monitoring in sub-watersheds. The process flow includes sink/fill, determination of
rivers flow direction, flow accumulation, pour point selection and watershed
delineation. The entire watershed in the study area is further divided
Remote sensing and GIS technologies are being widely studied and in several sub-watersheds to identify the areas having close proximity
applied for effective river quality monitoring (Ritchie et al., 2003; to the impaired stretches of the river. The water sample locations in
Satapathy et al., 2010; Somvanshi et al., 2012; Sultana & Dewan, 2021). the study area were collected through a Differential Global Positioning
As evident in recent literature, futuristic geospatial and Internet of System (GPS) survey method. The sampling locations (Lat/Long, World
Things (IoT) based techniques are also being used for WQ monitor- Geodetic System 1984 (WGS-84)) were plotted in GIS and their real-
ing (Cao et al., 2022; Chen et al., 2022; Chowdury et al., 2019; world position was validated visually with the help of Google Earth
Radhika et al., 2022; Sharmila et al., 2022; X Sun et al., 2022.). Along- platform. The LULC map was prepared to establish a qualitative rela-
side the application of multivariate statistical techniques on secondary tion between anthropogenic effects on water quality. The Landsat En-
S. Sekharan, D.R. Samal, H.C. Phuleria et al. Environmental Challenges 7 (2022) 100496
Fig. 3b. Clustered Heat map (double dendrogram) showing standardized WQ parameter values at each station. Tidally-influenced stations ([1], [3]) seem to be
crossing limits in several WQ parameters.
hanced Thematic Mapper (ETM+) (30 m) data was downloaded from 3. Applications
open-source domain and pre-processed before classification. Broad land
use classes, such as water bodies, built-up, waste land and forested ar- Challenges encountered with secondary river WQ data and re-
eas were identified, and their spectral signatures were taken for image lated insights
classification. In post-processing, a majority filter was used on the im- The significant observations made through field-visits and data anal-
age classification to remove salt-pepper effects and produce a relatively yses were: a) high level of visible river pollution in the study area, dis-
better LULC map (Samal & Gedam, 2015). In the study area, the ICs coloured water, several stretches with no flow owing to sewage accu-
are a major contributor to the industrial pollution. Therefore, all the IC mulation, low awareness on effects of river pollution in surrounding
boundary, CETPs, member industry units along with their cadastral in- public, b) overburdened CETPs with each plant catering to more mem-
formation were collected to identify the probable sources of pollution ber industries than it is designed to, c) non-optimal functioning of STPs
in the study area. In the due process, it was noticed that the cadas- and CETPs, industries bypassing wastewater discharge rules, d) consid-
tral information was in non-GIS (∗ .dwg) format which must be con- erable difficulty in data collection from stakeholders requiring numer-
verted to a GIS compatible (∗ .shp) format for seamless integration with ous follow-up attempts, e) presence of intermittent, inconsistent and un-
other spatial data in a geospatial framework. The entire conversion pro- reliable secondary data from stakeholders on river water quality, ren-
cess was carried out and the maps were geo-referenced using ground dering most of them unqualified for efficient application of multivariate
control points (GCP). In addition, the plot boundaries were spatially statistical techniques.
adjusted to realign the exact orientation of cadastral plots within the During secondary data inventorization, a water-quality data ma-
ICs. trix was created that helped visualise the extent of the data scarcity
S. Sekharan, D.R. Samal, H.C. Phuleria et al. Environmental Challenges 7 (2022) 100496
Fig. 3c. Dendrogram showing clusters of inland (‘red’ stations [2], [4], [5] as also marked in Fig. 1) and tidally-influenced stations (‘green’ ones are [1], [3] stations
in Fig. 1)
problems. Its thorough scrutiny revealed irregular water quality sam- Lovric, 2011). A p-value < 0.05 obtained in Bartlett test reveals that
ples, with different parameters analysed in each sample, producing a correlation is present among the WQ variables which ensures successful
huge quantum of missing values. It was also observed that each sam- application of factor analysis. A Kaiser–Meyer–Olkin (KMO) value i.e.,
pling location has different sample sizes for each parameter. Such a overall proportion of variance of 0.8023 also shows that factor analysis
data-scarce situation encountered due to balkanization and pluralisa- can be applied on Set 2. In Fig. 4, Scree plot generated after applying
tion makes it challenging to generate robust inferences using multivari- factor analysis, number of eigenvalues greater than 1.0 reveal the apt
ate statistical analyses for which substantial amount of past data, with number of factors to be considered since eigenvalues are the amount
minimal missing values that is rearrangeable to multidimensional ma- of variance explained by the factors. Varimax orthogonal rotation that
trix form, is of utmost necessity. Sufficient past data collected through requires the factors to be uncorrelated is performed to generate factor
efficient surface water quality monitoring (Bartram & Ballance, 1996; loadings. Loadings are correlations between factors and the variables
Dixon & Chiswell, 1996; Strobl & Robillard, 2008; Varekar et al., and range from -1 to 1 and indicate the extent to which a factor ex-
2015) that is temporally continuous with minimal missing values is a plains a variable. Values close to -1 or 1 indicate that the factor has
pre-requisite for consequential robust statistical analyses so to assess an influence on these variables; the first factor always accounts for the
long-term trends, influence of seasonal fluctuations, and any identifi- most variance (and hence has the highest eigenvalue), and the next fac-
cation (Yang et al., 2021) and impact of point (as well as non-point) tor accounts for as much of the left-over variance as it can, and so on.
source of water pollution (Kamarudzaman et al., 2011; Loehr, 1974; Communality measures the proportion of each WQ variable’s variance
Puckett, 1995; Schaffner et al., 2009; Xuejun Wang et al., 2004), and that can be explained by the two factors jointly, e.g., 77.9% of COD’s
surface water quality modeling (C Wang et al., 2006.; Q Wang et al., variance is explained by the two factors Table 1. shows the factor load-
2013.; Whitehead et al., 2018). Nevertheless, two secondary water qual- ings between the WQ parameters, and it can be inferred that pH, BOD,
ity datasets procured from NWMP MINARs for 5 locations (marked COD with higher factor loadings (values > 0.7 in bold in Table 1) are
[1,2,3,4,5] in Fig. 1) in the study area were found satisfactorily eli- significant enough to be considered for continuous monitoring in future.
gible for applying multivariate statistical analysis techniques. CA was FA is to be necessarily applied to determine the optimal number of sta-
suitably applied on Set-I to create clusters of monitoring locations ex- tions and WQ parameters to be included in a monitoring network so as
hibiting similar behavior, displayed as double dendrogram or in other to optimize costs.
words ‘clustermap’ in Figs. (3a, 3b, 3c). Stakeholders will be able to Challenges encountered with corroborative study and related
decipher visually the cluster of river stretches that need attention and insights
relevant actions can be taken up. A corroborative one-time sampling study was undertaken with the
The suitability of the secondary data (Set 2) for factor analysis objective of getting an instantaneous water quality picture of the study
was evaluated by Kaiser–Meyer–Olkin (Cerny & Kaiser, 1977) measur- area and the extent of pollution caused by its point sources (i.e., the
ing of sampling adequacy and Bartlett tests of sphericity (Arsham & STPs, CETPs, and Nallahs). Around 130+ locations were initially iden-
S. Sekharan, D.R. Samal, H.C. Phuleria et al. Environmental Challenges 7 (2022) 100496
Table 1 operation and maintenance at the plants. Discussions with the plants’
Factor Loadings corresponding to each WQ parameter variables in Set 2. staff revealed receipt of higher inflow as compared to inlet design, no
WQ parameter variables Factor Loadings Communalities primary treatment by member industries, mixing of sewage at effluent
treatment plants as some reasons for their less than ideal functioning
Factor 1 Factor 2
(CPCB, 2005; Metcalf, 2003).
pH -0.095 0.706 0.508 For overcoming the challenges faced by local river authorities
DO -0.592 0.504 0.605
as discussed above, what we recommend are the following: (i) con-
BOD 0.794 -0.248 0.692
COD 0.853 -0.229 0.779 duct frequent post monsoon technical performance assessment of
Nitrate 0.449 -0.348 0.323 sewage/effluent treatment plants, particularly in tropical regions where
Fecal Coliform 0.492 -0.046 0.244 monsoon rainfall is quite significant, for process optimization and upgra-
dation of treatment technologies as deemed necessary, (ii) regular water
quality analysis at inlet and outlet of sewage/effluent treatment plants
tified through virtual exploration of the study area, discussions with for a considerable time period (> 1 year) to help in precise segregation of
stakeholders, and physical field visits. Out of these 130+ locations, pollutant-concentrations discharged from different types of industries,
26 strategic locations were finalised for the one-time primary sam- (iii) delineation of urban industrial catchments and accurate grid
pling. River surface water samples were collected in plastic bottles for mapping of their member industries at regular intervals to identify and
physicochemical and microbiological analyses with water quality pa- prune cluster of industries that discharge without appropriate treat-
rameters (pH, Color, Odour, DO, TDS, Chlorides, Turbidity, BOD, COD, ment; geospatial mapping responsibilities can be undertaken by in-
Lead, Cadmium, Mercury, Fluoride, Iron, Copper, Zinc, Arsenic, Oil and dustrial development agencies, (iv) install real-time non-contact type
Grease, Sulphate, Phosphate, Chromium Hexa, Total Ammonical Nitro- flow/discharge measuring devices, (v) dredging at river confluences
gen, Cyanide, Nickel, Total Chromium, Nitrate Nitrogen, Total Coliform, and creek points both pre-monsoon and post-monsoon (vi) sensitizing
Faecal Coliform). Upstream and downstream river points in each catch- pollution defaulter industries about zero-discharge (Van der Bruggen &
ment were sampled (Bartram & Ballance, 1996). It facilitated a quick Braeken, 2006) and recycling (Ranade & Bhandari, 2014) principles,
validation of the pollution situation in the study area as observed in (vii) a central portal for secondary data from stakeholders accessible on
secondary data and confirmed its deteriorating status. The water quality request for researchers.
test results from the corroborative study were compared with standard Despite the challenges discussed in this paper, inferences that could
accepted norms (CPCB, 1986). be drawn from secondary data and corroborative sampling analyses
The CETPs and STPs are considered major sources of urban river were that river water-quality in the study area is heavily deteriorated,
pollution (Kathuria & Turaga, 2014) as they discharge into rivers di- and existing data is intermittent and hence insufficient to generate any
rectly and practically most such plants do not run at full capacity. Op- constructive in-depth statistical insights. It was recognised that geospa-
erational performance of sewage/effluent treatment plants in the study tial representation of secondary/primary data gives a better understand-
area was found to be highly questionable (officially, only 79% capacity ing of WQ status.
was operational in effluent treatment plants and 50% of sewage treat- Recommendations to overcome stakeholder challenges
ment plants in the study area were dysfunctional). Several water quality Quality of an analysis is highly dependent on quality and extent
parameters were found to be not fulfilling the regulatory discharge stan- of the data. Grab samples can utmost reveal instantaneous river wa-
dards (CPCB, 1986) in the outlet. BOD, COD, trace metal values were ter quality status but do not hold enough substance to contribute to
in non-compliance with regulatory effluent standards and highly fluc- decision-making process. Long-term decision-making demands uncover-
tuating at all plants. TDS, chloride, and sulphate were observed to be ing of past latent trends and patterns in river water quality, which can
higher at outlet compared to inlet. These observations hinted at poor be achieved through building an efficient water quality monitoring net-
S. Sekharan, D.R. Samal, H.C. Phuleria et al. Environmental Challenges 7 (2022) 100496
Fig. 5. Flowchart displaying recommendations derived post-investigation of the river stakeholder challenges
work (Dixon & Chiswell, 1996; A. K Mishra & Coulibaly, 2009.; Strobl is finalized statistically, a river water quality database can be generated
& Robillard, 2008). However, it is difficult and almost futile without after regular monitoring at the rationalised network using the proposed
availability of significant amount of consistent and robust water quality geospatial framework, and a surface water quality model can be devel-
data. Thus, acquiring relevant and reliable water quality data, though oped using the water quality database. Refer the flowchart in Fig. 5.
an admittedly tedious task, is the first and most important step before We recommend that the water quality model be used as a planning
applying the proposed geospatial framework. Identifying the potential tool for future by compulsorily running it prior to giving consent to new
polluting locations in the river on the basis of field visits, informational industries, expansion of industries/CETP or augmentation of water sup-
interviews with various river stakeholders (such as municipal bodies, in- ply etc. Prior to this, however, reliable and effective data generation
dustrial development agencies, pollution control agencies, CETPs/STPs) and management is the first and foremost feasibility criteria for effec-
and the river-surrounding public, virtual exploration, and geo-spatial tive multivariate statistical analyses and surface water quality modelling
catchment analyses of the river are the further requisite steps before ap- which in turn would facilitate strategic decision-making for rejuvenation
plying the proposed framework. It is recommended that short-term wa- of pollution-impaired river stretches.
ter quality monitoring be carried out at the identified strategic locations
diurnally for 3-4 months and subsequently analysed for all relevant wa- 3.1. Development of geospatial framework for WQ data
ter quality parameters. This database be then subjected to multivariate
statistical analysis to identify and rationalize homogenous location clus- The overall approach here was to visualize the pollution scenario
ters (Alameddine et al., 2013; Mavukkandy et al., 2014; Varekar et al., of an urban ecosystem in the study area. A geospatial framework was
2015, 2016, 2021), and arrive at a rationalized network i.e., reduced developed to integrate all spatial data pertaining to UW watershed in
and optimally-representative subset of monitoring locations and WQ a GIS platform in order to understand and visualize the impaired river
parameters which consistently captures significant variability in most stretches on near real-time basis. The framework consists of GIS-based
of the WQ parameters. Once the optimally rationalised set of locations watershed/sub-watershed delineation, mapping of primary/secondary
S. Sekharan, D.R. Samal, H.C. Phuleria et al. Environmental Challenges 7 (2022) 100496
Fig. 6. Representation of spatial distribution of pollution levels for a. BOD, b. COD, c. TDS, and d. Fluoride (all in mg/L units) in the study area
S. Sekharan, D.R. Samal, H.C. Phuleria et al. Environmental Challenges 7 (2022) 100496
water sampling locations, preparing LULC map, mapping ICs and iden- information about the flow of pollutants as well as its impact on down-
tifying different industry types within the ICs. stream areas/sub-watersheds. Also, it helps to identify the source and
The framework starts with downloading and processing of an open flow of effluents within a natural drainage system. The watershed ap-
source SRTM DEM (Spatial resolution of 30m) data. The DEM is capa- proach to monitor river pollution is the building block of the proposed
ble of capturing natural topography and drainage systems of the study geospatial framework which can be implemented in urban industrial
area. The entire watershed in the study area is divided into several num- catchments in other parts of the world.
ber of sub-watersheds in the vicinity of impaired river stretch. Dividing The primary water sampling locations were mapped with respect to
the watershed to a finite number of sub-watersheds might provide vital a geographic reference system (WGS-84/ Universal Transverse Merca-
S. Sekharan, D.R. Samal, H.C. Phuleria et al. Environmental Challenges 7 (2022) 100496
tor (UTM)) in GIS environment. In addition, the relative locations of tent and reliable river data, and other routine stakeholder concerns. In
sampling points were validated with high resolution Google Earth Pro an effort to help manage their overwhelm and transparentize pollution
imagery using an overlaying technique. Several water quality param- monitoring processes and in extension facilitate anthropic river pollu-
eters, such as pH, TDS, BOD, COD, Color, Fluoride, Arsenic etc were tion control, the use of the proposed geospatial framework is demon-
attributed to the sampling locations. The concentrations of some repre- strated for spatiotemporal pollution monitoring that is tech-friendly and
sentative WQ parameters have been presented using proportional sym- can facilitate collaborative management of pollution issues in any urban
bol method in Fig. 6. The visualization of WQ parameters reveals the river system. This paper discusses the benefits of multivariate statisti-
spatial distribution of the effluents which helps to identify its probable cal analysis on WQ data and of integrating GIS in pollution monitoring
sources in the study area. through application of remote sensing technology, spatial analysis and
The study area is highly urbanized and most of the urban settlements role of IoT for near real-time information on water quality status quo
are located in close proximity to the study area. Therefore, the land use at impaired river stretches, particularly in urban industrial ecosystem.
pattern in the study area was prepared and quantified from the Earth Moreover, the unbiased data acquisition capabilities of remote sensing
Observation (EO) satellite image using a supervised classification ap- technology and integration with field information in GIS platform will
proach (Fig. 7). The spectral signatures of prominent land use classes, enable policymakers to visualize the ground situation and take timely
such as wastelands/open lands, built-up, forested area and water bod- decision toward better planning and management of river over indus-
ies were taken for LULC map preparation. More than half of the study trial catchment in urbanized ecosystem.
area is covered by wastelands/open lands (55.3%) followed by built up This is not a policy tool but a planning tool to facilitate GIS-based
area (23.8%), forest (16.2%) and water bodies (4.7%). The water bod- decision-making, particularly for the ground technical staff. The pro-
ies consist of natural streams, isolated ponds and minor reservoirs. The posed framework is appropriate for decision support for pollution con-
built-up area is very dense due to high population density and its close trol agencies in study areas with data and resource constraints. Urban
proximity to the Mumbai city. The LULC classification map showed the river stakeholders in other LMICs will be able to relate to the chal-
distribution of urban areas that discharges a significant part of the do- lenges recognized in the current study. Applying the proposed geospatial
mestic/industrial effluents to the study area. framework, that is capable of incorporating numerous WQ parameters
Further, the geospatial framework focuses on the high polluting re- and sampling locations, would facilitate collating river WQ data con-
gions such as the ICs present within the study area. Each IC along with sistently and unbiasedly. The proposed GIS tool may be coupled with
their member industries were plotted with the help of cadastral infor- optimization models to maintain and predict river WQ in future. With
mation (Fig. 8). The position/shape of cadastral plots were spatially the availability of spatial data and advanced GIS tools in open-source
adjusted /rectified with the help of a high-resolution image obtained domain (e.g., QGIS), the proposed geospatial framework can be repli-
from Google Earth. The type of industries, amount of effluent discharge cated for any urban river ecosystem with no cost. It is a given that con-
and its other properties were attributed in GIS platform. The mapping dition of rivers in highly urbanized areas, particularly in LMICs, with
of plot level information would help to identify the type of industry increasing population density and disproportionate infrastructural ex-
and its effluent discharge to the nearby drainage system or CETPs in pansion is deteriorating. This paper demonstrates the possibility and ad-
the study area. However, it was found that the number of industries vantages of a GIS-based near real-time WQ monitoring of rivers flowing
listed in the secondary data had some discrepancies with respect to the through urbanized industrial catchments consisting of heavy built-up
field information. In addition, an effort has been made to demonstrate areas, which will also prevent illegal wastewater dumping/accidental
the concept of industrial catchment in ICs that contributes to particular spills with use of sensor-based systems; however, need of the hour are
CETPs for better monitoring and management of pollution in the re- consistent and reliable river WQ data for robust visualization and anal-
gion. The industrial catchment is very similar to the watershed concept ysis. Unless aligned efforts from river stakeholders and use of geospatial
where several industries discharge their effluents to a CETP which can technology are made part of the solution, urban rivers will keep strug-
be delineated at a cadastral level. The present effort is to majorly un- gling to stay alive. The proposed geospatial framework and recommen-
derstand primary/ secondary WQ data in urban industrial catchments dations derived during execution of this study are universally applicable
statistically and geospatially on near real-time basis in order to easily to any river and have been presented as an effort in this direction to fa-
identify impaired river stretches. cilitate effective pollution mitigation of urban rivers.
S. Sekharan, D.R. Samal, H.C. Phuleria et al. Environmental Challenges 7 (2022) 100496
