Deep learning for water quality

  • Wei Zhi 1 , 2 ,
  • Alison P. Appling   ORCID: 3 ,
  • Heather E. Golden   ORCID: 4 ,
  • Joel Podgorski   ORCID: 5 &
  • Li Li   ORCID: 2  

Nature Water volume 2, pages 228–241 (2024)

6041 Accesses

6 Citations

49 Altmetric

Metrics details

  • Environmental sciences

Understanding and predicting the quality of inland waters are challenging, particularly in the context of intensifying climate extremes expected in the future. These challenges arise partly due to complex processes that regulate water quality, and arduous and expensive data collection that exacerbate the issue of data scarcity. Traditional process-based and statistical models often fall short in predicting water quality. In this Review, we posit that deep learning represents an underutilized yet promising approach that can unravel intricate structures and relationships in high-dimensional data. We demonstrate that deep learning methods can help address data scarcity by filling temporal and spatial gaps and aid in formulating and testing hypotheses via identifying influential drivers of water quality. This Review highlights the strengths and limitations of deep learning methods relative to traditional approaches, and underscores its potential as an emerging and indispensable approach in overcoming challenges and discovering new knowledge in water-quality sciences.

water quality prediction research paper

water quality prediction research paper

Temperature outweighs light and flow as the predominant driver of dissolved oxygen in US rivers

water quality prediction research paper

Intercomparison of deep learning models in predicting streamflow patterns: insight from CMIP6

water quality prediction research paper

Machine learning approach towards explaining water quality dynamics in an urbanised river

Data availability.

Streamflow data (Fig. 1a ) from the Global Streamflow Indices and Metadata Archive (GSIM) were compiled from repositories at and . Water-quality data (Fig. 1b ) from the Global River Water Quality Archive (GRQA) were downloaded from .

W.Z. was supported by the National Natural Science Foundation of China (52121006) and by the Barry and Shirley Isett Professorship (to L.L.) at Penn State University. L.L. was supported by the US National Science Foundation via the Critical Zone Collaborative Network (EAR-2012123 and EAR-2012669), Frontier Research in Earth Sciences (EAR-2121621), Signals in Soils (EAR-2034214), and US Department of Energy Environmental System Science (DE-SC0020146). J.P. was supported by Swiss Agency for Development and Cooperation (SDC) (WABES project, 7F-09963.02.01). This paper has been reviewed in accordance with the US Environmental Protection Agency’s peer and administrative review policies and approved for publication. Mention of trade names or commercial products does not constitute endorsement or recommendation for use. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement or recommendation for use by the US Government. Statements in this publication reflect the authors’ professional views and opinions and should not be construed to represent any determination or policy of the US Environmental Protection Agency.

water quality prediction research paper



Cite this article

water quality prediction research paper

  • Mourade Azrour   ORCID: 1 ,
  • Jamal Mabrouki 2 ,
  • Ghizlane Fattah 3 ,
  • Azedine Guezzaz 4 &
  • Faissal Aziz 5  

2948 Accesses

77 Citations

4 Altmetric

Explore all metrics

Water is an essential resource for human existence. In fact, more than 60% of the human body is made up of water. Our bodies consume water in every cell, in the different organisms and in the tissues. Hence, water allows stabilization of the body temperature and guarantees the normal functioning of the other bodily activities. Nevertheless, in recent years, water pollution has become a serious problem affecting water quality. Therefore, to design a model that predicts water quality is nowadays very important to control water pollution, as well as to alert users in case of poor quality detection. Motivated by these reasons, in this study, we take the advantages of machine learning algorithms to develop a model that is capable of predicting the water quality index and then the water quality class. The method we propose is based on four water parameters: temperature, pH, turbidity and coliforms. The use of the multiple regression algorithms has proven to be important and effective in predicting the water quality index. In addition, the adoption of the artificial neural network provides the most highly efficient way to classify the water quality.

water quality prediction research paper

water quality prediction research paper

Hybrid Machine Learning Algorithms for Effective Prediction of Water Quality

water quality prediction research paper

Optimizing Water Quality Parameters Using Machine Learning Algorithms

water quality prediction research paper

Classification and Analysis of Water Quality Using Machine Learning Algorithms

  • Artificial Intelligence

Ahmed U, Mumtaz R, Anwar H et al (2019) Efficient water quality prediction using supervised machine learning. Water 11(11):2210.

Article   Google Scholar  

Aldhyani THH, Al-Yaari M, Alkahtani H, Maashi M (2020) Water quality prediction using artificial intelligence algorithms. Appl Bionics Biomech.

Asadollah SBHS, Sharafati A, Motta D, Yaseen ZM (2021) River water quality index prediction and uncertainty analysis: a comparative study of machine learning models. J Environ Chem Eng 9(1):104599.

Azrour M, Farhaoui Y, Ouanan M, Guezzaz A (2019) SPIT detection in telephony over IP using K-means algorithm. Proc Comput Sci 148:542–551.

Bekesiene S, Meidute-Kavaliauskiene I, Vasiliauskiene V (2021) Accurate prediction of concentration changes in ozone as an air pollutant by multiple linear regression and artificial neural networks. Mathematics 9(4):356.

Ciulla G, D’Amico A (2019) Building energy performance forecasting: a multiple linear regression approach. Appl Energy 253:113500.

Deng T, Chau K-W, Duan H-F (2021) Machine learning based marine water quality prediction for coastal hydro-environment management. J Environ Manag 284:112051.

Dezfooli D, Hosseini-Moghari S-M, Ebrahimi K, Araghinejad S (2018) Classification of water quality status based on minimum quality parameters: application of machine learning techniques. Model Earth Syst Environ.

El Bilali A, Taleb A (2020) Prediction of irrigation water quality parameters using machine learning models in a semi-arid environment. J Saudi Soc Agric Sci 19(7):439–451.

Ewaid SH (2017) Water quality evaluation of Al-Gharraf river by two water quality indices. Appl Water Sci 7(7):3759–3765

Griffiths O, Henderson H, Simpson M (2010) Environmental Health Practitioner Manual: (Common wealth of Australia). . Accessed 10 Aug 2021

Guezzaz A, Asimi Y, Azrour M, Asimi A (2021b) Mathematical validation of proposed machine learning classifier for heterogeneous traffic and anomaly detection. Big Data Min Anal 4(1):18–24.

Guezzaz A, Asimi A, Asimi Y, Azrour M, Benkirane S (2021) A distributed intrusion detection approach based on machine leaning techniques for a cloud security. In: Gherabi N, Kacprzyk J (eds) Intelligent systems in big data, semantic web and machine learning. Advances in intelligent systems and computing, vol 1344. Springer, Cham.

Chapter   Google Scholar  

Guo Q, Zhuang T, Li Z, He S (2021) Prediction of reservoir saturation field in high water cut stage by bore-ground electromagnetic method based on machine learning. J Petrol Sci Eng 204:108678.

Haghiabi AH, Nasrolahi AH, Parsaie A (2018) Water quality prediction using machine learning methods. Water Qual Res J 53(1):3–13.

Harkins RD (1974) An objective water quality index. J (water Pollution Control Federation) 46(3):588–591

Google Scholar  

Hasan MK, Alam MA, Das D, Hossain E, Hasan M (2020) Diabetes prediction using ensembling of different machine learning classifiers. IEEE Access 8:76516–76531.

Ighalo JO, Adeniyi AG, Marques G (2021) Artificial intelligence for surface water quality monitoring and assessment: a systematic literature analysis. Model Earth Syst Environ 7(2):669–681.

Imani M, Hasan MM, Bittencourt LF, McClymont K, Kapelan Z (2021) A novel machine learning application: water quality resilience prediction Model. Sci Total Environ 768:144459.

Kapadia D, Jariwala N (2021) Prediction of tropospheric ozone using artificial neural network (ANN) and feature selection techniques. Model Earth Syst Environ.

Kicsiny R (2014) Multiple linear regression based model for solar collectors. Sol Energy 110:496–506.

Kumar MJV, Samalla K (2019) Design and development of water quality monitoring system in IOT. Int J Recent Technol Eng 7(5):7

Li D, Liu S (2019) System and platform for water quality monitoring. Water Qual Monit Manag.

Lu H, Ma X (2020) Hybrid decision tree-based machine learning models for short-term water quality prediction. Chemosphere 249:126169.

Lumb A, Sharma TC, Bibeault J-F, Klawunn P (2011) A comparative study of USA and Canadian water quality index models. Water Qual Expo Health 3(3–4):203–216

Mabrouki J, Azrour M, Boubekraoui A, El Hajjaji S (2021a) Intelligent system for the protection of people. In: Intelligent systems in big data, semantic web and machine learning. Springer, pp 157–165

Mabrouki J, Azrour M, Dhiba D, Farhaoui Y, Hajjaji SE (2021b) IoT-based data logger for weather monitoring using arduino-based wireless sensor networks with remote graphical application and alerts. Big Data Min Anal 4(1):25–32.

Mabrouki J, Azrour M, Fattah G, Dhiba D, Hajjaji SE (2021c) Intelligent monitoring system for biogas detection based on the Internet of Things: Mohammedia, Morocco city landfill case. Big Data Min Anal 4(1):10–17

Mabrouki J, Fattah G, Al-Jadabi N, Abrouki Y, Dhiba D, Azrour M, Hajjaji SE (2021d) Study, simulation and modulation of solar thermal domestic hot water production systems. Model Earth Syst Environ.

Mabrouki J, Azrour M, El Hajjaji S (2021e) Use of internet of things for monitoring and evaluation water’s quality: comparative study. Int J Cloud Comput (in press)

Mabrouki J, Azrour M, Farhaoui Y, El Hajjaji S (2021f) Intelligent system for monitoring and detecting water quality. In: Farhaoui Y (ed) Big data and networks technologies, vol 81. Springer International Publishing, pp 172–182.

Miry AH, Aramice GA (2020) Water monitoring and analytic based thingspeak. Int J Electr Comput Eng (IJECE) 10(4):3588–3595.

Momenzadeh L, Zomorodian A, Mowla D (2011) Experimental and theoretical investigation of shelled corn drying in a microwave-assisted fluidized bed dryer using artificial neural network. Food Bioprod Process 89(1):15–21.

Nabavi-Pelesaraei A, Rafiee S, Hosseini-Fashami F, Chau K (2021) Artificial neural networks and adaptive neuro-fuzzy inference system in energy modeling of agricultural products. In: Predictive modelling for energy management and power systems engineering. Elsevier, pp 299–334.

Naga C, Talnan Jean Honoré C, Delfin OA, Bernard YO, Guillaume ZS, Henoc Sosthène A, Mpakama Z, Issiaka S (2018) Spatio-temporal analysis and water quality indices (WQI): case of the Ébrié Lagoon, Abidjan. Côte D’ivoire Hydrology 5(3):32.

Pasika S, Gandla ST (2020) Smart water quality monitoring system with cost-effective using IoT. Heliyon 6(7):e04096.

Rath S, Tripathy A, Tripathy AR (2020) Prediction of new active cases of coronavirus disease (COVID-19) pandemic using multiple linear regression model. Diabetes Metab Syndr 14(5):1467–1474.

Singha S, Pasupuleti S, Singha SS, Singh R, Kumar S (2021) Prediction of groundwater quality using efficient machine learning technique. Chemosphere 276:130265.

Sossi Alaoui S, Aksasse B, Farhaoui Y (2020) Data mining and machine learning approaches and technologies for diagnosing diabetes in women. In: Farhaoui Y (ed) Big data and networks technologies. Springer International Publishing, pp 59–72.

The California Water System (2021) . Accessed 3 June 2021

Tunc Dede O, Telci IT, Aral MM (2013) The use of water quality index models for the evaluation of surface water quality: a case study for Kirmir Basin, Ankara, Turkey. Water Qual Expo Health 5(1):41–56.

Westall F, Brack A (2018) The importance of water for life. Space Sci Rev 214(2):50.

Zotou I, Tsihrintzis VA, Gikas GD (2020) Water quality evaluation of a lacustrine water body in the Mediterranean based on different water quality index (WQI) methodologies. J Environ Sci Health Part A 55(5):537–548

Original Submission Date Received: .

Article Menu

A review of the artificial neural network models for water quality prediction.

water quality prediction research paper

1. Introduction

  • First, we identified ANN-related papers in influential water-related and environmental-related journals to ensure that high-quality papers are included in the review. These papers are mainly from journals whose subjects are environmental science and ecology, water resources, engineering and application.
  • Thereafter, a keyword search of the ISI Web of Science was then conducted for the period 2008–2019 using the keywords; water quality, river, lake, reservoir, WWTP, groundwater, pond, prediction, and forecasting, accompanied by the names of ANN methods (one or more), such as neural network, MLP, RBFNN, GRNN, RNN, to name but a few.
  • Then, through the search process from 1 to 2, 151 articles in English relevant to our focus were selected. The basic information of the papers, including authors (year), locations, water quality variables, meteorological factors, other factors, output strategy, data size, time step, data dividing, methods, and prediction lengths are provided in Appendix A .

3. Three Basic Model Structures in Water Quality Prediction

3.1. feedforward architectures, 3.2. recurrent architectures, 3.3. hybrid architectures, 3.4. emerging methods, 4. artificial neural networks models for water quality prediction, 5.1. data collection, 5.2. output strategy, 5.3. input selection, 5.4. data dividing, 5.5. data preprocessing, 5.6. model structure determination, 5.7. model training, 6. discussion, 6.1. data are the foundation, 6.2. data processing is key, 6.3. model is the core, author contributions, acknowledgments, conflicts of interest.

AbbreviationsFull NameAbbreviationsFull NameAbbreviationsFull NameAbbreviationsFull Name
Rahmi Fadhilah

Institut Teknologi Sepuluh Nopember

Heri Kuswanto

affiliation not provided to SSRN

Dedy Dwi Prastyo

Handling Imbalanced datasets is a significant challenge in water quality assessment. This study evaluates the performance of three machine learning models Naïve Bayes (NB), Extreme Gradient Boosting (XGBoost), and Random Forest (RF) on imbalanced water quality datasets. Various sampling strategies, including Random Undersampling (RUS), Rapidly Converging Gibbs Sampler (RACOG) and a combined RACOG-RUS approach, were employed to enhance model performance. The analysis shows notable variation in model accuracy and F1 scores depending on the sampling method and wheter feature selection was applied. XGBoost with RACOG achieved the highest performance without feature selection (accuracy: 0,958), while Naïve Bayes with RUS performed exceptionally well (accuracy: 0,986; F1 Score: 0,979). With feature selection, XGBoost with RACOG outperformed other models, reaching an F1 score of 0,982 and accuracy (0,606). These findings highlight the importance of advanced sampling techniques and feature selection in enhancing machine learning models for water quality classification. The method used:Apply advanced sampling techniques to handle imbalanced datasets effectively;Evaluated model performance with and without feature selection to identify the best approach;Enhance classification accuracy for better water quality assessment.

Keywords: Water Quality Classification, Imbalanced Data Handling, Random Forest, Naïve Bayes, XGBoost, Random Under Sampling (RUS), Rapidly Converging Gibbs Sampler (RACOG), RACOG-RUS

  1. Water quality prediction and classification based on principal

    Estimating water quality has been one of the significant challenges faced by the world in recent decades. This paper presents a water quality prediction model utilizing the principal component regression technique. Firstly, the water quality index (WQI) is calculated using the weighted arithmetic index method.

  2. A Comprehensive Review of Machine Learning for Water Quality Prediction

    Water quality prediction, a well-established field with broad implications across various sectors, is thoroughly examined in this comprehensive review. Through an exhaustive analysis of over 170 studies conducted in the last five years, we focus on the application of machine learning for predicting water quality. The review begins by presenting the latest methodologies for acquiring water ...

  3. (PDF) Water Quality Prediction Using Machine Learning Classification

    International Journal of Scientific & Engineering Research, V olume 8, Issue 9, October-2022. ISSN 2229-5518. W ater Quality Prediction Using Machine Learning. Classification Algorithm. Michael ...

  4. Reliable water quality prediction and parametric analysis using

    Chen, K. et al. Comparative analysis of surface water quality prediction performance and identification of key water parameters using different machine learning models based on big data. Water Res ...

  5. Real-time water quality prediction in water distribution networks using

    1. Introduction. Ensuring the safety and quality of drinking water is a critical concern for water infrastructure management (Assembly, 2015; Tortajada, 2020).Water quality monitoring plays a vital role in achieving this objective by facilitating the detection and mitigation of potential risks, thereby ensuring the delivery of clean and safe water to consumers (Li et al., 2022; Mondejar et al ...

  6. Deep learning for water quality

    Here we (1) describe the challenges in water-quality sciences that DL can help to resolve, (2) review opportunities for DL in water quality prediction, particularly in addressing data scarcity and ...

  7. Predicting Water Quality with Artificial Intelligence: A Review of

    The articles reviewed in this review study were selected to cover experiments focused specifically on water quality prediction. We found 83 research articles as shown in Table 1 and Fig. 1.Most of these articles were published in the last 5 years as shown in Fig. 2.Additionally, we selected to review these articles because they used various input parameters to predict the water quality as ...

  8. Water quality prediction using machine learning models based on grid

    Water quality is very dominant for humans, animals, plants, industries, and the environment. In the last decades, the quality of water has been impacted by contamination and pollution. In this paper, the challenge is to anticipate Water Quality Index (WQI) and Water Quality Classification (WQC), such that WQI is a vital indicator for water validity. In this study, parameters optimization and ...

  9. Research progress in water quality prediction based on deep learning

    Water, an invaluable and non-renewable resource, plays an indispensable role in human survival and societal development. Accurate forecasting of water quality involves early identification of future pollutant concentrations and water quality indices, enabling evidence-based decision-making and targeted environmental interventions. The emergence of advanced computational technologies ...

  10. PDF A Comprehensive Review of Machine Learning for Water Quality Prediction

    The presented research underscores the transformative impact of machine learning on water quality prediction in coastal areas. Our review on the limitations of current models, the need for diverse datasets, and the consideration of evolving environmental conditions points to avenues for future research. 5. Conclusions.

  11. A review of the application of machine learning in water quality

    The application of machine learning in surface water quality research has become a hotspot [16, 17]. A series of surface water quality prediction and analysis methods have been developed (Table 1). Many efforts have been devoted to optimizing machine learning models and improving their prediction accuracy.

  12. Advancing Water Quality Prediction: the Role of Machine Learning in

    decision-making in environmental policy, resource management, and urban. planning. In this context, the application of m achine learning techniques offers a. promising avenue to enhance the pre ...

  13. Water Quality Prediction Using Machine Learning Techniques

    Water is the most crucial resource of life and it is necessary for the survival of all living creatures including human beings. The survival of business and agriculture depends on freshwater. An essential step in managing freshwater assets is the evaluation of the quality of the water. Before using water for anything, including drinking, chemical spraying (pesticides, etc.), or animal ...

  14. Water Quality Prediction Based on Multi-Task Learning

    Water pollution seriously endangers people's lives and restricts the sustainable development of the economy. Water quality prediction is essential for early warning and prevention of water pollution. However, the nonlinear characteristics of water quality data make it challenging to accurately predicted by traditional methods. Recently, the methods based on deep learning can better deal with ...

  15. (PDF) Water Quality Prediction Based on Machine Learning and

    Additionally, these models can incorporate other environmental factors. and meteorological data, thereby improving the accuracy and reliability of water quality. prediction. However, machine ...

  16. Summary of Water Quality Prediction Models Based on Machine Learning

    Abstract: Water quality prediction is a research hotspot in the field of ecological environment, which is of great significance to the prevention of water pollution and the construction of automatic water quality monitoring network. The accuracy of prediction model results will affect the scientificity and correctness of applied engineering projects, as well as the accuracy of water pollution ...

  17. Machine learning methods for better water quality prediction

    During these processes, two scenarios were introduced: Scenario 1 and Scenario 2. Scenario 1 constructs a prediction model for water quality parameters at every station, while Scenario 2 develops a prediction model on the basis of the value of the same parameter at the previous station (upstream). Both the scenarios are based on the value of ...

  18. Machine learning algorithms for efficient water quality prediction

    In addition, reliable predictions of water quality are also the best evidence that can help policy makers to make good decisions before disaster strikes (Lu and Ma 2020). In this research paper, our goal is to suggest a new model for prediction water quality based on machine learning algorithms and with minimal parameters. In addition, the ...

  19. A Review of the Artificial Neural Network Models for Water Quality

    Water quality prediction plays an important role in environmental monitoring, ecosystem sustainability, and aquaculture. Traditional prediction methods cannot capture the nonlinear and non-stationarity of water quality well. In recent years, the rapid development of artificial neural networks (ANNs) has made them a hotspot in water quality prediction. We have conducted extensive investigation ...

  20. Data-Driven Water Quality Analysis and Prediction: A Survey

    This paper reviews the published research results relating to water quality evaluation and prediction. Moreover, the paper classifies and compares the applied big data analytics approaches and big data based prediction models for water quality assessment. Furthermore, the paper also discusses the future research needs and challenges.

  21. On the Search of the Optimum Method for Water Quality Prediction Using

    These findings highlight the importance of advanced sampling techniques and feature selection in enhancing machine learning models for water quality classification. The method used:Apply advanced sampling techniques to handle imbalanced datasets effectively;Evaluated model performance with and without feature selection to identify the best ...

  22. Analysis and prediction of water quality using deep learning and auto

    Taking a step further, this paper explores Automated Deep Learning, which is a new research domain. This paper straddles perfectly built DL models and automated DL models. With the same baseline data, the authors intend to explore the potential of automated processing, as well as the shortcomings in it. ... The water quality prediction using ...

  23. Predictive Models for River Water Quality using Machine Learning and

    The increase in pollution influences the quantity and quality of water, which results high risk on health and other issues for human as well as for living organisms on the planet. Hence, evaluating and monitoring the quality of water, and its prediction become crucial and applicable area for research in the current scenario.

  24. Water quality prediction using machine learning methods

    Water Quality Research Journal 1 February 2018; 53 (1): 3-13. doi: ... The aim of this study is the prediction of water quality components using artificial intelligence (AI) techniques including MLP, SVM, and group method of data handling (GMDH). ... In this part of the paper, the results of prediction of the internal relations between the ...

  25. Aquaponic Farming Water Quality Prediction

    This review paper explores the latest developments in IoT-based automated water monitoring systems, focusing on their role in predicting and managing water quality in aquaponic systems.

  26. Quantitative prediction of water quality in Dongjiang Lake watershed

    1.Introduction. Surface water is a non-renewable resource that is important in the daily life of human beings (Chen et al., 2020).Predicting the trend of water quality in a watershed is necessary for ensuring that water quality remains within manageable limits (Kut et al., 2019, Peng et al., 2020).The task of predicting water quality has become more complex due to water quality is affected by ...