Technical Papers
May 8, 2017

Correcting Systematic Underprediction of Biochemical Oxygen Demand in Support Vector Regression

This article has been corrected.
VIEW CORRECTION
Publication: Journal of Environmental Engineering
Volume 143, Issue 9

Abstract

Biochemical oxygen demand (BOD) is a variable that is missing or inaccurate in many water quality data sets because of difficulties in diluting highly polluted water samples. Machine learning algorithms, particularly support vector regression (SVR), are useful to build regression models to fill gaps in these data sets. The SVR can underpredict extreme-high values when they are few in number and underrepresented. This paper evaluates two methods, bootstrapping and data expansion, to mitigate the problem by increasing the proportion of extreme-high BOD in the data set before training the gap-filling model. Both methods were tested on the water quality data of Yuen Long Creek, Hong Kong, for the years 2000–2014. Both methods were effective in mitigating systematic underprediction and reducing their residual errors when the proportion of extreme-high values in the data set were increased from 3 to 30–40%. Both methods were useful for gap filling on BOD time series because extreme-high values are often the ones missing or inaccurate when highly polluted samples are diluted.

Get full access to this article

View all available purchase options and get full access to this article.

References

Balfer, J., and Bajorath, J. (2015). “Systematic artifacts in support vector regression-based compound potency prediction revealed by statistical and activity landscape analysis.” PLoS One, 10(3), e0119301.
Chiang, C. F., Wu, Y. S., and Young, J. C. (2004). “Analyzing the uncorrected error of dilution water demand for the dilution biochemical oxygen demand method.” Water Environ. Res., 76(3), 238–244.
Džeroski, S., Demšar, D., and Grbović, J. (2000). “Predicting chemical parameters of river water quality from bioindicator data.” Appl. Intell., 13(1), 7–17.
EPDHK (Environmental Protection Department of Hong Kong). (2007). “Livestock waste information system.” ⟨http://www.epd.gov.hk/epd/misc/river_quality/1986-2005/eng/5_nor_nt_-menu.htm⟩ (Jan. 21, 2017).
EPDHK (Environmental Protection Department of Hong Kong). (2014). “River water quality in Hong Kong in 2014.” ⟨http://wqrc.epd.gov.hk/pdf/water-quality/annual-report/RiverReport2014eng.pdf⟩ (Jan. 21, 2017).
Garsole, P., and Rajurkar, M. (2015). “Streamflow forecasting by using support vector regression.” Proc., 20th Int. Conf. of Hydraulics, Water Resources and River Engineering, Indian Society for Hydraulics, Pune, India.
Granata, F., Gargano, R., and de Marinis, G. (2016). “Support vector regression for rainfall-runoff modeling in urban drainage: A comparison with the EPA’s storm water management model.” Water, 8(3), 69.
Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning, Springer, New York.
Hsu, C.-W., Chang, C.-C., and Lin, C.-J. (2003). “A practical guide to support vector classification.” ⟨http://www.csie.ntu.edu.tw/∼cjlin/papers/guide/guide.pdf⟩ (Jan. 212017).
Karush, W. (1939). “Minima of functions of several variables with inequalities as side constraints.” M.S. thesis, Dept. of Mathematics, Univ. of Chicago, Chicago.
Kohavi, R. (1995). “A study of cross-validation and bootstrap for accuracy estimation and model selection.” Proc., Int. Joint Conf. of Artificial Intelligence, Morgan Kaufmann Publishers, San Francisco, 1137–1145.
Kuhn, H. W., and Tucker, A. W. (2014). “Nonlinear programming.” Traces and emergence of nonlinear programming, Springer, New York, 247–258.
Lima, A. R., Cannon, A. J., and Hsieh, W. W. (2015). “Nonlinear regression in environmental sciences using extreme learning machines: A comparative evaluation.” Environ. Modell. Software, 73, 175–188.
Liu, M., and Lu, J. (2014). “Support vector machine—An alternative to artificial neuron network for water quality forecasting in an agricultural nonpoint source polluted river?” Environ. Sci. Pollut. Res., 21(18), 11036–11053.
Nagel, B., Dellweg, H., and Gierasch, L. M. (1992). “Glossary for chemists of terms used in biotechnology (IUPAC recommendations 1992).” Pure Appl. Chem., 64(1), 143–168.
Nash, J., and Sutcliffe, J. (1970). “River flow forecasting through conceptual models. Part I: A discussion of principles.” J. Hydrol., 10(3), 282–290.
Noori, R., Karbassi, A., Ashrafi, K., Ardestani, M., Mehrdadi, N., and Bidhendi, G.-R. N. (2012). “Active and online prediction of BOD5 in river systems using reduced-order support vector machine.” Environ. Earth Sci., 67(1), 141–149.
Noori, R., Yeh, H.-D., Abbasi, M., Kachoosangi, F. T., and Moazami, S. (2015). “Uncertainty analysis of support vector machine for online prediction of five-day biochemical oxygen demand.” J. Hydrol., 527, 833–843.
Qiu, J.-W. (1999). “Composition, structure and distribution of polychaete assemblages in Deep Bay.” The mangrove ecosystem of deep bay and the Mai Po marshes, Hong Kong, Hong Kong University Press, Hong Kong, 13–21.
Rice, E., Baird, R., Eaton, A., and Clesceri, L. S. (2012). Standard methods for the examination of water and wastewater, American Public Health Association, American Water Works Association, Water Environment Federation, Washington, DC.
Sawyer, C. N., McCarty, P. L., and Parkin, G. F. (2002). Chemistry for environmental engineering and science, 5th Ed., McGraw Hill, New York.
Singh, K. P., Basant, A., Malik, A., and Jain, G. (2009). “Artificial neural network modeling of the river water quality—A case study.” Ecol. Modell., 220(6), 888–895.
Smola, A. J., and Scholkopf, B. (2004). “A tutorial on support vector regression.” Stat. Comput., 14(3), 199–222.
Udeigwe, T. K., and Wang, J. J. (2010). “Biochemical oxygen demand relationships in typical agricultural effluents.” Water Air Soil Pollut., 213(1–4), 237–249.
Vapnik, V. N. (1995). “Constructing learning algorithms.” The nature of statistical learning theory, Springer, New York, 119–166.

Information & Authors

Information

Published In

Go to Journal of Environmental Engineering
Journal of Environmental Engineering
Volume 143Issue 9September 2017

History

Received: Dec 7, 2016
Accepted: Feb 7, 2017
Published online: May 8, 2017
Published in print: Sep 1, 2017
Discussion open until: Oct 8, 2017

Permissions

Request permissions for this article.

Authors

Affiliations

Graduate Student, Graduate School of Science, Univ. of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan (corresponding author). ORCID: https://orcid.org/0000-0003-4930-9829. E-mail: [email protected]
Alan D. Ziegler [email protected]
Professor, Dept. of Geography, National Univ. of Singapore, 1 Arts Link, Kent Ridge, Singapore 117570. E-mail: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share