Evaluation of different curve fitting models for prediction of municipal solid waste composition Kentsel katı atık bileşiminin tahmini için farklı eğri uydurma modellerinin değerlendirilmesi

Öz In this study, the methodology was applied for the prediction of municipal solid waste (MSW) composition in the Eskişehir/Turkey. For this purpose, MSW samples were collected, and the samples were separated by food wastes, paper-cardboard, plastics, glass, metals as manually. In the present study, the concept of 2D curve fitting functions was adopted for the forecasting of MSW composition. To comment on the performance of proposed system, the Root Mean Square Error (RMSE) and Sum of Squared Error (SSE) metrics were chosen as statistical residual evaluation metrics. According to results, it is seen that the polynomial curve fitting model is most suitable. Also, the effect of socio-economic structure on the waste composition by different nonlinear models was observed. With contribution of study, it would be possible to forecast the MSW composition of other cities which have similar factors. Bu çalışmada, Eskişehir/Türkiye’deki kentsel katı atık (MSW) bileşiminin tahmini için bir metodoloji uygulanmıştır. Bu amaçla, MSW örnekleri toplanmış ve numuneler yiyecek atığı, kâğıt-karton, plastik, metal, cam olarak el ile ayrılmıştır. Bu çalışmada, MSW bileşiminin tahmini için 2B eğri uydurma fonksiyonları kullanılmıştır. Önerilen sistemin performansını yorumlamak için, Kök Ortalama Karesel Hata (RMSE) ve Toplam Karesel Hata (SSE) değerleri değerlendirme ölçütleri olarak seçilmiştir. Sonuçlara göre, polinom eğri uydurma modelinin atık bileşiminin tahmini için daha uygun olduğu belirlenmiştir. Ayrıca, sosyoekonomik yapının atık bileşimi üzerindeki farklı doğrusal olmayan modellerle etkisi gözlenmiştir. Çalışmanın katkısıyla, benzer faktörlere sahip olan diğer şehirlerin MSW kompozisyonunu tahmin etmek mümkün olacaktır.


Introduction
The municipal solid waste (MSW) is very important since the contained ingredients can directly cause negative human health effects and environmental pollution. It is widely known that MSW can lead to the proliferation of insects, passing of the harmful instances into the plants, foodstuffs, and spread of epidemic diseases and etc. Specially, the decomposition of glass and plastics takes many years and causes pollution of soil permanently. To reduce problems encountered from MSW, a sophisticated solution for municipalities is greatly needed in terms of facilitating the reduction, reuse, recycling, and disposal stages. Therefore, the general trend behind a sufficient and accurate projected sample can be utilized to make decisions about the size and quantity of municipal solid wastes.
The emerging works in MSW monitoring deal with methodologies and techniques in order to determine the location of organic recycling and processing plants. It has been emphasized that there are two major ideas as time series and factor models that can be used for the prediction of the volume and future dimension MSW generation [1]. The key idea under the time series models is considered that the obtained data within a period can be modeled based on the trend or * Corresponding author/Yazışılan Yazar autocorrelation between consecutive data points. In this respect, the time series analysis can be carried out to extract meaningful information by considering the characteristic of the present data. In this study, we will review only the techniques that have been utilized for forecasting of MSW composition of different locations as using the projected data. Typical methods in time series are curve fitting, exponential smoothing and auto correlation (Auto-Regressive Integrated Moving Average (ARIMA)) models. The first study on ARIMA model was proposed by Box and Jenkins [2]. Previously, the limitations of S-curves fitting methodology were investigated to analyze for long-term forecasting of MSW composition in the UK [3]. Mwenda el al. [4] have been utilized the ARMA/ARIMA and exponential smoothing models to predict the quantity of solid waste generation in Arusha City-Tanzania. Also, Sodanil and Chatthong [5] have been operated Artificial Neural Network (ANN) models based on time series for prediction solid waste generation for Bangkok city of Thailand. Additionally, two forecasting techniques as seasonal Auto Regressive and Moving Average (sARIMA) model [2] and a discrete dynamical system for analysis of non-linear systems in order to predict MSW generation rates, based on the MSW time series of three cities in Spain and Greece [6].
Another technique considers socio-economic factors (age, educational level, and income level) as basis for forecasting MSW production rate. The aim of factor-based methods is unveiling the relationship between factors and generated waste composition. In a work, a multivariate regression model is developed to disclose the meaningful relationship between the demographic parameters and the amount of collected waste paper [7]. For this purpose, a hypothesis is put forward that the waste paper potential and factors influence the size of collected paper and density of collection sites. Again, Ying et al. [8] tried to model the projected data with coefficients of Gray forecast and multiple linear regression model with respect to Minimal Sum of Square Error (MSSE). Then, the rate of municipal solid waste from 2010 to 2014 was estimated based on the statistics data of Beijing satellite towns in 1992-2005. The grey fuzzy dynamic modeling is carried out for the prediction of MSW generation a case study in the city of Beijing in China [9]. The authors emphasized that the new forecasting technique should be preferred instead of the conventional grey dynamic model, least squares regression approach and the fuzzy goal regression model in terms getting the best accuracy rates. Similarly, Dai et al. [10] utilized the two-stage support vector regression optimization model (TSOM) for the monitoring of MSW management in the urban districts of Beijing. In a different study, optimized multivariate grey model was applied to forecast MSW collected in Thailand with prediction intervals in long term period [11]. The potential impact of coupling the support-vector-regression (SVR) model with an interval parameter mixed-integer linear programming (IMILP) was investigated to optimize the MWM planning. To evaluate the accuracy of system, three types of performance evaluation criteria such as prediction accuracy (PA), fitting accuracy (FA) and overall accuracy (OA determined by employing the four kernel functions including linear kernel, polynomial kernel, radial basis function, and multilayer perception kernel in TSOM). Then, the SVR model with the best kernel function was chosen to forecast the waste generation rate in Beijing.
The composition of municipal solid waste (MSW) is a result of regional and cultural aspects as well as social behavior, and it is strongly influenced by economic factors. The changes in MSW composition may strongly influence the quality of the waste, which affects emissions from landfills, the quality of incineration residues and other parameters of waste management systems [12]. Moreover, Denafas et al. [12] analyzed the effects of seasonal variation in MSW composition by relying on the time series forecasting models including nonparametric seasonal exponential smoothing, Winters additive, and Winters multiplicative methods on collected data. Furthermore, the potential impact of ANN was examined to provide a convincing accuracy rate in terms of MSW monitoring. Specifically, Zade and Noori [13] addressed the ability of the ANN for forecasting of MSW generation in Mashhad. Kumar et al. [14] have been estimated the amount of MSW in the Eluru/India with the ANN model. Also, ANN models have been used for MSW prediction in Gujarat (India) by Patel and Meka [15].
In fact, the prediction of the MSW generation task is similar to determine the severity of a possible incoming earthquake or the weather conditions of a next week, since for both tasks a model is constructed based on the data collected within a period in the past. Usually, using an accurate mathematical modeling technique can promise satisfactory results for expectations. Besides, it is accepted that some favorable and unfavorable aspects of methods are required to be accounted for different conditions. For instance, when the dimension of training data is increased in case of ANN based modeling, some problems also arise in the sense of determination of network architecture, local minimum and parameter selection [11], [16]. Moreover, 1D curve functions have been utilized to make predictions on MSW generation in our previous study [17]. Similarly, one should make a tradeoff between the accuracy and computation time if the time series based approaches are utilized. Moreover, there is no accepted unique method that produces the best accuracy rates in terms of forecasting MSW generation for different data types of different municipalities.
Eskişehir which is studies city in this paper has a land area of 13.925 km 2 and is located in the northwest of the Central Anatolia region in Turkey. The city is also known as a university town. According to the report of 2014 on the census, the population of the city is around 810,000 and the population density is about 58/km 2 . There are two municipalities in Eskişehir as Odunpazari and Tepebasi. Along with student population growth, the quantity of consumption has also increased. The statistical record indicates that the average daily MWS production rate is reported as 750 tons/day. At some certain times of the day, two private companies are organized for collecting the MSW in plastic bags. The collected MSW is transferred into a landfill.
In this study, it was realized a feasibility work that serves as extracting the meaningful information from historical statistical data of MSW generation for two municipalities including Odunpazari and Tepebasi of Eskişehir city in Turkey. With this aim, it was conducted some experiments on a sample of projected MSW generation by investigating the limitation of two dimensional (2D) curve fitting methods, which are Power, Exponential and Polynomial. The reason to choose these curve types is explained as the aforementioned curve types are faster than other ones in terms computation time and constructing the model from large scale data. According to the literature of predictions for MSW generation rate, it is observed that while the one-dimensional (1D) curve fitting strategies have been applied for forecasting of MSW generation ( [18] and [19]) , but the two-dimensional (2D) based methods have not attempted yet for this purpose. Technically, the 1D curve fitting model implies that there is only one independent variable. Also, in the 2D case, there are two different independent variables. Moreover, there is no work that proposed to make prediction for MSW generation in Eskişehir. The objectives of this study are (i) to analyze the relationship between population and income level with food wastes, paper-cardboard, plastics, glass, metals, and others composition types and (ii) to investigate the ability to make predictions for different municipality and countries that share the same characteristics in socio-economic factors.

Material and method
The motivation behind the curve fitting is constructing a mathematical function that captures the trend in data and the relationship between the factors (independent) and outcomes (dependent). For this purpose, the parameters of the fitted function are adjusted as yielding the 'best-fit' of the model. In determining the characteristic of data and obtaining the bestfitted model, linear and nonlinear curve fitting functions can be preferred. The designation of subject is ground on as interpolation or smoothing. In this study, we have extended the curve fitting strategy for interpolation purposes in order to predict the quantity of MSW generation.
In the power-based strategy, the mathematical formula given in Equation (1) is utilized.
( , ) = 0 + 1 2 + 3 4 + 5 6 1 (1) In the case of exponential based curve fitting, the model denoted with Equation (2) is aimed to extract by using the factors and dependent variables. In given equation, the a0, a1 and a2 refer to coefficients need to be determined, whereas µx and µy denote the mean of variables corresponding to the x and y planes.
Where the parameters are given as ̅ = If our aim is modeling the data by a second-order polynomial function (Equation 3); where e is the expected residual value, Equation 4 can be used [20].
Then, the derivative of both sides of the equation should be derived as in Equation (5); So, the generated functions can be rearranged (Equation 6): By extending the concept of Equation 3, one can easily develop m th-order polynomial curve fitting as shown in Equation (7): In the present study, the concept of 2D curve fitting functions is adopted for forecasting of MSW generation. In other saying, the relation between dependent variables and two factors (income and population) is investigated by using the 2D curve fitting functions, respectively. In the sense of accurate prediction, one can emphasize that the contribution of 2D based curve fitting methodology is more convenient than the other one when the minimum error and stability are considered as the principal constraints. This can be attributed to the impact of 2D based curve fitting methodology as analyzing the rising and declining property of data from two aspects, by generating a surface based on the two factors. The aforementioned curve fitting methods are power, exponential and polynomial.
Let assume that we have given two factors as x and y to build a model on processed data. In this study, the population (x) and income level (y) are considered as two factors that influence the variation on dependent variables, which are food wastes, paper-cardboard, plastics, glass, metals, and others. The given two variables reveal the effect of socio-economic factors on MSW prediction. For polynomial case, the model is constructed with Equation (8).
Where n and m are the degree of x and y factors, respectively. In this study, n and m values is equal to 3.
As touched above, the parameters of the model can be easily derived by carrying out the derivative procedure in terms of obtaining the 'best-fitted model' for 1D curve fitting functions. However, for the 2D based curve fitting concept, the optimization problem can be solved with a nonlinear optimization technique. To solve the optimization for nonlinear functions, the constrained nonlinear optimization is employed. The constrained nonlinear optimization for 2D functions has the form as indicated in Equation (9). Where, x and y represent the population and income level, and the value of i varies between 1 to 6. Also, the zi denotes percentage of variation for dependent variables including food wastes, paper-cardboard, plastics, glass, metals, and others. The motivation under using the nonlinear optimization is explained with extracting the coefficients of 2D based curve fitted functions by taking the residual minimization constraint. Since the sum of the probabilities of waste generation is given as percentage, the total sum of all waste products should also yield the 100. To solve the constrained nonlinear optimization, the lsqcurvefit function that implemented in Matlab optimization toolbox, is carried out. The referred function is utilized to solve nonlinear curve-fitting (data-fitting) problems in least-squares sense.
To comment on the performance of the proposed system, the Root Mean Square Error (RMSE) and Sum of Squared Error (SSE) metrics are chosen as statistical residual evaluation metrics [21]. As a numerical prediction, the RMSE is the modified version of mean square error and presents the difference between the estimated and expected value. The general formula used to compute the RMSE value as shown in Equation (10).
The SSE is computed with formula shown in Equation (11).
In this study, the accuracy of system is calculated based on the SSE and RMSE values. Thereby, the real data is put forward into the constructed model and the predictions for compositions are obtained. Then, the residual values with SSE and RMSE are obtained based on the predicted and observed results. The smaller the SSE and the RMSE values, the more accurate the prediction of unknown values.
In this study, 293 MSW samples were collected from different regions at different seasons. The amount of each component was determined gravimetrically after manual sorting [22], [23]. The population data were obtained from the Turkish Statistical Institute. The income level values were entered according to land market value in the sub municipalities' databases.
To investigate the relationship between socio-economic factors and MSW compositions, the different types of curves and loss metrics are carried out in connection with prediction. After providing a limited sample for each factor and composition, the variation in income and population versus food wastes, papercardboard, plastics, glass, metals are examined based upon the constructed mathematical models. With this aim, models are determined based on the obtained coefficients of fitted curves, which are power, exponential and polynomial curve fitting methods. As an optimization problem, the minimum RMSE and SSE error values are considered as high accuracy rate for forecasting of MSW generation. For the sake of accurate prediction, only 2D curve fitting concept is utilized in the experimental stage. Conceptually, the residual between the predicted 2D plane and observed 2D plane is accounted to fulfill the forecasting of MSW generation with plausible accuracy.

Results and Discussion
In case of assessing the performance of system, the simple leave-one-out cross-validation methodology is applied in experiments from 1 to 3. As a degenerate version of k-fold cross-validation approach, the value of k is selected as 1 in case of cross-validation. Hence, among the n districts of aforementioned municipalities, the samples of n-1 are processed for model construction and remaining one for prediction procedure purposes, until all districts are completed. Moreover, the generation of MSW compositions for Tepebasi (sub municipality A) is predicted based upon the model obtained from Odunpazari (sub municipality B) data, and vice versa. Once all of the cross-validation stages completed, an overall residual error for each MSW composition is obtained. The reason of realize the experiments 1 and 2 is cross-validated the performance of proposed system for predicting MSW composition of one municipality based on the data of other municipality. Since the real value of MSW generation for both municipalities are available, we can easily compute the performance of system with respect to SSE and RMSE values.

Experiment 1: Prediction of MSW generation for sub municipality B with sub municipality a data
In this experiment, the amount of MSW generation for sub municipality B was aimed to forecast based on the mathematical models constructed from projected samples of sub municipality A. Thereby, 2D type of Power, Exponential and Polynomial curve fitting functions are utilized to constitute three different models on the sub municipality B data. By experimenting with the models and error metrics (SSE and RMSE), the predictions have been done and the system was evaluated in terms of overall accuracy.
The obtained error rates and formulations for MSW generation in sub municipality B are given in Table 1 Upon inspecting the overall results, it can be observed that the performance of Power is favorable than the other ones. One can find that the proposed system presents 2.03 RMSE residual as the overall error rate. This fact validates that proposed models can be used accurately for prediction MSW generation. The small SSE and RMSE values imply that the proposed system is useful to make prediction on MSW generation for any municipality in Eskişehir.

Experiment 2: Prediction of MSW generation for sub municipality B with sub municipality B data
To evaluate the ability of proposed prediction system, the waste composition rate of sub municipality A is estimated by utilizing the models computed from sub municipality B data. Similarly, the aforementioned 2D curve types including Power, Exponential and Polynomial are executed to build models in order to obtain some estimation for MSW generation in sub municipality A. The performance of the system is analyzed bases upon the SSE and RMSE error metrics. The residual values and formulations corresponding to sub municipality A are demonstrated in

Experiment 3: Performance evaluation on all data
In this experiment, the MSW quantity of both sub municipality A and sub municipality B are combined to make more accurate predictions for MSW generation in Eskişehir. For this purpose, the prediction of food wastes, paper-cardboard, plastics, glass, metals, and others. MSW composition associated with each district are determined in terms of SSE and RMSE error values. In general, the utilized accuracy evaluation system is named as leave-one-out in literature that is a large proportion of data is experimented to construct the model whereas remain proportion is utilized for testing (prediction) purpose. Finally, the SSE and RMSE values are obtained.
According to the results in Table 3 obtained from leave-one-out system, it can be emphasized that curve type should be utilized in the prediction of food wastes,

Experiment 3: Performance evaluation on all data
Multiple linear regression (MLR) includes many advantages to find an accurate model that reveals the relation between generated MSW rate and social-economic factors. To compare the limitations of some similar studies, we have enhanced the performance evaluation and comparison stage by justifying their usefulness. Since the existing methods are task-oriented and considered different datasets, it would be unfair to compare performances.
One can say that the ANN has indisputable advantages to find a best model capturing the characteristic of data. In a given study [24], the valuable RMSE score, which accounts for 467 was obtained in case of in predicting the municipal solid waste generation. Again, the double exponential smoothing models [4] were utilized to find out the best time series model for forecasting amount of solid waste generation for a following year. The obtained RMSE is noted as 348.60. In a similar study [12], the seasonal variation of municipal solid waste generation and composition for Kaunas, Lithuania, was predicted with a 0.371 of RMSE value. Recently, an algorithm with a similar concept behind our approach is proposed by Kannangara et al. [25]. In related study, the prediction of regional municipal solid waste generation was investigated by considering Socioeconomic variables including population, median personal income, employment rate and etc. After setting parameters of ANN, the model gives a satisfactory prediction accuracy with 28.996 of MSE value when testing system over income and population as influenced factors. Moreover, the multiple linear regression model [26] was used on the basis of social and demographic explanatory variables such as daily per capita income and population. In related study, the developed multiple linear regression model shows the valuable prediction result, recorded as 0.19 of RMSE value. One can note that the main idea is representing the characteristic of data with a robust model. In order to analyze the merits and limitations of some curve fitting models on MSW prediction, we have also carried out various experiments. After experimental stage, it turns out that the power model gives superior results than other model. At overall, the obtained RMSE values implies that the curve fitting based models give accurate predictions to reveal the relationship between MSW generation and independent factors like population and income.

Conclusion
In this paper, a new perspective is presented for the utilization of some curve fitting functions for prediction of MSW composition. By considering the obtained error rates, we consider it necessary to recommend some suggestions in the sense of taking some precautions in terms of planning a solid waste management system. By considering the experiment 1 to 3, we can comment that it is useful to combine population and income factors with a purpose to predict the amount of MSW generation in Eskişehir, instead of using one factor. Particularly, predicting the MSW generation of metal gives best accuracy rates when all experiments are considered. While making predictions for paper-cardboard, glass, plastic compositions gives good accuracy in terms of SSE and RMSE metrics, however, it is a big challenge to predict the food waste and generation of others compositions according to SSE and RMSE. One can list such possible reasons as high variation within variables and low connection between variables and outcomes. Given the small RMSE and SSE values, it is clear that the curve fitting models can be used for waste composition estimation. When all the results are evaluated together, it is seen that the polynomial curve fitting model is most suitable. With contribution of study, it would be possible to forecast the MSW composition of other cities which have similar factors (income level and population) in common. Therefore, one can emphasize that the curve fitting based data-driven methodology can be used with a purpose of generating predictions for other municipalities. In order to enhance the prediction capability of a model, a proposed model must be retrained to receive valuable forecasting results in case of simulations. For an unseen data, the model have to be generalized in order capture the trend behind newly variables.

Author contribution statements
In the scope of this study, Aysun ÖZKAN in the formation of the idea, the design and the literature review; Kemal ÖZKAN in the assessment of obtained results; Şahin IŞIK in formation of the models; Müfide BANAR in the editing the article in terms of content were contributed.

Ethics committee approval and conflict of interest statement
There is no need to obtain permission from the ethics committee for the article prepared. There is no conflict of interest with any person/institution in the article prepared.