Waste generation prediction under uncertainty in smart cities through deep neuroevolution


ABSTRACT

The unsustainable development of countries has created a problem due to the unstoppable waste generation. Moreover, waste collection is carried out following a pre-defined route that does not take into account the actual level of the containers collected. Therefore, optimizing the way the waste is collected presents an interesting opportunity. In this study, we tackle the problem of predicting the waste generation ratio in real-world conditions, i.e., under uncertainty. Particularly, we use a deep neuroevolutionary technique to automatically design a recurrent network that captures the filling level of all waste containers in a city at once, and we study the suitability of our proposal when faced to noisy and faulty data. We validate our proposal using a real-world case study, consisting of more than two hundred waste containers located in a city in Spain, and we compare our results to the state-of-the-art. The results show that our approach exceeds all its competitors and that its accuracy in a real-world scenario, i.e., under uncertain data, is good enough for optimizing the waste collection planning.

Keywords:

Deep neuroevolution, deep learning, evolutionary algorithms, smart cities, waste collection


RESUMEN

El desarrollo insostenible de los países ha creado un problema debido a la imparable generación de residuos. Más aún, la recogida de residuos se realiza siguiendo una ruta predefinida que no tiene en cuenta el nivel real de los contenedores recogidos. Por lo tanto, optimizar la forma en que se recolectan los residuos presenta una oportunidad interesante. En este estudio, abordamos el problema de predecir la tasa de generación de residuos en condiciones reales, es decir, bajo incertidumbre. En particular, utilizamos una técnica neuroevolutiva profunda para diseñar automáticamente una red recurrente que encapsula el nivel de llenado de todos los contenedores de residuos en una ciudad a la vez, y estudiamos la idoneidad de nuestra propuesta cuando nos enfrentamos a datos ruidosos y defectuosos. Validamos nuestra propuesta utilizando un caso real, que consta de más de doscientos contenedores de residuos ubicados en una ciudad de España, y comparamos nuestros resultados con el estado del arte. Los resultados muestran que nuestra propuesta supera a todos sus competidores y que su precisión en un escenario del mundo real, es decir, bajo datos inciertos, es lo suficientemente buena para optimizar la planificación de la recolección de residuos.

Palabras clave:

Neuroevolución profunda, aprendizaje profundo, algoritmos evolutivos, ciudad inteligente, gestión de residuos


1. Introduction

The World’s population is moving from rural to urban areas and it is expected that this trend will continue. The number of inhabitants in cities will be about 75% of the World’s population by 2050 [1].

The fast demographic growth, together with the concentration of the population in cities and the increasing amount of daily waste are factors that push to the limit the ability of waste assimilation by Nature. This fact has forced the authorities to examine the cost-effectiveness and environmental impact of our economic system.

The linear structure of our economy has reached its limits and the natural resources of our planet are drained. Thus, a more sustainable model of economy is needed. For example, the circular economy [2, 3], which consists in the transformation of our waste into raw materials, proposing a new paradigm for a more sustainable future.

The unsustainable development of countries has created a problem due to the unstoppable waste generation. In addition, there are hardly any technological means to make an optimal management of the waste collection process. Nowadays, the solid waste collection is carried out without a previous analysis of the demand, i.e. following a manually defined route. This approach has severe limitations, one of the most important is the variability in the amount of waste that needs to be picked up. This is especially critic in the case of selective collection (plastic, paper, glass,...), where the waste volume is smaller than in the organic case. Thus, when dealing with recyclable waste, the planning of the optimal collection routes is even more influential.

An alternative to tackle the planning of the collection routes is to determine which containers should be collected. Note that the recyclable waste collection process represents 70% of the operational cost in waste treatment [4]. Thus a reduction in the number of unnecessary visits to semi-empty containers will save money! We aim to provide an alternative to predict if a container should be collected or not. Particularly, we propose to predict the filling level of the waste containers (all the containers involved in the operation at once) using a Recurrent Neural Network (RNN).

RNNs are top-notch at predicting time series, however as all Deep Learning (DL) techniques the selection of an appropriate network design is a tough task [5]. The use of automatic intelligent tools seems a mandatory requirement when addressing the design of RNNs, since the vast possible RNN architectures that can be generated defines a huge search space. In this sense, metaheuristics [6] emerged as efficient stochastic techniques able to address hard-to-solve optimization problems. Indeed, these algorithms are currently employed in a multitude of real world problems, e.g., in the domain of Smart City [7-13], showing a successful performance. Nevertheless, the use of such a methodology in the domain of DL is still limited [14].

On the other hand, real-world problems present many challenges, from technological issues to political restrictions. In our particular problem, there is an interesting problem that arises when dealing with the prediction of the filling level: uncertainty. Following previous works [15], we distinguish two types of uncertainty, reliability against noise and robustness. The latter measures uncertainty caused by imprecision of the decision variables of the solution, which is not relevant for our problem because solutions can always be implemented precisely. The former measures uncertainty which may come from many different sources such as sensory/human measurement errors as it is the case of the historical data of filling level of the waste containers. Therefore, in this study, we propose to study whether RNN could provide reliable solutions when provided with noisy/faulty data.

In this article, we extend the ideas presented by [16]. Particularly, using a hyper-parameter technique based on evolutionary computation, we design and train an RNN that predicts the filling level of the containers of a whole city. Then, we study the behavior of this approach when faced with prediction under uncertainty. To validate our proposal, we analyze a real-world case consisting of more than two hundred waste containers located in a city in Spain, and we compare our results against the ones presented by [17].

As a summary, in this study:

  • We define a deep neuroevolutionary technique to automatically design an efficient RNN.

  • We use our proposal to design and train an RNN that predicts the filling level of the waste containers of a real city and benchmark our results against the state-of-the-art.

  • We study the RNN approach for predicting under uncertainty.

Therefore, the main contribution of this work in contrast to the previous work [16] is the analysis of the behavior of deep neuroevolution and RNN when faced with uncertain data.

The remainder of this paper is organized as follows. The next section briefly reviews the state-of-the-art of smart waste management. Section 3 discusses about the use of DL to predict the waste generation rate. Section 4 presents a deep neuroevolutionary approach to design an artificial neural network-based predictor of the filling level of the waste containers. Section 5 presents the experiments carried out, results, benchmark, and analyses. Finally, Section 6 outlines our conclusions and proposes the future work.

2. Smart waste management

The waste collection is a process with uncountable variants and constraints which have led to a multitude of studies in recent years due to its importance. The works in the literature could be classified, among other ways, according to the waste type that is treated: residential waste commonly known as garbage [10, 18], industrial waste where customers are more dispersed and the amount of waste is higher [19], recyclable waste [20] increasingly important for our society, where the collection frequency is lower than organic waste and hazardous waste where the probability of damage is minimized [21].

In the municipal solid waste collection [22], the authorities need global studies to quantify the waste generated in a period of time to be able to manage them. Particularly, the waste generation forecasting for Xiamen city (China) inhabitants was studied by [23]. The main difference with our approach is the granularity of the object under study. They predict the amount of waste produced by the whole city, in contrast, we predict every single container in a city (i.e. a disaggregated prediction of the whole city). This supposes a considerable increase of the complexity of the problem that is solved, because it is necessary to consider multiple aspects such as the location, the customs of the citizens, the population density of the area, etc. In the same research line, the impact of the intervention of local authorities on waste collection has also been studied [24], being this relevant in the medium-long term.

Regarding the location where the collection takes place, there exist multiple variants of the problem. There are communal collections where the local authority identifies a place shared by the community [11, 25], in most cases a local waste facility for recycling. In the other side we found the kerbside collection [26] where the household waste is collected from individual small containers located near each house. The intermediate case studied here is the analysis of containers that give service to several streets and blocks of flats [27].

In previous works [17, 28] the authors used machine learning techniques to predict the filling level of a container. Particularly, the authors used Linear Regression, Gaussian Processes and Support Vector Machines for regression to predict each container individually. In this work, we present a unique RNN able to generate predictions for the whole set of containers instead of creating and training individual predictors for each container.

3. Deep learning for waste generation prediction

In this study, we focus on waste generation prediction by applying DL based on specific type of artificial neural networks (ANN), RNN. As other ANNs, this type of networks are composed of multiple hidden layers between input and output layers. RNNs incorporate feedforward and feedback connections between layers to capture long-term dependency in an input. Thus, RNNs have successfully applied to address learning applications which involve sequential modeling and prediction as natural language, image, and speech recognition and modeling [29]. In turn, they have been applied in Smart Cities problems that require time dependent prediction [14].

We apply supervised learning, which consists in an iterative process that requires a training data set (N input-output pairs). As this study deals with the prediction of the filling levels, the inputs are the current filling level each container and the outputs are the next (future) filling levels. Thus, for each input, the ANN produces an output (i.e., a tentative future filling rate) which is compared to the expected output by using an error (cost or distance) function. Then, a procedure is applied to reduce this error by updating the network until a given stop criteria is reached [30].

Minimizing such learning error is a tough task. Backpropagation [31] (BP), a first-order gradient descent algorithm, is the most widely used method to address such issue. In order to apply BP on RNN, the network has to be unfold [32], i.e., the network is copied and connected in series a finite number of times (known as look back) to build an unrolled version of the RNN.

Large ANNs (as unfolded RNNs) suffer from overfitting to the training data set, i.e., the error on the training set is driven to a very small value, but when unseen new data is presented to the network the error dramatically increases [33]. In order to address this issue, a technique called dropout, which consists in including a stochastic procedure to the training process, is applied [34].

The accuracy and the generalization capability of the RNN prediction depends on a set of configuration hyper-parameters: number of layers, number of hidden units per layer, activation function, kernel size of a layer, etc. Thus, a promising research line in DL proposes to find specific hyper-parameters configurations for an ANN to improve its numerical accuracy [35, 36]. The results demonstrated that selecting the most suitable hyper-parameters for a given dataset provides more competitive results than using generalized networks.

Since training an RNN is costly (in terms of computational resources) and the number of RNN architectures is infinite (or extremely large if we impose restrictions to the number of hidden layers or neurons), we are enforced to define a smart search strategy to find an optimal RNN.

Among the many potential optimization techniques to find efficient ANN hyper-parameterization, a few authors have already applied metaheuristics [37, 38]. However, these solutions cannot be directly applied to deep neural networks (DNN), i.e. ANNs with one or more hidden layer, due to the high computational complexity of DNNs. Recently, new solutions specifically defined to address hyper-parameter optimization of DNNs by using metaheurisitcs are emerging: the deep neuroevolutionary approaches [5, 14, 39-41], showing competitive results in finding parameters that improve the accuracy and minimize the generalization error.

In this study, we focus on applying a deep neuroevolution approach to address the generation of container filling predictions. Our optimization method deals with the next main RNN parameters: the look back (i.e., how many times the net is unfold during the training), the number of hidden layers, and the number of neurons for each hidden layer.

4. Deep neuroevolutionary architecture optimization

In this section, we present the details of our proposal. First, we formally state the architecture optimization problem, and then we outline our deep neuroevolutionary approach to solve the problem.

4.1 Architecture optimization

Optimizing an ANN consists in finding an appropriate network structure (architecture) and a set of weights to solve a given problem [30]. Particularly, we can analyze the suitability of an ANN by measuring its generalization capability, i.e. the ability to predict/classify new (unseen) data.

In our case, we are interested in optimizing the architecture of an RNN. Therefore, we decided to train an RNN using BP (i.e. we are finding an appropriate set of weights given a network structure) and measure the mean absolute error (MAE) of the predicted values against the observed ones.

Equation (1) states the problem of finding an optimal architecture as a minimization problem, where N corresponds to the number of samples in the testing data set (X,Y), zi stands for the predicted value of the i-th sample, and yi corresponds to the ground truth of the i-th sample. Note that the RNN is fed with already predicted data x, and that the architecture is constraint by B, H, and L.

4.2 Deep neuroevolution

To solve the problem stated in Equation (1) we designed a deep neuroevolutionary algorithm based on the (1 + 1) Evolutionary Strategy (ES) [6] and on the Adam weights optimizer [42]. Our proposal is presented in Algorithm 1.

A solution represents an RNN architecture and it is encoded as an integer vector of variable length, solution=< s 0, s 1, …, sH > . The first element, s 0 ϵ [1; max_look_back], corresponds to the look back, while the following elements ( sj , j ϵ [1;H]), correspond to the number of Long Short-Term Memory (LSTM) cells of the j-th hidden layer, subject to sj ϵ [1; max_neurons_per_layer] and H ϵ [1; max_hidden_layers]. Note that the number of hidden layers is defined by the length of the vector. The number of neurons of the output layer is defined accordingly to the inputed time series, i.e. we add a dense layer (fully connected) with a number of neurons equal to the number of dimensions of the output.

First, the Initialize function creates a new random solution. Then, the Evaluate function computes the Fitness of the solution. Specifically, the solution is decoded (into an RNN), then the net is trained using the Adam optimizer [42] for evaluation_epochs epochs using the training data set and finally the fitness value is computed using the testing data set.

Then, while the number of evaluations is less or equal than max_evaluations, the evolutionary process takes place. Starting from a solution, the Mutate function generates a new mutated solution, which is later evaluated. The Mutate function consists in a two step process applied to the inputed solution. In the first step, with a probability equal to mut_element_p the j-th element of the solution is perturbed by adding a uniformly drawn value in the range [-max_step,max_step]. In the second step, with a probability equal to mut_length_p the length of the solution is modified by copying or removing (with equal probability) an element of the solution. Before returning the new solution, a validation process is performed to assure that the mutated solution is valid (i.e. its values complies with the restrictions).

Next, the fitness of the original solution and the mutated one are compared. If the fitness of the mutated is less or equal than the original solution, the mutated replaces the original solution.

As the last part of the evolutionary process, a SelfAdapting step is performed to improve the performance of the evolutionary process [43]. Particularly, if the fitness of the mutated solution improves the original one, then the mut_element_p and mut_length_p values are multiplied by 1.5, in other case these probabilities are divided by 4 [43]. In other words, if we are not improving, we narrow the local search space. On the contrary, while the solutions are improving (in terms of the fitness), we widen the local search space.

Finally, the evolved solution is evaluated (using final_epochs to feed the number of epochs of the training process) and returned.

5. Experimental study

We implemented our proposal in Python 3, using the DL optimization library dlopt [44], and the DL frameworks keras [45] and tensorflow [46]. Then, we (i) selected a data set to test our proposal, (ii) optimized an RNN to tackle the referred problem, (iii) compared our predictions against the state-of-the-art of urban waste containers filling level prediction, and (iv) studied the suitability of the solutions found to predict under uncertainty.

5.1 Data set: filling level of containers

The data set analyzed in this article is the one used in [17, 28], a real case study of an Andalusian city (Spain), where we highlight the benefits of our approach, being effective and realistic at the same time. Our case study considers 217 paper containers from the metropolitan area of a city. The choice of an instance of recycling waste (paper) is more attractive than a organic waste collection to show the quality of our approach because most paper containers do not need to be collected everyday like the organic waste, so they have a high variability in collection frequency.

In order to study the reliability of our approach under uncertainty we propose a synthetic benchmark of instances derived from the original data set. We selected a percentage p of random days where the filling data of all containers have errors, which may come from a) sensors errors or b) the loss of the data. From these two source of errors, we generate two types of instances. To represent the former source of errors (a) we generate random values between 0 and 100 to fix errors in data, we call it random. For the latter one (b) we use zeros to represent the loss of data, so we call it zeros. Combining the percentage of days with errors (p = 5; 10; 20) and the type of errors (zeros or random) we generate 6 synthetic instances.

5.2 RNN optimization

We executed 30 independent times our deep neuroevolutionary algorithm considering the combinatorial search space defined in Table 1, using the data set described above, the parameters defined in Table 2, and a fixed dropout equal to 0.5. We use an 80% of the data to train the networks and the remainder data to test their performance (i.e., computing the fitness).

Table 1

RNN optimization search space

0120-6230-rfiua-93-00128-gt1.png

The initial setup of the algorithm is taken from the related literature [14]. Considering that our proposal performs a self-adapting step, we did not perform a tuning of the parameters of the algorithm.

Table 3 summarizes the results obtained. The MAE, the mean squared error (MSE), the total number of LSTM cells, the look back, and the number of recurrent layers correspond to the statistics computed over the final solutions (30 RNN trained). We will refer to the solution returned by the algorithm as solution. The time corresponds to the statistics computed over the total time, i.e. the sum of the computation time of all the architectures evaluated, including the solution. The time is presented in minutes.

Table 2

ES parameters configuration

0120-6230-rfiua-93-00128-gt2.png

The results show that the algorithm is robust in regard to the MAE (and the MSE), however there is a noticeable variation in the architectures and in the time needed to compute a solution. We analyze the solutions and all the architectures evaluated during the optimization to get insights into the relation between the architecture and the error. Figure 1a presents the architectures (number of LSTM cells and layers) of the solutions along with their respective MAE and Figure 1b shows the same for all architectures evaluated. A small MAE (a darker dot) is desirable. It is important to remark that the MAE presented in both figures is not comparable, because in both cases the number of training epochs is different, therefore the results are expected to differ (at least in their magnitude).

It is quite interesting that the solutions are very diverse (see Figure 1a), and that most of them use less than 500 LSTM cells. This is more interesting if we consider that the maximum allowed number of LSTM cells given the problem restrictions (see Table 1] is equal to 2400 and that many architectures evaluated have more than 500 LSMT cells (see Figure 1b).

To continue with our analysis, we ranked all the architectures evaluated (excluding the solutions) into deciles and selected the top one (i.e. the best architectures evaluated). Then we plot the density distribution of the number of recurrent layers (see Figure 2a) and of the total number of LSTM cells (see Figure 2b). We also plot the density distribution of the solutions in both figures. The results show that both densities are relatively similar, therefore we intuit that there is an archetype that better suits to the problem. However, further analysis is required to validate this intuition.

5.3 Prediction benchmark

In order to continue with the evaluation of our proposal, we benchmark the predictions made by the RNN against the results published in [17, 28]. In order to compare the approaches we compute the “mean absolute error in the filling predictions of the next month” (MM) using the solutions given by our algorithm, i.e. we predict a whole month using an RNN and summed up the predictions per container, then we compute the mean absolute difference between the predicted values and the ground truth. Table 4 summarizes the results of the MM computed using the solutions. Note that the MM results are better than the MAE (see Table 3].

We selected the median solution (in regard to the MM) and compared the results against the ones presented in [17, 28]. Table 5 presents the benchmark in terms of the prediction error. In that previous work, the authors proposed three time series algorithms used for forecasting the fill level for all containers. Particularly, they used techniques based on Linear Regression (LR), Gaussian Processes (GP), and Support Vector Machines for Regression called SMReg.

The results indicate that our proposal exceeds its competitors. Moreover, we performed a non-parametric Friedman’s Two-Way Analysis of Variance Ranks Test that revealed RNN as the best algorithm, followed by the algorithm based on GP, the LR, and the SMReg as last algorithm in the comparison. Regarding the statistical significant differences, the values have been adjusted by the Bonferroni correction for multiple comparisons. There are significant differences between each pair of algorithms except for the particular comparison between LR and SMReg. Thus, the RNN is significantly the most competitive method according to the MM.

Finally, to relate the results presented in this subsection (see Table 4] to the ones presented in the previous subsection (see Table 3] we plotted the relation between the MAE and the MM (please refer to Figure 3]. The figure also includes the architecture of the solutions (number of LSTM cells and number of recurrent layers). Something that caught our attention is that there is not an apparent linear relation between both metrics presented in the plot, however the summarized results presented for both metrics (see Tables 3 and 4] are robust in regard to the referred error measurement.

5.4 Prediction under uncertainty

Following up with our experimentation, we studied the reliability of the solutions found. Particularly, we re-trained (Section 5.2) the solutions found (30 RNNs) using the synthetic dataset described previously.

Table 3

ES-based RNN optimization results

0120-6230-rfiua-93-00128-gt3.jpg

Figure 1

Architectures evaluated during the optimization process

0120-6230-rfiua-93-00128-gf1.jpg

Figure 2

The best solutions evaluated (fitness) compared to the final solutions

0120-6230-rfiua-93-00128-gf2.jpg

Table 4

MM statistics computed for the RNN solutions

0120-6230-rfiua-93-00128-gt4.png

Table 5

Prediction error of the compared methods

0120-6230-rfiua-93-00128-gt5.png

Figure 3

Relation between the MAE and the MM

0120-6230-rfiua-93-00128-gf3.jpg

Tables 6, 7 and 8 summarize the results of the reliability benchmark. At a glance, we notice that the overall results worsen as the uncertainty increases, as it is expected. Basically, the quality of the data has a direct impact in the accuracy of the predictions.

We also observed that losing data (i.e. replacing measurements with a zero) is not as important as having random errors. In other words, it is preferable to have a missing data (non-functioning sensor) than having an imprecise measurement (or faulty sensor). This particular insight presents a new challenge (or problem) to real waste management companies, because it is clear that a non-functioning sensor is easy to found, however a faulty one might be hard to detect.

Table 6

Reliability benchmark (5%)

0120-6230-rfiua-93-00128-gt6.png

Table 7

Reliability benchmark (10%)

0120-6230-rfiua-93-00128-gt7.png

In order to compare the predictions under uncertainty against the predictions made by our competitors [Table 5], we selected a solution per combination (random/zeros and percentage) whose MAE is equal to the median and we computed the MM. Table 9 presents the described results. As expected, the results show that adding uncertainty to the data has a negative impact on the MM. Moreover, a missing datum (zeros) has less impact than a random noisy datum. On the other hand, if the uncertain data represent less than the 5%, the RNN still beats all its competitors [Table 5]. Note that the results shown in Table 5 do not consider uncertain data.

Table 8

Reliability benchmark (20%)

0120-6230-rfiua-93-00128-gt8.png

In order to gain insights into the relation between the performance and the architecture, specially in regard to the variation of the uncertainty, we computed the Pearson correlation between the MAE and the architecture definition of each solution. Particularly, we computed the correlation between the total number of LSTM cells, the look back, and the number of recurrent layers, and the MAE, MAE with a 5% missing (zeros) or faulty data (random). Table 10 presents the correlations computed. The results show that there is a small correlation between the variables. Therefore, further analysis is needed to conclude that there is a relation between the performance and the architecture in this case (adding uncertainty to the dataset). Please refer to Table 11 in Appendix for a detailed version of the results.

Table 9

Reliability benchmark (up to 20% uncertainty)

0120-6230-rfiua-93-00128-gt9.png

Table 10

Architecture and MAE correlation

0120-6230-rfiua-93-00128-gt10.png

6. Conclusions and future work

Deep neuroevolution has emerged as a promising field of study and is growing rapidly. Particularly, the use of Evolutionary Algorithms to tackle the hyper-parametrization optimization problem is showing unprecedented results, not only in terms of the performance of the designed networks, but also in terms of the reduction of the computational resources needed (e.g., the configurations are evaluated using a heuristic, therefore not all configurations are actually trained [47, 48]).

In this study, we present a deep neuroevolutionary algorithm to optimize the architecture of an RNN (given a problem). We test our proposal using the filling level of 217 waste containers located in Andalusia, Spain, recorded over a whole year and benchmark our results against the state-of-the-art of filling level prediction. Our experimental results show that an “appropriate” selection of the architecture improves the performance (in terms of the error) of an RNN and that our prediction results exceeds all its competitors.

In regards to the quality of the predictions under uncertainty, the result show that the quality gets worse as the percentage of missing or faulty data increases. Nevertheless, the median RNN (not the best) is able to outperform all its competitors (using correct data) even when the RNN uses an instance which has 10% of missing data. In addition, by analyzing in detail the RNN results’ under uncertainty, we conclude that it is preferable to have missing data than imprecise data coming from a faulty sensor. This fact should be considered when we receive an outlier from a sensor.

As future work, we propose to explore train-free approaches for evaluating a network configuration. Specifically, we propose to study the use of the MAE random sampling [47, 48] to compare RNN architectures, aiming to reduce the computational power and the time needed to find an appropriate architecture.

7. Acknowledgements

This research was partially funded by Ministerio de Economía, Industria y Competitividad, Gobierno de España, and European Regional Development Fund grant numbers TIN2016-81766-REDT (http://cirti.es), and TIN2017-88213-R (http://6city.lcc.uma.es). European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 799078. Universidad de Málaga, Campus Internacional de Excelencia Andalucía TECH.

References

[1] T. Bakici, E. Almirall, and J. Wareham, “A smart city initiative: the case of Barcelona,” Journal of the Knowledge Economy, vol. 4, no. 2, pp. 135-148, 2013.

T. Bakici E. Almirall J. Wareham A smart city initiative: the case of BarcelonaJournal of the Knowledge Economy421351482013

[2] P. Ghisellini, C. Cialani, and S. Ulgiati, “A review on circular economy: The expected transition to a balanced interplay of environmental and economic systems,” Journal of Cleaner Production, vol. 114, pp. 11-32, 2016.

P. Ghisellini C. Cialani S. Ulgiati A review on circular economy: The expected transition to a balanced interplay of environmental and economic systemsJournal of Cleaner Production11411322016

[3] A. Tukker, “Product services for a resource-efficient and circular economy - a review,” Journal of Cleaner Production , vol. 97, pp. 76-91, 2015.

A. Tukker Product services for a resource-efficient and circular economy - a reviewJournal of Cleaner Production9776912015

[4] J. Teixeira, A. P. Antunes, and J. P. de Sousa, “Recyclable waste collection planning--a case study,” European Journal of Operational Research, vol. 158, no. 3, pp. 543-554, nov 2004.

. Teixeira A. P. Antunes J. P. de Sousa Recyclable waste collection planning--a case studyEuropean Journal of Operational Research1583543554112004

[5] V. K. Ojha, A. Abraham, and V. Snášel, “Metaheuristic design of feedforward neural networks: A review of two decades of research,” Engineering Applications of Artificial Intelligence, vol. 60, pp. 97 - 116, 2017.

V. K. Ojha A. Abraham V. Snášel Metaheuristic design of feedforward neural networks: A review of two decades of researchEngineering Applications of Artificial Intelligence6097 1162017

[6] T. Back, Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms. Oxford university press, 1995.

T. Back Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic AlgorithmsOxford university press1995

[7] J. Ferrer, J. García, E. Alba, and F. Chicano, “Intelligent testing of traffic light programs: Validation in smart mobility scenarios,” Mathematical Problems in Engineering, vol. 2016, pp. 1-19, 2016. [Online]. Available: http://www.hindawi.com/journals/mpe/2016/3871046/

J. Ferrer J. García E. Alba F. Chicano Intelligent testing of traffic light programs: Validation in smart mobility scenariosMathematical Problems in Engineering20161192016http://www.hindawi.com/journals/mpe/2016/3871046/

[8] J. García , J. Ferrer , andE. Alba , “Optimising traffic lights with metaheuristics: Reduction of car emissions and consumption,” in International Joint Conference on Neural Networks, 2014, pp. 48-54.

J. García J. Ferrer E. Alba Optimising traffic lights with metaheuristics: Reduction of car emissions and consumptionInternational Joint Conference on Neural Networks20144854

[9] R. Massobrio, J. Toutouh, S. Nesmachnow, andE. Alba , “Infrastructure deployment in vehicular communication networks using a parallel multiobjective evolutionary algorithm,” International Journal of Intelligent Systems, vol. 32, no. 8, pp. 801-829, 2017.

R. Massobrio J. Toutouh S. Nesmachnow E. Alba Infrastructure deployment in vehicular communication networks using a parallel multiobjective evolutionary algorithmInternational Journal of Intelligent Systems3288018292017

[10] S. Nesmachnow , D. Rossit, and J. Toutouth, “Comparison of multiobjective evolutionary algorithms for prioritized urban waste collection in montevideo, Uruguay,” Electronic Notes in Discrete Mathematics, 2018.

S. Nesmachnow D. Rossit J. Toutouth Comparison of multiobjective evolutionary algorithms for prioritized urban waste collection in montevideo, UruguayElectronic Notes in Discrete Mathematics2018

[11] J. Toutouh , D. Rossit , andS. Nesmachnow , “Computational intelligence for locating garbage accumulation points in urban scenarios,” in International Conference on Learning and Intelligent Optimization, LION 12, 2018, pp. 1-15.

J. Toutouh D. Rossit S. Nesmachnow Computational intelligence for locating garbage accumulation points in urban scenariosInternational Conference on Learning and Intelligent Optimization, LION 122018115

[12] D. G. Rossit, S. Nesmachnow , andJ. Toutouh , “Municipal solid waste management in smart cities: facility location of community bins,” in Congreso Iberoamericano de Ciudades Inteligentes (ICSC-CITIES 2018), 2018, pp. 1-14.

D. G. Rossit S. Nesmachnow J. Toutouh Municipal solid waste management in smart cities: facility location of community binsCongreso Iberoamericano de Ciudades Inteligentes (ICSC-CITIES 2018)2018114

[13] A. Camero, J. Arellano, andE. Alba , “Road map partitioning for routing by using a micro steady state evolutionary algorithm,” Engineering Applications of Artificial Intelligence , vol. 71, pp. 155-165, 2018.

A. Camero J. Arellano E. Alba Road map partitioning for routing by using a micro steady state evolutionary algorithmEngineering Applications of Artificial Intelligence711551652018

[14] A. Camero , J. Toutouh , D. H. Stolfi, andE. Alba , “Evolutionary deep learning for car park occupancy prediction in smart cities,” in Learning and Intelligent OptimizatioN Conference LION, 2018.

A. Camero J. Toutouh D. H. Stolfi E. Alba Evolutionary deep learning for car park occupancy prediction in smart citiesLearning and Intelligent OptimizatioN Conference LION2018

[15] Y. Jin and J. Branke, “Evolutionary optimization in uncertain environments,” IEEE Transactions on Evolutionary Computation, vol. 9, no. 5, pp. 303-317, 2005.

Y. Jin J. Branke Evolutionary optimization in uncertain environmentsIEEE Transactions on Evolutionary Computation953033172005

[16] A. Camero , J. Toutouh , J. Ferrer , andE. Alba , “Waste generation prediction in smart cities through deep neuroevolution,” in Smart Cities. ICSC-CITIES 2018, vol. 978, 2019, pp. 192-204.

A. Camero J. Toutouh J. Ferrer E. Alba Waste generation prediction in smart cities through deep neuroevolution,” in Smart CitiesICSC-CITIES 20189782019192204

[17] J. Ferrer andE. Alba , “(bin-ct): Urban waste collection based in predicting the container fill level,” jul 2018. [Online]. Available: http://arxiv.org/abs/1807.01603

J. Ferrer E. Alba “(bin-ct): Urban waste collection based in predicting the container fill level072018http://arxiv.org/abs/1807.01603

[18] B. J. Garvin, M. Cohen, and M. B. Dwyer, “Evaluating improvements to a meta-heuristic search for constrained interaction testing,” Empirical Software Engineering, vol. 16, no. 1, pp. 61-102, 2011.

B. J. Garvin M. Cohen M. B. Dwyer Evaluating improvements to a meta-heuristic search for constrained interaction testingEmpirical Software Engineering161611022011

[19] S. Sahoo, S. Kim, B. I. Kim, B. Kraas, and A. Popov Jr., “Routing optimization for waste management,” Interfaces, vol. 35, no. 1, pp. 24-36, 2005.

S. SahooS. KimB. I. KimB. KraasA. Popov Jr Routing optimization for waste managementInterfaces35124362005

[20] L. Q. Dat, D. T. Truc, S. Y. Chou, and V. F. Yu, “Optimizing reverse logistic costs for recycling end-of-life electrical and electronic products,” Expert Systems with Applications, vol. 39, no. 7, pp. 6380-6387, 2012.

L. Q. Dat D. T. Truc S. Y. Chou V. F. Yu Optimizing reverse logistic costs for recycling end-of-life electrical and electronic productsExpert Systems with Applications397638063872012

[21] A. Z. Alagöz and G. Kocasoy, “Improvement and modification of the routing system for the health-care waste collection and transportation in istanbul,” Waste Management, vol. 28, no. 8, pp. 1461-1471, 2008.

A. Z. Alagöz G. Kocasoy Improvement and modification of the routing system for the health-care waste collection and transportation in istanbulWaste Management288146114712008

[22] J. Beliën, L. De Boeck, and J. Van Ackere, “Municipal solid waste collection and management problems: A literature review,” Transportation Science, vol. 48, no. 1, pp. 78-102, feb 2014.

J. Beliën L. De Boeck J. Van Ackere Municipal solid waste collection and management problems: A literature reviewTransportation Science48178102022014

[23] L. Xu, P. Gao, S. Cui, and C. Liu, “A hybrid procedure for msw generation forecasting at multiple time scales in xiamen city, china,” Waste management, vol. 33, no. 6, pp. 1324-31, jun 2013.

L. Xu P. Gao S. Cui C. Liu A hybrid procedure for msw generation forecasting at multiple time scales in xiamen city, chinaWaste management33613241331062013

[24] C. Cole, M. Quddus, A. Wheatley, M. Osmani, and K. Kay, “The impact of local authorities’ interventions on household waste collection: a case study approach using time series modelling,” Waste management , vol. 34, no. 2, pp. 266-72, feb 2014.

C. Cole M. Quddus A. Wheatley M. Osmani K. Kay The impact of local authorities’ interventions on household waste collection: a case study approach using time series modellingWaste management342266272022014

[25] D. V. Tung and A. Pinnoi, “Vehicle routing-scheduling for waste collection in hanoi,” European Journal of Operational Research , vol. 125, no. 3, pp. 449-468, 2000.

D. V. Tung A. Pinnoi Vehicle routing-scheduling for waste collection in hanoiEuropean Journal of Operational Research12534494682000

[26] J. Sniezek and L. Bodin, “Using mixed integer programming for solving the capacitated arc routing problem with vehicle/site dependencies with an application to the routing of residential sanitation collection vehicles,” Annals of Operations Research, vol. 144, no. 1, pp. 33-58, apr 2006.

J. Sniezek L. Bodin Using mixed integer programming for solving the capacitated arc routing problem with vehicle/site dependencies with an application to the routing of residential sanitation collection vehiclesAnnals of Operations Research14413358042006

[27] L. Bodin , A. Mingozzi, R. Baldacci, and M. Ball, “The rollon-rolloff vehicle routing problem,” Transportation Science, vol. 34, no. 3, pp. 271-288, 2000.

L. Bodin A. Mingozzi R. Baldacci M. Ball The rollon-rolloff vehicle routing problemTransportation Science3432712882000

[28] J. Ferrer andE. Alba , “Bin-ct: sistema inteligente para la gestión de la recogida de residuos urbanos,” in International Greencities Congress, 2018, pp. 117-128.

J. Ferrer E. Alba Bin-ct: sistema inteligente para la gestión de la recogida de residuos urbanosInternational Greencities Congress2018117128

[29] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, p. 436, 2015.

Y. LeCun Y. Bengio G. Hinton Deep learningNature52175534364362015

[30] S. Haykin, Neural networks and learning machines. Pearson, 2009, vol. 3.

S. Haykin Neural networks and learning machinesPearson20093

[31] D. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal representations by error propagation,” California: Univ San Diego La Jolla Inst for Cognitive Science, Tech. Rep. No. ICS-8506, 1985.

D. Rumelhart G. E. Hinton R. J. Williams Learning internal representations by error propagationCaliforniaUniv San Diego La Jolla Inst for Cognitive ScienceTech. Rep. No. ICS-85061985

[32] H. Jaeger, Tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the echo state network approach. GMD, 2002, vol. 5.

H. Jaeger Tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the echo state network approachGMD20025

[33] R. Reed, R. Marks, and S. Oh, “Similarities of error regularization, sigmoid gain scaling, target smoothing, and training with jitter,” IEEE Transactions on Neural Networks, vol. 6, no. 3, pp. 529-538, 1995.

R. Reed R. Marks S. Oh Similarities of error regularization, sigmoid gain scaling, target smoothing, and training with jitterIEEE Transactions on Neural Networks635295381995

[34] N. Srivastava, G. Hinton , A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” The Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929-1958, 2014.

N. Srivastava G. Hinton A. Krizhevsky I. Sutskever R. Salakhutdinov Dropout: A simple way to prevent neural networks from overfittingThe Journal of Machine Learning Research151192919582014

[35] J. Bergstra, D. Yamins, and D. Cox, “Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures,” in International Conference on Machine Learning, 2013, pp. 115-123.

J. Bergstra D. Yamins D. Cox Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architecturesInternational Conference on Machine Learning2013115123

[36] R. Jozefowicz, W. Zaremba, andI. Sutskever , “An empirical exploration of recurrent network architectures,” in International Conference on Machine Learning , 2015, pp. 2342-2350.

R. Jozefowicz W. Zaremba I. Sutskever An empirical exploration of recurrent network architecturesInternational Conference on Machine Learning201523422350

[37] E. Alba and R. Martí, Metaheuristic Procedures for Training Neural Networks. Springer Science & Business Media, 2006.

E. Alba R. Martí Metaheuristic Procedures for Training Neural NetworksSpringer Science & Business Media2006

[38] X. Yao, “Evolving artificial neural networks,” Proceedings of the IEEE, vol. 87, no. 9, pp. 1423-1447, 1999.

X. Yao Evolving artificial neural networksProceedings of the IEEE879142314471999

[39] R. Miikkulainen and et al., “Evolving deep neural networks,” arXiv preprint arXiv:1703.00548, 2017. [Online]. Available: http://arxiv.org/abs/1703.00548

R. Miikkulainen Evolving deep neural networksarXiv preprint arXiv:1703.005482017http://arxiv.org/abs/1703.00548

[40] G. Morse and K. O. Stanley, “Simple evolutionary optimization can rival stochastic gradient descent in neural networks,” in Proc. of the Genetic and Evolutionary Computation Conf. 2016, 2016, pp. 477-484.

G. Morse K. O. Stanley Simple evolutionary optimization can rival stochastic gradient descent in neural networksProc. of the Genetic and Evolutionary Computation Conf. 20162016477484

[41] X. Su, X. Yan, and C. L. Tsai, “Linear regression,” Wiley Interdisciplinary Reviews: Computational Statistics, vol. 4, no. 3, pp. 275-294, 2012.

X. Su X. Yan C. L. Tsai Linear regressionWiley Interdisciplinary Reviews: Computational Statistics432752942012

[42] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.

D. P. Kingma J. Ba Adam: A method for stochastic optimizationarXiv preprint arXiv:1412.69802014

[43] C. Doerr, “Non-static parameter choices in evolutionary computation,” in Genetic and Evolutionary Computation Conference Companion, 2017.

C. Doerr Non-static parameter choices in evolutionary computationGenetic and Evolutionary Computation Conference Companion2017

[44] A. Camero , J. Toutouh , andE. Alba , “(dlopt): Deep learning optimization library,” arXiv preprint arXiv:1807.03523, july 2018.

A. Camero J. Toutouh E. Alba (dlopt): Deep learning optimization libraryarXiv preprint arXiv:1807.03523072018

[45] F. Chollet and et al, “Keras,” https://keras.io, 2015.

F. Chollet Kerashttps://keras.io2015

[46] M. Abadi and et al., “Tensorflow: A system for large-scale machine learning,” in 12 th (USENIX) Symposium on Operating Systems Design and Implementation (OSDI 16), 2016, pp. 265-283.

M. Abadi Tensorflow: A system for large-scale machine learningth12(USENIX) Symposium on Operating Systems Design and Implementation (OSDI 16)2016265283

[47] A. Camero , J. Toutouh , andE. Alba , “Comparing deep recurrent networks based on the mae random sampling, a first approach,” in Conference of the Spanish Association for Artificial Intelligence (CAEPIA) 2018, 2018, pp. 1-10.

A. Camero J. Toutouh E. Alba Comparing deep recurrent networks based on the mae random sampling, a first approachConference of the Spanish Association for Artificial Intelligence (CAEPIA) 20182018110

[48] A. Camero , J. Toutouh , andE. Alba , “Low-cost recurrent neural network expected performance evaluation,” arXiv preprint arXiv:1805.07159, may 2018.

A. Camero J. Toutouh E. Alba Low-cost recurrent neural network expected performance evaluationarXiv preprint arXiv:1805.07159052018

Appendices

Appendix

Table 11 presents the detailed results of the experimentation. Particularly, LSTM stands for the total number of LSTM cells, LB is the look back, and RL is the number of recurrent stacked layers.

Table 11

Detailed results of the experimentation

0120-6230-rfiua-93-00128-gt11.png