You are on page 1of 20

Energy Conversion and Management 44 (2003) 32073226

www.elsevier.com/locate/enconman

Architecture and performance of neural networks for ecient A/C control in buildings
Mohamed A. Mahmoud *, Abdullatif E. Ben-Nakhi
Department of Mechanical Engineering, College of Technological Studies, P.O. Box 33145, Rumaithya 25562, Kuwait Received 10 November 2002; accepted 12 May 2003

Abstract The feasibility of using neural networks (NNs) for optimizing air conditioning (AC) setback scheduling in public buildings was investigated. The main focus is on optimizing the network architecture in order to achieve best performance. To save energy, the temperature inside public buildings is allowed to rise after business hours by setting back the thermostat. The objective is to predict the time of the end of thermostat setback (EoS) such that the design temperature inside the building is restored in time for the start of business hours. State of the art building simulation software, ESP-r, was used to generate a database that covered the years 19951999. The software was used to calculate the EoS for two oce buildings using the climate records in Kuwait. The EoS data for 1995 and 1996 were used for training and testing the NNs. The robustness of the trained NN was tested by applying them to a production data set (19971999), which the networks have never seen before. For each of the six dierent NN architectures evaluated, parametric studies were performed to determine the network parameters that best predict the EoS. External hourly temperature readings were used as network inputs, and the thermostat end of setback (EoS) is the output. The NN predictions were improved by developing a neural control scheme (NC). This scheme is based on using the temperature readings as they become available. For each NN architecture considered, six NNs were designed and trained for this purpose. The performance of the NN analysis was evaluated using a statistical indicator (the coecient of multiple determination) and by statistical analysis of the error patterns, including ANOVA (analysis of variance). The results show that the NC, when used with a properly designed NN, is a powerful instrument for optimizing AC setback scheduling based only on external temperature records. 2003 Elsevier Ltd. All rights reserved.

Corresponding author. Tel.: +965-563-0013; fax: +965-534-9253/481-1753. E-mail address: drmamm@yahoo.com (M.A. Mahmoud).

0196-8904/$ - see front matter 2003 Elsevier Ltd. All rights reserved. doi:10.1016/S0196-8904(03)00105-5

3208

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226

Keywords: Neural networks; Energy conservation; Air conditioning; Control; General regression; Building simulation; Polynomial nets

1. Introduction Energy conservation in buildings is important both for economic and environmental reasons. Improving building energy eciency simultaneously reduces conventional fuel consumption, building energy cost and global warming gas release to the atmosphere. This has been highlighted by the recent trend toward more eective and ecient heating, ventilation and air conditioning (HVAC) control methodologies. In countries with extremely hot weather conditions, energy conservation in air conditioning (AC) of public and oce buildings is of particular interest, since most of these buildings are used only for part of every work day. One of the most promising strategies in this respect is through o hours thermostat setback, i.e. allowing the temperature to rise inside the building when it is not in use, leading to energy savings. This is achieved by setting back the thermostat temperature after work hours, then resetting it early enough before the start of the work day such that the desired temperature in the building is restored in time for actual work start. The end of setback (denoted herein as EoS) depends, for a given building, on the weather conditions, which is not known a priori. This requires an advanced tool to predict the EoS based on past weather history, and articial neural networks (NNs) oer an attractive and powerful option for this purpose. NNs have been employed in a wide range of HVAC applications, such as design, operation and fault detection. Yeh and Wong [1] simulated the design process for sizing uid systems in HVAC by using a NN. Teeter and Chow [2] used articial NNs to emulate the HVAC plant dynamics in order to estimate future plant outputs and obtain plant input/output sensitivity information for online neural control adaptation. Chen and Chen [3] discussed a NN based system identication technique to determine the z-transfer function coecients of a building envelope from experimental data. These coecients were then used in a z-transfer function technique, which is used in calculations for HVAC design and building energy consumption. Among the many researchers who have addressed the issue of increasing indoor thermal comfort, Egilegor et al. [4] implemented a neural-fuzzy control system in order to optimize the value of the predicted mean vote (PMV) index by tuning zone temperature according to the humidity level. Rock and Wu [5] introduced a CO2 based, demand controlled ventilation scheme by using an articial NN control algorithm. Ahmed et al. [6] studied the development of a controller for temperature during the cooling and heating sequence. Morel et al. [7] developed an adaptive heating controller algorithm by using articial NNs to accommodate the non-linearities of real buildings. Jeannette et al. [8] improved the performance of an unstable hot water system in an air handling unit by applying a predictive neural network (PNN) controller. Fargus and Chapman [9] developed a hybrid PI (Proportional and Integral)/neural network controller for commercial application in HVAC control systems. Kasahara et al. [10] applied a so-called preview control, which is a linear quadratic Gaussian optimal control with feed forward compensation, to control process variables, such as indoor temperature and indoor humidity. Saboksayr et al. [11]

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226

3209

designed a NN based decentralized controller to improve the operation of a multi-zone space heating system. Two approaches have been reported for employing predictive controls in HVAC systems, namely predicting future weather and predicting future thermal load. Alessandri et al. [12] presented an environmental temperature forecasting model based on NNs. On the other side, most of the studies for applying NN based predictive control systems for HVAC systems are performed on thermal storage systems. Kawashima et al. [13] studied improving the performance of a partial ice storage system by employing a controller that predicts the load by NNs. Similarly, Massie et al. [14] developed adaptive and predictive NN models for a chiller and ice thermal storage tank of a central plant HVAC system. Aside from these, NNs were employed for fault detection and diagnosis in HVAC systems [15,16]. In this paper, the feasibility of using NNs to optimize HVAC setback scheduling is demonstrated. Attention is focused on energy conservation in AC of public and oce buildings in which o hours thermostat setback is used to allow the temperature to rise inside the building when it is not in use, leading to energy savings. The temperature in Kuwait in summer can exceed 50 C in the shade during the day and can exceed 37 C at dawn. With such an extreme condition, there is a great potential for energy conservation in AC in these public buildings. As mentioned above, one successful strategy in this respect is to allow the temperature to rise within the building when it is not in use. One of the keys to ensure success of this strategy is being able to predict accurately the time (EoS) when full AC power needs to be restored. This prediction of the EoS is important because if the AC units start too late, the desired temperature within the building will not be reached in time for the start of business. On the other hand, if the AC units start too early, some potential savings in AC energy are lost. The problem with accurate prediction of the EoS is that it depends on the weather conditions, which are not known ahead of time. This requires an advanced tool to predict the EoS based on past weather history, and articial NNs oer an attractive and powerful option for this purpose. The architecture and parameters of the NNs are the main focus of this study to determine the optimum design for best network performance.

2. Building simulation software for calculation of EoS In the oce buildings considered in this study, the temperature is to be maintained at 24 C within each building from 8 AM to 5 PM. Then, the thermostat is set back to 30 C until at least 2 AM. It is required to determine the end of setback (EoS), i.e. the time at which the thermostat is reset to 24 C, such that the temperature within the building is restored to 24 C at 8 AM. Building energy simulation programs may be used to predict the EoS only when the weather conditions are known. This was done in the present study to prepare a database of past history to be used with the NNs to predict the EoS for new situations before the weather conditions are known. One of the most powerful building simulation codes, the ESP-r [17], was used to calculate the EoS for this study. This state of the art, whole building simulation software is being evolved and applied at several research centers throughout Europe since its selection by the European Commission as a reference program for building energy simulation. This software is based on using integrated dynamic simulation in which the thermal performance of the building is systemic, i.e. dierent heat transfer mechanisms (such as the eect of wind velocity on external heat transfer

3210

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226

coecient) interact in a complex manner. Besides conduction and convection, all signicant heat ow paths are considered. These include internal and external long wave and short wave radiation and radiation absorption by transparent materials. The weather inputs to the program include diuse solar radiation on the horizontal, external dry bulb temperature, direct normal solar intensity, prevailing wind speed, wind direction and relative humidity. Weather data in Kuwait were obtained from the Kuwait Institute for Scientic Research for the ve years 19951999. The weather data for 1995 and 1996 were used as the base. Two oce buildings are considered in this study. The rst, denoted EW, is a two story 155 m long, 24 m wide and 7.2 m high structure. The long side of the building faces the East. The eastern wall contains a double glazing transparent area of 328 m2 , while the western wall contains a double glazing transparent area of 270 m2 . This building is considered one of the extreme cases that can be studied since the eect of sun radiation through the windows contributes to the cooling load inside the building after dawn, i.e. during the critical hours during which the NNs are predicting the EoS. It will be noticed in this respect that dawn time is as early as 3:13 AM in Kuwait City in June and is earlier than 4:00 AM for more than 130 days between April 15 and August 27. During those days, the building will be exposed to radiation (both direct and indirect) for more than 4 h before the start of work at 8:00 AM. It should also be noticed that the NNs used in this study use external temperature only as input to predict the EoS. Therefore, reasonably accurate predictions of the EoS by NNs for this building serve to prove the robustness of the NN approach. The building materials used in the numerical models are similar to the common building materials in Kuwait and consist mainly of cement blocks, sandlime bricks, cement mortar and insulation. The thermophysical and optical properties of these materials, together with those of the double glazing glass, were used as part of the input data to ESP-r to predict the EoS. The second building considered, denoted SN, was the same building but rotated 90, i.e. the northern facade contains a double glazing transparent area of 270 m2 and the southern wall contains a double glazing transparent area of 328 m2 . In this case, the eect of the sun radiation through the windows is expected to be minimal during the period of interest (between dawn and 8 AM). It is expected, therefore, that the NN analysis should give better predictions for this building than for the rst one. The site latitude and longitude for both buildings was assumed to be similar to Kuwait City (i.e. 29.3 N latitude and 47.9 E longitude). Fig. 1 shows how the EoS, calculated using ESP-r, varies from day to day and from year to year during the period covered in this study for building EW. It is evident from the gure that a schedule for the EoS based on the date can lead to errors as large as 4 h during the hottest days of the year. Examination of the EoS plotted against the external temperature at 2 AM during the ve year period for the same building revealed that no correlation exists between the EoS and the temperature, which rules out any simple way to predict the EoS based merely on one single reading of temperature. An advanced tool that accounts for the temperature variation for several hours is needed, and articial NNs oer a powerful tool for this purpose. An articial NN is usually dened as a network composed of a large number of processors (neurons) that are massively interconnected, operate in parallel and learn from experience (examples). Backpropagation networks are known for their ability to generalize well on a wide variety of problems. That is why they are used for the vast majority of working NN applications. Two other NN types are also evaluated in this study, namely general regression and polynomial

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226
8 7 6 5 4 3 2 0 50 100 150 200 250 300 350

3211

1995 1996 1997 1998 1999

EoS

Day
Fig. 1. Daily variation of the EoS, building EW.

Table 1 Types of NNs used in this study Abbreviation SBP 3Slab 2Slab 2SlabB Network Standard backpropagation NN (Fig. 2) Predictive three slab NN (Fig. 3c) Predictive two-slab NN (Fig. 3a) Predictive two-slab NN (Fig. 3b) Brief description A three layer network in which each layer connects only to the next layer Backpropagation network with the hidden layer divided into three slabs Backpropagation network with the hidden layer divided into two slabs Backpropagation network with the hidden layer divided into two slabs and a connection to the output layer A three layer network that contains one hidden neuron for each training pattern and converges to the underlying regression surface Works by building successive layers with links that are polynomial terms

GRNN

General regression NN (similar to Fig. 2) Polynomial NN (similar to Fig. 2, but with multiple hidden layers)

GMDH

NNs. A brief summary of these network architectures is presented in Table 1 and a brief description of each network is given in the next section.

3. Neural network architectures The six architectures of NNs used in this study are summarized in Table 1. The standard type of the well known backpropagation NN (SBP) is one in which every layer is connected or linked to

3212

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226

the immediately previous layer. Three layers, namely input, hidden and output layers, (shown schematically in Fig. 2) are used for this basic standard type. The principal advantages of this network are its quick learning and fast convergence to an optimal regression surface as the number of samples becomes large. Sometimes more than one hidden layer is used, however this may result in a dramatic increase in training time. Three more versatile types of the backpropagation NNs were also evaluated in this study in addition to the SBP. These NNs, shown schematically in Fig. 3, are based on dividing the neurons in the middle layer into 2 or 3 sets (slabs); dierent activation functions applied to the hidden layer slabs detect dierent features in a pattern processed through a network. For example, a network design may use a Gaussian activation function on one hidden slab to detect features in the midrange of the data and use a Gaussian complement function in another hidden slab to detect features from the upper and lower extremes of the data. Combining the two feature sets in the output layer may lead to a better prediction. Thus, the output layer will get dierent views of the data [20]. General regression neural networks (GRNN) were also evaluated in this study. The block diagram of a GRNN is essentially similar to Fig. 2. A GRNN is a feedforward network that can be used to estimate a vector Y from a measured vector X . The input units are merely distribution

HIDDEN LAYER
Fig. 2. Block diagram of a NN.

(a) 2Slab

(b) 2SlabB

(c) 3Slab

Slab 2 Slab 1 Slab 3 Slab 4 Slab 1

Slab 2 Slab 4 Slab 3 Slab 1

OUTPUTS

INPUTS

Slab 2 Slab 4 Slab 3 Slab 5

Fig. 3. Block diagrams showing the architecture of the NNs used in the analysis: (a) two hidden slabs, (b) two hidden slabs with a direct connection to output layer and (c) three hidden slabs.

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226

3213

units, which provide the (scaled) measured variables X to all of the neurons on the hidden layer that contains the pattern units. Each pattern unit (neuron) is dedicated to one exemplar (pattern) or one cluster center. When a new vector X is entered into the network, it is subtracted from the stored vector representing each cluster center. The squares of the dierences are summed and fed into a non-linear activation function. The activation function used herein is logistic in the form f x 1=1 ex where x is the input. This function is the most popular and has been found useful for most network applications [18]. The pattern units output is passed on to the summation units. Details of the GRNN paradigm were provided by Specht [19]. The GRNN learns by adjusting the interconnection weights between layers. The answers the network is producing are repeatedly compared with the correct answers, and each time, the connecting weights are adjusted slightly in the direction of the correct answers. Eventually, if the problem is learned, a stable set of weights adaptively evolves, which will provide good answers for all of the sample predictions. The real test of NNs occurs when the trained network is able to produce good results for new data. A third class of NNs also evaluated in this study is the group method of data handling (GMDH) networks, also known as polynomial nets. These NNs work by building successive layers with complex links (or connections) that are the individual terms of a polynomial. These polynomial terms are created by using linear and non-linear regression. The initial layer is simply the input layer. The rst layer created is made by computing regressions of the input variables and then choosing the best ones. The second layer is created by computing regressions of the values in the rst layer along with the input variables. (Note that the process is essentially building polynomials of polynomials.) Again, only the best are chosen by the algorithm. These are called survivors. This process continues until the network stops getting better (according to a prespecied selection criterion). More technical details of the GMDH design and limitations are given by Farlow [23].

4. Neural network results and discussion One of the objectives of this study was to investigate the feasibility of using NNs to estimate the EoS since there is no simple solution available to solve this problem. In order to use NNs, the building simulation algorithm ESP-r was used to generate a database for the years 19951999. The data used in the NN analysis covered 233 days (Julian days 89321) of each year. The data for 1995 and 1996 were used for training and testing the NNs. The patterns in this database were divided into two sets. The rst set consisted of 373 patterns and was used for training the networks. The second set consisted of the remainder of 93 patterns, selected randomly, and was used for testing the trained networks. To evaluate the usefulness of the networks, the trained networks are applied to a production data set (699 patterns for the years 19971999) which the networks have never seen before. The input layer to the GRNN network N2 consisted of 19 neurons to which temperature readings (recorded every hour) between 8 AM (of the previous day) and 2 AM (of the day being considered) were fed. The hidden layer must contain a minimum of one neuron for each data pattern; the number was set to 466. The number of neurons in the output layer is 1, which corresponds to the output (the EoS). The statistical indicator used to evaluate the closeness of t (for

3214

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226

Table 2 Details of the six basic GRNNs and the NC for the training set (19951996) and buildings EW Network N2 N3 N4 N5 N6 N7 NC Input neurons 19 20 21 22 23 24 Input-data range 8 8 8 8 8 8 AM2 AM3 AM4 AM5 AM6 AM7 AM AM AM AM AM AM R2 0.9856 0.9877 0.9899 0.9910 0.9927 0.9939 0.9909 Useful range of network prediction P2 P3 P4 P5 P6 P7 28 AM AM AM AM AM AM AM

all network architectures considered herein) is the coecient of multiple determination R2 that can P P be dened as [20]: R2 1 y yp 2 = y ym 2 where y is the actual value, yp is the predicted value of y, and ym is the mean of the y values. The coecient of multiple determination R2 , compares the accuracy of the model to the accuracy of a trivial benchmark model wherein the prediction is simply the mean of all of the samples. A perfect t would result in an R2 value of 1 and a very good t near 1. The quality of t decreases as R2 decreases. Table 2 shows that R2 for this network is 0.9856, which is proof of a very good t. It will be noticed that since the temperature (in addition to other weather data needed for the simulation program ESP-r) is read and recorded every hour, by 3 AM, more weather information becomes available that might improve the prediction of the EoS. To investigate this notion, another NN (denoted N3) was designed and trained on the temperature records between 8 AM and 3 AM, i.e. the input layer for this network contains 20 neurons. Table 3 shows that R2 for this network is 0.9877, which is higher than the R2 for N2. However, it has to be borne in mind that the predictions of N3 are useful only for predicted values of the EoS > 3 AM. Similarly, four more NNs (denoted N4N7) were designed and trained. The input range and R2 for each are shown in Table 2. The table shows steady improvement in R2 from N2 to N7, i.e. with the increase in the number of inputs. This led us to implement a neural control scheme (denoted NC) as explained by the ow chart on Fig. 4. For any given day, at 2 AM, the network N2 is applied to the hourly temperature record between 8 AM (of the previous day) and 2 AM (of the day being considered) to predict an EoS value denoted N2. If N2 < 3 AM, then EoS N2. If N2 > 3 AM, then at 3 AM the network N3 is applied to the hourly temperature record between 8 AM (of the previous day) and 3 AM (of the day being considered) to predict another EoS value denoted N3. Then EoS N3 if 3 AM < N3 < 4 AM. If N3 < 3 AM, then EoS 3 AM, but if N3 > 4 AM, then the network N4 is used at 4 AM to predict the EoS. The procedure is repeated for N5N7 as appropriate. From Table 2 (and other results presented in the remainder of this paper), it is evident that the NC gives better prediction of the EoS than does N2. A comparison of the NC predictions (GRNN networks) and the actual values of the EoS (calculated using the simulation software ESP-r) is shown on Fig. 5a and b for 1995 and 1996, respectively. The gure shows that the NC very closely predicts the EoS. The real test of NN analysis is when it is applied to production data, i.e. data the networks have never seen. The six trained GRNNs, N2N7, were applied to the temperature records of the years 19971999 to predict the EoS for these years. The procedure was repeated for each of the six

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226

3215

Table 3 R2 using the NC and the network N2 for the production set (19971999) and the building EW for dierent network designs Network GRNN, linear scaling, genetic adaptive, City Block distance metric 3Slab; learning rate and momentum are 0.03, 0.7 on one link, and 0.1, 0.1 on all others; vanilla; rotation. Linear scaling function on input activation functions are Gauss, Gauss-comp., and logistic 2Slab; learning rate and momentum are 0.1, 0.1 on all links; vanilla; rotation. Linear scaling function on input activation functions are Gauss, Gauss-comp., and logistic 2Slab-B; learning rate and momentum are 0.1, 0.1 on all links; vanilla; rotation. Linear scaling function on input activation functions are Gauss, Gauss-comp., and logistic GMDH; Linear scaling function on input SBP; learning rate and momentum are 0.1, 0.1 on all links; vanilla; rotation. Linear scaling function on input activation functions are Gauss-comp. and logistic 1997 Using N2 0.828 0.74 Using NC 0.863 0.845 1998 Using N2 0.875 0.856 Using NC 0.882 0.907 1999 Using N2 0.899 0.842 Using NC 0.899 0.904

0.794

0.859

0.829

0.856

0.871

0.894

0.797

0.858

0.828

0.842

0.871

0.882

0.789 0.817

0.851 0.847

0.836 0.865

0.853 0.871

0.88 0.882

0.901 0.892

network architectures listed in Table 1, i.e. for each architecture, networks N2N7 were designed, trained on 19951996 data, incorporated in the NC scheme and then applied to the production data 19971999. It will not be possible to include all the results in this paper because of size limitations. Therefore, sample results and a summary of the most signicant ndings are presented in the following. For building EW, Table 3 shows the R2 for each year using both N2 and NC for all six NN types. From the table, the NC gives better prediction than the network N2 for all cases. The table also includes a denition of the parameters of each of the NN types. For instance, the results presented in the table for the GRNN were obtained using a genetic adaptive algorithm, a socalled City Block distance metric and a linear scaling factor for the input data. This network design was selected as a result of a parametric study of the network design that was intended to determine quantitatively the GRNN design that best predicts the production data (i.e. for years 19971999). The variables investigated included (a) dierent scaling functions (linear between [)1,1], linear between [0,1], logistic and hyperbolic tangent tanh), (b) two possible ways to measure the distance between patterns, namely the City Block distance metric [19] and the Euclidean distance metric [22] and (c) the genetic adaptive algorithm as opposed to an iterative

3216

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226

Start

is N2 > 3 ? Yes i =3

No

EoS=N2

End

is Ni < i ?

Yes

EoS = i

End

No is Ni > i +1 ? Yes i = i +1 No EoS = Ni End

Fig. 4. Flow diagram of the NC.

approach. The genetic approach is much slower but is expected to generalize much better than the iterative procedure. Similar parametric investigations were conducted on the other network types listed in Table 3 to determine the eect of each network variable on the ability of the network to predict the EoS accurately. Table 4 shows partial results of such a parametric study for the 3Slab networks. It includes the rate of learning and momentum on each link of the network, the method of weight updates (Vanilla/Momentum), and the method of pattern selection (random/rotation). These and other network parameters are described briey in the Appendix A for completeness. The results presented in the table conrm that the NC gives better prediction than the network N2 for all cases. Also, it is evident from the table that the best 3Slab design among those considered is design A. The types of scaling function and activation functions for each slab in the network (as listed in the table caption) were also selected as a result of a similar parametric study. It will be recalled that each time a value for NC is determined, six NNs (N2N7) have to be designed and trained. Therefore, signicant amounts of time and eort were spent in performing these parametric studies. Based on these studies, the best design of each of the NNs for the present application is presented in Table 3. These best designs were used in the remainder of this paper for building EW (Figs. 511) and building SN (Figs. 1215).

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226
8 7 6

3217

ESP-r NC

EoS

5 4 3 2 100 150 200 250 300

(a)

Day
8 7 6

ESP-r NC

EoS

5 4 3 2 100 150 200 250 300

(b)

Day

Fig. 5. Comparison of actual (from ESP-r) and neural (NC) EoS, building EW, training and testing data: (a) for 1995 and (b) for 1996.

For building EW, Figs. 68 show a comparison of the NC predictions (3Slab NNs) and the actual values (predicted by ESP-r). The gures show close agreement between the NC predictions and the actual values. Plots of the dierences (errors) on the top of each of Figs. 68 show a random distribution of these errors. Further analysis of the correlation between actual (from ESPr) and NN predictions were conducted. Fig. 9 shows another plot of the data of Fig. 7. Ideally, the data should fall on a line of slope 1. A least squares t through the data showed that the slope of the line of best t has a slope b of 0.971. The 95% condence interval on b lies between 1.007 and 0.936. This means that there is no reason to reject the null hypothesis that b 1, i.e. a 1:1

3218

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226

Table 4 Eect of some 3Slab network parameters on R2 using the NC and the network N2 for the production set (19971999) and the building EW (Linear scaling function, activation functions for hidden layer: GaussianGaussian complementLogistic, and for third layer: Logistic) Design # A Network Variables(3Slab) Vanilla; rotation; learning rate and momentum are 0.03 and 0.7, respectively on the link to the logistic hidden slab; 0.1 and 0.1 on all other links Momentum; random; learning rate and momentum are 0.03 and 0.7, respectively on the link to the logistic hidden slab; 0.1 and 0.1 on all other links Momentum; random; learning rate and momentum are 0.1 and 0.1, respectively on all links Vanilla; random; learning rate and momentum are 0.03 and 0.7, respectively on the link to the logistic hidden slab; 0.1 and 0.1 on all other links 1997 Using N2 0.740 0.845 1998 Using NC Using N2 0.856 0.907 1999 Using NC Using N2 0.842 Using NC 0.904

0.788

0.830

0.842

0.850

0.876

0.893

0.782

0.848

0.848

0.868

0.871

0.899

0.751

0.828

0.848

0.855

0.852

0.881

1 0 -1 -2 8

EoS (hour)

Error

ESP-r NC
2 80 120 160 200 240 280 320

Day
Fig. 6. Comparison of actual (from ESP-r) and neural (NC, 3Slab Networks) EoS, 1997, building EW. The dierences (errors) are randomly distributed as shown on the top of the gure.

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226

3219

Error EoS (hour)

1 0 -1 -2 8

ESP-r NC
2 80 120 160 200 240 280 320

Day
Fig. 7. Comparison of actual (from ESP-r) and neural (NC, 3Slab Networks) EoS, 1998, building EW. The dierences (errors) are randomly distributed as shown on the top of the gure.

1 0 -1 -2 8

EoS (hour)

Error

ESP-r NC
2 80 120 160 200 240 280 320

Day
Fig. 8. Comparison of actual (from ESP-r) and neural (NC, 3Slab Networks) EoS, 1999, building EW. The dierences (errors) are randomly distributed as shown on the top of the gure.

correlation between ESP-r and NC. This statistical analysis was performed for all six networks (listed in Table 3) and the results are plotted in Fig. 10 for the years 19971999. The gure shows that for all the networks considered and for the three year period, one may accept the hypothesis of a 1:1 correlation between ESP-r and each of the NNs considered. Fig. 11 presents sample

3220

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226
10

3Slab
4 2 0 2 4 6 8 10

ESP-r
Fig. 9. Comparison of EoS as predicted from ESP-r and the 3Slab (NC) networks for building EW, 1998.

1.1

1997

1998

1999

slope b

1.0

0.9

GRNN

3Slab

2Slab

2SlabB

GMDH

SBP

Network type
Fig. 10. The 95% condence intervals and point estimates of the slope of line of best t, building EW.

histogram plots of the errors (dierence between NN and best line of t), which shows close to a normal distribution with mean close to 0. Statistical analysis of variance (ANOVA) showed that at the 95% condence level there is no reason to reject the null hypothesis that the prediction of the EoS by the six network architectures listed in Table 3 are essentially similar. In light of this, there is no statistical basis to prefer any network type over the others, except as may be based on the R2 values. From Table 3, it may be inferred that the 3Slab architecture is marginally better than the others.

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226

3221

GRNN
80 80

3Slab
60

SBP

Frequency

60

60 40

40

40 20

20

20

-1

-1

-1

Error, hr
Fig. 11. Histogram distribution of errors between some network predictions and lines of best-t, building EW, 1998.

1 0 -1 -2 8

Error EoS (hour)

ESP-r NC
2 80 120 160 200 240 280 320

Day
Fig. 12. Comparison of actual (from ESP-r) and neural (NC, 3Slab Networks) EoS, 1997, building SN. The dierences (errors) are randomly distributed as shown on the top of the gure.

For the building SN, the same procedure used to generate Figs. 611 was used to predict the EoS. This involved (a) using ESP-r to predict the EoS for the years 19951999, (b) training six new networks, N2N7, using 1995 and 1996 temperature data using the six network designs described in Table 3 and (c) applying these trained networks to the temperature records of the production years 19971999. Figs. 1214 show a sample comparison of the NC predictions and the actual values, and Table 5 shows the R2 for each year using both N2 and NC. The table shows that the NC gives better prediction than the network N2. The gures show close agreement between the NC predictions and the actual values. Plots of the errors (on the top of each of Figs. 1214) show the random distribution of these errors. The NN predictions for this building, as expected, are

3222

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226
1 0 -1 -2 8

Error EoS (hour)

ESP-r NC
2 80 120 160 200 240 280 320

Day
Fig. 13. Comparison of actual (from ESP-r) and neural (NC, 3Slab Networks) EoS, 1998, building SN. The dierences (errors) are randomly distributed as shown on the top of the gure.

1 0 -1 -2 8

Error EoS (hour)

ESP-r NC
2 80 120 160 200 240 280 320

Day
Fig. 14. Comparison of actual (from ESP-r) and neural (NC, 3Slab Networks) EoS, 1999, building SN. The dierences (errors) are randomly distributed as shown on the top of the gure.

better than the predictions for the EW building. Fig. 15 presents the 95% condence intervals and point estimates of the slopes of the lines of best t, which again show that for all the networks considered and for the three year period, one may accept the hypothesis of a 1:1 correlation between ESP-r and each of the NNs considered.

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226
1.1

3223

1997

1998

1999

slope b

1.0

0.9

GRNN

3Slab

2Slab

2SlabB

GMDH

SBP

Network type
Fig. 15. The 95% condence intervals and point estimates of the slope of line of best t, building SN. Table 5 R2 using the NC and the network N2 for the production set (19971999) and the building SN for dierent network architectures Network GRNN 3Slab 2Slab 2Slab-B GMDH SBP 1997 Using N2 0.921 0.916 0.922 0.921 0.908 0.918 Using NC 0.942 0.943 0.928 0.942 0.934 0.943 1998 Using N2 0.919 0.923 0.922 0.922 0.902 0.920 Using NC 0.928 0.932 0.935 0.938 0.919 0.933 1999 Using N2 0.936 0.936 0.930 0.925 0.916 0.928 Using NC 0.934 0.937 0.937 0.940 0.926 0.938

Again, a statistical ANOVA showed that at the 95% condence level there is no reason to reject the null hypothesis that the prediction of the EoS by the six network architectures listed in Table 5 are essentially similar. In light of this, there is no basis to prefer any network type over the others.

5. Conclusion Six types of NNS were designed and trained to investigate the feasibility of using this technology for HVAC set-back control. A state of the art whole building simulation software, the ESP-r system, was used to prepare a database of past history to be used with the NNs to predict the EoS for new situations before the weather conditions are known. Parametric studies were conducted for each NN type to select the network parameters that best predict the EoS. The NN predictions were improved by developing a NC. This scheme is based on using the temperature readings as they become available. Six NNs were designed and trained for this purpose (for each network type considered). The performance of the NN analysis was evaluated using a statistical indicator (the coecient of multiple determination R2 ), and by statistical analysis of the error patterns.

3224

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226

To evaluate the usefulness of the NNS, the trained networks were applied to a production data set (699 patterns for the years 19971999) that the networks have never seen before. The results of applying the technique on data for these three years show good prediction when the NN is properly designed (R2 values are given in Tables 3 and 5). The results also conrm that the NC gives better prediction than the network N2 for all cases. The success of the NN in accurately predicting the EoS is signicant for two reasons. The rst is that the NN can predict the EoS before the weather conditions are known. The second is that building simulation softwares require many more weather inputs than the NNs do, ESP-r, for instance, requires as inputs diuse solar radiation on the horizontal, external dry bulb temperature, direct normal solar intensity, prevailing wind speed, wind direction and relative humidity. The NN predictions, on the other hand, are based on using the external temperature only as input. Temperature is perhaps the easiest and most reliable weather data to measure. Based on this fact, one expects that controllers based on the NC approach will be simple to make and reliable to use.

Appendix A. Brief discussion of some NN parameters and design aspects As neurons pass values from one layer of the network to the next layer in backpropagation networks, the values are modied by a weight value in the link that represents connection strengths between the neurons. The weights begin as random numbers that fall within a range specied at the start of training. As each pattern passes though the network, the weight is either raised to reinforce a connection positively, or lowered to inhibit the connection. A link is the connection or set of weights between the slabs or groups of neurons in a network. Each link can have an individual learning rate and momentum. Each time a pattern is presented to the network, the weights leading to an output node are modied slightly during learning in the direction required to produce a smaller error the next time the same pattern is presented. The amount of weight modication is the learning rate times the error. Large learning rates often lead to oscillation of weight changes and learning never completes, or the model converges to a solution that is not optimum. One way to allow faster learning without oscillation is to make the weight change a function of the previous weight change to provide a smoothing eect. The momentum factor determines the proportion of the last weight change that is added into the new weight change. Two techniques for weight updates were evaluated, namely Vanilla and Momentum. The Vanilla algorithm applies the learning rate without a momentum term. In the Momentum algorithm, the weight updates not only include the change dictated by learning rate, but also include a portion of the last weight change as well. Like momentum in physics, a high momentum term will keep the network generally going in the direction it has been going. In other words, weight uctuations will tend to be dampened by a high momentum term. The Momentum algorithm is expected to be useful for extremely noisy data, or when a high learning rate is necessary. Two methods for selection of patterns from the training data set are Rotation and Random. The Rotation scheme selects the patterns in the order of appearance in the training set and is expected to be useful when similar training patterns are dispersed evenly in the data set. The Random scheme selects the patterns randomly and is useful for training sets that contain cyclical

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226

3225

variations or when it is desirable to obtain answers that are independent of clustered information. One of the pitfalls of using NNs is overtraining, i.e. the network memorizes the input patterns and does not generalize well for other data. In this study, over training of the GRNN was prevented by using so-called Net-Perfect algorithm [20]. This algorithm optimizes the network by applying the current network to an independent test set during training. The algorithm nds the optimum network for the data in the test set (which means that the network is able to generalize well and give good results on new data). The algorithm optimizes the smoothing factor based upon the values in the test set. It does this by trying dierent smoothing factors and choosing the one that minimizes the mean squared error between the actual and predicted answers. Overtraining of the backpropagation networks and the GMDH networks used in the present study was prevented using similar algorithms. When variables are loaded into a NN, they must be scaled from their numeric range into the numeric range that the NN deals with eciently. There are two main numeric ranges the networks commonly operate in: zero to one denoted [0, 1] and minus one to one denoted [)1, 1]. One choice is the use of linear scaling functions for this purpose. Possible alternatives to these linear scaling functions include two non-linear scaling functions: logistic and tanh. The logistic function scales data to (0, 1) according to the following formula: f x 1=1 expxm =s where xm is the average of all of the values of that variable in the pattern le, and s is the standard deviation of those values. The hyperbolic tangent function (tanh) scales data to ()1, 1) according to: f x tanhxm =s. As detailed in the main body of the paper, parametric studies were conducted to select the best scaling function for the present application. GRNN work by measuring how far a given sample pattern is from patterns in the training set in N dimensional space, where N is the number of inputs in the problem. In this study, the method of measuring the distance between patterns was the so-called City Block distance metric, which is the sum of the absolute values of the dierences in all dimensions between the pattern and the weight vector for that neuron [19]. The GRNN used in this study was genetic adaptive, i.e. it uses a genetic algorithm to nd an input smoothing factor adjustment. This is used to adapt the overall smoothing factor to provide a new value for each input. Genetic algorithms use a tness measure to determine which of the individuals in the population survive and reproduce [21]. The tness for GRNN is the mean squared error of the outputs for the entire data set. The genetic adaptive algorithm seeks to minimize the tness.

References
[1] Yeh S, Wong K. HVAC pipe/duct sizing using articial neural networks. Int J Modell Simulat 1999;19(3):2826. [2] Teeter J, Chow M. Application of functional link neural network to HVAC thermal dynamic system identication. IEEE Trans Ind Electron 1998;45(1):70176. [3] Chen Y, Chen Z. Neural-network-based experimental technique for determining z-transfer function coecients of a building envelope. Building Environ 2000;35(3):1819. [4] Egilegor B, Uribe JP, Arregi G, Pradilla E, Susperregi L. A fuzzy control adapted by a neural network to maintain a dwelling within thermal comfort. In: Proceedings of Building Simulation 97, vol. II. pp. 8794.

3226

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226

[5] Rock BA, Wu Ch-T. Performance of xed, air-side economizer, and neural network demand-controlled ventilation in CAV systems. ASHRAE Trans 1998;104(2):23445. [6] Ahmed O, Mitchell J, Klein S. Feedforward-feedback controller using general regression neural network (GRNN) for laboratory HVAC system: Part IItemperature controlcooling. ASHRAE Trans 1998;104(2):62634. [7] Morel N, Bauer M, El-Khoury M, Krauss J. Neurobat, a predictive and adaptive heating control system using articial neural networks. Int J Solar Energy 2001;21(23):161202. [8] Jeannette E, Assawamartbunlue K, Curtiss P, Kreider J. Experimental results of a predictive neural network HVAC controller. ASHRAE Trans 1998;104(2):1927. [9] Fargus RS, Chapman C. Commercial PI-neural controller for the control of building services plant, IEE Conference Publication no. 455 2, IEE, Stevenage, England, 1998. p. 168893. [10] Kasahara M, Matsuba T, Hashimoto Y, Murasawa I, Kimbara A. Optimal preview control for HVAC system. ASHRAE Trans 1998;104(Pt 1A):50213. [11] Saboksayr S, Patel R, Zaheer-uddin M. Energy-ecient operation of HVAC systems using neural network based decentralized controllers. In: Proceedings of the American Control Conference, vol. 6, 1995. p. 43215. [12] Alessandri A, Verona F, Parisini T, Torrini A. Neural approximation for the optimal control of heating systems. In: Proceedings of the IEEE Conference on Control Applications, vol. 3, 1994. p 16138. [13] Kawashima M, Dorgan C, Mitchell J. Optimizing system control with load prediction by neural networks for an ice-storage system. ASHRAE Trans 1996;102(1):116978. [14] Massie D, Curtiss P, Kreider J. Predicting central plant HVAC equipment performance using neural networks laboratory system test results. ASHRAE Trans 1998;104(1A):2218. [15] Peitsman H, Bakker V. Application of black-box models to HVAC systems for fault detection. ASHRAE Trans 1996;102(1):62840. [16] Li X, Vaezi-Nejad H, Visier J. Development of a fault diagnosis method for heating systems using neural networks. ASHRAE Trans 1996;102(1):60714. [17] Clarke JA. Energy simulation in building design. Bristol: Adam Hilger Ltd.; 1985. [18] Hammerstrom D. Working with neural networks. IEEE Spectrum 1993:4653. [19] Specht DF. A general regression neural network. IEEE Trans Neural Networks 1991;2(6):56876. [20] Neuroshell 2 Manual, Ward Systems Group Inc., Frederick, MA, 1996. [21] Goldberg DE. Genetic algorithms in search optimization, and machine learning. Reading, MA: Addison-Wesley; 1989. [22] Kreyszig E. Advanced engineering mathematics. seventh ed. New York: Wiley; 1993. [23] Farlow SJ, editor. Self-organizing method in modeling: GMDH type algorithms. Statistics: Textbooks and Monographs, 1984. p. 54.

You might also like