In response to the growing challenge of energy and power management caused by increasing
implementation of volatile sustainable energy sources, a case study of forecasting building
electric loads using statistical and machine learning methods is conducted for a multi-purpose
LEED-certified institutional building on the UC Irvine campus. Four data-driven methods,
which require no detailed building information and strong building energy knowledge, are
employed and compared, including the polynomial regression, Autoregressive Integrated
Moving Average (ARIMA), TBATS (Trigonometric seasonal formulation, Box-Cox
transformation, ARMA errors, Trend, and Seasonal components), and backpropagation Artificial
Neural Networks (ANN). These models are investigated to satisfy the ASHRAE standards and
optimize the prediction performance. A full year of hourly electric load and meteorological data
of the building in 2019 was obtained using the existing meters for this data-driven study. Root
Mean Square Error (RMSE), Coefficient of Variance (CV), Mean Absolute Percentage of Error
(MAPE) and R2 are calculated as evaluation criteria to compare the performances of these datadriven
methods in terms of prediction accuracy. Akaike’s Information Criteria (AIC) is
introduced as a guideline to determine the optimal model for several prediction models. The polynomial regression is performed using MATLAB and is shown capable of only data fitting
instead of forecasting when the total hour is used as the independent variable. When using the
daily and weekly data, the polynomial regression method fails for forecasting. For the wholemonth
data, ARIMA, TBATS, and ANN methods are used to predict hourly power load in the
next month with Python. The ARIMA model shows relatively low accuracy, indicating that it is
unable to handle multiple seasonlities in the data. TBATS shows a substantially improved
accuracy and satisfactory prediction. The backpropagation ANN is also conducted with its
configuration, including inputs, number of hidden layers and neurons, optimizer, and activation
functions, optimized after extensive testing. Different sets of training data are examined for both
TBATS and ANN. The ANN’s forecasting accuracy is found to be about 5~20% better than
TBATS’ when only using one month’s data for training. The residuals of these forecasting
methods show there could be information uncaptured in forecasting. It is speculated that
operation and activity schedules can serve as additional inputs for the ANN to achieve better
forecasting accuracy.