Time Series Analysis 9 | Exponential Smoothing Techniques and Univariate FB Prophet Model

Series: Time Series Analysis

Time Series Analysis 9 | Exponential Smoothing Techniques and Univariate FB Prophet Model

  1. Holt-Winters Exponential Smoothing Techniques (ETS)

(1) Recall: SARIMA Model

SARIMA (i.e. Seasonal ARIMA) model is a general model used for time series analyzing and it is actually an extension of the ARIMA model. It adds three new hyperparameters to specify the autoregression (P), differencing (D), and moving average (Q) for the seasonal component of the series, as well as an additional hyperparameter for the period of the seasonality (m).

(2) Advantages of SARIMA

  • Generality: The SARIMA model can be applied to data collected from any time-frequency including daily data, weekly data, monthly data, etc.
  • Flexibility: The SARIMA model provides flexibility for hyperparameter tuning so that we can easily make changes to fit different trends and seasonalities.

(3) Disadvantages of SARIMA

Although SARIMA has advantages in generality and flexibility, it has several drawbacks,

  • Parameter tuning is difficult to decide
  • Long-term prediction performance is quite poor and it finally becomes flat

(4) SARIMA Vs. ETS

The SARIMA model and the ETS model are special ARIMA models, and the main difference between SARIMA and ETS is in the prediction part.

a. SARIMA:

  • Estimation is based on the ARIMA model.
  • Prediction is a weighted sum of past observations.
  • Prediction weights on the past observations are different based on different data, and they are updated on each step.

b. ETS: Error, Trend, Seasonal Model

  • Estimation is based on the ARIMA model.
  • Prediction is a weighted sum of past observations.
  • Prediction weights on the past observations are exponentially decaying.

(5) ETS Evaluation

  • ETS models are way less general than ARIMA models
  • ETS models don’t require order selections
  • ETS can be fast and dirty

(6) Different types of Exponential Smoothing Models

  • Single/Simple Exponential Smoothing (SES): For stationary time-series data without trend and seasonality
  • Double Exponential Smoothing (DES): For time-series data with the trend only
  • Triple Exponential Smoothing (TES): For stationary time series data with trend and seasonality

(7) The Definition of Level

In time series models, the level is another component except for trend, seasonality, and noise, and it is defined as the average value in the series.

(8) Single/Simple Exponential Smoothing

In ETS terms, the SES model corresponds to a model with additive errors, no trend, and no seasonality. It requires a single parameter, called alpha (i.e. α), and it is also called the smoothing factor or smoothing coefficient. An α value close to 1 indicates fast learning (that is, only the most recent values influence the forecasts), whereas a value close to 0 indicates slow learning (past observations have a large influence on forecasts).

The level component, which is the only component of an SES model, is expressed as,

where,

Based on this definition, the estimate Y_{t+1}-hat should be,

which is, as defined,

continue, we will have,

where,

Based on this result, we can conclude that,

where the MA(1) process has a parameter θ in its generating function and,

(9) Double Exponential Smoothing

Double Exponential Smoothing (i.e. DES) is an extension to SES that explicitly adds support for trends in the univariate time series. In addition to the α parameter for controlling the smoothing factor for the level, an additional smoothing factor β is added to control the decay of the influence of the change in trend.

The level component is defined as,

And the trend component is defined as,

where,

Based on this definition, the estimate Y_{t+h}-hat should be,

From this result, we can find out that the estimated value keeps going down or up because it has a term h * b_t.

Similar to SES, the DES is equivalent to,

where the MA(2) process has parameters θ_1 and θ_2 in its generating function. Then the smoothing factor for level is,

And the smoothing factor for the trend is,

(10) Triple Exponential Smoothing

Triple Exponential Smoothing (i.e. TES) is an extension of Exponential Smoothing that explicitly adds support for seasonality to the univariate time series. This method is also called Holt-Winters Exponential Smoothing, named for two contributors to the method, Charles Holt and Peter Winters. In addition to the α and β smoothing factors, a new parameter is added called γ.

The level component is defined as,

And the trend component is defined as,

Then the seasonal component with seasonality lag m is defined as,

Additive model result
Multiplicative model result

Based on this definition, the estimate Y_{t+h}-hat should be,

Similar to SES and DES, the TES is equivalent to,

(11) ETS Application in Python

In practice, when the SARIMA model with the selected order doesn’t satisfy the expectation, we can run a fast ETS as an alternative. For an exponential smoothing method in Python, we have to choose parameters based on,

  • trend: "additive" if we have a linear trend, "multiplicative" if we have an exponential trend
  • damped_trend : set to True to imply that there can potentially be some damped trend. It’s a good idea to always set this to true.
  • seasonal: "additive" if we have seasonality independent to time, "multiplicative" if we have seasonality changes depending on the time

2. Univariate Facebook’s Prophet Model

(1) The Definition of Prophet Model

The Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.

The model is,

where,

  • Trend g(t): The trend can have two types of features, a piecewise linear model by default and a non-linear saturating growth, which means the growth rate is by default decreases with time.

For the piecewise linear model, the basic model should be modified into a piecewise logistic growth model, where we don’t have a continuous changing trend or changing rate. To allow the growth rate to change, some changepoints should also be given. Here we will not go deeply into its mathematical details.

For non-linear saturating growth business data, it usually has a trend of continuous growth within a limit of capacity. So the logistic growth model can be addressed as,

C is the constant growth rate, and m is the adjustment parameter

For the case with both a saturating growth and a piecewise model, the model expression should be noted as the combination of the formulas we have discussed.

  • Seasonality s(t): Seasonality comes with many different seasonal frequencies like weekly and yearly by default, and we can also add monthly, quarterly, or something else.

The weekly frequency is calculated based on the daily data,

where I_{dayofweek} is the indicator of each day of the week, and it can be equal to the average value of that week of the day.

The yearly and monthly frequency is using a partial Fourier sum, which can approximate an arbitrary periodic signal. The number of terms in the partial sum is a parameter that determines how quickly the seasonality can change. This is a very classical way of modeling waves.

a_n and b_n are the scales of the data
  • Holiday h(t): We have to provide a list/data frame of the customized holidays or events. We can also extend the length of the holiday by assigning lower_window and upperwindow. There is also a built-in method called add_country_holidays() which can be used to add country-specific holidays.

(2) Evaluation of Prophet Model

The advantages of using a Prophet model are,

  • The piecewise linear growth trend captures the overall shape of the trend
  • It adds different seasonalities instead of just one seasonality
  • It is easy to use and provides more flexibility
  • As it is shown in the paper, the Prophet model shows a better long term performance
  • Accurate: Prophet is used in many applications across Facebook for producing reliable forecasts for planning and goal setting. We’ve found it to perform better than any other approach in the majority of cases.
  • Fully automatic: Prophet is robust to outliers, missing data, and dramatic changes in your time series.
  • Combined with human-interpretable parameters that could combine your domain knowledge.

However, there are also some drawbacks,

  • It works well only if you have daily or sub-daily data
  • It works well only on univariate data rather than multivariate data. Note that similar to the SARIMA model, when we observe a change in the variation of seasonality, we should at first perform a log transformation or a more general box-cox transformation on our time series Y_t.
  • It gives up some important inferential advantages of using a generative model such as ARIMA.
  • It performs well only for time series with business features

(3) Business Time Series Features

  • Multiple strong seasonalities
  • Trend change that’s impacted by the time of the year or product release
  • Holiday effects or any change caused by human actions
  • Outliers

(4) Change Points Selection

If we want to use the piecewise trend for modeling, the selection of the changing points can be vital. Typically, we have two approaches for change points selection,

  • It can be specified manually with known special dates
  • It can be automatically selected given a (large) number of changepoints (for example, one per month for a several-year history) and a prior distribution of δ