"Fossies" - the Fresh Open Source Software Archive  

Source code changes of the file "docs/_docs/diagnostics.md" between
prophet-0.7.tar.gz and prophet-1.0.tar.gz

About: Prophet is a tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

diagnostics.md  (prophet-0.7):diagnostics.md  (prophet-1.0)
--- ---
layout: docs layout: docs
docid: "diagnostics" docid: "diagnostics"
title: "Diagnostics" title: "Diagnostics"
permalink: /docs/diagnostics.html permalink: /docs/diagnostics.html
subsections: subsections:
- title: Cross validation
id: cross-validation
- title: Parallelizing cross validation - title: Parallelizing cross validation
id: parallelizing-cross-validation id: parallelizing-cross-validation
- title: Hyperparameter tuning - title: Hyperparameter tuning
id: hyperparameter-tuning id: hyperparameter-tuning
--- ---
<a id="cross-validation"> </a>
### Cross validation
Prophet includes functionality for time series cross validation to measure forec ast error using historical data. This is done by selecting cutoff points in the history, and for each of them fitting the model using data only up to that cutof f point. We can then compare the forecasted values to the actual values. This fi gure illustrates a simulated historical forecast on the Peyton Manning dataset, where the model was fit to a initial history of 5 years, and a forecast was made on a one year horizon. Prophet includes functionality for time series cross validation to measure forec ast error using historical data. This is done by selecting cutoff points in the history, and for each of them fitting the model using data only up to that cutof f point. We can then compare the forecasted values to the actual values. This fi gure illustrates a simulated historical forecast on the Peyton Manning dataset, where the model was fit to a initial history of 5 years, and a forecast was made on a one year horizon.
![png](/prophet/static/diagnostics_files/diagnostics_3_0.png) ![png](/prophet/static/diagnostics_files/diagnostics_4_0.png)
[The Prophet paper](https://peerj.com/preprints/3190.pdf) gives further descript ion of simulated historical forecasts. [The Prophet paper](https://peerj.com/preprints/3190.pdf) gives further descript ion of simulated historical forecasts.
This cross validation procedure can be done automatically for a range of histori cal cutoffs using the `cross_validation` function. We specify the forecast horiz on (`horizon`), and then optionally the size of the initial training period (`in itial`) and the spacing between cutoff dates (`period`). By default, the initial training period is set to three times the horizon, and cutoffs are made every h alf a horizon. This cross validation procedure can be done automatically for a range of histori cal cutoffs using the `cross_validation` function. We specify the forecast horiz on (`horizon`), and then optionally the size of the initial training period (`in itial`) and the spacing between cutoff dates (`period`). By default, the initial training period is set to three times the horizon, and cutoffs are made every h alf a horizon.
The output of `cross_validation` is a dataframe with the true values `y` and the out-of-sample forecast values `yhat`, at each simulated forecast date and for e ach cutoff date. In particular, a forecast is made for every observed point betw een `cutoff` and `cutoff + horizon`. This dataframe can then be used to compute error measures of `yhat` vs. `y`. The output of `cross_validation` is a dataframe with the true values `y` and the out-of-sample forecast values `yhat`, at each simulated forecast date and for e ach cutoff date. In particular, a forecast is made for every observed point betw een `cutoff` and `cutoff + horizon`. This dataframe can then be used to compute error measures of `yhat` vs. `y`.
Here we do cross-validation to assess prediction performance on a horizon of 365 days, starting with 730 days of training data in the first cutoff and then maki ng predictions every 180 days. On this 8 year time series, this corresponds to 1 1 total forecasts. Here we do cross-validation to assess prediction performance on a horizon of 365 days, starting with 730 days of training data in the first cutoff and then maki ng predictions every 180 days. On this 8 year time series, this corresponds to 1 1 total forecasts.
```R ```R
# R # R
df.cv <- cross_validation(m, initial = 730, period = 180, horizon = 365, units = 'days') df.cv <- cross_validation(m, initial = 730, period = 180, horizon = 365, units = 'days')
head(df.cv) head(df.cv)
``` ```
```python ```python
# Python # Python
from fbprophet.diagnostics import cross_validation from prophet.diagnostics import cross_validation
df_cv = cross_validation(m, initial='730 days', period='180 days', horizon = '36 5 days') df_cv = cross_validation(m, initial='730 days', period='180 days', horizon = '36 5 days')
``` ```
```python ```python
# Python # Python
df_cv.head() df_cv.head()
``` ```
<div> <div>
<style scoped> <style scoped>
.dataframe tbody tr th:only-of-type { .dataframe tbody tr th:only-of-type {
skipping to change at line 69 skipping to change at line 75
<th>yhat_lower</th> <th>yhat_lower</th>
<th>yhat_upper</th> <th>yhat_upper</th>
<th>y</th> <th>y</th>
<th>cutoff</th> <th>cutoff</th>
</tr> </tr>
</thead> </thead>
<tbody> <tbody>
<tr> <tr>
<th>0</th> <th>0</th>
<td>2010-02-16</td> <td>2010-02-16</td>
<td>8.956828</td> <td>8.959678</td>
<td>8.460272</td> <td>8.470035</td>
<td>9.476844</td> <td>9.451618</td>
<td>8.242493</td> <td>8.242493</td>
<td>2010-02-15</td> <td>2010-02-15</td>
</tr> </tr>
<tr> <tr>
<th>1</th> <th>1</th>
<td>2010-02-17</td> <td>2010-02-17</td>
<td>8.723230</td> <td>8.726195</td>
<td>8.208639</td> <td>8.236734</td>
<td>9.222179</td> <td>9.219616</td>
<td>8.008033</td> <td>8.008033</td>
<td>2010-02-15</td> <td>2010-02-15</td>
</tr> </tr>
<tr> <tr>
<th>2</th> <th>2</th>
<td>2010-02-18</td> <td>2010-02-18</td>
<td>8.607021</td> <td>8.610011</td>
<td>8.106506</td> <td>8.104834</td>
<td>9.104792</td> <td>9.125484</td>
<td>8.045268</td> <td>8.045268</td>
<td>2010-02-15</td> <td>2010-02-15</td>
</tr> </tr>
<tr> <tr>
<th>3</th> <th>3</th>
<td>2010-02-19</td> <td>2010-02-19</td>
<td>8.528870</td> <td>8.532004</td>
<td>8.061701</td> <td>7.985031</td>
<td>9.024450</td> <td>9.041575</td>
<td>7.928766</td> <td>7.928766</td>
<td>2010-02-15</td> <td>2010-02-15</td>
</tr> </tr>
<tr> <tr>
<th>4</th> <th>4</th>
<td>2010-02-20</td> <td>2010-02-20</td>
<td>8.270872</td> <td>8.274090</td>
<td>7.773299</td> <td>7.779034</td>
<td>8.745526</td> <td>8.745627</td>
<td>7.745003</td> <td>7.745003</td>
<td>2010-02-15</td> <td>2010-02-15</td>
</tr> </tr>
</tbody> </tbody>
</table> </table>
</div> </div>
In R, the argument `units` must be a type accepted by `as.difftime`, which is we eks or shorter. In Python, the string for `initial`, `period`, and `horizon` sho uld be in the format used by Pandas Timedelta, which accepts units of days or sh orter. In R, the argument `units` must be a type accepted by `as.difftime`, which is we eks or shorter. In Python, the string for `initial`, `period`, and `horizon` sho uld be in the format used by Pandas Timedelta, which accepts units of days or sh orter.
Custom cutoffs can also be supplied as a list of dates to to the `cutoffs` keywo rd in the `cross_validation` function in Python and R. For example, three cutoff s six months apart, would need to be passed to the `cutoffs` argument in a date format like: Custom cutoffs can also be supplied as a list of dates to the `cutoffs` keyword in the `cross_validation` function in Python and R. For example, three cutoffs s ix months apart, would need to be passed to the `cutoffs` argument in a date for mat like:
```R ```R
# R # R
cutoffs <- as.Date(c('2013-02-15', '2013-08-15', '2014-02-15')) cutoffs <- as.Date(c('2013-02-15', '2013-08-15', '2014-02-15'))
df.cv2 <- cross_validation(m, cutoffs = cutoffs, horizon = 365, units = 'days') df.cv2 <- cross_validation(m, cutoffs = cutoffs, horizon = 365, units = 'days')
``` ```
```python ```python
# Python # Python
cutoffs = pd.to_datetime(['2013-02-15', '2013-08-15', '2014-02-15']) cutoffs = pd.to_datetime(['2013-02-15', '2013-08-15', '2014-02-15'])
df_cv2 = cross_validation(m, cutoffs=cutoffs, horizon='365 days') df_cv2 = cross_validation(m, cutoffs=cutoffs, horizon='365 days')
``` ```
The `performance_metrics` utility can be used to compute some useful statistics of the prediction performance (`yhat`, `yhat_lower`, and `yhat_upper` compared t o `y`), as a function of the distance from the cutoff (how far into the future t he prediction was). The statistics computed are mean squared error (MSE), root m ean squared error (RMSE), mean absolute error (MAE), mean absolute percent error (MAPE), median absolute percent error (MDAPE) and coverage of the `yhat_lower` and `yhat_upper` estimates. These are computed on a rolling window of the predic tions in `df_cv` after sorting by horizon (`ds` minus `cutoff`). By default 10% of the predictions will be included in each window, but this can be changed with the `rolling_window` argument. The `performance_metrics` utility can be used to compute some useful statistics of the prediction performance (`yhat`, `yhat_lower`, and `yhat_upper` compared t o `y`), as a function of the distance from the cutoff (how far into the future t he prediction was). The statistics computed are mean squared error (MSE), root m ean squared error (RMSE), mean absolute error (MAE), mean absolute percent error (MAPE), median absolute percent error (MDAPE) and coverage of the `yhat_lower` and `yhat_upper` estimates. These are computed on a rolling window of the predic tions in `df_cv` after sorting by horizon (`ds` minus `cutoff`). By default 10% of the predictions will be included in each window, but this can be changed with the `rolling_window` argument.
```R ```R
# R # R
df.p <- performance_metrics(df.cv) df.p <- performance_metrics(df.cv)
head(df.p) head(df.p)
``` ```
```python ```python
# Python # Python
from fbprophet.diagnostics import performance_metrics from prophet.diagnostics import performance_metrics
df_p = performance_metrics(df_cv) df_p = performance_metrics(df_cv)
df_p.head() df_p.head()
``` ```
<div> <div>
<style scoped> <style scoped>
.dataframe tbody tr th:only-of-type { .dataframe tbody tr th:only-of-type {
vertical-align: middle; vertical-align: middle;
} }
skipping to change at line 167 skipping to change at line 173
<table border="1" class="dataframe"> <table border="1" class="dataframe">
<thead> <thead>
<tr style="text-align: right;"> <tr style="text-align: right;">
<th></th> <th></th>
<th>horizon</th> <th>horizon</th>
<th>mse</th> <th>mse</th>
<th>rmse</th> <th>rmse</th>
<th>mae</th> <th>mae</th>
<th>mape</th> <th>mape</th>
<th>mdape</th> <th>mdape</th>
<th>smape</th>
<th>coverage</th> <th>coverage</th>
</tr> </tr>
</thead> </thead>
<tbody> <tbody>
<tr> <tr>
<th>0</th> <th>0</th>
<td>37 days</td> <td>37 days</td>
<td>0.494800</td> <td>0.493764</td>
<td>0.703420</td> <td>0.702683</td>
<td>0.505277</td> <td>0.504754</td>
<td>0.058540</td> <td>0.058485</td>
<td>0.050149</td> <td>0.049922</td>
<td>0.676565</td> <td>0.058774</td>
<td>0.674052</td>
</tr> </tr>
<tr> <tr>
<th>1</th> <th>1</th>
<td>38 days</td> <td>38 days</td>
<td>0.500706</td> <td>0.499522</td>
<td>0.707606</td> <td>0.706769</td>
<td>0.510301</td> <td>0.509723</td>
<td>0.059120</td> <td>0.059060</td>
<td>0.049955</td> <td>0.049389</td>
<td>0.675423</td> <td>0.059409</td>
<td>0.672910</td>
</tr> </tr>
<tr> <tr>
<th>2</th> <th>2</th>
<td>39 days</td> <td>39 days</td>
<td>0.522967</td> <td>0.521614</td>
<td>0.723165</td> <td>0.722229</td>
<td>0.516433</td> <td>0.515793</td>
<td>0.059724</td> <td>0.059657</td>
<td>0.050078</td> <td>0.049540</td>
<td>0.672682</td> <td>0.060131</td>
<td>0.670169</td>
</tr> </tr>
<tr> <tr>
<th>3</th> <th>3</th>
<td>40 days</td> <td>40 days</td>
<td>0.530259</td> <td>0.528760</td>
<td>0.728189</td> <td>0.727159</td>
<td>0.519331</td> <td>0.518634</td>
<td>0.060033</td> <td>0.059961</td>
<td>0.049706</td> <td>0.049232</td>
<td>0.678849</td> <td>0.060504</td>
<td>0.671311</td>
</tr> </tr>
<tr> <tr>
<th>4</th> <th>4</th>
<td>41 days</td> <td>41 days</td>
<td>0.537736</td> <td>0.536078</td>
<td>0.733305</td> <td>0.732174</td>
<td>0.520341</td> <td>0.519585</td>
<td>0.060114</td> <td>0.060036</td>
<td>0.049955</td> <td>0.049389</td>
<td>0.685244</td> <td>0.060641</td>
<td>0.678849</td>
</tr> </tr>
</tbody> </tbody>
</table> </table>
</div> </div>
Cross validation performance metrics can be visualized with `plot_cross_validati on_metric`, here shown for MAPE. Dots show the absolute percent error for each p rediction in `df_cv`. The blue line shows the MAPE, where the mean is taken over a rolling window of the dots. We see for this forecast that errors around 5% ar e typical for predictions one month into the future, and that errors increase up to around 11% for predictions that are a year out. Cross validation performance metrics can be visualized with `plot_cross_validati on_metric`, here shown for MAPE. Dots show the absolute percent error for each p rediction in `df_cv`. The blue line shows the MAPE, where the mean is taken over a rolling window of the dots. We see for this forecast that errors around 5% ar e typical for predictions one month into the future, and that errors increase up to around 11% for predictions that are a year out.
```R ```R
# R # R
plot_cross_validation_metric(df.cv, metric = 'mape') plot_cross_validation_metric(df.cv, metric = 'mape')
``` ```
```python ```python
# Python # Python
from fbprophet.plot import plot_cross_validation_metric from prophet.plot import plot_cross_validation_metric
fig = plot_cross_validation_metric(df_cv, metric='mape') fig = plot_cross_validation_metric(df_cv, metric='mape')
``` ```
![png](/prophet/static/diagnostics_files/diagnostics_16_0.png) ![png](/prophet/static/diagnostics_files/diagnostics_17_0.png)
The size of the rolling window in the figure can be changed with the optional ar gument `rolling_window`, which specifies the proportion of forecasts to use in e ach rolling window. The default is 0.1, corresponding to 10% of rows from `df_cv ` included in each window; increasing this will lead to a smoother average curve in the figure. The `initial` period should be long enough to capture all of the components of the model, in particular seasonalities and extra regressors: at l east a year for yearly seasonality, at least a week for weekly seasonality, etc. The size of the rolling window in the figure can be changed with the optional ar gument `rolling_window`, which specifies the proportion of forecasts to use in e ach rolling window. The default is 0.1, corresponding to 10% of rows from `df_cv ` included in each window; increasing this will lead to a smoother average curve in the figure. The `initial` period should be long enough to capture all of the components of the model, in particular seasonalities and extra regressors: at l east a year for yearly seasonality, at least a week for weekly seasonality, etc.
<a id="parallelizing-cross-validation"> </a> <a id="parallelizing-cross-validation"> </a>
### Parallelizing cross validation ### Parallelizing cross validation
Cross-validation can also be run in parallel mode in Python, by setting specifyi ng the `parallel` keyword. Four modes are supported Cross-validation can also be run in parallel mode in Python, by setting specifyi ng the `parallel` keyword. Four modes are supported
* `parallel=None` (Default, no parallelization) * `parallel=None` (Default, no parallelization)
* `parallel="processes"` * `parallel="processes"`
* `parallel="threads"` * `parallel="threads"`
* `parallel="dask"` * `parallel="dask"`
For problems that aren't too big, we recommend using `parallel="processes"`. It will achieve the highest performance when the parallel cross validation can be d one on a single machine. For large problems, a [Dask](https://dask.org) cluster can be used to do the cross validation on many machines. You will need to [insta ll Dask](https://docs.dask.org/en/latest/install.html) separately, as it will no t be installed with `fbprophet`. For problems that aren't too big, we recommend using `parallel="processes"`. It will achieve the highest performance when the parallel cross validation can be d one on a single machine. For large problems, a [Dask](https://dask.org) cluster can be used to do the cross validation on many machines. You will need to [insta ll Dask](https://docs.dask.org/en/latest/install.html) separately, as it will no t be installed with `prophet`.
```python ```python
from dask.distributed import Client from dask.distributed import Client
client = Client() # connect to the cluster client = Client() # connect to the cluster
df_cv = cross_validation(m, initial='730 days', period='180 days', horizon='365 days', df_cv = cross_validation(m, initial='730 days', period='180 days', horizon='365 days',
parallel="dask") parallel="dask")
skipping to change at line 303 skipping to change at line 315
df_cv = cross_validation(m, cutoffs=cutoffs, horizon='30 days', parallel="pr ocesses") df_cv = cross_validation(m, cutoffs=cutoffs, horizon='30 days', parallel="pr ocesses")
df_p = performance_metrics(df_cv, rolling_window=1) df_p = performance_metrics(df_cv, rolling_window=1)
rmses.append(df_p['rmse'].values[0]) rmses.append(df_p['rmse'].values[0])
# Find the best parameters # Find the best parameters
tuning_results = pd.DataFrame(all_params) tuning_results = pd.DataFrame(all_params)
tuning_results['rmse'] = rmses tuning_results['rmse'] = rmses
print(tuning_results) print(tuning_results)
``` ```
changepoint_prior_scale seasonality_prior_scale rmse changepoint_prior_scale seasonality_prior_scale rmse
0 0.001 0.01 0.757489 0 0.001 0.01 0.757694
1 0.001 0.10 0.745049 1 0.001 0.10 0.743399
2 0.001 1.00 0.753315 2 0.001 1.00 0.753387
3 0.001 10.00 0.763111 3 0.001 10.00 0.762890
4 0.010 0.01 0.536260 4 0.010 0.01 0.542315
5 0.010 0.10 0.538103 5 0.010 0.10 0.535546
6 0.010 1.00 0.544326 6 0.010 1.00 0.527008
7 0.010 10.00 0.520970 7 0.010 10.00 0.541544
8 0.100 0.01 0.524669 8 0.100 0.01 0.524835
9 0.100 0.10 0.521302 9 0.100 0.10 0.516061
10 0.100 1.00 0.520692 10 0.100 1.00 0.521406
11 0.100 10.00 0.515338 11 0.100 10.00 0.518580
12 0.500 0.01 0.532103 12 0.500 0.01 0.532140
13 0.500 0.10 0.528939 13 0.500 0.10 0.524668
14 0.500 1.00 0.525256 14 0.500 1.00 0.521130
15 0.500 10.00 0.524619 15 0.500 10.00 0.522980
```python ```python
# Python # Python
best_params = all_params[np.argmin(rmses)] best_params = all_params[np.argmin(rmses)]
print(best_params) print(best_params)
``` ```
{'changepoint_prior_scale': 0.1, 'seasonality_prior_scale': 10.0} {'changepoint_prior_scale': 0.1, 'seasonality_prior_scale': 0.1}
Alternatively, parallelization could be done across parameter combinations by pa rallelizing the loop above. Alternatively, parallelization could be done across parameter combinations by pa rallelizing the loop above.
The Prophet model has a number of input parameters that one might consider tunin g. Here are some general recommendations for hyperparameter tuning that may be a good starting place. The Prophet model has a number of input parameters that one might consider tunin g. Here are some general recommendations for hyperparameter tuning that may be a good starting place.
**Parameters that can be tuned** **Parameters that can be tuned**
- `changepoint_prior_scale`: This is probably the most impactful parameter. It d etermines the flexibility of the trend, and in particular how much the trend cha nges at the trend changepoints. As described in this documentation, if it is too small, the trend will be underfit and variance that should have been modeled wi th trend changes will instead end up being handled with the noise term. If it is too large, the trend will overfit and in the most extreme case you can end up w ith the trend capturing yearly seasonality. The default of 0.05 works for many t ime series, but this could be tuned; a range of [0.001, 0.5] would likely be abo ut right. Parameters like this (regularization penalties; this is effectively a lasso penalty) are often tuned on a log scale. - `changepoint_prior_scale`: This is probably the most impactful parameter. It d etermines the flexibility of the trend, and in particular how much the trend cha nges at the trend changepoints. As described in this documentation, if it is too small, the trend will be underfit and variance that should have been modeled wi th trend changes will instead end up being handled with the noise term. If it is too large, the trend will overfit and in the most extreme case you can end up w ith the trend capturing yearly seasonality. The default of 0.05 works for many t ime series, but this could be tuned; a range of [0.001, 0.5] would likely be abo ut right. Parameters like this (regularization penalties; this is effectively a lasso penalty) are often tuned on a log scale.
- `seasonality_prior_scale`: This parameter controls the flexibility of the seas onality. Similarly, a large value allows the seasonality to fit large fluctuatio ns, a small value shrinks the magnitude of the seasonality. The default is 10., which applies basically no regularization. That is because we very rarely see ov erfitting here (there's inherent regularization with the fact that it is being m odeled with a truncated Fourier series, so it's essentially low-pass filtered). A reasonable range for tuning it would probably be [0.01, 10]; when set to 0.01 you should find that the magnitude of seasonality is forced to be very small. Th is likely also makes sense on a log scale, since it is effectively an L2 penalty like in ridge regression. - `seasonality_prior_scale`: This parameter controls the flexibility of the seas onality. Similarly, a large value allows the seasonality to fit large fluctuatio ns, a small value shrinks the magnitude of the seasonality. The default is 10., which applies basically no regularization. That is because we very rarely see ov erfitting here (there's inherent regularization with the fact that it is being m odeled with a truncated Fourier series, so it's essentially low-pass filtered). A reasonable range for tuning it would probably be [0.01, 10]; when set to 0.01 you should find that the magnitude of seasonality is forced to be very small. Th is likely also makes sense on a log scale, since it is effectively an L2 penalty like in ridge regression.
 End of changes. 22 change blocks. 
69 lines changed or deleted 81 lines changed or added

Home  |  About  |  Features  |  All  |  Newest  |  Dox  |  Diffs  |  RSS Feeds  |  Screenshots  |  Comments  |  Imprint  |  Privacy  |  HTTP(S)