Intro to time series in R

Applications to resilience assesments

Juan C. Rocha

Time series

  • Observations or measurements that are indexed according to time (instead of Xi you have Xt)
  • Ordering matters: not exchangeable!
  • Time can stand for other unobserved variables
  • The assumption of independence does not apply
  • One can use the time index to decompose time series into different time scales! (Fourier transform or spectral analysis)

Time series task force

Packages

  • ts: time series object of the base R package
    • matrix of observations indexed by a chronological identifier.
    • Descriptive attributes need to be numeric.
    • ts does not recognize irregularly spaced time series.

      Source: Foley, 2021

# creating a ts:
ts(1:10, frequency = 4, start = c(1959, 2)) 
     Qtr1 Qtr2 Qtr3 Qtr4
1959         1    2    3
1960    4    5    6    7
1961    8    9   10     
nhtemp
Time Series:
Start = 1912 
End = 1971 
Frequency = 1 
 [1] 49.9 52.3 49.4 51.1 49.4 47.9 49.8 50.9 49.3 51.9 50.8 49.6 49.3 50.6 48.4
[16] 50.7 50.9 50.6 51.5 52.8 51.8 51.1 49.8 50.2 50.4 51.6 51.8 50.9 48.8 51.7
[31] 51.0 50.6 51.7 51.5 52.1 51.3 51.0 54.0 51.4 52.7 53.1 54.6 52.0 52.0 50.9
[46] 52.6 50.2 52.6 51.6 51.9 50.5 50.9 51.7 51.4 51.7 50.8 51.9 51.8 51.9 53.0

Packages

  • zoo: supports irregular time series (Zeileis’s ordered observations)
    • matrix of values with index attribute
    • partial support for factors, but mainly numeric data
    • implement methods to handle NAs and rolling windows
    • introduced in 2005.

      Source: Foley, 2021

library(zoo)
## simple creation and plotting
x.Date <- as.Date("2003-02-01") + c(1, 3, 7, 9, 14) - 1
x <- zoo(rnorm(5), x.Date)
plot(x)

time(x) # extract time vector
[1] "2003-02-01" "2003-02-03" "2003-02-07" "2003-02-09" "2003-02-14"

Packages

  • xts: extends zoo.
    • matrix of values with a time object as index
    • allow duplicates but not missing values in index
    • allow meta-data and other attributes
    • allow fancy filtering in indexing
    • introduced in 2008.

      Source: Foley, 2021

library(xts)
data(sample_matrix)
sample.xts <- as.xts(sample_matrix, descr='my new xts object')
str(sample.xts)
An 'xts' object on 2007-01-02/2007-06-30 containing:
  Data: num [1:180, 1:4] 50 50.2 50.4 50.4 50.2 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:4] "Open" "High" "Low" "Close"
  Indexed by objects of class: [POSIXct,POSIXt] TZ: 
  xts Attributes:  
List of 1
 $ descr: chr "my new xts object"
sample.xts['2007-03/']  # March 2007 to the end of the data set
               Open     High      Low    Close
2007-03-01 50.81620 50.81620 50.56451 50.57075
2007-03-02 50.60980 50.72061 50.50808 50.61559
2007-03-03 50.73241 50.73241 50.40929 50.41033
2007-03-04 50.39273 50.40881 50.24922 50.32636
2007-03-05 50.26501 50.34050 50.26501 50.29567
2007-03-06 50.27464 50.32019 50.16380 50.16380
2007-03-07 50.14458 50.20278 49.91381 49.91381
2007-03-08 49.93149 50.00364 49.84893 49.91839
2007-03-09 49.92377 49.92377 49.74242 49.80712
2007-03-10 49.79370 49.88984 49.70385 49.88698
2007-03-11 49.83062 49.88295 49.76031 49.78806
2007-03-12 49.82763 49.90311 49.67049 49.74033
2007-03-13 49.69628 49.70863 49.37924 49.37924
2007-03-14 49.36270 49.53735 49.30746 49.53735
2007-03-15 49.57374 49.62310 49.39876 49.49600
2007-03-16 49.44900 49.65285 49.42416 49.59500
2007-03-17 49.55666 49.55666 49.33564 49.34714
2007-03-18 49.29778 49.67857 49.29778 49.65463
2007-03-19 49.62747 49.65407 49.51604 49.54590
2007-03-20 49.59529 49.62003 49.42321 49.50690
2007-03-21 49.49765 49.53961 49.41610 49.51807
2007-03-22 49.42306 49.42306 49.31184 49.39687
2007-03-23 49.27281 49.27281 48.93095 48.93095
2007-03-24 48.86635 48.86635 48.52684 48.52684
2007-03-25 48.50649 48.50649 48.33409 48.33973
2007-03-26 48.34210 48.44637 48.28969 48.28969
2007-03-27 48.25248 48.41572 48.23648 48.30851
2007-03-28 48.33090 48.53595 48.33090 48.53595
2007-03-29 48.59236 48.69988 48.57432 48.69988
2007-03-30 48.74562 49.00218 48.74562 48.93546
2007-03-31 48.95616 49.09728 48.95616 48.97490
2007-04-01 48.94407 48.97816 48.80962 48.87032
2007-04-02 48.90488 49.08400 48.90488 49.06316
2007-04-03 49.06071 49.24525 48.96928 49.24525
2007-04-04 49.22579 49.37335 49.19913 49.34736
2007-04-05 49.41435 49.41435 49.30641 49.33776
2007-04-06 49.33621 49.41900 49.33621 49.41900
2007-04-07 49.45170 49.60950 49.45170 49.53819
2007-04-08 49.54338 49.58968 49.41806 49.41806
2007-04-09 49.44429 49.50234 49.33828 49.50234
2007-04-10 49.55704 49.78776 49.55704 49.76984
2007-04-11 49.74550 49.81925 49.74550 49.74623
2007-04-12 49.75079 49.75470 49.61732 49.72996
2007-04-13 49.70708 49.85332 49.69245 49.73339
2007-04-14 49.74154 49.77340 49.70159 49.75552
2007-04-15 49.74707 49.79341 49.66299 49.70942
2007-04-16 49.74915 49.86289 49.71091 49.83886
2007-04-17 49.84698 49.95456 49.77754 49.95456
2007-04-18 49.93794 50.07208 49.92484 50.07208
2007-04-19 50.02441 50.02991 49.83945 49.83945
2007-04-20 49.76042 49.92847 49.69808 49.91103
2007-04-21 49.98954 50.20123 49.98954 50.20123
2007-04-22 50.31203 50.33781 50.24788 50.32556
2007-04-23 50.32009 50.32009 49.87574 49.88539
2007-04-24 49.87340 49.90184 49.72769 49.72769
2007-04-25 49.73385 49.88622 49.73385 49.88472
2007-04-26 49.89064 49.89064 49.74899 49.79201
2007-04-27 49.80530 49.80530 49.50814 49.50814
2007-04-28 49.54688 49.55497 49.29186 49.29186
2007-04-29 49.30289 49.30289 49.05676 49.13529
2007-04-30 49.13825 49.33974 49.11500 49.33974
2007-05-01 49.34572 49.52635 49.34572 49.47138
2007-05-02 49.47062 49.47062 49.34261 49.38521
2007-05-03 49.46328 49.69097 49.46328 49.58677
2007-05-04 49.59963 49.59963 49.41375 49.41375
2007-05-05 49.38428 49.40266 49.10310 49.10310
2007-05-06 49.16606 49.45999 49.16606 49.45999
2007-05-07 49.49188 49.49188 49.13572 49.13572
2007-05-08 49.13282 49.25507 49.13282 49.18930
2007-05-09 49.17739 49.17739 48.72708 48.72708
2007-05-10 48.83479 48.84549 48.38001 48.38001
2007-05-11 48.25456 48.25456 47.96904 47.96904
2007-05-12 47.96813 48.03286 47.89262 48.01935
2007-05-13 48.05550 48.05550 47.66209 47.66209
2007-05-14 47.64469 47.72505 47.58212 47.65930
2007-05-15 47.60647 47.74053 47.51796 47.72686
2007-05-16 47.72065 47.90717 47.70913 47.86683
2007-05-17 47.79430 47.79430 47.55140 47.62938
2007-05-18 47.65013 47.75117 47.65013 47.68423
2007-05-19 47.65552 47.77986 47.60536 47.60536
2007-05-20 47.56210 47.93085 47.56210 47.93085
2007-05-21 47.96582 48.02903 47.78072 47.78072
2007-05-22 47.81830 47.94825 47.81155 47.82946
2007-05-23 47.93593 48.08242 47.88763 47.90068
2007-05-24 47.89041 48.03077 47.88413 48.01130
2007-05-25 47.98234 48.17543 47.94507 48.16058
2007-05-26 48.14521 48.14521 47.92649 47.99613
2007-05-27 48.01018 48.02166 47.90193 47.90193
2007-05-28 47.90142 47.93398 47.64718 47.64718
2007-05-29 47.65665 47.89342 47.65446 47.87252
2007-05-30 47.78866 47.93267 47.78866 47.83291
2007-05-31 47.82845 47.84044 47.73780 47.73780
2007-06-01 47.74432 47.74432 47.54820 47.65123
2007-06-02 47.60223 47.74542 47.56796 47.72569
2007-06-03 47.71215 47.71215 47.50198 47.50198
2007-06-04 47.51516 47.53545 47.32342 47.37642
2007-06-05 47.41090 47.48217 47.21116 47.22930
2007-06-06 47.36581 47.41233 47.23306 47.40048
2007-06-07 47.42099 47.50637 47.35320 47.45262
2007-06-08 47.48449 47.53089 47.42814 47.48360
2007-06-09 47.38669 47.74770 47.38669 47.74770
2007-06-10 47.74899 47.74899 47.28685 47.28685
2007-06-11 47.27807 47.30884 47.14660 47.14660
2007-06-12 47.19411 47.41834 47.18153 47.41834
2007-06-13 47.46135 47.52004 47.43083 47.43083
2007-06-14 47.43279 47.43279 47.33490 47.34884
2007-06-15 47.33306 47.40490 47.26157 47.36779
2007-06-16 47.36452 47.40463 47.26056 47.26056
2007-06-17 47.24783 47.47249 47.24783 47.39521
2007-06-18 47.43470 47.56336 47.36424 47.36424
2007-06-19 47.46055 47.73353 47.46055 47.67220
2007-06-20 47.71126 47.81759 47.66843 47.66843
2007-06-21 47.71012 47.71012 47.61106 47.62921
2007-06-22 47.56849 47.59266 47.32549 47.32549
2007-06-23 47.22873 47.24771 47.09144 47.24771
2007-06-24 47.23996 47.30287 47.20932 47.22764
2007-06-25 47.20471 47.42772 47.13405 47.42772
2007-06-26 47.44300 47.61611 47.44300 47.61611
2007-06-27 47.62323 47.71673 47.60015 47.62769
2007-06-28 47.67604 47.70460 47.57241 47.60716
2007-06-29 47.63629 47.77563 47.61733 47.66471
2007-06-30 47.67468 47.94127 47.67468 47.76719

Packages

  • tsibble: is a time series tibble.
    • preserves time indices and make heterogeneous data structures possible
    • compatible with tidyverse family of packages
    • contains a key and index columns
    • the key can be multivariate
    • works nicely with lubridate: accept multiple time formatting
    • introduced in 2020.

      Wang, Cook and Hyndman 2020

tsibble

Create time series data

From scratch

library(tsibble)
y <- tsibble(
  Year = 2015:2019,
  Observation = c(123, 39, 78, 52, 110),
  index = Year
)
y
# A tsibble: 5 x 2 [1Y]
   Year Observation
  <int>       <dbl>
1  2015         123
2  2016          39
3  2017          78
4  2018          52
5  2019         110

Or from an existing data frame

y <- tibble(
  Year = 2015:2019,
  Observation = c(123, 39, 78, 52, 110)
)

y <- y |> as_tsibble(index = Year)
y
# A tsibble: 5 x 2 [1Y]
   Year Observation
  <int>       <dbl>
1  2015         123
2  2016          39
3  2017          78
4  2018          52
5  2019         110

tsibble

From a csv file. For a tsibble to be valid, it requires a unique index for each combination of keys. The tsibble() or as_tsibble() function will return an error if this is not true.

prison <- readr::read_csv("https://OTexts.com/fpp3/extrafiles/prison_population.csv")
prison <- prison %>%
    # you need to declare the time object
    mutate(Quarter = yearquarter(Date)) %>%
    select(-Date) %>%
    # note that the key can be multivariate
    as_tsibble(key = c(State, Gender, Legal, Indigenous),
               index = Quarter)

prison
# A tsibble: 3,072 x 6 [1Q]
# Key:       State, Gender, Legal, Indigenous [64]
   State Gender Legal    Indigenous Count Quarter
   <chr> <chr>  <chr>    <chr>      <dbl>   <qtr>
 1 ACT   Female Remanded ATSI           0 2005 Q1
 2 ACT   Female Remanded ATSI           1 2005 Q2
 3 ACT   Female Remanded ATSI           0 2005 Q3
 4 ACT   Female Remanded ATSI           0 2005 Q4
 5 ACT   Female Remanded ATSI           1 2006 Q1
 6 ACT   Female Remanded ATSI           1 2006 Q2
 7 ACT   Female Remanded ATSI           1 2006 Q3
 8 ACT   Female Remanded ATSI           0 2006 Q4
 9 ACT   Female Remanded ATSI           0 2007 Q1
10 ACT   Female Remanded ATSI           1 2007 Q2
# ℹ 3,062 more rows

Plotting

library(tsibbledata)
olympic_running |> 
    ggplot(aes(Year, Time)) + 
    geom_line(aes(color = Sex)) + 
    facet_wrap(~Length, 
               scales = "free_y", 
               ncol = 4) +
    theme_light(base_size = 12) +
    theme(legend.position = c(0.9,0.1))

plot(sample.xts)

Common tasks

  • Trend long-term increase or decrease (change of direction) in the data
  • Seasonal patterns repeat by a fixed time (periodic)
  • Cyclic patterns do not have fixed frequency

Example: energy demand

library(tsibbledata)
library(fpp3)
library(patchwork)
library(feasts)
a <- vic_elec %>% gg_season(Demand, period = "week") +
    scale_colour_gradient(breaks = NULL, labels = NULL) +
  theme(legend.position = "none") +
  labs(y="MWh", title="Daily")

b <- vic_elec %>% gg_season(Demand, period = "week") +
    scale_colour_gradient(breaks = NULL, labels = NULL) +
  theme(legend.position = "none") +
  labs(y="MWh", title="Weekly")

c <- vic_elec %>% gg_season(Demand, period = "year") +
    scale_colour_gradient(breaks = NULL, labels = NULL) +
  labs(y="MWh", title="Annual")

(a+b+c) + plot_layout(guides= "collect") & theme_light(base_size = 14)  

Lags

recent_production <- aus_production %>%
  filter(year(Quarter) >= 2000) 
recent_production |> 
    ggplot(aes(Quarter, Beer)) +
    geom_line() +
    theme_grey(base_size = 15)

recent_production %>%
  gg_lag(Beer, geom = "point") +
  labs(x = "lag(Beer, k)") +
    theme_grey(base_size = 15)

Autocorrelation

Measures the linear relationship between lagged values of time series

recent_production %>% ACF(Beer, lag_max = 9)
# A tsibble: 9 x 2 [1Q]
       lag      acf
  <cf_lag>    <dbl>
1       1Q -0.0530 
2       2Q -0.758  
3       3Q -0.0262 
4       4Q  0.802  
5       5Q -0.0775 
6       6Q -0.657  
7       7Q  0.00119
8       8Q  0.707  
9       9Q -0.0888 

ACF is the autocorrelation function and it is useful to study the autocorrelation structure of the data

recent_production %>%
  ACF(Beer) %>%
  autoplot() + labs(title="Australian beer production") + theme_light(base_size = 15)

De-trending

Decompose a time series into its seasonal, trend, and remainder parts:

yt=St+Tt+Rt or yt=St∗Tt∗Rt

Choice depends of case, but note that:

yt=St∗Tt∗Rt is equivalent to logyt=logSt+logTt+logRt

us_retail_employment <- us_employment %>%
  filter(year(Month) >= 1990, Title == "Retail Trade") %>%
  select(-Series_ID)
autoplot(us_retail_employment, Employed) +
  labs(y = "Persons (thousands)",
       title = "Total employment in US retail") + theme_light(base_size = 15)

De-trending

Decompose a time series into its seasonal, trend, and remainder parts:

dcmp <- us_retail_employment %>%
    # seasonal trend decomposition using loess
    model(stl = STL(Employed))
components(dcmp)
# A dable: 357 x 7 [1M]
# Key:     .model [1]
# :        Employed = trend + season_year + remainder
   .model    Month Employed  trend season_year remainder season_adjust
   <chr>     <mth>    <dbl>  <dbl>       <dbl>     <dbl>         <dbl>
 1 stl    1990 Jan   13256. 13288.      -33.0      0.836        13289.
 2 stl    1990 Feb   12966. 13269.     -258.     -44.6          13224.
 3 stl    1990 Mar   12938. 13250.     -290.     -22.1          13228.
 4 stl    1990 Apr   13012. 13231.     -220.       1.05         13232.
 5 stl    1990 May   13108. 13211.     -114.      11.3          13223.
 6 stl    1990 Jun   13183. 13192.      -24.3     15.5          13207.
 7 stl    1990 Jul   13170. 13172.      -23.2     21.6          13193.
 8 stl    1990 Aug   13160. 13151.       -9.52    17.8          13169.
 9 stl    1990 Sep   13113. 13131.      -39.5     22.0          13153.
10 stl    1990 Oct   13185. 13110.       61.6     13.2          13124.
# ℹ 347 more rows
components(dcmp) %>% autoplot()

There are multiple ways of detrending time series, your choice will depend on the problem and data at hand, and the relevant time scale to your question.

Rolling windows

swe_exports <- global_economy |> 
    filter(Country == "Sweden") |> 
    mutate(
        `5-MA` = slider::slide_dbl(Exports, mean,
                .before = 2, .after=2, .complete = TRUE),
        `7-MA` = slider::slide_dbl(Exports, mean,
                .before = 3, .after=3, .complete = TRUE),
        `9-MA` = slider::slide_dbl(Exports, mean,
                .before = 4, .after=4, .complete = TRUE)
    )
swe_exports
# A tsibble: 58 x 12 [1Y]
# Key:       Country [1]
   Country Code   Year        GDP Growth   CPI Imports Exports Population `5-MA`
   <fct>   <fct> <dbl>      <dbl>  <dbl> <dbl>   <dbl>   <dbl>      <dbl>  <dbl>
 1 Sweden  SWE    1960    1.48e10  NA     9.21    23.4    23.0    7484656   NA  
 2 Sweden  SWE    1961    1.61e10   5.68  9.41    21.7    22.3    7519998   NA  
 3 Sweden  SWE    1962    1.75e10   4.26  9.86    21.4    21.9    7561588   22.3
 4 Sweden  SWE    1963    1.90e10   5.33 10.1     21.5    21.9    7604328   22.1
 5 Sweden  SWE    1964    2.11e10   6.82 10.5     21.9    22.3    7661354   21.9
 6 Sweden  SWE    1965    2.33e10   3.82 11.0     22.5    21.9    7733853   21.7
 7 Sweden  SWE    1966    2.53e10   2.09 11.7     21.9    21.4    7807797   21.7
 8 Sweden  SWE    1967    2.75e10   3.37 12.2     21.0    21.1    7867931   21.8
 9 Sweden  SWE    1968    2.91e10   3.64 12.5     21.6    21.6    7912273   21.9
10 Sweden  SWE    1969    3.16e10   5.01 12.8     23.0    22.8    7968072   22.2
# ℹ 48 more rows
# ℹ 2 more variables: `7-MA` <dbl>, `9-MA` <dbl>
swe_exports |> 
    pivot_longer(cols = ends_with("MA"), names_to = "window", values_to = "MA") |> 
    ggplot(aes(Year, Exports)) +
    geom_line(col="black") +
    geom_line(aes(Year, MA), color = "orange") +
    facet_wrap(~window, ncol=1) + theme_grey(base_size = 15)

Case study

Where are regime shifts likely to occur?

Depends on our ability to observe and measure resilience

Data & methods

  • Terrestrial:
    - Gross primary productivity (2001:2018) - Ecosystem respiration (2001:2018) - Leaf area index (1994:2017)
  • Marine:
    - Chlorophyll A (1998:2018)
  • >1M pixels, weekly obs, 0.25 degree grid resolution

Critical slowing down

  • ↑ Variance and autocorrelation
  • Δ skewness and kurtosis

Dakos et al. 2012. PLoS ONE; Kéfi et al. 2014. PLoS ONE

Critical speeding up

  • ↓ Variance and autocorrelation

Titus & Watson 2020 J Theor Ecol

Fractal dimension

  • ↑ adaptive capacity
  • Measure of self-similarity across scales

West, Geoffrey. 2017. Scale; Gneiting et al. 2012. Statistical Science.

Analysis: one pixel

The generic resilience indicators do not necessarily align with critical slowing down or speeding up theories

Rocha, JC. 2022. Ecosystems are showing symptoms of resilience loss. ERL

Detection

In the absence of ground truth, if Δ is > 95% or < 5% of the distribution is considered a signal of resilience loss

~30% of ecosystem show symptoms of resilience loss, boreal forest and tundra particularly strong signals

~25% of ecosystem show symptoms of resilience loss, Easter Indo-Pacific and Tropical Eastern Pacific Oceans particularly strong signals

Advanced topics

  • Time series regression models
  • Panel models (econometrics)
  • ARIMA and friends
  • Dynamic regression models (MARSS)
  • Forecasting
  • Machine learning for time series (recursive neural networks)
  • Spectral analysis
  • State space models

Further reading

  • Hyndman, R & Athanasopoulos G. 2021. Forecasting: principles and practice. OTexts [*]

  • Peng, R. 2020. A very short course on time series analysis

  • Foley, M. 2021. Time Sries Analysis

Exercises

Vector exercises

Low-hanging fruit

Challenge

  1. Use your PhD data to calculate early warning indicators of resilience loss
  2. De-trend the data
  3. Use a rolling window to calculate changes in variance and autocorrelation
  4. What do you think, is your system lossing or gaining resilience?

https://juanrocha.se/presentations/r_timeseries

Intro to time series in R Applications to resilience assesments Juan C. Rocha

  1. Slides

  2. Tools

  3. Close
  • Intro to time series in R
  • Time series
  • Time series task force
  • Packages
  • Packages
  • Packages
  • Packages
  • tsibble
  • tsibble
  • Plotting
  • Common tasks
  • Example: energy demand
  • Lags
  • Autocorrelation
  • De-trending
  • De-trending
  • Rolling windows
  • Case study
  • Where are regime shifts likely to occur?
  • Data & methods
  • Analysis: one pixel
  • Detection
  • ~30% of ecosystem...
  • ~25% of ecosystem...
  • Advanced topics
  • Further reading
  • Exercises
  • Vector exercises
  • f Fullscreen
  • s Speaker View
  • o Slide Overview
  • e PDF Export Mode
  • b Toggle Chalkboard
  • c Toggle Notes Canvas
  • d Download Drawings
  • ? Keyboard Help