## Trend assessment – technical information

This webpage provides a summary of how we determined the methodologies for assessing the trends presented in

*Environmental indicators Te taiao Aotearoa*. It is intended for a technical audience familiar with statistical trend assessments. It outlines the key methods we used and the characteristics of the trends.Stats NZ calculated most of the trend assessments on our highest quality data best suited to this type of analysis. Data providers analysed the trends for river, lake, and groundwater quality; coastal sea-level rise; and primary productivity. See the technical reports in the Ministry for the Environment’s data service for details of these trend assessments. The data providers used methods that are generally comparable with those used by Stats NZ.

## Selecting the methodologies

There is no ‘one size fits all’ approach to trend assessment, so we used a range of techniques.

### Inspecting the data

For all indicators, we first visually inspected the data to assess their distribution. This involved fitting a variety of parametric models and non-parametric descriptors to assess linearity, autocorrelation, residual normality, and variance structure. The techniques we used for visual interpretation included non-parametric slope estimation (Theil-Sen estimation), ordinary least squares (OLS) regression, piecewise regression, locally weighted scatterplot smoothing (LOESS), and moving average time series decomposition. We assessed OLS model assumptions (eg normality, autocorrelation, and heteroscedasticity) to help us choose between parametric and non-parametric statistical trend tests. We reviewed the literature and consulted externally to help us decide which test might be appropriate for a particular indicator.

Where necessary, we imputed missing values and applied appropriate transformations to satisfy OLS model assumptions. For example, where we applied OLS regression to a regular time series featuring a seasonal pattern, we first de-seasonalised the series using classical seasonal decomposition by moving averages (see figure 1). This removed extraneous seasonal variability that would have otherwise reduced our ability to recover the trend signal.

### Determining trend direction

We used one of two main statistical approaches to determine the direction of the trend:

- non-parametric: ideal for shorter time series and avoids distributional assumptions (eg Mann-Kendall test, seasonal and non-seasonal Theil-Sen test)
- parametric: preferred for longer time series with checks for distribution (eg OLS, piecewise regression).

The method we applied to each indicator depended on these factors:

- number of data points available
- frequency of the data available
- extent to which seasonality was a likely factor influencing the trend
- validity of model assumptions.

We sought to determine the overall trend direction, and in cases where the trend was linear, the rate of change over time. We therefore based our trend conclusions on methods with easily interpretable results, such as the Theil-Sen test and OLS regression analysis. Figure 1 shows the general process we used for determining the trend method.

#### Figure 1

##### 1. Maximum missing value tolerance set at 25 percent.

2. Locally weighted scatterplot smoothing (LOESS) was also fitted to aid interpretation and to determine whether the assumption of linearity could be applied.Trends may be monotonic and linear (eg OLS regression), monotonic and non-linear (eg Mann-Kendall), or non-monotonic and non-linear. Linearity tests check whether the change over time is consistent across all years. Monotonic trends can show variable rates of change but the direction of change is (statistically) the same.

### Level of application

Where possible, we assessed trends at the national level. Where trends were assessed on a site-by-site basis, we determined whether the indicator was showing an overall national change by using the exact binomial test. This tests whether the number of sites that show an increasing trend, as a percentage of the total number of sites surveyed, is further from 50 percent than would be expected by chance.

We have presented trend findings for the longest available time frame to ensure the finding is as robust as possible, and to detect the long-term trend rather than short-term random or cyclical fluctuations.

We did not assess trends for indicators where:

- there were insufficient data points (eg land cover)
- the time series data were pooled to reflect the long-term average level (eg river flows)
- data quality issues (eg excessive non-sampling errors) may have affected the trend

### Level of confidence

#### Before October 2017

We used 95 percent confidence intervals to establish an increasing, decreasing, or indeterminate trend. The upper and lower bounds of the interval represent the range within which we are 95 percent confident the actual trend lies. If the lower bound of the confidence interval is greater (less) than zero, then we are confident (at the 95 percent level) that the trend is increasing (decreasing). If the confidence interval includes the value of zero, then we cannot determinate the direction of the trend with 95 percent confidence.

A statistic that is closely related to the confidence level is the P-value. The P-value shows the probability of recording the observed increase or decrease purely by chance. For example, a P-value less than 0.05 indicates that we can be 95 percent confident that the trend is not observed by chance.

#### After October 2017

To determine trend direction, we employed the two one-sided test (TOST) procedure, recommended by McBride et al (2014), using the 95 percent confidence level (α = 0.05). This procedure results in one of the following outcomes:

- confidence that the trend is increasing
- confidence that the trend is decreasing
- there are insufficient data to be confident about trend direction.

### Interpreting trend assessments

The trend assessments have these characteristics.

- They are statistically based. The use of 95 percent confidence levels may differ from those used in environmental management.
- They show the rate of change without reference to the condition/grade of the environment (ie good/fair/poor). This means the trends do not necessarily reflect biologically/ecologically significant changes occurring in the system, for example, that an environmental state has reached a ‘tipping point’.
- They do not always control for other factors which may be influencing the data, for example the effect of weather on PM10 concentrations. An exception to this is for river water quality trends in which measurements are flow adjusted. In some instances, seasonal or longer-term cyclical effects, such as the El Niño Southern Oscillation, may be influencing the trend estimate. In these and other cases, more data are required to assess the effects of multiple correlated factors.
- They may be affected by data limitations. For example, we can only perform the trend assessment on the data available to us. These data may be in the form of aggregated annual means or shorter than a longer-term cycle (see previous point).
- They relate to historical patterns in the data and do not necessarily imply that the trend will continue at the same rate of change.
- They assume there is no time-varying uncertainty (ie time-dependent bias) in the data (eg we assume that sea-surface temperature data are not affected by the replacement of an obsolete satellite by a new model).

## Key methods

This section summarises the methods we used to determine the trends presented in

*Environmental indicators Te taiao Aotearoa*.### Theil-Sen estimator

The Theil-Sen estimator is calculated as the median of the slopes generated by comparing all possible pairs of points in the data. For time-series data where seasonal patterns were present, we de-seasonalised or compared only pairs of points collected from the same time of year (ie the seasonal Theil-Sen estimator). P-values are calculated by resampling the data many times with replacement (‘bootstrap’ sampling; we used 1,000 bootstrap iterations in our analyses). The Theil-Sen estimator is an unbiased estimator of the true slope in simple OLS regression, but is less sensitive to outliers. We made no assumptions regarding the distribution of the data. When only the sign of the Theil-Sen estimator is considered, the test is equivalent to the Mann-Kendall test for trend direction. Hence, in statements regarding trend assessments, ‘Mann-Kendall’ and ‘Theil-Sen’ (where slopes are not reported) are used interchangeably. We only carried out the seasonal Theil-Sen test if we had at least three years of sub-annual data points, and the non-seasonal Theil-Sen test if at least 10 consecutive data points were available. The Mann-Kendall test was only carried out if there were at least six consecutive data points available.

Inference of Theil-Sen estimates may be affected by the presence of autocorrelation, and consensus is required on how to make such adjustments.

### OLS regression

This parametric test assesses the significance of the coefficient on a regression of time on the indicator of interest. OLS regression requires assumptions of normality in the distribution of residuals, no heteroscedasticity or autocorrelation, and a linear relationship, to be valid. The assumption of no autocorrelation is frequently violated in time series analysed using OLS regression, which affects statistical inference through underestimating the standard errors, and hence the confidence intervals. We required at least 10 consecutive data points to perform this test.

## References

McBride, G, Cole, RG, Westbrooke, I, & Jowett, I (2014). Assessing environmentally significant effects: a better strength-of-evidence than a single P value?

*Environmental monitoring and assessment, 186*(5), 2729–2740.Updated 27 October 2017