Stats NZ has a new website.

For new releases go to

As we transition to our new site, you'll still find some Stats NZ information here on this archive site.

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
Iwi estimates using small domain estimation – technical documentation

The sections below summarise the methodology we used to produce iwi estimates from Te Kupenga, using small domain estimation.

What small domain estimates are and why we produce them

Te Kupenga surveyed 5,549 Māori in 2013 and the results now provide a wide range of measures around cultural well-being.

See Survey Methodology in Te Kupenga 2013 for details of the sample design for the survey.

Te Kupenga can produce useful estimates for large groups, but the survey is not large enough to produce reliable estimates for many smaller iwi, using our standard methods. We therefore used a technique called ‘small domain estimation’ (SDE) to produce these estimates.

SDE works by using a statistical model to estimate a relationship between a variable of interest in Te Kupenga (eg ability in te reo Māori) and the detailed information collected in the census about each iwi (eg the areas people from that iwi live, demographics of the iwi, and the proportion who can hold an everyday conversation in te reo). We use this relationship to produce estimates for an iwi with fewer than 180 respondents. We have already released detailed estimates from Te Kupenga for eight iwi with more than 180 respondents, and for some iwi groupings.

See Iwi statistics from Te Kupenga 2013 – tables for more details.

While we took great care when producing the model and its resulting estimates, there is always a chance that an iwi will have a different relationship between the census and Te Kupenga variables from the rest of the population – we won’t identify this because of the small sample size for that iwi. For example, an iwi may have implemented a te reo education initiative aimed at lifting te reo ability of people with good to moderate ability to a higher level. If this initiative was very successful, and the sample size in Te Kupenga was small, the increase in the number with good to very good te reo ability may not be picked up. However, these estimates represent our best estimate of iwi information, which we provide by combining Te Kupenga and census information. We hope they are useful.

As we do for our standard survey estimates, we give confidence intervals that estimate a ‘credible range’ in which we are 95 percent confident the true value will lie. The confidence interval assumes the underlying model is correct. It takes into account uncertainty that’s due to the number of people in the sample and the variability in the data.

The sections below give further detail on how we produce the SDEs.

An overview of the model we use

We’ve produced small domain estimates (SDEs) for four cultural measures:

  • importance of being engaged in Māori culture
  • te reo Māori proficiency
  • whether visited ancestral marae in previous 12 months
  • ease of getting support with Māori cultural practices.

Each measure has discrete response options that have a natural ordering (eg rating te reo ability from low to high). We used an ordered multinomial logit model to model each measure. A separate model is fit for each outcome, but we used the same explanatory variables and model structure for each outcome.

The model assumes there is a continuous unobserved measure (η). This measure is turned into the discrete categories we are trying to estimate by defining cut-points, which define the range of η that corresponds to a particular response category.

We implemented the model within a Hierarchical Bayesian framework using the statistical package Stan (retrieved 3 February 2016 from This modelling framework allows us to pool information across iwi in a way that estimates for each iwi are informed by the general pattern of variation in the outcome, over variables such as age and sex, as well each iwi's own data.

We carried out inference using Markov chain Monte Carlo (MCMC) methods. The MCMC sampling comprised four chains. Each chain consisted of 5,000 simulations for the burn-in and produced 5,000 simulations for inference. We then thinned the resulting simulations to 1,000.

Model specification

We produced iwi estimates in two steps.

  1. We fitted a model that predicts the probability of each response type, given a set of characteristics such as a person's age and the iwi they affiliate with.
  2. We weighted-up the model’s predictions for each combination of the characteristics used in the model, by the proportion that each combination makes up for a particular iwi. We used counts from the 2013 Census to calculate the proportions. Because young adults (aged 15–30 years) tend to be under-represented in census counts, we adjusted the census counts by five-year age groups to bring the census proportions in line with the estimated resident population counts.

In step one, the probability of response category k (out of K possible responses) is modelled as:

Image, figure 1.


  • for a probability, Image, figure 2.
  • η is a continuous latent variable modelled by Image, figure 3.
  • Image, figure 4.  represents the iwi random effect for iwi j. We give these a student t prior distribution with the degrees of freedom Image, figure 5. estimated from the data: Image, figure 6. Image, figure 7. . The reason we use this distribution is to allow for situations where a small number of iwi are markedly different from the majority of iwi.
  • Image, figure 8.  represents the PSU random effect for PSU p. We give these effects a normal prior distribution, centered on the stratum mean to which PSU j belongs: Image, figure 9. Image, figure 10. .
  • We model stratum means with a normal distribution Image, figure 11.
  • X are person-level attributes from the census. These are sex, age, the number of people in the person’s census night dwelling who were eligible for selection (given an upper limit of 6), and whether the person indicated on their census form that they could hold an everyday conversation in te reo Māori. We give each of these four attributes an independent distribution of Image, figure 12. , g=1 to 4.
  • The parameters Image, figure 13. , and Image, figure 14. are given non-informative prior distributions of Uniform(0,3). Image, figure 15. is given a prior distribution of Uniform(0,20).

We used 41 strata, which reflect the strata used in the sample design of Te Kupenga. The strata are defined by regional council area, whether the primary sampling unit (PSU) is urban or rural, and whether the PSU has a high or low proportion of Māori.

The number of people eligible for selection within a census night dwelling had six as an upper limit, due to the small number of dwellings with more than six eligible people. We used the following age groups: 15–19 years, 20–29, 30–39, 40–49, 50–59,60–69, and 70 years and over (70+).

When there are K=2 response options, the formulation above reduces to logistic regression.

Model selection

We included age group, stratum, sex, number of eligible household members, and PSU in the model because they have a strong relationship to either the chance a person was selected into the sample, or the chance they responded. We included whether a person indicated they could hold an everyday conversation in te reo because of its strong relationship to the measures we are estimating. This variable, and the strata, are the most significant factors in the model. Strata include region, whether the person lives in a rural or urban area, and whether this area has a high or low proportion of Māori living there. We looked at interaction terms and using census income variables in our preliminary analyses, using the lme package in the statistical software [R], but these were not included in the final models because they did not improve the model’s fit.

Model checking

The primary model checks we completed were the following.


We compared the estimates produced by the SDE model with the standard survey estimates for large domains. For large domains the estimates and confidence intervals should be very similar between the two methods. We did these checks by age group, sex, strata, region, and whether the person indicated (in the census) they could speak te reo. This check is useful to ensure we haven’t missed an important aspect of the weighting methodology, and to identify any parts of the population that the model may not fit well. The confidence intervals produced by the two methods were similar for the large domains looked at. Generally the model recaptured the point estimates very well, though the estimates for the 60+ age groups were not estimated as well as for other age groups.

Posterior predictive checks

We used repeated draws from the posterior distribution to simulate the observed sample. We compared the distribution of simulated iwi means against the observed sample. No systematic issues, such as over-shrinkage (where the estimates look to be pulled too close to the overall average to support the variation observed in the sample) were identified through this stage of checking.

Sensitivity analysis

We investigated the sensitivity of estimates to aspects of model specification. Sensitivity testing had three investigations.

  • Modelling iwi estimates with a normal distribution, and with a Student-t distribution with the degree of freedom set to 3. A small number of iwi had large iwi effects estimated – the purpose of these tests was to see if the results changed meaningfully as we changed the model assumptions. This produced no significant impact.
  • Fitting a wider prior distribution of Uniform(0,100) for Image, figure 16. , and Image, figure 17. . This slowed convergence of the MCMC simulation substantially but had little impact on the resulting estimates.
    Fitting fixed effects for strata and age-group effects rather than the random effects specified above. Initially we were going to explore alternative age groupings but this was not required due to the small size of the age effects we estimated.

Ability to access cultural support

The response options for the measure 'Ability to access cultural support' differed slightly from the other measures – it had four ordinal response options (very easy to very difficult), but also the response option 'I do not need support'. This last response option is not clearly ordinal; around 4 percent of respondents used this response option. An alternative to modeling the data as ordinal would have been to first model the probability that an individual needed support, and then model ‘ability to access cultural support’ for those who did need support. However, this would have complicated the model substantially. Diagnostics on the simpler model, where we treated the additional response category as ordinal with the highest response category (ie above ‘very easy’), showed our model was performing reasonably well without introducing this additional step.

Multiple iwi

Of Te Kupenga respondents who affiliate with at least one iwi: 71 percent identified with only one iwi, 19 percent with two iwi, and 10 percent with three or more iwi. Because people can identify with more than one iwi, we looked at two options for fitting the model in step one (see ‘model specification’ section).

  • Using the first iwi each person listed on their census form.
  • Allowing each person to be repeated for each iwi they affiliated with. This increases the number of records by around 35 percent and we need an adjustment to ensure the resulting confidence intervals are not too narrow.

The estimates from these options were very similar, with any differences well within the confidence intervals. Therefore we used the first option, due to its simplicity. Under either option we use all iwi listed for each person, from the census, when weighting the model up for a particular iwi (step two in ‘model specification’).

How we take the sample design into account

Two important aspects need to be taken into account when forming the SDEs.

  • Some population groups are over-represented by the respondents in the sample.
  • The different stages of sample selection.

Over- (or under-) representation in the sample

Over- (or under-) representation is due to the sample design being more heavily targeted at some parts of the population (eg those from smaller regions), and some parts of the selected sample being less likely to be able to be contacted and therefore to respond to the survey.

To ensure the SDEs represent the population, we include the variables used to target the sample design in the model, and key variables associated with how likely a person is to respond to the survey (eg age, sex, strata). The model predictions are weighted up, using data from the 2013 Census on the population characteristics of each iwi.

Stages of sample selection

Another aspect of the sample design is that individuals are not sampled directly, but go through different selection stages. The first stage consists of selecting areas that contain around 60 dwellings. These are the primary sampling units (PSUs). Individuals are then selected from within the chosen PSUs; finally we remove some individuals from selection if more than one person was selected from the same household.

Because people who live within the same PSU tend to have similar characteristics, this type of design leads to larger sampling errors than if people were selected directly (selecting areas first makes the survey more cost effective). We need to capture this aspect of the sample design in the model, otherwise the model will overstate the precision of the estimate. We do this by including a random effect for PSUs in the model. This parameter is then integrated out of the posterior distribution to generate the marginal effects of the other characteristics in the model. These steps result in the width of the confidence intervals increasing by around 30 percent.

A third stage of selection consists of subsampling people selected within the same household. Because of this subsampling, people in larger households have a smaller chance of being selected. We therefore include the number of people eligible for the sample in the model. No attempt is made to adjust for any sampling efficiency gained due to household members being similar to each other. This may cause the resulting confidence intervals to be wider than necessary, but we expect the effect to be small.

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+