Stats NZ has a new website.

For new releases go to

As we transition to our new site, you'll still find some Stats NZ information here on this archive site.

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+

Four variables have been identified by the Census Transformation programme as essential census information requirements specific to Māori: Māori descent, Māori ethnicity, iwi, and te reo Māori (the Māori language). This paper summarises the availability and quality of administrative data sources for these critical census information needs. While the potential for a future census based on administrative sources provides the context for this investigation, the findings are relevant for other uses of the administrative data.

These four variables are collected by a number of government agencies. We have examined the statistical properties of government administrative data sources by considering three aspects: the concepts underlying the data collection, coverage, and measurement error.

The relevance of the data in the census context is determined by how close the concepts, definitions, and questions are to the statistical concept we wish to understand. Coverage tells us how much of the population we are able to obtain information for. Measurement error looks at the accuracy of the data that is collected.

Consistency with statistical standards

We considered consistency with the concepts, definitions, and guidelines of the statistical standards, and the use of the standard classifications for each variable. For ethnicity, descent, and iwi, we found that government agencies do now largely collect data for these variables in way that is consistent with the key concepts of the standards. These developments are relatively recent and coincide with the development of the current standards from the mid-1990s. Older data is often not fit for purpose, and it is important that data collected before certain dates can be removed for analysis.

However, we found marked variation in the questions used. Some agencies adhere closely to the question guidance given in the standard, and the question used by the census, while other agencies use a wide variety of form types and questions.


The census provides information for people of all ages. Agencies collect information from people who interact with their particular service. Only the Ministry of Health achieves high coverage of New Zealand residents across all ages; the coverage of other agencies is limited by the nature of their service and also by the year from which data is available. Using linked data (as we have in the IDI) means that coverage gaps in one data source may be filled by other sources. Only ethnicity currently shows almost full coverage from the available administrative sources. Implications of using a system of linked data include the need to be able to identify unique individuals in the source data and for linkages to be accurate.

When combining data from different sources, better methods need to be developed to deal with inconsistent or conflicting data from different sources.

Measurement error

Measurement error is considered by comparing the 2013 Census with the administrative sources, based on the linked Census-IDI dataset. We compare the aggregate results that would be obtained from the administrative sources with those from the census. We also compare the values recorded for an individual in the administrative sources against those recorded for the same individual in the census. This analysis produces a number of measures of consistency between the administrative data and the census. Strong consistency across all these measures suggests that data in both sources are accurate. Where there are differences, conclusions are less obvious. In some cases causes may be clear – for example: where multiple responses for ethnicity are not retained, response patterns are very different; the use of different questions seems likely to be a significant cause of different responses; large amounts of missing data suggest underlying problems with data collection.

However, no two sources will ever provide exactly the same responses. There is an underlying variation in people’s responses over time, and response depends on context. This is a feature of ethnicity data collection but is also seen in what might be considered more stable attributes such as descent. Inevitably, some errors are introduced during subsequent processing for both surveys and administrative sources, including data linkages. Of all the data sources examined, DIA’s birth registrations data since 1998 shows the highest consistency with the census. The level of agreement seen here may be as high as can be expected anywhere across the system.

Māori organisations collection of statistical data

Statistics NZ will need to work in close partnership with iwi and other Māori organisations towards improving iwi information collected within government, and to establish a potential role for iwi registers in contributing to statistical information for and about Māori. McNally and Gleisner (2015) provide more detail on these issues.

Government collection of statistical data

There is a question about what kind of environment government agencies need to operate within in order to collect good statistical information. Registration of a birth is a legal requirement, and DIA has a centralised system of data collection. DIA collects ethnicity and descent for statistical purposes and has worked closely with Statistics NZ to ensure the statistical standard is followed and the quality of data is maintained over time. The registration questions are almost identical with the census questions.

In contrast, the challenges of gathering information from a large and widely dispersed collection system are revealed in the high levels of missing data and low consistency of some data sources compared with the census. This is despite providing good guidelines, and use of the standard classifications.

A more coordinated approach by government might rationalise data collection, concentrating resources to achieve high quality in a small number of agencies and allow core demographic information to be shared between agencies. Collection of data about indigenous populations will only succeed where there is a genuine partnership between Māori organisations and government agencies.


While some administrative sources provide very good information for and about Māori, the lack of completeness and lower quality of other sources means that administrative data cannot at present replace these essential information needs provided by the current survey-based census. There is some promise that ethnicity and Māori descent could be provided in future censuses through linked administrative data sources, though for iwi information this is more uncertain. Key requirements are:

  • improvements to the quality of ethnicity data collected by government
  • a source of Māori descent information for adults (possibly through access to electoral roll data)
  • government to work in partnership with iwi to develop iwi information sources.

Te reo proficiency is not suitable for collection through administrative sources. Therefore te reo information will require continued survey collection. It seems likely that information about iwi will also need to be obtained through surveys for some time to come.

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+