Stats NZ has a new website.

For new releases go to

As we transition to our new site, you'll still find some Stats NZ information here on this archive site.

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
Data and methods

Data source: the Integrated Data Infrastructure

Statistics NZ developed the Integrated Data Infrastructure (IDI) as an environment in which to link multiple data sources in a systematic and secure way. It was developed to produce official statistics outputs and to allow Statistics NZ staff and external researchers to conduct policy evaluation and research on people’s transitions and outcomes. The IDI contains administrative and survey datasets, linked at the individual level. The IDI continues to change as new datasets are added.

This section describes the structure and content of the IDI in May 2015.

See figure 1 for the basic structure of the IDI.

Figure 1

Diagram showing the structure of the Integrated Data Infrastructure in May 2015. 


Identifying a resident population in IDI

The IDI spine contains more than 9 million individuals, far more than the current New Zealand usual resident population of approximately 4.5 million. Many individuals in the IDI spine are former usual residents of New Zealand who have since left or died. It is therefore necessary to restrict the IDI spine population to the subset of individuals who are usual residents of New Zealand at a given date (for the purposes of this paper, the date was census day, 5 March 2013).

The method used to select the usually resident population relies on the identification of activity in New Zealand administrative systems that would indicate an individual’s presence in New Zealand over a period prior to the reference date. Individuals who leave the population by death or outmigration prior to the reference date are removed.

Specifically, the method used to identify the IDI resident population (IDI-ERP) for census day (5 March 2013) was as follows:

  • For ages 5 and over, the spine population was restricted to those individuals who had activity in one of the following IDI datasets in the 12 months prior to census day:
    • ACC claims
    • Inland Revenue tax (employer monthly summary of tax paid at source, or annual tax return data; receipt of taxable benefit payments is included)
    • Health (pharmaceutical prescriptions, GP enrolment, hospital admissions, non-admission hospital visits)
    • Education (school enrolment, tertiary enrolment or attainment)
  • For ages under 5 having a record in the spine was sufficient for inclusion in the population. There was no additional requirement of activity in the 12 months prior to census day.
  • Linked death records were used to identify individuals with a date of death prior to census day. These individuals were removed from the population.
  • Linked migration data were used to identify individuals who had moved overseas. Individuals were classified as having moved overseas if they were overseas on the reference date, and the total length of time spent overseas was at least 10 of the 12 months spanning census day (that is, the six months either side of census day).


Location information in IDI

In May 2015 there were five data sources in the IDI that contained information about where individuals live. Not all addresses in these data sources represent where an individual actually lives. In some cases, addresses may represent postal or contact addresses provided to the administrative data supplier (Statistics NZ, 2013). For all of the data sources, address information was geocoded to a meshblock where possible. For Primary Health Organisation (PHO) data, address updates were available up until the end of November 2012. For all other sources, address updates were available at least up until census night (5 March 2013).

The five sources of address information were:

  • Inland Revenue (IR): The address history table is a summary table in IDI containing a history of address updates (coded to meshblock level) for each individual in IDI. Almost all meshblocks in the address history table (99 percent) come from Inland Revenue data (Gibb & Shrosbree, 2014) and represent the meshblock for the contact address supplied to Inland Revenue. The remaining meshblocks (less than 1 percent) come from postcodes in student loan data, with a meshblock being randomly assigned from within the postcode.
  • National Health Index (NHI): meshblock of residence as recorded when visiting hospital or outpatient clinic.
  • Primary Health Organisation (PHO): meshblock of residence as recorded when visiting a general practitioner.
  • Ministry of Social Development (MSD): meshblock of residence as reported when applying for a working age benefit (not including superannuation).
  • Ministry of Education (MOE): meshblock of residence as reported when enrolling at primary or secondary school (but not tertiary education).

Timestamps in each of these sources were used to select the most recently updated meshblock from each source prior to the reference date. Meshblocks were converted to area units and territorial authorities using standard meshblock concordances from Statistics New Zealand.


Linking census data and the IDI Linking census data and the IDI

The Census of Population and Dwellings is the official count of people and dwellings in New Zealand. It provides a snapshot of New Zealand at a point in time, and measures social and economic change in New Zealand. The latest census was held in March 2013.

The census aims to count everyone who is in New Zealand on census night. Overseas visitors are included in the census, while New Zealand residents who are not in New Zealand on census night are not included.

To enable individual-level comparisons between the geographic information in the IDI and the geographic information in the census, the census must be linked to the IDI at the individual level. This link was created by Census Transformation in May 2015 (Statistics NZ, 2014c). The linking was done for the purpose of better understanding the coverage and quality of census information in the IDI, and the linked data was only available to approved Statistics NZ staff working on the Census Transformation programme. The linking method used for this paper differed slightly from that being used to link the 2013 Census to the IDI spine for the September 2015 IDI refresh.

The census was linked to the spine of the IDI in the May 2015 refresh. Linking was completed in Quality Stage using probabilistic matching techniques. The variables full name, date of birth, sex, meshblock of usual residence, and country of birth were used in the linkage process. Overall, 94 percent of census records were linked to the IDI. The match rate was higher for NZ usual residents than for overseas visitors, and much better for individuals who had used e-forms (98 percent linked) compared to paper forms (93 percent linked).

Populations for comparison

To enable better comparison of the IDI-ERP and census populations, the following adjustments were made to the populations:

  • overseas visitors were removed from the census population 
  • residents temporarily overseas on census night (RTOs) were removed from the IDI-ERP population 
  • babies born in March 2013 were removed from both populations (as only month and year of birth were available in IDI, so it was not possible to distinguish babies born before 5 March from those born after).

From the above population, individual level comparisons can only be done for those individuals in the IDI-ERP who were able to be linked to their census record and who had a meshblock recorded in both the census and the IDI (n=3,787,700). The population available for individual-level comparisons represented 91 percent of the census population and 86 percent of the IDI-ERP. Unless otherwise stated, all comparisons in this paper are based on the linked population. The reference date used for individual-level comparisons was census night, 5 March 2013.

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+