Stats NZ has a new website.

For new releases go to

www.stats.govt.nz

As we transition to our new site, you'll still find some Stats NZ information here on this archive site.

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
Results

Results for each step of the construction of the IDI-ERP are shown in table 1. Almost half the members of the spine have no activity apparent in the 12 months prior to June 2013. Another 200,000 of those reporting some activity were removed from the population because they died or left the country. The number of deaths removed is larger than the approximately 30,000 deaths that typically occur in a calendar year. This is at least partly because some people with activity in 2012/13 are recorded as a death prior to that time. There may be genuine reasons for this apparent anomaly.

Table 1

Results for construction of the IDI-ERP
Number in IDI spine 

9,074,000 

Retained through activity 

4,722,600 

Deaths removed 

40,300 

Outmigration removed

149,000

Total IDI-ERP

4,533,200

ERP

4,442,100

Aggregate comparison against the ERP

The total national population obtained using the activity-based method described above was 4,533,200 as at 30 June 2013. This represents 102 percent of the ERP for the same date – in other words, the IDI-ERP is 2 percent higher than the ERP. This is outside the quality standard which specifies that the total national population should be within 0.5 percent of the ERP. 

The national IDI-ERP age-sex distribution pattern is largely similar to the ERP (see figure 3), suggesting that overall the approach to extracting the resident population from the administrative sources is working reasonably well.

Figure 3
National population distribution for IDI-ERP and ERP
By single year of age and sex
At 30 June 2013 

Graph showing national population distribution for IDI-ERP and ERP, by single year of age for males, at 30 June 2013.  Graph showing national population distribution for IDI-ERP and ERP, by single year of age for females, at 30 June 2013.

However, this presentation may conceal important differences between the two sources. The quality standards are expressed in terms of relative difference from the official ERP. Figure 4 shows the IDI-ERP as a percentage of the ERP, by five-year age group and sex. The figure also shows the quality standards: 90 percent of the estimates should be within 1.5 percent of the ERP (the dark grey shaded area) and all should be within 5 percent (the lighter grey area).

While some parts of the population have good coverage (for example, females aged 30 to 69), others have coverage that is outside the quality standards. Overall, 44 percent of the age-sex groups were within 1.5 percent of the ERP and 83 percent were within 5 percent of the ERP.

Where coverage is outside the quality standards, this was mostly overcoverage (where the IDI-ERP count is higher than the ERP) rather than undercoverage. The figure shows that overcoverage is greatest in the early adult ages (20–34 years), particularly for males. Possible reasons for this overcoverage are considered in the discussion section.

Figure 4 

Graph showing IDI-ERP as a percentage of ERP, by five-year age group and sex, at 30 June 2013.

Individual-level comparisons against census

The net coverage estimates from the aggregate comparison in figure 3 may conceal areas of overcoverage and undercoverage at the individual level. This section presents findings from the individual-level analysis of coverage in the IDI-ERP using the linked Census-IDI dataset.

Figure 5 shows the overlap between the census, IDI, and IDI-ERP populations. The figure is to scale and the size of the areas represents the relative size of the populations (Micallef & Rodgers, 2014).

A total of 3,805,700 individuals were in both the IDI-ERP and census populations (the overlapping striped and shaded grey areas in figure 5). This represents 86.8 percent of the IDI-ERP population and 93.5 percent of the census population.

Some individuals in the census population were not found in the IDI-ERP (the non-overlapping grey areas in figure 5). Of the individuals who were in the census population, 6.5 percent (n=264,200) were not found in the IDI-ERP. As these individuals were identified as usual residents in the census, but were not included in the IDI-ERP, they can be thought of as potential undercoverage in the IDI-ERP population. Some of these individuals (1.9 percent, n=77,300) were in the IDI, but had not been selected in to the IDI-ERP population using the rules for defining a resident population from the IDI. The remainder (4.6 percent, n=186,900) were not found in the IDI at all.

There were also individuals who were in the IDI-ERP, but were not found in the census (the non-overlapping striped areas in figure 5). Of the individuals who were in the IDI-ERP, 13.2 percent (n=578,600) were not found in the census. As these individuals were included in the IDI-ERP population, but were not identified as usual residents in the census population, they can be thought of as potential overcoverage in the IDI-ERP population.

Figure 5
Overlap between the IDI, IDI-ERP, and census populations

Diagram showing the overlap between the IDI, IDI-ERP, and census populations.

This analysis suggests a first estimate of 6.5 percent undercoverage in the IDI-ERP and 13.2 percent overcoverage. However, we should be cautious of these estimates as there are several reasons why an individual may be in the IDI-ERP but not in census, or vice versa. Not all of these are ‘true’ undercoverage or overcoverage in the IDI-ERP. Other reasons for apparent coverage error include non-response in the census, and linkage errors in the Census-IDI link.

Impact of census non-response

Census non-response contributes to apparent overcoverage in the IDI-ERP. Individuals who are part of the usual resident population but did not fill out a census form may appear in the IDI-ERP but not in the census. As these individuals are part of the resident population, their inclusion in the IDI-ERP is correct. Total census non-response due to substitutes and undercount is 7.1 percent (Statistics NZ, 2014b).

Census overcount contributes to an apparent undercount in the IDI-ERP. Some individuals may be counted more than once in the census, or counted when they were not usual residents. Overcount was estimated to be less than 1 percent in the 2013 Census.

Impact of Census-IDI linkage errors

Linkage errors in the Census-IDI link are problematic because they inflate estimates of overcoverage and undercoverage in the IDI-ERP. If the records for an individual are not linked when they should have been (false negative link), the records for that individual will appear as two unlinked records – one in the census, and one in the IDI-ERP. The unlinked IDI-ERP record will be counted as IDI-ERP overcoverage (because it was in the IDI-ERP but not in the census). The unlinked record in the census will be counted as IDI-ERP undercoverage (because it was in the census but not in the IDI-ERP). Thus the false negative link will contribute both to apparent undercoverage and overcoverage in the IDI-ERP.

Conversely, if records for two different individuals are linked when they should not be (false positive link), the records will not be counted towards IDI-ERP undercoverage or overcoverage totals, when in fact they should have been. False positive and false negative linkage error may affect some population groups more than others.

While the false positive linkage rate is estimated at less than 1 percent, it is difficult to distinguish between people in the census who were incorrectly missed in the linkage process and those genuinely not found in the IDI. The presence of linkage error in the Census-IDI link may lead us to incorrect conclusions about coverage of the IDI-ERP. Further work is needed to adjust for the effect of Census-IDI linkage errors before we have a better understanding of the undercoverage and overcoverage of the IDI-ERP.

An example of IDI-ERP undercoverage

One subset of the IDI-ERP undercoverage population is of particular interest and less affected by these problems. Around 77,000 individuals were in the census population, and were linked to the IDI, but were not included in the IDI-ERP due to not meeting the requirements of the rules. Those individuals are likely to represent genuine undercoverage as they have been identified as residents in the census, but have not been included in the IDI-ERP.

Figure 6 shows, by five-year age group, the percentage of the linked Census-IDI population (the group of people who were found in both the IDI and the census) that was not included in the IDI-ERP. The figure shows that the percentage of people who were in the IDI but not in the IDI-ERP increases throughout adulthood. It reaches a peak at the 60–64-year age group, before declining.

At most age groups, the percentages are similar for males and females. The exception is the oldest age groups, where females are less likely to be selected into the IDI-ERP than males.

Figure 6

Graph showing percentage of census population linked to the IDI not in the IDI-ERP, by five-year age group and sex, at 5 March 2013.

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
Top
  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+