Stats NZ has a new website.

For new releases go to

www.stats.govt.nz

As we transition to our new site, you'll still find some Stats NZ information here on this archive site.

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
Results

Coverage of geographic information in the IDI

Table 1 shows the coverage of geographic information in the IDI. For each source of geographic information, the table shows the ages best covered by the data source, and the percentage of individuals in the IDI-ERP within those target ages who had a meshblock recorded in that data source.

Coverage of geographic information varied between different data sources. Health and Inland Revenue sources had the highest coverage, as most of the population has had some contact with these agencies and has, at some point, had an address recorded.

While most individuals in the target ages have had contact with the Ministry of Education (via a school enrolment), not all individuals have an address recorded, so coverage was low. The MSD benefits source had low coverage, because only a small proportion of the target population have had contact with MSD. Overall, when the data sources were combined, they had very good coverage of the population, with 99 percent of people having a meshblock recorded in at least one of the data sources.

Table 1 

 Coverage of meshblock information in IDI data sources
Source

Ages
(years)

% with
meshblock
information

Health (NHI

all

89

Health (PHO)

all

83

Inland Revenue

all

93

MOE(1) (education)

6–15

48

MSD(2) (benefits)

18–64

14

Any source

all

99

1.Ministry of Education.
2. Ministry of Social Development.
Source:  Statistics New Zealand

top

Comparison of IDI and census meshblocks

Using the linked IDI and census dataset it was possible to compare the meshblock recorded for an individual in the IDI (as at census day, 5 March 2013) with the meshblock recorded for that same individual in the census. Census meshblocks have a high level of accuracy so can provide a reliable indication of the quality of the IDI meshblocks.

Table 2 shows the percentage of people who had the same geographic information recorded in IDI and census for three geographic levels: meshblock; area unit; and territorial authority (TA). The table shows that different sources had different levels of agreement with the census. The health sources (NHI and PHO) had the highest levels of agreement, with more than 70 percent of IDI meshblocks and more than 90 percent of territorial authorities being the same as in the census. The lowest levels of agreement were for MSD benefits, with 57 percent of IDI meshblocks and 85 percent of territorial authorities being the same as in the census. The different levels of agreement for different data sources may be due to differences in the frequency of contact and address updating procedures at different agencies. For example, many individuals do not have regular contact with Inland Revenue, so they may not update Inland Revenue when their address changes.

Table 2  

 Percentage of people with the same geographic information in IDI and census
Source

% of non-missing that are same as census

Meshblock

Area unit

TA(1)

Health (NHI) 

75

78

92

Health (PHO) 

71

74

90

MOE(2) (education) 

67

73

92

Inland Revenue

63

68

89

MSD(3) benefits) 

57

62

85

1. Territorial authority.
2. Ministry of Education.
3. Ministry of Social Development.
Source: Statistics New Zealand

For all sources there was greater agreement between the census and the IDI at the TA level than at the area unit or meshblock level. This is because, although some individuals move house and do not update their address in administrative data sources, they often remain in the same TA, but not in the same meshblock or area unit.

A given individual may have a meshblock recorded in several different IDI data sources. These meshblocks may differ, for example if an individual has updated their address with some agencies but not others. It is therefore necessary to find a way to combine the geographic information from different sources and select the ‘best’ meshblock for each individual at any given date.

Table 3 shows the agreement between IDI and census geographic information for two simple methods of combining the information from different sources. The first is a ‘prioritised’ method in which the meshblock sources are ranked according to their agreement with census and then the meshblock from the highest-ranked available source is selected. The second is a ‘most recent’ method in which the meshblock that was updated most recently is selected.

Table 3 shows that the meshblocks selected using the ‘most recent’ method were more likely to agree with census meshblocks than those selected with the ‘prioritised’ method. Almost 80 percent of IDI meshblocks and 94 percent of IDI territorial authorities were the same as in census when the ‘most recent’ method was used, compared to 70 percent and 90 percent using the prioritised method. When meshblocks were selected using the ‘most recent’ meshblock method, around 46 percent of the meshblocks selected came from the NHI health data, 35 percent from Inland Revenue, 13 percent from PHO health, 5 percent from Education, and 2 percent from MSD working age benefits.

Table 3   

Percentage of people with the same geographic information in the IDI and the census
By method used to combine meshblocks from different sources 

Method for combining information

% of non-missing that are same as census 

Meshblock

Area unit

TA(1)

Prioritised 

70

73

90

Most recent

79

82

94

1. Territorial authority. 
Source: Statistics New Zealand

Further analysis revealed that, overall, 84 percent of individuals had a meshblock recorded in at least one IDI data source that was the same as their census meshblock. This provides an upper limit for the potential agreement rate that could be obtained by using a set of rules to select the ‘best’ meshblock in the IDI. The upper limit for area units was 86 percent and for territorial authorities it was 95 percent. For territorial authorities, the results from the ‘most recent’ selection method are close to the upper limit, suggesting that refining the method for selecting the ‘best’ meshblock would only improve territorial authority agreement by a small amount. For area units and meshblocks, it may be possible to get a slightly larger increase in agreement (up to 5 percent) by refining the selection method.

Figure 2 shows the proportion of individuals in the IDI-ERP who have the same meshblocks recorded in the IDI and the census, by age and sex. The ‘most recent meshblock’ method has been used to select a meshblock for these individuals. The figure shows that agreement between IDI and census meshblocks is lowest in the young adult ages (approximately ages 15–30) compared to other ages. Young adults are more mobile than other age groups and therefore may be more likely to have an administrative address that is out of date. Overall, agreement between IDI and census meshblocks is lower for males than for females at most ages, with the exception of ages under 15 and over 75, where levels of agreement are similar for males and females.

Figure 2 

Graph showing percentage of linked IDI-census population with the same meshblocks in IDI and census.

Using geographic information to create households

An additional use of address information in the IDI is to create households. Individuals who are living at the same address can be grouped together to form a household. Creating households is a more demanding test of the quality of address data as a ‘correct’ household requires that all individuals in the household are registered at the correct address, and that no additional individuals are incorrectly registered to that address.

We took several steps to create households in the IDI. First, a single address was allocated to each individual in the IDI-ERP using the ‘most recent’ method described previously. Second, the geocoded address identifiers associated with these addresses were used to group individuals into households. Individuals with the same address identifier were considered to be in the same household.

To examine the quality of household information in the IDI we compared the set of individuals living in an IDI-ERP household with the set of individuals living in a census household. All of the households (addresses) identified in the IDI-ERP were also identified in the census.

Table 4 shows, for each IDI-ERP household size, the percentage of IDI-ERP households of that size that had the same household size in census, and the percentage that contained the exact same individuals in census. The population used for the household analysis in Table 4 was restricted to households where all household members (as specified in the IDI) had records in the census and the IDI, and those records were linked together. Households where one or more individuals were away from home on census night, or did not have an IDI address recorded, were excluded. Visitors (people who were in a household on census night but do not usually live there) were excluded from household counts. Census dwellings that were non-private (such as rest homes, boarding houses, university accommodation) were excluded from the analysis, as they are not considered to be ‘households’ in the census and as such they do not have a household size available.

Table 4 also shows that, overall, 55 percent of IDI-ERP households had the same household size in census, and 48 percent contained the same set of household members in census. Agreement between IDI-ERP and census household sizes was better for smaller households than for larger households.

Table 4

 Comparison of household size and composition in the IDI and the census

IDI-ERP(1)
household size
(number of people)

Number of
IDI-ERP(1)
households 

% with same
household
size in the census

% with same
household
members in the census

261,300

69

64

2

328,200

70

62

3

212,900

40

32

4

170,300

50

42

5

90,000

36

29

6

40,800

24

17

7

18,900

16

10

8+

20,600

9

5

Total

1,142,900 

55

48

1. Estimated resident  population, see method used to identify the IDI resident population.
Source: Statistics New Zealand

If household sizes in the IDI-ERP are not correct, this will have an impact on the household size distribution.

Table 5 shows the distribution of household size in the IDI-ERP compared to census. The population used for the IDI-ERP distribution in Table 5 was the full IDI-ERP, not the restricted population used in Table 4.

Table 5 also shows that, compared to the census, the IDI-ERP contained substantially fewer two-person households (416,100 in the IDI-ERP compared to 527,700 in the census). Compared to the census, the IDI-ERP contained substantially more large households (six people or more). This may be due to non-private dwellings (such as university accommodation, rest homes, boarding houses) being included in the IDI-ERP household count, but not in the census household count. 

Table 5

Household size distribution in the IDI and the census

Household size
(number of people)

Number of IDI-ERP(1)
households

Number of census
households

Number of IDI-ERP(1)
households as % of census 

368,600

355,300

104

416,100

527,700

79

273,800

252,900

108

216,200

235,300

92

117,700

106,300

111

55,700

41,200

135

27,200

15,400

177

8+ 

33,300

13,800

241

Total 

1,508,600 

1549,900

97

1. Estimated resident  population, see method used to identify the IDI resident population.
Source: Statistics New Zealand

 

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
Top
  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+