Stats NZ has a new website.

For new releases go to

www.stats.govt.nz

As we transition to our new site, you'll still find some Stats NZ information here on this archive site.

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
Using census data

How is census data different from survey data?

Census data is different from other survey data in a number of ways. The most important difference is that a census sets out to include information from every person in the country. Therefore, it is not subject to sampling errors that occur in other methods (see What are the possible sources of error?).

The census includes a broad range of topics providing good contextual information for individuals, families, and households, unlike other surveys, which have a narrower focus. However, in order to cover such a broad range of topics and maximise response rate, census questions are quick and simple and may not gather information in as much detail or in as much depth as other methods.

The population coverage of census means information is available for much smaller geographic areas – down to the meshblock and area unit levels (see Geographic definitions) – and for small population groups, for example ethnic groups. Sample surveys only cover a small proportion of the population (as outlined in chapter 3).

Respondents fill in the forms themselves. Like other self-administered questionnaires this can lead to more truthful responses, because the interviewer cannot influence the respondent. However, respondents might not fill the form correctly, which may lead to issues or errors with the data.

We advise you to understand the strengths and limitations of census data compared with other survey methods before deciding which to use.

To what geographic levels can I get census data?

Census data is available in two main ways:

  • as standard published outputs available from the Statistics NZ website
  • as customised data available on request from Statistics NZ's Client Services team.

The geographic levels available from these two sources are summarised in the table below.

Geographic levels available from census data
Geographic level  Standard published output Customised request 
Meshblock(1)  X  X
Area unit  X  X
Ward    X
Territorial authority area  X  X
Regional council area  X  X
Urban area    X
Statistical area    X
Regional council constituency    X
Community board    X
Auckland local board area   X   X
General electoral district  X  X
Māori electoral district  X  X
District health boards  X  X
User defined(2)    X

1. The Meshblock dataset will be available in March 2014 for selected variables from the 2013, 2006, and 2001 Censuses, rebased to 2013 Census boundaries. These counts are at the highest level of each variable's classification. The Meshblock dataset also contains counts for area units, wards, territorial authority areas, and regional council areas.

2. Such as police districts, radius from a specific point, any combination of standard geographies. 

In addition to standard products and customised requests, we may provide accredited researchers with access to microdata. Microdata is unit-record-level data or data corresponding to information at the respondent level. We present all statistical data in a way that does not identity the particulars about a person, dwelling, or household. This means that the microdata is anonymised for use in the Data Laboratory (Data Lab) facilities in Statistics NZ's offices in Auckland, Wellington, and Christchurch or by a secure remote access system. Applicants applying for Data Lab access follow an application process with strict eligibility criteria based on the requirements of the Statistics Act 1975. Find out more about access to the Data Lab and associated costs.

Which population count should I use?

For some output variables, data about individuals/people can be reported in two ways:

  • census usually resident population count
  • census night population count.

Most often, the census usually resident population count is used. This is the count of all people who usually live in an area of New Zealand and are present in New Zealand on census night. This count excludes visitors from overseas and residents who are temporarily overseas on census night. New Zealand residents who are away from their usual address on census night are allocated back to the area where they usually live and form part of the census usually resident population count of that area.

The census night population count is a count of all people present in a given area of New Zealand on census night. This count includes visitors from overseas who are in New Zealand on census night and people who usually live elsewhere in New Zealand, but excludes New Zealand residents who are temporarily overseas on census night.

What are the impacts of the longer-than-usual time period between the 2006 and 2013 Censuses?

There was a seven-year gap between the 2013 Census and the 2006 Census because the planned 2011 Census was cancelled due to the Canterbury earthquakes. Generally, New Zealand has had a five-year gap between censuses. For example, censuses were held every five years from 1951 to 2006, although on different dates in either March or April. Before 1951, the census was held on different dates throughout the year, and there were longer gaps than five years between 1926, 1936, 1945, and 1951.

History of the census in New Zealand has more about previous censuses.

You need to be aware of the irregular time period when interpreting results. Issues to be aware of include:

  • an inconsistent time series that means care is needed in comparing trends over time
  • the ‘usual residence five years ago’ indicator remains the same and so, unlike with previous censuses, does not align with the last census
  • there may appear to be greater changes in data due to the longer intercensal period.

Annualising intercensal changes (for example average annual change) is one approach to comparing periods of different length. You should consider adding a footnote to any tables or graphs you produce comparing the 2013 data to earlier periods to ensure your readers are aware of the longer gap.

2013 Census products, including graphs and tables with time series data, will have a footnote highlighting the seven-year gap between data points. Some 2013 Census tables included annualised change.

What is the difference between census counts, population estimates, and projections?

Between censuses, we prepare population estimates by age and sex to show changes annually. The estimates use the latest census data with adjustments for net census undercount, residents temporarily overseas on census night, and births, deaths, and migration since the last census. Population estimates are usually higher than the census usually resident population count because of the adjustments for net census undercount and residents who are temporarily overseas at the time of the census.

National population estimates are produced quarterly (reference dates at 31 March, 30 June, 30 September and 31 December) and provisional results are available within six weeks of the reference date.

Subnational population estimates are produced annually (reference date at 30 June) and are available in October. After each census, the population estimates for the preceding intercensal period are revised. For example, following the release of results from the 2013 Census and the 2013 Post-enumeration Survey, a new estimated resident population at 30 June 2013 is calculated. This new base is used to revise the previously published population estimates for the period 2006–13.

We also update and release population projections every two to three years to give an indication of the future size and composition of the population. Population projections are available at national and subnational levels by age, sex, and major ethnic group. We produce multiple projection series by using different combinations of assumptions about future births (fertility), deaths (mortality), and migration. Population projections use population estimates as a starting point or base.

Find out more about population measures including estimates and projections.

Why is the subject population important?

The subject population is the individuals, families, households, or dwellings to which variables apply.

When interpreting census data, it is important for users to know what subject population the data is based on, so that any inferences drawn from that data are restricted only to that population group and not generalised outside that population group. Census variables / topics by subject population lists the subject population(s) for each census variable.

2013 Census information by variable has more on variables and their subject populations.

What is the difference between a dwelling and a household?

A dwelling is any building or structure – or its parts – that is used, or intended to be used, for human habitation. Dwellings can be permanent or temporary and include structures such as houses, motels, hotels, prisons, motor homes, huts, and tents.

There can be more than one dwelling within a building; for example, in an apartment building each separate apartment or unit is considered a dwelling.
There are two types of dwellings:

  • private (for example houses, flats, or apartments)
  • non-private (for example hotels, hospitals, prisons).

'Dwellings under construction' includes all houses, flats, groups, or blocks of flats being built.

A household is either one person who usually resides alone, or two or more people who usually reside together and share facilities (such as for eating, cooking, or a living area; and bathroom and toilet) in a private dwelling. Included are people who were absent on census night but usually live in a particular dwelling and are members of that household, as long as they were reported as being absent by the reference person on the dwelling form.

Census collects information on families and households in private occupied dwellings. No family and household data is collected for non-private dwellings.

How are occupied and unoccupied dwellings defined?

Dwellings are classed as occupied if the dwelling is occupied at the midnight on the night of the census or occupied up to 12 hours after midnight of the night of the census.

A dwelling is defined as unoccupied if it was unoccupied at all times during the 12 hours after midnight on the night of the census and was suitable for habitation.

The count of occupied dwellings includes:

  • private occupied dwellings such as houses, flats, and apartments
  • non-private occupied dwellings such as hotels and hospitals.

What is an absentee?

An absentee is identified on the census dwelling form as someone who usually lives in a particular dwelling, but has not completed a census individual form there – because the person was elsewhere in New Zealand or overseas on census night. Such a person may have completed a census individual form elsewhere in New Zealand.

Included as absentees in the census are children away at boarding school, people away on business or holiday, in hospital, and so on.

Excluded are long-term hospital patients and tertiary (including university) students who live away from the dwelling for most of the year

Absentees are only recorded in dwellings where a dwelling form was completed, therefore there are no absentees recorded for unoccupied dwellings.

Statistics NZ uses information on absentees to work out the composition of households, for example the total number of absentees for each household and whether they were elsewhere in New Zealand or overseas.

The census can also tell us how long an absentee who is overseas on census night is away from New Zealand. A 'New Zealand resident temporarily overseas' is an absentee who is overseas and away from New Zealand for less than 12 months.

What are derived variables?

Some census output variables are created from responses to individual questions or from a combination of responses given to two or more questions on the census forms. These are called derived variables. For example:

  • age is derived from the census question on date of birth
  • years since arrival in New Zealand is derived from month and year first arrived in New Zealand
  • household composition is derived from the relationship to the reference person, absentees’ relationship to the reference person, and living arrangements.

Total family and household incomes are also derived. The following section details how these variables are worked out.

Derived variables are dependent on the quality of the input variables. Any errors or issues with the input variables are likely to affect the data quality of the derived variable and maybe greater when two or more census questions feed into the derived variable. We comment on any issues or errors with derived variables in 2013 Census information by variable.

How are total family and household incomes worked out?

Total personal income received is the before-tax income of a person in the 12 months ended 31 March 2013. The information is collected as income bands rather than in actual dollars.

'Total family income' is derived by aggregating the total personal income of all members of the family nucleus who are aged 15 years and over. To calculate total family income, a representative income is worked out for each total personal income range. The representative value for each band is the median value (half are above and half below) for those in that band of the more detailed Household Economic Survey (HES). These median values are then added together.

Household income is calculated in a similar way to family income, except that all people in the household who are aged 15 years and over are included in the calculation.

Why use income bands?

The census question that asks about the total personal income of individuals provides the respondent with a list of income ranges or bands to choose from. This is because asking respondents to state their actual income is a sensitive issue and will often result in a higher level of non-response to the question.

Other total income variables, such as total household income and total family income, are derived from total personal income.

Why do totals for some geographic areas from previous censuses change after the latest census?

Population changes throughout New Zealand lead to changes in geographic boundaries. This means that totals for geographic areas, for example meshblocks, area units, and local and regional council areas, may change between the censuses.

We produce data from previous census years according to the current census's geographic boundaries to maintain comparability and allow time-series analysis of census data – this is called rebasing.

In the process of rebasing, each dwelling and individual within a meshblock or area unit split since the previous census is identified and allocated to the new meshblock pattern.

This allows users to compare people and dwellings in the same area between different censuses.

How should I calculate percentages?

When you calculate percentages using census data, it is important to follow these steps:

1. Ensure that the data reflects the correct subject population. For example, when calculating the percentage of regular cigarette smokers, the data needs to refer to the census usually resident population count aged 15 years and over, as this is the correct subject population for this variable.
2. Use the total stated population as the denominator for the calculation – this excludes residual categories (‘not stated’, ‘refused to answer’, ‘don't know', ‘response outside scope’, ‘response unidentifiable’ and ‘not elsewhere included’).
3. Where a ‘total stated’ population specifically appears in the census table, we advise you to use this ‘total stated’ population as the denominator.
4. Where a ‘total stated’ population does not appear in the table, we advise you to calculate the total population to use as the denominator (as in in point 2), by subtracting the residual categories from the total mentioned in the table.
5.

A number of variables have categories that are valid responses and should not be excluded from the total population such as:

  • number of children born – 'object to answer' is a valid response and is part of the 'total stated' population (it is a tick-box option on the form)
  • Māori descent – 'don't know' is a valid response and is part of the 'total stated' population (it is a tick-box option on the form)
  • iwi affiliation – 'don't know' is a valid response and is part of the 'total stated' population (it is a tick-box option on the form
  • religious affiliation – 'no religion' and 'object to answer' are valid responses and are part of the 'total stated' population (they are tick-box options on the form).
6. Exclude 'not further defined' categories from the total population when they are used for cases where the information of interest was not provided. For example, if calculating the percentage of households who own the dwelling they live in with a mortgage, the 'dwelling owned or partly owned, mortgage arrangements not further defined’ category is excluded from the calculation. The calculation is:

 owned with a mortgage
__________________________________________ 

  x 100  
(owned with a mortgage + owned without a mortgage)
7.  When calculating percentage change over time use the following formula:

latest year figure – base year census figure
__________________________________ 

  x 100  
base year census figure

Note that in published census data, percentages are usually rounded to one decimal place. When percentages are calculated for categories within total response variables (variables for which there can be more than one valid response), they will most likely add to more than 100 percent.

What is total response data?

Several questions give individuals the option to provide more than one response. We work out the total response count or percentage by counting each response given, for example each ethnic group stated. This means that the total response count may add up to more than the count of the subject population for that variable. When calculating percentages for categories within these variables, they will most likely add to more than 100 percent.

Variables that may be output on the basis of total responses are:

  • ethnic group
  • languages spoken
  • iwi
  • religious affiliation
  • sources of personal income
  • job search methods
  • unpaid activities
  • sources of family income
  • sources of extended family income
  • sources of household income
  • fuel type used to heat dwellings
  • access to telecommunication systems.

Total response variables can also be output as single and combined data, so that individuals or dwellings count once in the category that applies to them. For example, for ethnic group the categories may be combined to be European only, European/Māori, or Māori/Pacific peoples. This means that the total population will be equal to the usual subject population for that variable, as we count individuals once only.

Examples of variables that can be output on the basis of single and combination categories are:

  • ethnic group 
  • languages spoken
  • fuel type used to heat dwellings.
  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
Top
  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+