Stats NZ has a new website.

For new releases go to

www.stats.govt.nz

As we transition to our new site, you'll still find some Stats NZ information here on this archive site.

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
2013 Census confidentiality rules – summary of changes from 2006

This page gives you information on the confidentiality rules and procedures that Statistics NZ uses for 2013 Census data, and changes from the 2006 Census.

Statistics NZ has reviewed the confidentiality rules used in the 2006 Census and produced a Confidentiality Standard that includes the confidentiality procedures for 2013 Census data. The goals for these procedures are:

  • utility – fulfilling Statistics NZ's aim to maximise the use of data
  • safety – maintaining sufficient levels of confidentiality protection
  • simplicity – providing procedures that can be efficiently described, understood, automated, and implemented.

The standard also contains a statement of the principles for confidentiality of census data.

2013 Census confidentiality rules and how they are applied has details on how we apply the eight 2013 rules to census data before we release it. 

Summary of main changes

We extensively investigated how the confidentiality rules could be changed and simplified, looking at how various combinations would affect the data. A few minor changes to the rules were shown to produce some significant improvements in both utility and simplicity, with the established level of safety maintained. These changes now allow users to see counts ≥ 6 in any normal table. These changes are shown below.

Census confidentiality rules

2006 rules

Rule 1: Meshblock rule

Rule 2: Income rule

Rule 3: Mean cell size rule – 2007 amendment: large cells rescued by threshold.

Rule 4: Random rounding rule

Rule 5: Derivation rule

Release under licence

2013 rules

Rule 1: Meshblock rule – has been simplified, and more data can now be released at meshblock level.

Income rule has been removed.

Rule 2: Mean cell size rule – reference to income information has been removed.

Rule 3: Threshold rule – ‘rescue by threshold’ has been extended and has become a rule in its own right.

Rule 4: Random rounding rule – no change to this rule.

The original rule has been refined and expanded to provide confidentiality protection for proportions and measures from counts. This matches the process that previously existed in practice, and has been split over four rules:

  • Rule 5: Proportions from counts rule
  • Rule 6: Suppression for measures rule
  • Rule 7: Use of rounded counts for calculating measures rule
  • Rule 8: Rounding of measures rule.

Release under licence has been extended.

Details of changestop

Meshblock rule

2006

Rule 1: Meshblock rule
Meshblock data may only be disaggregated (broken down) by one variable. The categories for the selected variable must be at the highest level of aggregation (least detailed) for that classification.

Exemptions to Rule 1
Standard groupings of meshblocks are exempt from Rule 1.
Customised groupings of more than 10 meshblocks are exempt from Rule 1.

2013

Rule 1: Meshblock rule
If the geographic unit for the table is a meshblock, and if the table is 'complex', as defined below, then the table is deemed to be sensitive. Then Rule 3, the threshold rule, will apply to the table.

For this rule, a 'complex' table is one that uses a variable that is at a level more detailed than its highest level, or one that uses two or more variables.

For the income variable, the highest level is grouped income with 7 categories.

Non-standard groupings of meshblocks are treated as if the geographic unit were a meshblock.

Explanation of changes

The change to Rule 1 has the effect of permitting more data to be released at meshblock level. A cross-tabulation using the classification for a variable below the highest level of aggregation will now be possible, as well as a cross-tabulation using more than one variable. Therefore, users are now able to obtain any normal table, even those deemed to be complex, where only counts ≥ 6 will still be visible – counts that are less than 6 will be suppressed with a ‘..C’ suppression symbol. The part of the 2013 rule referring to income corresponds to part of the 2006 income Rule 2. top

Income rule

2006

Rule 2: Income rule
(Part 1)
No income data can be released for any geographic area if the total unrounded subject population is less than 40 individuals or 20 families/dwellings/households.

(Part 2)
The standard 15-category income classification cannot be used for meshblocks or user-defined combinations.

2013

No income rule.

Explanation of changes

Part 1 is removed and part 2 is covered by the 2013 Rule 1 (meshblock rule), which is less restrictive than the 2006 Rule 1. For example, the release of a table at meshblock level with the 15-category income classification is now possible, although only counts ≥ 6 are seen. top

Mean cell size rule and new threshold rule

2006

Rule 3: Mean cell size rule

Image, 2006 calculation for mean cell size rule.

The mean cell size (MCS) for individual geographic areas must be greater than two. Income information is still subject to Rule 2.

Some tables which contain only a geography and the population total for that geography are excluded from the mean cell size rule.

The finest geographic variable will be considered the geographic area, and the other geographic variable will become the category variable.

Exemption to Rule 3 for large cells
When the mean cell size rule suppresses large cells, or the rule cannot be applied to output, a threshold may be applied instead. Cells at or below the threshold value of 5 are suppressed, and higher counts are released.

2013

Rule 2: Mean cell size rule
The rule is applied separately for each geographic unit. For any table for a geographic unit, the mean cell size is calculated as follows:

Image, 2013 calculation for mean cell size rule.

If the mean cell size is less than or equal to two then the table is deemed to be sensitive. Then Rule 3, the threshold rule, will apply to the table.

A table may contain the total count for its subject population by geographic unit only. Such a table is treated as not sensitive and therefore is not suppressed.

A table may use a second geographic classification as one of its categorical variables. Such a table is deemed to be sensitive, with Rule 3 applied to the table (see below.)

Rule 3: Threshold rule
When the table for a geographic unit is deemed sensitive by either Rule 1 or Rule 2, all counts less than 6 are suppressed.

Explanation of changes

There are some changes to how the mean cell size is calculated:

  • The reference to income information is removed.
  • Subject population totals are treated in the same way (not suppressed). A subject population is the people, families, households, or dwellings to whom a variable applies (eg the subject population for the occupation variable is the employed census usually resident population aged 15 years and over).
  • Sensitive geographic tables are treated by threshold. The benefits of the 2013 Rule 3 (threshold rule) are that it 'rescues' larger cells from the meshblock and mean cell size rules; cells with small counts are suppressed but overall more data is released. Therefore, counts ≥ 6 will always be shown, regardless of whether the table is deemed sensitive or not. top

Random rounding rule

2006

Rule 4: Random rounding rule
All counts for individuals, families, households and dwellings must be randomly rounded to base 3.

2013

Rule 4: Random rounding rule
All counts for individuals, families, households and dwellings must be randomly rounded to base 3.

Explanation of changes

No change to this rule.top  

Derivation rule, new proportions from counts rule, and three new rules for measures

Measures are medians, means, quartiles, quintiles, and deciles.

2006

Rule 5: Derivation rule
All derivations from counts must be derived from the randomly rounded counts. This excludes totals and subtotals, as these are independently randomly rounded. Once a derivation has been calculated, there will be no further random rounding applied to the derived data.

2013

Rule 5: Proportions from counts rule
All proportions, percentages, and ratios that come from counts must use data that is fully confidentialised, using rules 1–4 above. Where either of the two counts in the proportion is suppressed, then the proportion is suppressed.

Rule 6: Suppression for measures rule
For each measure there is a threshold value:

  • for medians and means it is 6
  • for quartiles it is 12
  • for quintiles it is 15
  • for deciles it is 30
  • if other measures are requested, the confidentiality assessment team will decide this value.

If the unrounded count of individuals in a cell is less than the threshold value for the measure, then the measure for that cell is suppressed.

Rule 7: Use of rounded counts for calculating measures rule
All measures need to be calculated from randomly rounded counts.

Rule 8: Rounding of measures rule
All measures need to be rounded:

  • measures from annual income are rounded to the nearest $100
  • measures from weekly rent paid are rounded to the nearest $10
  • measures for age are rounded to one decimal place
  • measures from whole number count variables are rounded to one decimal place.

Explanation of changes

These rules for proportions and measures, which previously existed in practice, are now properly covered by the 2013 rules.top

Processes for deciding how to confidentialise data

The following flowcharts show you the processes for deciding how to confidentialise 2013 Census data.

Decision process for confidentialising tables of counts

Flowchart, Decision process for confidentialising tables of counts.

Decision process for confidentialising output containing proportions, percentages or ratios

Flowchart, Decision process for confidentialising output containing proportions, percentages or ratios.

Decision process for producing a confidentialised measure from the distribution of a variable

Flowchart, Decision process for producing a confidentialised measure from the distribution of a variable. top

Release under licence of less-confidentiliased counts

We have made an extension to the licensed release of less-confidentialised counts. We may offer licence agreements to institutions for specific projects, meaning that: 

  • the institution agrees to work within the licence agreement
  • all individual users within the institution sign the user undertaking
  • our count data is released with only random rounding applied to the counts.

A Statistics NZ confidentiality assessment team assesses a request for such an agreement. The team retains the right to decide whether the project justifies the release of less-confidentialised tables. The final decision rests with the General Manager Census.

In a release under licence agreement, the confidentiality protection shifts from suppressing data (as done for public release) to the context in which the data is used. The tables will be released, but only under licence conditions, and strictly for internal use only within the institution.

Licence agreements enable fuller use of census data by key users, for statistical purposes only. The new agreement expands on the one created for 2006 Census data, and is consistent with that used for basic confidentialised unit record files (CURFs).top

How we reviewed the confidentiality rules

We consulted external users at the end of 2008 to gather feedback on their views on the existing (2006) rules. Workshops were held in Christchurch, Wellington, and Auckland, with representatives from central and local government and the private sector. We also consulted internally across Statistics NZ, which involved several stages of feedback and revision.

The intention was to improve the existing rules. User feedback generally showed that the 2006 rules had a negative impact on data and on users’ ability to produce key outputs. However, users also acknowledged that useful data could be obtained and that disclosure control is necessary.

The following are examples of points raised about the 2006 rules:

  • Confidentiality rules affect the utility and complexity of tables, especially when tables are for research purposes.
  • External users would like access to more data when their work is for central and local government.
  • The meshblock rule can be too restrictive and inflexible.
  • Useful data can be obtained under the threshold rule and the suppression to 5 or less is a huge improvement.
  • Random rounding is acceptable and users are accustomed to it. Useful trends can still be seen in data with random rounding applied. top

More information

2013 Census confidentiality rules and how they are applied

Read more about the privacy, security, and confidentiality of information supplied to Statistics NZ 

Contact our Information Centre for specific queries

Request customised data

Use NZ.Stat to explore our data 

Use DataInfo+ to explore our metadatatop

Published 30 August 2013top

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
Top
  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+