Our activities include the acquisition, documentation, anonymization, dissemination, and preservation of micro-data and related metadata.
The National Data Archive was primarily established to archive survey and census microdata produced by the National Statistics Office and other official data producers. On a selected basis, the National Data Archive also serves as a repository for non-official datasets. Data producers interested in depositing data in the National Data Archive are invited to contact us.
Data documentation serves several important functions. It helps data producers build institutional memory, and helps researchers to:
- Find the data they are interested in.
- Locate the datasets and variables that meet their research requirements.
- Understand what the data are measuring and how they data have been created, and assess their quality.
- Understand the survey design and the methods used when collecting and processing the data, thereby reducing the risk that data will be misunderstood or misused.
The National Data Archive adopted the Data Documentation Initiative (DDI) and the Dublin Core (DCMI) international metadata standards.
Statistical agencies are charged with legal and ethical obligations to protect the confidentiality of survey respondents. The National Data Archive protects confidentiality of the data by:
- Restricting access to data that present a potential disclosure risk to scrutinized users only, under formal conditions.
- Anonymizing data when necessary, by altering or supressing variables which could potentially identify a physical or legal individual. This may make the data less useful for analysts. The National Data Archive seeks to minimize the information loss while ensuring an acceptable level of disclosure risk. Principles and methods applied for measuring the risk and for anonymizing data are those provided or recommended by the International Household Survey Network.
Data dissemination increases the quality, use and potential impact of data, by:
- Making it possible for analytical work to be replicated, a critical step to good science;
- Creating the potential to use old data to test new ideas;
- Reducing the costs of data collection and the burden on respondents, by avoiding the need for reasearchers to undertake their own surveys;
- Demonstrating transparency and credibility in data production, which are at the heart of good governance; and
- Improving the relevance and quality of data by incorporating users feedback in future data collection.
Obviously, making microdata available also has down sides. It exposes data producers to criticism, it increases the risk of breach to confidentiality, and it can result in conflicting outputs being generated. Having faith in the ethical consuct of data users and in their willingness to contribute to the quality and usefulness of the data, the National Data Archive considers that the benefits outweigh the disadvantages. We insist however that access to microdata must not be seen as a right. Access will only be permitted to bona fide users, and for statistical and research purposes only.
Micro-datasets can be damaged or lost because of human error, because of technical problems, or because of disasters such as fire or flood. New technologies can also render old data unreadable, because of either hardware or software advances. The National Data Archive is implementing standard procedures for ensuring the physical security and long-term usability of its resources, together with associated backup arrangements for minimizing the impact of adverse events.