Understanding ISO 19157 – Geographic Data Quality Standard

Overview of ISO 19157 Standard

ISO 19157:2013 provides the definitive international framework for describing and managing the quality of geographic data. By establishing common principles for evaluating a dataset’s fitness for purpose, it ensures that quality reports are universally trustworthy and understandable, regardless of the data’s origin.

The standard’s scope is comprehensive. It not only defines the fundamental components of data quality, such as accuracy and completeness, but also outlines the general procedures for their evaluation.

As a key part of the broader ISO 19100 family of standards, ISO 19157 works hand-in-hand with ISO 19115, the standard for metadata. While ISO 19115 provides the overall structure for documenting a dataset—its origin, extent, and other properties—ISO 19157 provides the specific rules and vocabulary specifically for the ‘data quality’ section. This integration ensures that critical quality information is captured in a structured and meaningful way.

The standard is built on two key concepts: Data Quality Elements and Data Quality Measures. The elements are the fundamental aspects used to describe quality, including completeness, logical consistency, thematic accuracy, temporal quality, positional accuracy, and usability.

Key Components of Data Quality

To provide a concrete, measurable assessment of data quality, ISO 19157 breaks down data quality into six distinct, evaluable components. These elements provide a comprehensive framework for understanding a dataset’s strengths and weaknesses. By examining each component, you can determine if the data is fit for a specific purpose, whether that’s for urban planning, environmental modeling, or navigation.

The six core components, or data quality elements, defined by the standard are:

  • Completeness: These addresses whether the dataset is whole. It looks for missing data (omission) and the inclusion of data that shouldn’t be there (commission).

  • Logical Consistency: This checks if the data adheres to a set of logical rules. It ensures the correctness of the data’s structure, format, and relationships between different features.

  • Thematic Accuracy: This measures the accuracy of the attributes or classifications assigned to features. For example, is a feature labeled as a ‘hospital’ actually a hospital?

  • Temporal Quality: This evaluates the data’s accuracy with respect to time, including its currency, validity, and how well it represents the real world at a specific point or period.

  • Positional Accuracy: This assesses how closely the coordinates of features in the dataset align with their true positions on the Earth’s surface.

  • Usability: This element evaluates how well the data meets the requirements of a specific application or user group, often based on the other quality elements.

Together, these components provide a comprehensive view of a dataset’s quality. A dataset might have excellent positional accuracy but poor completeness, making it unsuitable for an emergency response system that requires all roads to be mapped. The evaluation of these elements can be quantitative (e.g., “95% of road positions are accurate to within 5 meters”) or qualitative (e.g., “The land use classification is suitable for regional planning”), giving data users the detailed information they need to make confident decisions.

Completeness and Logical Consistency

Completeness addresses a fundamental question: is all the necessary data present? This component assesses two potential failings: omission, where required features or attributes are missing, and commission, where extra or incorrect data is included. For example, a map of national parks is incomplete if it omits a newly designated park or includes one that has been declassified (a commission error).

While completeness ensures the right features are present, logical consistency guarantees they make sense together. This element evaluates the data’s adherence to defined rules governing its structure, format, and the relationships between features—whether conceptual (e.g., a river cannot cross a watershed divide), topological (e.g., road centerlines must connect at intersections), or attribute-based (e.g., a ‘road type’ field must only contain approved values). Data that fails these checks is unreliable and can trigger significant errors in automated analysis, GIS processing, and AI applications.

Thematic and Positional Accuracy

Once data is confirmed to be complete and logically sound, the focus shifts to its correctness. This is the domain of thematic accuracy, which evaluates the truthfulness of attribute values. It assesses how well the labels, classifications, or quantitative measurements attached to a feature reflect reality. For instance, is a parcel of land correctly classified as ‘commercial,’ or is the recorded population for a city block accurate? Poor thematic accuracy directly undermines the reliability of any analysis, leading to incorrect conclusions.

Equally important is the accuracy of where features are located. Positional accuracy measures the discrepancy between the coordinates of a feature in a dataset and its true position on the Earth’s surface. This component is vital for applications where precise location is critical, such as emergency response navigation, utility infrastructure management, or autonomous vehicle guidance. It answers the fundamental question: is this feature really where the data says it is?

These two forms of accuracy are fundamental to data quality, and a failure in one can render a dataset useless, even if the other is perfect. A dataset with perfect positional accuracy, but poor thematic accuracy might guide you to the exact location of a building but incorrectly label it as a hospital instead of a school. Conversely, low positional accuracy can result in misaligned map layers and flawed spatial analyses, even if all the attributes are correct. Together, they ensure that a geographic dataset is a trustworthy digital twin of the real world.

Data Quality Reporting Principles

Evaluating data quality is only the first step; the results must be communicated clearly for users to assess a dataset’s suitability. The reporting principles of ISO 19157 provide a standardized framework for documenting and sharing these assessments, enabling informed decisions.

The core objective of these reporting principles is to support the concept of “fitness for use.” Instead of labeling a dataset as simply “good” or “bad,” the standard encourages a more nuanced approach. A quality report should provide detailed, transparent information that allows a user to determine if the data’s accuracy, completeness, and consistency meet the requirements of their specific application. For example, data with a positional accuracy of 5 meters might be perfectly acceptable for regional environmental analysis but entirely unsuitable for mapping urban utility lines.

A comprehensive data quality report under ISO 19157 typically includes:

  • The scope of the evaluation (e.g., the entire dataset or a specific subset).

  • The quality elements that were measured.

  • The evaluation methods used.

  • The results, which can be quantitative (e.g., a statistical measure of positional error) or qualitative (e.g., a description of identified logical inconsistencies).

By standardizing how this information is presented (often within metadata), ISO 19157 makes data quality an integral and transparent part of the data itself. This approach fosters trust and helps prevent the misuse of geographic information.

Machine—Readable Implementations

Beyond human-readable reports, ISO 19157 enables automated data validation through machine-readable implementations. This approach transforms quality information from static documentation into dynamic, actionable metadata that software can process directly.

To achieve this, the standard specifies the process for establishing and maintaining a formal register of data quality measures. This register, which must comply with ISO 19135-1 (Geographic information — Procedures for item registration), provides a standardized content structure for defining each quality measure. By creating a centralized, unambiguous definition for every measure, different systems and organizations can exchange and interpret quality results without ambiguity, which ensures interoperability.

The benefits of this machine-readable approach are substantial, including:

  • Streamlined Workflows: Automated quality checks can streamline data ingestion pipelines, flagging or rejecting datasets that fail to meet predefined thresholds.

  • Cost Savings: It reduces the need for manual inspection and minimizes the risk of using flawed data, preventing costly reworks or poor decisions.

  • Improved Governance: The approach fosters better process control, improves documentation, and encourages continuous improvement in data management.

Updates and Amendments to ISO 19157

Like any effective technical framework, ISO 19157 is not static. To remain relevant as technology evolves, the standard is periodically reviewed and updated by a designated Maintenance Agency or Registration Authority, which oversees changes to meet the needs of the geospatial community.

The publication of ISO 19157-1:2023, which supersedes the original 2013 version, is a major update. The updated standard introduces a more refined, modular structure, dividing it into multiple parts for more focused guidance on different aspects of data quality management.

The updated structure includes:

  • ISO 19157-1:2023 – Part 1: General principles: This part establishes the fundamental concepts, principles, and framework for describing and managing geographic data quality.

  • ISO 19157-2:2016 – Part 2: XML schema implementation: This part provides the technical XML schemas for implementing the data quality concepts, facilitating machine-readable reporting.

  • ISO/AWE 19157-3 – Part 3: Monitoring (Under development): This forthcoming part will focus on the principles and methods for ongoing data quality monitoring.

These amendments reflect the growing complexity of geospatial data and the increasing demand for more dynamic and automated quality assurance processes. By separating general principles from specific implementations like XML schemas and future monitoring guidelines, the ISO 19100 series provides a clearer, more adaptable framework for achieving and maintaining high-quality geographic information.

Conclusion: The Importance of ISO 19157

ISO 19157 is the global benchmark for geographic data quality, providing a universal language and a structured framework for its assessment. It establishes clear principles for describing, evaluating, and reporting on the fitness of geospatial data, ensuring that producers and consumers share a common understanding of its reliability and suitability for any given purpose.

The standard’s significance extends beyond mere technical compliance. By defining core quality components and outlining systematic evaluation procedures, it fosters trust and confidence in geographic information. This reliability is essential for critical applications ranging from emergency response and urban planning to environmental science and autonomous navigation. Ultimately, ISO 19157 helps organizations transform raw spatial data into reliable information for decision-making.

As part of the broader ISO 19100 series, this framework provides a comprehensive system for managing the entire lifecycle of geographic information. While designed for digital data, its principles are versatile enough to apply to traditional maps and charts. Adhering to ISO 19157 is not just about following rules; it is a commitment to quality that supports sound decision-making and innovation in the geospatial field.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *