Exploring subset profile and validation procedures of geographical markup language (GML) for 3D areal plan information
Original version
International Multidisciplinary Scientific GeoConference SGEM .... 2017, 17 (21), 883-894. 10.5593/sgem2017/21/S08.112Abstract
In the paper we explain our experiences in the work towards a GML subset profile for Norwegian Areal Plan data sets, and how this GML subset profile together with requirements form the UML based conceptual application schema that can be used for data validation. GML is about to be the preferred exchange language for geographical information in Norway. One challenge using GML, slowing down the implementation of GML in Norway, is the complexity, e.g. the possibilities of representing the same information in multiple ways. In all digital information handling, automated validation is important. For data validation, both the digital data and the requirements to the data are needed in suitable languages/formats. ISO 19136 define the “ISO-certified” version of GML. The Annex G of ISO19136 have guidelines for defining GML subset profiles. Using GML subset profiles the unneeded complex parts and alternative solutions in GML can be removed, and data still be conformant to the full GML. Selected activities considering GML profiles are investigated: INSPIRE, CityGML and OGC GML SimpleFeatureProfile. The surprising finding from this investigation is the lack of formal GML profiles to be used for data validation. In all the three activities, subsets of GML are explained only using text for human reading, to some degree pointing to recommended UML modelling practice. None of the three activities have made available official XSD schema files for information validation following their text-based GML subset profiles. For the XML-based GML, the natural data validation start is XSD-based data structure validation against the formal XSD application schema. For validation of geometry (e.g. closure of polygons and solids) additional requirements to the “pure” XSD data structure rules are needed. Some geometry requirements can be derived directly from the GML semantics, e.g. a GML Ring should be a closed curve and a GML CompositeCurve should have only connected curve subparts. Some other geometry requirements are connected to the semantics of the defined feature types in the user application schema, e.g. the spatial extent of a county should be inside the spatial extent of the country. For geographic information described using UML and based on the ISO19109 General Feature Model, UML classes representing feature types are important modelling elements. We have added geometry requirements to UML-based feature types and made this available for validation on the dataset (GML) level. The geometry requirements are defined using the ISO19157 Data Quality / Data Quality Measure (DQM) principles. In the context of this paper, only quality category Logical Consistency DQMs are relevant. In the paper we explain the experience with defining and validating rather general geometry rules connected to polygon geometry; polygon/polygon relationships and polygon set/tessellations rules. The paper ends with conclusions and recommendations for further work