Loc-I Data Profile

IRI
https://linked.data.gov.au/def/loci-dp
Title
Loc-I Data Profile
Definition
The Loc-I Data Profile is a profile of well-known Semantic Web spatial and temporal data, cataloguing and general-purpose models which creates a set of requirements that datasets can meet to ensure they can be used for cataloguing and spatial querying.
Version
2.2
Dates
Created: 2021-04-01
Issued: 2021-04-03
Modified: 2025-06-04
History

2025-06 2.2: fixed Dataset class targeting & allowed more temporal object types

2025-06: Added a requirement for geometry representation as WKT; removed ID requirements

2025-05: Update for maintenance by DCCEEW

2022: Initial release for the Loc-I project

Agents
Publisher: Geoscience Australia
Creator: KurrawongAI
Profile Resources
https://linked.data.gov.au/def/loci-dp/validator
Code Repository
https://github.com/dcceew-bdr/loci-data-profile

Abstract

The Loc-I Data Profile is a profile of several Semantic Web standards that define data cataloguing, spatial data and vocabulary objects.

The purpose of this profile is to allow for the validation of datasets according to a minimum set of metadata and structural requirements which, when met, ensure that the datasets can be used in tasks such as cataloging, spatial reasoning and so on.

Namespaces

This document refers to elements of various ontologies by short codes using namespace prefixes. The prefixes and their corresponding namespaces' URIs are:

locidp
https://linked.data.gov.au/def/loci-dp/
ex
http://example.com/ - a non resolving namespace for examples
dcterms
http://purl.org/dc/terms/
prof
http://www.w3.org/ns/dx/prof/
prov
http://www.w3.org/ns/prov#
schema
https://schema.org/
skos
http://www.w3.org/2004/02/skos/core#
rdfs
http://www.w3.org/2000/01/rdf-schema#

1. Introduction

In 2018 - 2021, the Location Index (Loc-I) Project demonstrated methods for spatial data integration across institutions using Semantic Web patterns of data representation and Linked Data methods of data access.

This data profile is the Loc-I Project's basic data model for spatial datasets and is used to ensure a minimum set of requirements for metadata and structure are met by all datasets wanting to be "Loc-I compatible".

It is expected that this profile will be a base on which other projects and domains will build their own, specialised, profiles for their own purposes, be them spatial or otherwise.

Formally, a profile is:

A specification that constrains, extends, combines, or provides guidance or explanation about the usage of other specifications.
    - The W3C's Profiles Vocabulary

Here, the other specifications being profiled are:

In the following section, this document describes the individual requirements this profile imposes on datasets and their elements. These requirements are mapped to SHACL "Shapes" which are executable validation rules stored in this profile's validator.

2. Requirements

This specification's requirements are listed here and indicated in red text. The ID of each requirement is reused in this profile's validator file at https://linked.data.gov.au/def/loci-dp/validator.

This profile requires basic DCAT-style catalogue for Datasets expressed using schema.org predicates to enable simple cataloguing. If a dataset contains spatio-temporal objects - and Loc-I Profile Datasets are expected to - then it must do so using GeoSPARQL's FeatureCollection classes which must be indicated as being parts of the Dataset.

2.1 Dataset

D-exists: Each graph MUST contain at least one Dataset instance

D-name: Each Dataset MUST have one and only one name which is to be a text literal, indicated using the schema:name predicate

schema:name is to be used rather than rdfs:label or dcterms:title or other, similar, titling properties.

D-desc: Each Dataset MUST have one and description which is to be a text literal, indicated using the schema:description predicate

schema:description is to be used rather than rdfs:comment or dcterms:description or other, similar, description properties.

D-created: Each Dataset MUST have exactly one created date indicated using the schema:dateCreated predicate with a literal value of either xsd:date, xsd:dateTime, xsd:dateTimeStamp or xsd:gYear type

D-modified: Each Dataset MUST have exactly one created date indicated using the schema:dateModified predicate with a literal value of either xsd:date, xsd:dateTime or xsd:dateTimeStamp type

Created and modified dates are relevant to the content of the dataset, not the things that the data is about. For example, a dataset showing the boundaries of Australian states in 1815 might only have been created in 2023 and last modified on 2025-05-02. The dates of the things that the data is about should be indicated per spatio(/temporal) object instead.

D-creator: Each Dataset MUST indicate one or more agents with the schema:creator predicate, typed as an sdo:Person or sdo:Organization

D-publisher: Each Dataset MUST indicate one or more agents with the schema:publisher predicate, typed as an sdo:Person or sdo:Organization

D-history: Each Dataset SHOULD indicate how it was produced or its origin by use of one of the following predicates: skos:historyNote, sdo:citation, prov:wasDerivedFrom

The first is to be used for a narrative history, the second for a link to a non-semantic source or an academic-style citation and the third for a link to a semantic resource from which this dataset was derived.

D-license: Each Dataset MUST indicate a license that defines conditions for its use with the schema:license predicate with the value indicated coming from either the RDF Licenses dataset or being locally-defined odrl:Policy instance

The most commonly-used license for open government spatial data in Australia is the Creative Commons BY 4.0 license:

  • <http://purl.org/NET/rdflicense/cc-by4.0>

D-spatial: A Dataset MAY indicate the total region that the features within it occupy by using the geo:hasGeometry predicate or a sub property of it such as geo:hasBoundingBox and if it does, it must do so with a geo:Geometry objects

D-temporal: A Dataset MAY indicate the total temporal range that the features within it occupy by using the time:hasTime predicate

2.2 Feature Collection

This profile matches the OGC Features APIref's method of arranging multiple feature instances within dataset by grouping them into feature collections. GeoSPARQL's geo:FeatureCollection instances are not spatial object themselves but are administrative (managed) collections of geo:Feature instances. They require basic metadata and may also have spatial summary properties such as bounding boxes and temporal extents.

FC-part-of: Each Feature Collection MUST indicate that it is part of exactly one schema:Dataset instance with use of the schema:isPartOf predicate

FC-name: Each Feature Collection MUST have one and only one name which is to be a text literal, indicated using the schema:name predicate

FC-spatial: A Feature Collection MAY indicate the total area that the features within it cover by using the geo:hasGeometry predicate or a sub property of it such as geo:hasBoundingBox

FC-temporal: A Feature Collection MAY indicate the total temporal range that the features within it cover by using the time:hasTime predicate

2.3 Feature

Features are the core object of Loc-I Data Profiles datasets. If they have spatial information, they must represent it according to GeoSPARQL and if the have temporal information, they must represent it according to OWL TIME. Features must be indicated as being part of a Feature Collection.

F-part-of: Each Feature MUST indicate that it is part of exactly one geo:FeatureCollection instance with use of the schema:isPartOf predicate

F-geometry: If a Feature indicates a geometry, MUST do so using the geo:hasGeometry predicate or sub properties of it

2.4 Geometry

While there are many representations of geometry available to use in the GeoSPARQL, this profile settles on Well-Known Text (WKT) only as all representations are functionally equivalent for topological SPARQL queries and WKT is widely understood by spatial indexing software.

G-representation: Geometry instances MUST indicate their spatial extent with exactly one geo:asWKT predicate with a geo:wktLiteral datatype

3. References

GeoSPARQL
Nicholas J. Car, Timo Homburg, Matthew Perry, Frans Knibbe, Simon J.D. Cox, Joseph Abhayaratna, Mathias Bonduel, Paul J. Cripps & Krzysztof Janowicz. OGC GeoSPARQL - A Geographic Query Language for RDF Data. 29 January 2024. Open Geospatial Consortium standard. URL: http://www.opengis.net/doc/IS/geosparql/1.1
OGC-API
Clemens Portele, Panagiotis (Peter) A. Vretanos, Charles Heazel (eds.). OGC API - Features - Part 1: Core. 14 October 2019. Open Geospatial Consortium standard. URL: http://www.opengis.net/doc/IS/ogcapi-features-1/1.0
PROF
Rob Atkinson; Nicholas J. Car (eds.). The Profiles Vocabulary. 18 December 2019. W3C Working Group Note. URL: https://www.w3.org/TR/dx-prof/
PROV
Lebo, Timothy, Satya Sahoo, and Deborah McGuinness. PROV-O: The PROV Ontology W3C Recommendation. W3C Provenance Working Group, 2013. URL: http://www.w3.org/TR/prov-o/.
OWL-TIME
Cox, Simon & Little, Chris. Time Ontology in OWL W3C Candidate Recommendation Draft. W3C & OGC Spatial Data on the Web Working Group, 15 November 2022. URL: https://www.w3.org/TR/owl-time/.
schema.org
Schema.org community. schema.org Data model and website. Schema.org community, 2025. URL: https://schema.org.