A 'Supermodel' is a set of data models designed to integrate data. This Supermodel has been built for DCCEEW to integrate Environmental Information Australia datasets.
|
Note
|
This Supermodel is demonstrated through the EIA Test Catalogue which delivers all parts of this Supermodel and demonstration data created to show integration. A Scenario Demonstration is also provided to test querying the data. |
1. Metadata
IRI |
|
Title |
EIA Supermodel |
Description |
This is demonstration an enterprise data model - a Supermodel - that has part models to represent specialised datasets and yet works together to integrate for all of Environmental Information Australia’s scope. |
Created |
2025-02-15 |
Modified |
2025-11-16 |
Issued |
2025-08-30 |
Version |
1 |
Creator |
KurrawongAI, for DCCEEW |
Publisher |
Department of Climate Change, Energy, the Environment and Water (DCCEEW) |
License |
|
Code Repository |
2. Introduction
2.1. What is a Supermodel
A "supermodel" is a set of models that have been designed to allow for the integration of data from different datasets. They achieve this by ensuring the models for each dataset implement the same patterns for common elements, such as spatiality, and they contain reference data, such as controlled vocabularies of values, that the datasets refer to, providing join points.
This Supermodel follows the generic Supermodel Model defined at https://linked.data.gov.au/def/supermodel.
2.2. This Supermodel’s Origins
Like all large and long-term data holders, the Department of Climate Change, Energy, the Environment and Water (DCCEEW) has many datasets that while they are conceptually related - within the environment domain - have been created separately and without interoperability as a priority. As a result, analysts must put lots of effort into aligning data before using multiple datasets together; effort that is likely duplicated by analysts unaware of others' work.
In early 2025, DCCEEW conducted a demonstration project called the EIA Supermodel Demonstrator that aimed at showing how integration-ready data might appear and be used.
The target datasets for this work were within DCCEEW’s Environmental Information Australia initiative and DCCEEW’s then new Biodiversity Data Repository was used as the reference dataset: the one whose form other datasets would emulate. This was due to the BDR having been designed specifically for cross-dataset integration.
2.3. Dataset Scope
The scope of this Supermodel was set within its establishment project to initially cover data from 6 DCCEEW datasets:
Additionally, the National Species List was included as a seventh dataset as it is already referenced by the BDR and species information is obviously used in much DCCEEW data:
An eighth dataset, Reference Areas was established too. This is a general-purpose, background dataset of reference spatial areas, such as RAMSAR wetlands and other named areas of ecological significance that are often used by DCCEEW for reporting:
-
Reference Areas
This Supermodel contains Foreground Models for each of these datasets, as well as 20+ supporting assets - vocabularies, data validators - formulated according to the Supermodel Model specification. These are detailed below.
Together, the Foreground Models, Background Models and other Supermodel elements allow all parts of these DCCEEW datasets to be integrated, for example, the spatial parts of any two datasets may be overlayed, the observations from datasets that contain them can be used together and reference values for common attributes are aligned.
2.4. Modelling System
All the models in this Supermodel are implemented using the Web Ontology Language OWL. OWL is a very widely used, standardised, formal modelling language. Unlike UML models, OWL models natively have machine-readable forms allowing data made according to them to be automatically validated and processed into databases.
The BDR is natively modelled in OWL using the ABIS and AusTraits is too, using the OBOE, ETS and Darwin Core models. There is a pre-existing ANSIS OWL ontology too, into which it’s easy to transform data delivering service responses. The other 3 datasets - NVIS, SPRAT and HCAS - have had models, or partial model in OWL created for them, for the first time, within the project that generated this Supermodel
The Supermodel’s constituent models are related to one another using The Profiles Vocabulary PROF which provides properties to indicate when and how one model reuses - "profiles" - another. PROF also links models to validators created to test data claiming conformance to them.
The technical presentation of data modelled in OWL is in RDF which is a graph ('node-edge-node') data structure. RDF allows models and data according to the models to be stored in files or databases of the same sort. RDF is also infinitely extensible, allowing for not just model change and growth but for easy schema change and growth too.
2.5. Patterning
Integrated use of multiple Foreground Models depends on each of them implementing, or mapping to, parts of the Background Models referred to as patterns. The patterns of relevance to this EIA Supermodel do not make a finite list as patterns can exist within and overlap other patterns, however the Patterns section lists a set that covers the scoped datasets' implementation at least once over.
2.6. Validation
For demonstrable interoperability, this Supermodel contains data validators for some of its Background and Foreground Models that test data claiming conformance to it against identified patterns. These validators are executable model specifications.
Most of the validators for the models within this Supermodel were created by the original model implementors, e.g. GeoSPARQL & ABIS, but one validator in particular was created for the EIA Demonstrator project: the EIA Data Profile. This validator is used to ensure datasets have basic metadata required to implement patterns listed in the Patterns section.
The Validators section below lists all the validators relevant to this Supermodel and indicates their dependencies. It also describes how to validate data.
2.7. Definitions
Here is a list of terms and acronyms used in this document.
- Background Model
-
A role within a Supermodel for low level or generic models that some, but not necessarily all, of the Foreground Models reuse and extend, depending on the patterns of data they contain.
- Foreground Model
-
A role within a Supermodel for the models of individual datasets within the set aiming for interoperability. Foreground Models must reuse and extend the Background Models.
- Feature
-
The class of object for "Anything spatial (being or having a shape, position or an extent)", according to GeoSPARQL
- IRI
-
Internationalized Resource Identifiers (IRIs) are Internet protocol standard identifiers used to identify, and often to link to representations of, resources. IRIs add internationalisation (use of different character sets to) Uniform Resource Identifiers (URIs) which are a superset of Uniform Resource Locators (URLs). Where URLs - web addresses - must link to resources, URIs often do but need not. [ref]
- pattern
-
In the context of a Supermodel, a pattern is a small data model and Background Models implement many patterns within them, either implicitly or explicitly
- profile
-
"A specification that constrains, extends, combines, or provides guidance or explanation about the usage of other specifications" according to The Profiles Vocabulary.
- Supermodel
-
A set of integrated data models used with defined roles used to make multiple datasets interoperable.
- Unified Modelling Language, UML
-
A general-purpose visual modeling language that is intended to provide a standard way to visualize the design of a system. [ref]
- Vocabulary
-
A controlled set of defined terms. Within Supermodel contexts, all vocabularies reuse and extend the SKOS vocabulary model.
- Web Ontology Language, OWL
-
A widely used international standard modelling language that allows for machine-readability of models.
3. Patterns
Modelling patterns are conventional, or even standardised, ways of modelling particular data scenarios. For example, within the OWL modelling world - see Modelling System above for context - if you want to associate something with something else, but you want to also include additional information in that association, such as a giving a time duration for when a relationship between two people occurred, you can use the Qualified Relations pattern which looks like this:
marriedTo - between two people. Bottom: a qualified relationship between people where the start and end date is indicated.The Qualified Relations pattern shown above is not defined by any particular model within this Supermodel but is implemented by many of them. It is a fundamental graph modelling pattern used by modelling systems such as OWL.
There is no one definitive set of patterns that can be extracted from this Supermodel’s models as patterns overlaps both across and sometimes within models, however patterns that are deemed important are often formalise with validation rules that look for them in data, see the Validators section.
The following list of patterns are taken from this Supermodel’s models and validators that cover major aspects of data needed to be modelled in particular ways to achieve certain interoperability outcomes.
| Pattern | Purpose | Implementation |
|---|---|---|
Basic Dataset Metadata |
To allow simple listing and filtering of datasets in catalogues |
|
Spatial Dataset Structure |
To provide a conventional structure for spatial datasets to allow their use in spatial (GIS) tools that require fixed structures |
Implemented by: ANSIS Dataset, Validated by: |
Environmental Dataset Domain Categorisation |
To provide a high-level thematic categorisation of all datasets within the EIA Supermodel scope |
Datasets categorised according to the EIA Data Kinds Vocabulary |
Object Spatiality |
To provide a conventional way to indicate spatiality of objects |
Defined by the GeoSPARQL Ontology |
Classification Term Definition |
To provide a conventional way to present defined terms |
Defined by the VocPub Profile |
Classification Vocabulary Structure |
To provide a conventional way to present collections of defined terms |
Defined by the VocPub Profile |
Model Classes |
To provide a conventional way to define classes of data objects within models |
Defined by the OntPub Profile |
Model Relationships |
To allow the dependencies between models to be known |
Defined by the Profiles Vocabulary |
Observations |
To place observations, results, methods used etc. in relation to one another |
Defined by the SOSA Ontology within the SSN Ontology |
Domain-wide Observable Properties |
To provide a single, master, listing of Observable Properties across all EIA datasets allowing for simple cross-querying |
The list is presented as a vocabulary in the EIA Observable Properties Vocabulary |
The above list of patterns is not exhaustive - no list could be as patterns overlap other patterns - and should be curated and grown as new valuable patterns are determined.
4. Models
4.1. Foreground Models
Of the 7 Datasets in this Supermodel’s scope:
- AusTraits
-
follows an international domain ontology, Ecological Trait-data Standard compatible with this Supermodel
- BDR
-
implements a profile of the Australian Semantic Web data exchange standard, ABIS which this Supermodel incorporates
- ANSIS & NSL
-
have purpose-built ontologies, ANSIS Ontology & NSL Model compatible with ABIS and this Supermodel
- NVIS & HCAS
-
don’t need Foreground Models as their concerns can be represented using this Supermodel’s Background Models
- Reference Areas
-
conforms to GeoSPARQL, the background spatial model
Only one dataset, SPRAT, is not yet bound in to the Supermodel by one of the methods applied to the other 6. This is planned to happen soon.
The AusTraits to NSL Mapping Linkset doesn’t need a Foreground Model either IF a single predicate can be added to either AusTraits.
Here are details of each of the four Foreground Model:
Listing
The listing of all Foreground Models used in this Supermodel is given in the EIA Test Models Catalogue where the models are indicated as having the role Foreground Model.
4.2. Background Models
Models used to implement patterns as listed above but which are not Foreground Models aiming to model the entire contents of a Supermodel Datasets are Background Models.
Some of these Background Models are several steps removed from the Foreground Models where they are background models for other background models for the Foreground Models.
Many of these Background Models are well-known, international standard, models.
Listing
The listing of all Foreground Models used in this Supermodel is given in the EIA Test Models Catalogue where the models are indicated as having the role Background Model.
5. Vocabularies
Some of the vocabularies within this Supermodel are relevant to all datasets within this it’s scope, others are relevant to particular ones.
Those relevant to all datasets are required for use by the EIA Data Profile and are:
-
-
"a set of terms in order to describe in a harmonised way data resulting from observations and measurements of ecosystem processes across different domain specific sciences"
-
-
-
"A vocabulary of the kinds of data within Environmental Information Australia"
-
-
EIA Observable Properties Vocabulary
-
"A vocabulary of all the Observable Properties within the Environmental Information Australia Supermodel"
-
Each dataset claiming conformance to this Supermodel must be classified according to at least one EnvThes term, as per requirements encoded in the EIA Data Profile.
The vocabularies relevant to individual Foreground Models and Background Models are defined, or indicted for use by, those models.
Listing
The listing of all vocabularies used in this Supermodel is given in the EIA Test Vocabulary Catalogue.
6. Validators
The validators relevant to this Supermodel come from a range of sources and some derive from, and extend upon, others. The following table lists all of them with derivation notes.
| Validator | Derived From | Relevant Models | Notes |
|---|---|---|---|
Basic Catalogue Profile |
DCAT |
Validates Datasets' metadata contains basic annotations and relations to Agents |
|
DCAT, GeoSPARQL |
DCAT, schema.org |
Requires minimum metadata for catalogued datasets containing spatial data |
|
Loc-I Data Profile |
DCAT, schema.org |
Requires minimum environmental domain dataset metadata |
|
GeoSPARQL |
GeoSPARQL |
Basic GeosPARQL validation |
|
ABIS & BDR-PR |
Validates all aspects of ABIS, and BDR Profile of ABIS, data |
||
DCAT, OWL |
DCAT, OWL |
Requires minimum cataloguing metadata, structural composition and element annotations for OWL Ontologies |
|
DCAT, SKOS |
DCAT, SKOS |
Requires minimum cataloguing metadata and structural composition for SKOS vocabularies |
All the validators used within this Supermodel are implemented in SHACL, an RDF data validation language and validation of data may be carried out as per the ABIS' _Performing Validation section.
Listing
The listing of all vocabularies used in this Supermodel is given in the EIA Test Validators Catalogue.
7. References
ABIS: Australian Biodiversity Information Governance Group, Australian Biodiversity Information Standard. Australian government data standard (4 December 2023). https://linked.data.gov.au/def/abis
- ANSISO
-
Megan Wong & Simon JD Cox, ANSIS Ontology. Community proposed data standard (21 July 2022). https://anzsoil.org/def/au/domain
- BDRPR
-
DCCEEW, BDR Profile of ABIS. System data model (2025). https://linked.data.gov.au/def/bdr-pr
- DCAT
-
World Wide Web Consortium, Data Catalog Vocabulary (DCAT) - Version 3, W3C Recommendation (22 August 2024). https://www.w3.org/TR/vocab-dcat-3/
- EIADP
-
Department of Climate Change, Energy and the Environment, EIA Data Profile, Community proposed data standard (15 May 2025), https://linked.data.gov.au/def/eia-dp
- ETS
-
Schneider, F.D., Jochum, M., Le Provost, G., Ostrowski, A., Penone, C. and Simons, N.K., Ecological Trait-data Standard v0.10, Community data standard (28 March 2019) https://doi.org/10.5281/zenodo.2605377
- GeoSPARQL
-
Open Geospatial Consortium, OGC GeoSPARQL - A Geographic Query Language for RDF Data, Version 1.1, OGC® Implementation Specification (2024). http://www.opengis.net/doc/IS/geosparql/1.1
- LOCIDP
-
Geoscience Australia, Loc-I Data Profile, Semantic Web profile (3 April 2021). https://linked.data.gov.au/def/loci-dp
NSLMODEL: Centre for Australian National Biodiversity Research, National Species List - Semantic Web Model (15 October 2023). https://linked.data.gov.au/def/nsl
- ONTPUB
-
Australian Government Linked Data Working Group, Ontology Publication Profile of OWL, Australian community standard (2020). https://linked.data.gov.au/def/ontpub
- OWL
-
World Wide Web Consortium, OWL 2 Web Ontology Language Document Overview (Second Edition), W3C Recommendation (11 December 2012). https://www.w3.org/TR/owl2-overview/
- Profiles Vocabulary
-
World Wide Web Consortium, The Profiles Vocabulary, W3C Working Group Note (18 December 2019). https://www.w3.org/TR/dx-prof/
- RDF
-
World Wide Web Consortium, RDF 1.1 Concepts and Abstract Syntax, W3C Recommendation (25 February 2014). https://www.w3.org/TR/rdf11-concepts/
- schema.org
-
W3C Schema.org Community Group, schema.org, Semantic Web model (2015). https://schema.org
- SHACL
-
World Wide Web Consortium, Shapes Constraint Language (SHACL), W3C Recommendation (20 July 2017). https://www.w3.org/TR/shacl/
- SKOS
-
World Wide Web Consortium, SKOS Simple Knowledge Organization System, W3C Recommendation (18 August 2009). Semantic Web model. https://www.w3.org/TR/skos-reference/
- VOCPUB
-
Australian Government Linked Data Working Group, VocPub Profile of SKOS, Australian community standard (2020). https://linked.data.gov.au/def/vocpub