Reusing Mapping suite packages

The structure of the package evolves during through the different phases of the mapping development process. Three phases of the mapping development are described in this section.

Initial Phase

Audience: Semantic Engineers In the first, initial, phase, when the Semantic Engineers start working on a new mapping suite, they should set up a package folder structure similar to the one described below.

One package per form number is considered the optimal solution when naming and organising the various mapping suites.

The structure of an example mapping package folder structure is presented below:

/package_Fxx

/transformation

    conceptual_mappings.xlsx

/mappings

    *.rml.ttl

/resources

    *.json, *.xml, *.csv

/test_data

    *.xml

/package_Fxx root folder of the mapping suite

/transformation/conceptual_mappings.xlsx manually created

/transformation/resources additional resources possibly needed by the transformation rules;
The content of this folder should be automatically generated by the a mapping package processor, based on the "Resources" sheet of the conceptual_mappings.xlsx

IMPORTANT!!! In these rules the source XML is always referring to data/source.xml, which corresponds to the ../../data/source.xml file that will be copied (and renamed) from the test_data folder at the time of the execution of the mapping.

/test_data manually and carefully selected test data possibly grouped in suborders, e.g. /test_data/batch-D1/*.xml

technical_mappings.yarrrml.yaml (optional) manually created, and used in earlier days of the mapping development, but currently not used

Mapping suite package description for the Software Engineers

A package provided by the semantic engineers (SE) is enriched with additional artefacts that are generated automatically using the package expanding tools which take as input the artefacts provided by the SE. Here are some examples of these additional artefacts that are being generated:

  • Metadata describing the parameters for selecting the notices that the mappings can be applied to, various version information, etc.

  • SPARQL queries that can be used to validate and/or test the generated outputs

  • SHACL shapes that can be used to validate and the structure of the generated outputs

  • New ones may be added at the time of writing this document

After the package processing/expansion, the structure of the example mapping package presented in the previous subsection would look like this:

/package_Fxx

    metadata.json
    /transformation
        conceptual_mappings.xlsx
        /mappings
            *.rml.ttl
        /resources
            *.json, *.xml, *.csv
    /data
        source.xml
    /output
        *.rdf
    /validation
        /sparql
            /cm_assertions
                *.rq
        /shacl # this is a constant, when  the SHACL is known (currently unknown)
            *.shacl.ttl # data shape file(s)
    /test_data # manually and carefully selected test data
        *.xml
        metadata.json automatically generated from Metadata sheet of conceptual_mapping.xlsx

    /data # this is a placeholder created at runtime to process the inputs. It serves only when the mapping suite is being tested, or executed by some script.

        source.xml - this file is generated during runtime by     copying a given test data file

    /output - this is a placeholder created at runtime to store outputs. It serves only when the mapping suite is being tested, or executed by some script.

    /validation/sparql/cm_assertions - SPARQL queries automatically generated from the conceptual mapping

Mapping suite package description for the Semantic Engineers after the expansion

After the “execution” of a mapping, the mapping package will be further enriched, and will contain additional files, as a result of running the mapping suite on the included test data.

/package_Fxx
    metadata.json
    /transformation
        conceptual_mappings.xlsx
        /mappings
            *.rml.ttl
        /resources
            *.json, *.xml, *.csv
    /data
        source.xml
    /output
        /<notice_file1>
            <notice_file1>.ttl
            /test_suite_report
                *.ttl, *.html, *.json # e.g. sparql_cm_assertions.html, shacl_epo.html, xml_coverage.html
        /<notice_file2>
            ...
        /<notice_file3>
            ...
    /validation
        /sparql
            /cm_assertions
                *.rq
        /shacl
            /epo
                ePO_shacl_shapes.rdf
            shacl_result_query.rq
    /test_data
        <notice_file1>.xml
        <notice_file2>.xml
        <notice_file3>.xml
        *.xml

/output/<notice_file1> for each example file a folder is created that contains all the generated artefacts for that sample file

/output/test_suite_report validation reports summarising all individual reports

/output/<notice_file1>/<notice_file1>.ttl the output of the transformation *

Code list mappings

Data samples

Versioning


Any comments on the documentation?