Codelists in eForms

Vocabularies

The Publications Office of the European Union (OP) has defined and published many vocabularies for use across a wide range of sectors on the EU Vocabularies website.

A subset of these vocabularies are relevant to the domain of eProcurement; most of these are flat lists of terms and their definitions; some, such as the NUTS (Nomenclature of Territorial Units for Statistics) are hierarchical taxonomies. They are listed on the Authority tables and taxonomies used in eProcurement page.

Codelists

A codelist implements a vocabulary: a list of language-independent codes that represent an agreed set of values for a particular data context.

Codelists in eForms

Codelists are used in eForms notices to restrict the values of some data items. Each code has a short label. The majority of codelists also have descriptions in each of the 24 official EU languages for every code. Codelists for language- or geographic-dependent codes, such as the NUTS codes, do not have such translations.

Format of codelists

EU Vocabularies provides the codelists in a number of different formats for use in different applications. The format chosen for use in the eForms SDK is the OASIS Genericode XML format, which was designed for the interchange of machine-readable codelists.

Information on codelists

The file named codelists.json provides information on each codelist.

Structure of codelists.json
{
  "ublVersion" : "2.3", (1)
  "sdkVersion" : "1.0.0", (2)
  "metadataDatabase" : { (3)
    "version" : "1.0.0",
    "createdOn" : "2022-08-05T10:24:40"
  },
  "codelists" : [
    {
      "id" : "accelerated-procedure", (4)
      "parentId" : "indicator", (5)
      "filename" : "indicator_accelerated-procedure.gc", (6)
      "description" : "List of codes to specify whether a procedure is accelerated or not.", (7)
      "_label" : "codelist|name|accelerated-procedure" (8)
    },
    ...
  ]
}
1 Version of the UBL standard used.
2 Version of the eForms SDK the file belongs to.
3 Version number and date of the data used to create this file.
4 Identifier of the codelist.
5 Identifier of the parent codelist, if the codelist is a tailored codelist.
6 Name of the file containing the codelist.
7 Description of the codelist.
8 Identifier of the label for the name of the codelist. The label is available in the translations folder.

In eForms, codelists are referenced using their identifier. The identifier is also indicated in the codelist Genericode files, in the "LongName" element which has no attributes:

<LongName>accelerated-procedure</LongName>

Types of codelists

There are three types of codelist used in eForms, based on their source and applicability:

  • Codelists derived from the EU Vocabularies site

  • Tailored codelists

  • Technical codelists

Codelists from EU Vocabularies

These are eProcurement codelists published and maintained on the Authority tables and taxonomies used in eProcurement page of EU Vocabularies.

In the SDK we include only the codes that are relevant to eForms. For example we don’t include currencies that are currently not in use. Those codes are marked with the "EFORMS" use context, which is available in the SKOS files available on EU Vocabularies

<lemon:context rdf:resource="http://publications.europa.eu/resource/authority/use-context/EFORMS"/>

As NUTS and CPV have no use-context we take the complete current version of these codelists.

Hierarchical codelists

Some codelists have a tree-like structure, with some codes corresponding to subdivisions of other codes. This is the case for the NUTS and CPV codelists.

In order to provide information on this structure, the parent code is indicated in the genericode file, in an additional column called "ParentCode". The parent code is optional, as top level elements do not have a parent.

<ColumnSet>
  ...
  <Column Id="parentCode" Use="optional"> (1)
    <ShortName>ParentCode</ShortName>
    <Data Lang="eng" Type="normalizedString"/>
  </Column>
  ...
</ColumnSet>
...
<Row>
  <Value ColumnRef="code">
    <SimpleValue>FR</SimpleValue> (2)
  </Value>
  <Value ColumnRef="Name">
    <SimpleValue>France</SimpleValue>
  </Value>
  <Value ColumnRef="eng_label">
    <SimpleValue>France</SimpleValue>
  </Value>
  ...
</Row>
<Row>
  <Value ColumnRef="code">
    <SimpleValue>FRD</SimpleValue> (3)
  </Value>
  <Value ColumnRef="Name">
    <SimpleValue>Normandie</SimpleValue>
  </Value>
  <Value ColumnRef="parentCode">
    <SimpleValue>FR</SimpleValue> (4)
  </Value>
  <Value ColumnRef="eng_label">
    <SimpleValue>Normandie</SimpleValue>
  </Value>
  ...
</Row>
...
1 Column definition for the optional parent code
2 The code "FR" representing France
3 The code "FRD" representing Normandie
4 The parent code of "FRD" is "FR", linking Normandie to France

The parent code is not indicated in tailored codelists based on hierarchical codelists.

Tailored codelists

Where only a subset of codes included in an eProcurement codelist are allowed or applicable in a particular eForms context, a new tailored codelist has been created with only those codes. The labels and translations for these codes in these tailored codelists are the same as in the "parent" codelists. Since these codelists have no relevance or use outside the context of eForms, they are not published on the EU Vocabularies website, but are published as part of the eForms SDK.

For example, the "language" codelist has codes for thousands of languages. The business term BT-702 "Notice Official Language" designates an EU Official language in which the notice is officially available. Only one of the 24 EU Official languages is permitted. So a tailored codelist named "EU Official Language" has been created. This codelist contains the entries for these 24 EU languages, copied from the parent "Language" codelist.

As another example, the business terms BT-10 "Activity Authority" and BT-610 "Activity Entity" each use a different subset of codes from the same eProcurement "Main activity" codelist. Two new tailored codelists have been created, "Authority Activity" and "Entity Activity", which each contain the relevant codes.

Filenames of tailored codelist

The filenames of the tailored codelists are composed of two parts, separated by an underscore character "_":

  • the name of the parent codelist, from which the codes are copied,

  • the name of the tailored codelist.

Each part follows the same convention as used for the eProcurement codelists: the name of the codelist, with spaces replaced by hyphens, and all in lower case.

Parent codelist name

Parent codelist filename

Tailored codelist name

Tailored codelist filename

Language

language.gc

EU Official Language

language_eu-official-language.gc

Main activity

main-activity.gc

Authority Activity

main-activity_authority-activity.gc

Entity Activity

main-activity_entity-activity.gc

Additional information in genericode files

Some new elements have been added within the <Identification> element in the genericode files for tailored codelists.

<LongName Identifier="listId">http://publications.europa.eu/resource/authority/main-activity</LongName> (1)
<LongName Identifier="eFormsParentId">main-activity</LongName> (2)
<Version>0.2.19</Version> (3)
1 URI of the parent codelist
2 Identifier of the parent codelist
3 Version number of the data used to create this file

Technical codelists

The UBL schema was chosen to define the XML structure for eForms notices due to its wide use for representing business documents across many domains, and its very close match to eForms data requirements. However, there are some contexts where the UBL elements available are not sufficient to represent the Business Terms needed by the context. In these cases new codelists have been created to implement the required Business Terms. These technical codelists are also published in the eForms SDK.

Codelists in eForms notice XML

In eForms XML, codes and codelists are mostly referenced using elements designed for that purpose. The elements have names which end in "Code", and have the attribute "listName". This attribute is used to hold the identifier of the codelist, and the code value required is set as the content of the element. In eForms, only codes from one codelist are allowed for any specific element.

The example below shows an example of using the code value "supplies" from the "contract-nature" codelist.

<cbc:ProcurementTypeCode listName="contract-nature">supplies</cbc:ProcurementTypeCode>

Validation of codes and codelists

The Schematron rules that are included as part of the eForms SDK contain rules to validate the correct use of codelists in a notice XML file. The rules check that for each element which should reference a code:

  • the correct codelist is named in the "listName" attribute

  • the content of the element is one of the codes from that codelist