As part of its Open Government Initiative, the Ontario Government has issued an Open Data Directive that requires provincial Ministries and the Treasury Board Secretariat to establish and publish an Inventory and Catalogue of government-wide data.
Metadata Elements of the Data Inventory
On September 1, 2017, the Ontario Government’s Data Inventory was available for download as a Comma-Separated Values (CSV) file at https://files.ontario.ca/opendata/ontariodatainventory_31.csv.
Table 1 presents the twenty metadata elements used to describe datasets in the Data Inventory:
|Public Title||The name given to a dataset. This must be unique and restricted to 200 characters or less. E.g. Consular offices in Ontario.|
|Short Description||Limited description to 200 characters or less. E.g. List of consular offices operating in Ontario.|
|Long Description||Limited description to 600 characters or less (approximately 100 words). E.g. The data contained includes the name of the Consul General or Honourary Consul General, office address, telephone and fax numbers, email and web addresses.|
|Other Title||Internal name of the dataset. Must be unique. Limited to 200 characters.|
|Data Custodian Branch||E.g. Office of International Relations and Protocol.|
|Date Range – Start||Date format yyyy-mm-dd.|
|Date Range – End||Date format yyyy-mm-dd.|
|Date Created||Date format yyyy-mm-dd.|
|Date Published||Date format yyyy-mm-dd. E.g. 2016-04-20.|
|Contains Geographic Markers||True if this dataset contains geographic markers, false otherwise. E.g. TRUE.|
|Publisher||Ministry or agency. Selected from a drop-down menu. E.g. Intergovernmental Affairs.|
|Update Frequency||The frequency of update of the dataset. Selected from a drop-down menu. E.g. On demand.|
|Access Level||Whether this data can be targeted for public release, is restricted from public release, or needs more assessment. Selected from a drop-down menu. E.g. Under review.|
|Exemption||Choose the specific exemption that applies to the restricted dataset. Selected from a drop-down menu.|
|Dataset URL||(Optional) A web page that can be navigated to to gain access to the dataset. E.g. https://www.ontario.ca/page/consular-offices.|
|File Types (extensions)||Comma separated list of file types. Selected from a drop-down menu. E.g. DOCX.|
|Additional Comments||Please provide any other information about your dataset that would help us to assess its suitability for open data.|
The Evolution of the Data Inventory
Table 2 presents the dates of publication and download links to every release of the Ontario Government’s Data Inventory between July 2016 and August 2017. 1 Notice that, on different days (column B), the Ontario Government published two (08/05/2016, 09/20/2016, 10/21/2016, 11/18/2016, 01/25/2017, 02/27/2017, 03/01/2017, 04/04/2017, 05/19/2017, and 08/03/2017), three (01/09/2017), four (02/13/2017), and sometimes even five (10/06/2016) releases of its Data Inventory.
We’re going to ignore any discrepancies among same-day releases of the Data Inventory, and focus instead on the sole or final release of the day (column C). Upon inspection, we determine that these fourteen releases all differ from one another, and so may be regarded as distinct versions of the Data Inventory.
For ease of reference and access, Table 2 (column D) provides and designates download links for the fourteen distinct versions of the Data Inventory OntGDI_00.csv, OntGDI_01.csv, OntGDI_02.csv,… OntGDI_13.csv.
The Ontario Government’s Data Inventory confronts the sort of data quality issues (e.g. typos, variant spellings, inconsistent use of punctuation, abbreviations, acronyms, etc.) that are expected with Open Data initiatives – especially in their beginnings.
A few discrepancies in the terminology/orthography used to designate a dataset’s Access Level also appear in different versions of the Data Inventory. These discrepancies are potentially quite troublesome for data analysis – though they are easily remedied (e.g. we replace designations, like “Will be made open/public,” “To be opened,” etc. with the most common designation, “To Be Opened”).
A more vexing problem arises when the same Public Title is assigned mistakenly to more than one dataset (e.g. the Public Title “Tax collections: client clearances” is assigned to datasets #350 and #371 in ontariodatainventory_8.csv). Resolving this problem requires:
- identifying every duplicate use of any Public Title across different versions of the Data Inventory;
- generating a Unique Title for every dataset, including those with the same Public Title (e.g. generating “Tax collections: client clearances_01” for dataset #350 and “Tax collections: client clearances_02” for dataset #371 in ontariodatainventory_8.csv,); and
- assigning the same Unique Title to every dataset across different versions of the Data Inventory.
Both of these data quality issues with the original ontarioinventory_nn.csv files (column A) are addressed in the OntGDI_nn_EN.csv files (column D) provided in Table 2.
Sole/Final Release of the Day
|ontariodatainventory.csv||07/29/2016 – 14:23||07/29/2016 – 14:23||OntGDI_00_EN.csv|
|ontariodatainventory_0.csv||08/05/2016 – 10:18|
|ontariodatainventory_1.csv||08/05/2016 – 10:19||08/05/2016 – 10:19||OntGDI_01_EN.csv|
|ontariodatainventory_2.csv||09/20/2016 – 10:29|
|ontariodatainventory_3.csv||09/20/2016 – 10:32||09/20/2016 – 10:32||OntGDI_02_EN.csv|
|ontariodatainventory_4.csv||10/06/2016 – 09:51|
|ontariodatainventory_5.csv||10/06/2016 – 10:02|
|ontariodatainventory_6.csv||10/06/2016 – 10:09|
|ontariodatainventory_7.csv||10/06/2016 – 10:28|
|ontariodatainventory_8.csv||10/06/2016 – 10:30||10/06/2016 – 10:30||OntGDI_03_EN.csv|
|ontariodatainventory_9.csv||10/21/2016 – 15:49|
|ontariodatainventory_10.csv||10/21/2016 – 15:50||10/21/2016 – 15:50||OntGDI_04_EN.csv|
|ontariodatainventory_11.csv||11/18/2016 – 10:21|
|ontariodatainventory_12.csv||11/18/2016 – 10:24||11/18/2016 – 10:24||OntGDI_05_EN.csv|
|ontariodatainventory_13.csv||01/09/2017 – 13:19|
|ontariodatainventory_14.csv||01/09/2017 – 13:22|
|ontariodatainventory_15.csv||01/09/2017 – 13:27||01/09/2017 – 13:27||OntGDI_06_EN.csv|
|ontariodatainventory_16.csv||01/25/2017 – 11:06|
|ontariodatainventory_17.csv||01/25/2017 – 11:08||01/25/2017 – 11:08||OntGDI_07_EN.csv|
|ontariodatainventory_18.csv||02/13/2017 – 13:04|
|ontariodatainventory_19.csv||02/13/2017 – 13:06|
|ontariodatainventory_20.csv||02/13/2017 – 13:56|
|ontariodatainventory_21.csv||02/13/2017 – 13:57||02/13/2017 – 13:57||OntGDI_08_EN.csv|
|ontariodatainventory_22.csv||02/27/2017 – 15:25|
|ontariodatainventory_23.csv||02/27/2017 – 15:28||02/27/2017 – 15:28||OntGDI_09_EN.csv|
|ontariodatainventory_24.csv||03/01/2017 – 15:22|
|ontariodatainventory_25.csv||03/01/2017 – 15:24||03/01/2017 – 15:24||OntGDI_10_EN.csv|
|ontariodatainventory_26.csv||04/04/2017 – 13:25|
|ontariodatainventory_27.csv||04/04/2017 – 13:26||04/04/2017 – 13:26||OntGDI_11_EN.csv|
|ontariodatainventory_28.csv||05/19/2017 – 11:38|
|ontariodatainventory_29.csv||05/19/2017 – 11:39||05/19/2017 – 11:39||OntGDI_12_EN.csv|
|ontariodatainventory_30.csv||08/03/2017 – 12:47|
|ontariodatainventory_31.csv||08/03/2017 – 12:48||08/03/2017 – 12:48||OntGDI_13_EN.csv|
- Personal correspondence, Dawn Edmonds, Team Lead, Policy and Partnerships, Open Government Office, Treasury Board Secretariat, October 11, 2017. ↵