Introduction

For the pros and const of this library see the ReadMe.

The following is an overview of concepts and use.

Metadata

The purpose of this project is to make it as easy as possible to catalog metadata. The primary use-case is “open data” but it can be adapted to any schema that takes a similar shape. Files can also be stored and published along side the metadata. Services such as Carto can be used to provide an API for interacting with files or what open data catalog’s refer to as a DataStore.

A document store offers a good solution for storing metadata because it limits the service area for reciving (JSON Objects), storing end editing (JSON Objects with references), and publishing (JSON Objects) metadata.

By limiting the scope and functionality of a metadata catalog this project is designed to make it easier to interact with outside services. For example the integration with ElasticSearch consists of only two methods.

Collections and Docs

Content in this catalog is divided into collections and documents similar to MongoDB or other Document Stores. Collections are types of content such as a “dataset” or “organization” though they can be anything defined by a schema. Docs are the individual content items.

The content model contains a FileStorage and MongoDB sub-classes which are options for storage. The FileStorage class treats the local file system as a document databse storing and retrieving results from disk. Using files to store metadata is a primary advantage of this project however a Mongo option is offered since it is necessary for Interra Catalog Admin and the methods for interacting with the data (ie InsertOne) are identical. Note the Mongo methods are not fully supported yet.

Structure

This project consists of the following:

config.yml
models/
schemas/
build/
sites/
app/
cli.js
plopfile.js

The rest of the files and folders are artifacts of react boilerplate which drives the app/.

config.yml

Contains variables for the location of the sites/, build/, schemas/ directories and the storage mechanism. Storage options are the default FileStorage. Mongo is not fully supported yet.

models/

Contains classes for creating, storing and building catalogs.

schemas/

Contains the base schemas for the catalog. For adding new schemas it is recommended to change the schema directory in the base config.yml file.

build/

Contains the fully-built sites separated by site name. Each site consists of an export of the collection data of the site as well as the built version of the app. The site folder is the production version of the site.

sites/

Contains the data for each site separated by site name. Each site contains a config.yml file, media files, collection data, and harvest sources.

app/

Contains the react app that builds renders the catalog.

cli.js

A cli for performing site tasks built using Caporal. Run node cli.js to see a list of available commands:

validate-site-contents site
validate-site site
build-collection-data site
build-collection-data-item site collection interraId
build-config site
build-datajson site
build-routes site
build-schema site
build-search site
build-swagger site
build-apis site
build-site site
run-dev site
run-dev-dll
run-dev-tunnel site
harvest-cache site
harvest-run site
load-doc site collection interraId

React App Front-End

The front end app is built using react boilerplate. This will likely be separated from the models at some point. Data for each site is exported to the build/ folder. Content data is exported to build/SITE-NAME/api/v1. Each site exports a swagger-based documentation of the APIs at /api which is exported to build/SITE-NAME/api/v1/swagger.json.

Development

Thanks to react boilerplate this project includes a hot-reloading dev-server. To run first build the DLLs node cli.js run-dev-dlls SITE-NAME and then run the dev server node cli.js run-dev SITE-NAME.