Kurator-Web User Documentation

From Kurator
Jump to: navigation, search


Using Kurator-Web


Register to obtain an account on a Kurator-web application deployment. An administrator will need to activate your account.

Once registered and activated, you can log in and use the web application.

Screenshot kurator web login.png

Available Canned Workflows

Once logged in, you can navigation from the home page to the workflows page to see a list of available pre-built workflows.

Screenshot kurator web workflows run.png

The i Info icon links out to a page that documents the workflow. Each workflow has a name and a brief description.

Click the Run link to run a workflow. This link will take you to a page where you can provide the workflow with a data set to run on and any other parameters that it needs.

Picking a Workflow

The current set of workflows can be divided in two ways, by what you want to accomplish, and by what form of input you have.

The workflows, for the most part, start with "CSV" or "Darwin Core Archive". CSV workflows run on CSV files, usually in flat DarwinCore form, that you upload to the Kurator-Web environment. Darwin Core Archive workflows run on a Darwin Core Archive downloaded from a URI that you provide (usually a Darwin Core Archive produced by an IPT instance or by Symbiota).

Some of the workflows (in particular some of the CSV workflows) are intended for preparing data for publishing to aggregators (in an IPT instance). Others of the workflows are intended to check the quality of your data and report on compliance or non-compliance with standard vocabularies or tests. Some of the data quality workflows (e.g. CSV/Darwin Core Archive Geography Assessor) simply report on how much of your data conforms with expectations, others (e.g. CSV/ Darwin Core Archive Geography Cleaner) propose amendments that could be made to your data to improve its quality for particular uses.

The "Validator" workflows (e.g. CSV File Georeference Validator or Darwin Core Archive Date Validator) provide reports on data quality in terms of a developing TDWG framework for reporting on data fitness for use. Their output includes an .xls spreadsheet containing the input data, the input data with amendments applied, and sets of data quality Measures, Validations, and Amendments on those data.

The "Counter", "Assessor", and "Cleaner" workflows provide reports on distinct values of terms (e.g. higher geography names) found in the input data. The Assessor and Cleaner workflows compare those distinct values to controlled vocabularies.

Some of the workflows, in particular the CSV File Darwinizer and the CSV File Aggregator, are intended for preparing data for publication to aggregators, but in general, the workflows could be run at any point in the data life cycle.

Detailed descriptions of each of the canned workflows can be found at: https://github.com/kurator-org/kurator-validation/wiki and the "i" button in front of each workflow links to the documentation for that workflow there on the kurator wiki on github.

Running a Workflow

Sample Data Set (192 records)

When you click the Run link on the list of workflows you will be taken to a dialog where you can enter the information needed by that workflow. This may be the URI at which a Darwin Core archive can be found, or may be an option to upload a data file. Some workflows may take other configuration parameters as well.

Screenshot kurator web dwca geo assessor workflowsetup.png

When you have provided the information the workflow needs, click the Run Workflow button. This will start the workflow. After some period of time, your browser will redirect to the home page where you will now see the workflow you just launched in the list of running workflows. You can also click on Home to go to the home page to see the running workflows, or you can navigate elsewhere, your workflow will be running in the background on the server.

Depending on the nature of the workflow and the size of your dataset, the workflow may take some time to complete.

Workflow Results

Below the list of available workflows on the home page is a list of your workflow runs.

Screenshot kurator web workflowruns.png

Workflows may be running, may have completed successfully, or may have failed with an error.

For a workflow which has completed successfully, click the Document icon for a popup dialog from which you can download some or all of the result files produced by the workflow.

Screenshot kurator web workflows resultpopup.png

Click on the Download Archive link to obtain a zip archive containing all of the workflow results, or click on the links to download individual files. Click on the "Download YAML Config" link to obtain the workflow definition file (which you can edit and run locally with an installation of Kurator-validation).

Workflow Designer

Personal tools