Roadmap

From Kurator
Jump to: navigation, search

Kurator-Web: Admin and error reporting

Includes changes to the Admin section to let us answer questions about user activity - who's active, how many workflows are they running, have their workflows run to completion, etc.

  • Report a bug/errors - Add a feature that enables users to report an issue (via a link on the navbar) or submit a report about errors in the output data for a workflow run. The error report would include information about the user's browser, input file and parameters used to run the workflow and can be configured to send mail to an admin user.
  • View runs as user - Need a way to view a user's workflow runs to investigate the source of errors. Fix the view as user feature in the web app.
  • Pagination of user management table - Implement pagination of the list of users on the admin page. (commit 02a070a)
  • Run count badges - Badges on the admin page indicating for each user a count of runs by status. (commit cc60432

Kurator-Web: Workflows page UI

Adding structured metadata to workflow documents to allow for user interface features such as faceted search (e.g. limit the list workflows to just those which take a csv file for input)

  • Metadata in config - Define metadata in the workflows.conf file describing workflow input and output types (CSV, DwCA). Also add support for listing the set of information elements that the workflow is relevant to.
  • Filtering workflows list view - Add ability to search workflows by input and output file types or enter a list of terms to filter by information element (e.g. find all workflows that act on dwc:decimalLatitude and dwc:decimalLongitude)

FFDQ for DQIG TG2 meeting

Refactoring the fittness for use framework for describing what quality control tests do, and what they report, and how they are composed into suites of tests to fit use cases. This is setting up for trying to present a standardized machine readable metadata format for a suite of standard tests for the TDWG DQIG TG2 meeting in January. With that format in place, we should be able to render the results of tests that are described in terms of the fittness for use framework in multiple ways, including the current spreadsheet of test result formats, and web pages with graphical summaries of the results, and unique error reports.

  • RDF Model and Java annotations - Update documentation for use of the new RDF bean objects for producing serializable instances of FFDQ concepts. Also update documentation of annotated Java classes for use in TestRunner.
  • Test spreadsheet utility - Utilities for producing an FFDQ RDF document from a spreadsheet of standardized tests, automatic generation of new Java annotated classes implementing tests and support for appending new tests to an existing DQ class.
  • Postprocessor - Refactor XLSX postprocessor and use the new RDF beans model in place of the old JSON format. Generate simple reports via SPARQL queries and postprocess the csv file result to add formating.
  • TestRunner and triplestore - Modify TestRunner to use the RDF beans and store assertions in an in-memory triple store. Using the FFDQ Profile as an input argument, the TestRunner will scan the annotated classes for the tests tied to the MeasurementMethod, ValidationMethod and AmendmentMethod concepts from the FFDQ triplestore.
  • kurator-akka actor support - Bring generic FFDQ actor in kurator-akka up to date and invoke an instance of the TestRunner from parameters defined in the YAML config.

Other fixes

  • Add native actors JAR for MacOS, Linux to Bamboo and documentation (Issue #7)
  • Minify and bundle front-end JavaScript for performance (Issue #8)
  • Fix database schema evolutions for deployment of updates (Issue #6)
  • Delete workspace directory when removing workflow runs (Issue #5)
  • Kill process when removing running workflows (commit ce37158)

Data Quality Tests/Actors

  • Complete refactoring of georeference validation tools from FP-Akka into geo_ref_qc library, see FFDQ Refactoring of FP-CurationServices
  • Refactor scientific name checking tools into sci_name_qc library.
  • Add support for full set of core TDWG DQIG TG2 core test suite.