Thursday, July 10, 2014

Converting NIEM XML to HTML5

Currently with Open-XDX, you can persist and retrieve XML information based on a NIEM IEPD. You can also expose information using NIEM JSON through a transformation library, which is useful for building Web and Mobile Web applications, using 4th generation client-side frameworks like Angular, Ember and Polymer, as well as older frameworks like JQuery and Dojo.

There are 4 main information formats used in the Worldwide Web:
  1. HTML is ideal for documentation, tables, and open data, because it is easy to publish and forgiving. HTML is fundamental to REST as a way of exposing endpoint documentation.
  2. JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate.
  3. The Resource Description Framework (RDF) is a language for representing information about resources in the World Wide Web.
  4. Extensible Markup Language (XML) is a simple, flexible text format derived from SGML (ISO 8879), designed to meet the challenges of large-scale electronic publishing, and plays an increasingly important role in the exchange of a wide variety of data on throughout Web.
In addition, several other XML formats are commonly used for document collation and syndication:
  1. DITA (Darwin Information Typing Architecture) and DocBook are used to assemble documentation out of markup. These will probably both be eventually supplanted by HTML5.
  2. ATOM and RSS are XML-based syndication formats. JSON-based syndication formats have also been described, although this is less mature.
(This discussion sort of refers back to Jeni Tennison's XML Prague keynote on "chimera", in which she discusses the different formats, and the way they, for instance, handle links and URIs differently.)

NIEM currently supports XML-based and JSON-based business cases as a way of quickly and rigorously exposing data for exchange and migration. In addition, the NIEM JSON flavor also supports web and mobile web applications, using the mentioned 4GL frameworks and their like. The quickest way to expose NIEM information, however, is using the HTML information format (most likely HTML5, which is more semantically rich than previous versions).

Basic rules for converting NIEM XML into NIEM HTML:
  1. Create one element per element, with the exception of lists.
  2. For node elements, use div.
  3. For leaf elements, use span.
  4. Where makeRepeatable, use ol and li, containing either div or span elements as per above.
  5. For any element, class attribution represents datatype (like "string" or "date")
  6. For any element, id attribution represents XML element name, including namespace prefix (like "ncPersonName")
Based on these rules, an XSLT transform can be generated from the OASIS CAM schema representation from the NIEM IEPD, which could be generated directly from the CAM tooling. This transform can then be applied to the XML exposed by Open-XDX, allowing this information to be quickly exposed using HTML for read only use and for add/update using forms (XForms? HTML5 forms? Hybrid using JavaScript?).

In the same way that a full HTML page can be created from NIEM information, it should also be possible to generate partial or natural templating. In essence, this is just a fragment of HTML. This may be required to support platforms like Java-Spring-Thymeleaf, Oracle ADF, or Meteor, which all rely on some sort of direction through attribution. The simplest way to expose information is still to create the entire HTML page, instead of a partial. This is noted here because whenever NIEM JSON is used, there will likely be a requirement to generate a template from the NIEM CAM as well.

Note that NIEM is not currently resource-based; their is no inbuilt facility to support REST by exposing resource identifiers; however, one of the requirements for REST is to expose documentation at the endpoint, and it should be possible to generate this documentation directly from the IEPD (I think Datypic generates something like this for the NIEM Core). In this case, the IEPD may be sufficient.

No comments: