Reference Manual for RNV Tools

David Tolpin

Developer

Thomas Schraitle

Manpage

Table of Contents

rnv - RELAX NG Compact Syntax Validator in C
arx - Automatically determine the type of a document from its name and contents
rvp - RELAX NG Validation Pipe

Name

rnv — RELAX NG Compact Syntax Validator in C

Synopsis

rnv { -q | -p | -c | -s | -v | -h } grammar.rnc document.xml

The options are:

-q

names of files being processed are not printed; in error messages, expected elements and attributes are not listed;

-n NUM

sets the maximum number of reported expected elements and attributes, -q sets this to 0 and can be overriden;

-p

copies the input to the output;

-c

if the only argument is a grammar, checks the grammar and exits;

-s

uses less memory and runs slower;

-v

prints version number;

-h

displays usage summary and exits.

Limitations

This tool has the following limitations:

  • RNV assumes that the encoding of the syntax file is UTF-8.

  • Support for XML Schema Part 2: Datatypes is partial.

  • The schema parser does not check that all restrictions are obeyed, in particular, restrictions 7.3 and 7.4 are not checked.

  • RNV for Win32 platforms is a Unix program compiled on Win32. It expects file paths to be written with normal slashes; if a schema is in a different directory and includes or refers external files, then the schema's path must be written in the Unix way for the relative paths to work. For example, under Windows, rnv that uses ..\schema\docbook.rnc to validate userguide.dbx should be invoked as

    rnv.exe ../schema/docbook.rnc userguide.dbx

Name

arx — Automatically determine the type of a document from its name and contents

Synopsis

arx { -n | -v | -h } document.xml arx.conf {arx.conf}

ARX either prints a string corresponding to the document's type or nothing if the type cannot be determined. The options are:

-n

turns off prepending base path of the configuration file to the result, even if it looks like a relative path (useful when the configuration file and the grammars are in separate directories, or for association with something that is not a file);

-v

prints version number;

-h

displays usage summary and exits.

The Configuration File

The configuration file must conform to the following grammar:

arx = grammars route*
grammars = "grammars"  "{" type2string+ "}"
type2string =  type "=" literal
type = nmtoken
route = match|nomatch|valid|invalid
match = "=~" regexp "=>" type
nomatch = "!~" regexp "=>" type
valid = "valid" "{" rng "}" "=>" type
invalid = "!valid" "{" rng "}" "=>" type

literal=string in '"', '"' inside must be prepended by '\'
regexp=string in '/', '/' inside must be prepended by '\'
rng=Relax NG Compact Syntax
      
Comments start with # and continue till the end of line.

Rules are processed sequentially, the first matching rule determines the file's type. RELAX NG templates are matched against file contents, regular expressions are applied to file names. The sample below associates documents with grammars for XSLT, DocBook or XSL FO.

grammars {
  docbook="docbook.rnc"
  xslt="xslt.rnc"
  xslfo="fo.rnc"
}

valid {
  start = element (book|article|chapter|reference) {any}
  any = (element * {any}|attribute * {text}|text)*
} => docbook
      
!valid {
  default namespace xsl = "http://www.w3.org/1999/XSL/Transform"
  start = element *-xsl:* {not-xsl}
  not-xsl = (element *-xsl:* {not-xsl}|attribute * {text}|text)*
} => xslt
      
=~/.*\.xsl/ => xslt
=~/.*\.fo/ => xslfo

ARX can also be used to link documents to any type of information or processing.


Name

rvp — RELAX NG Validation Pipe

Synopsis

rvp { -q | -s | -v | -h } grammar.rnc

The options are:

-q

returns only error numbers, suppresses messages;

-s

uses less memory and runs slower;

-v

prints version number;

-h

displays usage summary and exits.

RELAX NG Validation Pipe

RVP is abbreviation for Relax NG Validation Pipe. It reads validation primitives from the standard input and reports result to the standard output; it's main purpose is to ease embedding of a RELAX NG validator into various languages and environment. An application would launch RVP as a parallel process and use a simple protocol to perform validation. The protocol, in BNF, is:

query ::= (
          quit
        | start
        | start-tag-open
        | attribute
        | start-tag-close
        | text
        | end-tag) z.
      quit ::= "quit".
      start ::= "start" [gramno].
      start-tag-open ::= "start-tag-open" patno name.
      text ::= ("text"|"mixed") patno text.
      end-tag ::= "end-tag" patno name.  
    response ::= (ok | er | error) z.
       ok ::= "ok" patno.
       er ::= "er" patno erno.
       error ::= "error" patno erno error.
    z ::= "\0" .  
      
  • RVP assumes that the last colon in a name separates the local part from the namespace URI (it is what one gets if specifies : as namespace separator to Expat).

  • Error codes can be grabbed from rvp sources by grep _ER_ *.h and OR-ing them with corresponding masks from erbit.h. Additionally, error 0 is the protocol format error.

  • Either er or error responses are returned, not both; -q chooses between concise and verbose forms (invocation syntax described later).

  • start passes the index of a grammar (first grammar in the list of command-line arguments has number 0); if the number is omitted, 0 is assumed.

  • quit is not opposite of start; instead, it quits RVP.

To assist embedding RVP, samples in Perl (tools/rvp.pl) and Python (tools/rvp.py) are provided. The scripts use Expat wrappers for each of the languages to parse documents; they take a RELAX NG grammar (in the compact syntax) as the command line argument and read the XML from the standard input. For example, the following commands validate rnv.dbx against docbook.rnc:

perl rvp.pl docbook.rnc < rnv.dbx
python rvp.py docbook.rnc < rnv.dbx

The scripts are kept simple and unobscured to illustrate the technique, rather than being designed as general-purpose modules. Programmers using Perl, Python, Ruby and other languages are encouraged to implement and share reusable RVP-based components for their languages of choice.