org.cyberneko.html.filters
public class Purifier extends DefaultFilter
Illegal characters in XML names are converted to the character sequence "_u####_" where "####" is the value of the Unicode character represented in hexadecimal. Whereas illegal characters appearing in document content is converted to the character sequence "\\u####".
In comments, the character '-' is replaced by the character sequence "- " to prevent "--" from ever appearing in the comment content. For CDATA sections, the character ']' is replaced by the character sequence "] " to prevent "]]" from appearing.
The URI used for synthesized namespace bindings is "http://cyberneko.org/html/ns/synthesized/number" where number is generated to ensure uniqueness.
Version: $Id: Purifier.java,v 1.5 2005/02/14 03:56:54 andyc Exp $
Field Summary | |
---|---|
protected static String | AUGMENTATIONS Include infoset augmentations. |
protected boolean | fAugmentations Augmentations. |
protected boolean | fInCDATASection True if inside a CDATA section. |
protected NamespaceContext | fNamespaceContext Namespace information. |
protected boolean | fNamespaces Namespaces. |
protected String | fPublicId Public identifier of doctype declaration. |
protected boolean | fSeenDoctype True if the doctype declaration was seen. |
protected boolean | fSeenRootElement True if root element was seen. |
protected int | fSynthesizedNamespaceCount Synthesized namespace binding count. |
protected String | fSystemId System identifier of doctype declaration. |
protected static String | NAMESPACES Namespaces. |
protected static HTMLEventInfo | SYNTHESIZED_ITEM Synthesized event info item. |
static String | SYNTHESIZED_NAMESPACE_PREFX Synthesized namespace binding prefix. |
Method Summary | |
---|---|
void | characters(XMLString text, Augmentations augs) Characters. |
void | comment(XMLString text, Augmentations augs) Comment. |
void | doctypeDecl(String root, String pubid, String sysid, Augmentations augs) Doctype declaration. |
void | emptyElement(QName element, XMLAttributes attrs, Augmentations augs) Empty element. |
void | endCDATA(Augmentations augs) End CDATA section. |
void | endElement(QName element, Augmentations augs) End element. |
protected void | handleStartDocument() Handle start document. |
protected void | handleStartElement(QName element, XMLAttributes attrs) Handle start element. |
void | processingInstruction(String target, XMLString data, Augmentations augs) Processing instruction. |
protected String | purifyName(String name, boolean localpart) Purify name. |
protected QName | purifyQName(QName qname) Purify qualified name. |
protected XMLString | purifyText(XMLString text) Purify content. |
void | reset(XMLComponentManager manager) |
void | startCDATA(Augmentations augs) Start CDATA section. |
void | startDocument(XMLLocator locator, String encoding, Augmentations augs) Start document. |
void | startDocument(XMLLocator locator, String encoding, NamespaceContext nscontext, Augmentations augs) Start document. |
void | startElement(QName element, XMLAttributes attrs, Augmentations augs) Start element. |
protected void | synthesizeBinding(XMLAttributes attrs, String ns) Synthesize namespace binding. |
protected Augmentations | synthesizedAugs() Returns an augmentations object with a synthesized item added. |
protected static String | toHexString(int c, int padlen) Returns a padded hexadecimal string for the given value. |
void | xmlDecl(String version, String encoding, String standalone, Augmentations augs) XML declaration. |