Class TagNode

  • All Implemented Interfaces:
    BaseToken, HtmlNode
    Direct Known Subclasses:
    Serializer.HeadlessTagNode

    public class TagNode
    extends TagToken
    implements HtmlNode

    XML node tag - basic node of the cleaned HTML tree. At the same time, it represents start tag token after HTML parsing phase and before cleaning phase. After cleaning process, tree structure remains containing tag nodes (TagNode class), content (text nodes - ContentNode), comments (CommentNode) and optionally doctype node (DoctypeToken).

    • Field Detail

      • attributes

        private java.util.Map<java.lang.String,​java.lang.String> attributes
      • children

        private java.util.List children
      • nsDeclarations

        private java.util.Map<java.lang.String,​java.lang.String> nsDeclarations
      • itemsToMove

        private java.util.List<BaseToken> itemsToMove
      • isFormed

        private transient boolean isFormed
    • Constructor Detail

      • TagNode

        public TagNode​(java.lang.String name)
    • Method Detail

      • setName

        public boolean setName​(java.lang.String name)
        Changes name of the tag
        Parameters:
        name -
        Returns:
        True if new name is valid, false otherwise
      • getAttributeByName

        public java.lang.String getAttributeByName​(java.lang.String attName)
        Parameters:
        attName -
        Returns:
        Value of the specified attribute, or null if it this tag doesn't contain it.
      • getAttributes

        public java.util.Map<java.lang.String,​java.lang.String> getAttributes()
        Returns:
        Map instance containing all attribute name/value pairs.
      • hasAttribute

        public boolean hasAttribute​(java.lang.String attName)
        Checks existance of specified attribute.
        Parameters:
        attName -
      • addAttribute

        @Deprecated
        public void addAttribute​(java.lang.String attName,
                                 java.lang.String attValue)
        Deprecated.
        Use setAttribute instead Adds specified attribute to this tag or overrides existing one.
        Parameters:
        attName -
        attValue -
      • setAttribute

        public void setAttribute​(java.lang.String attName,
                                 java.lang.String attValue)
        Adding new attribute ir overriding existing one.
        Specified by:
        setAttribute in class TagToken
        Parameters:
        attName -
        attValue -
      • addNamespaceDeclaration

        public void addNamespaceDeclaration​(java.lang.String nsPrefix,
                                            java.lang.String nsURI)
        Adds namespace declaration to the node
        Parameters:
        nsPrefix - Namespace prefix
        nsURI - Namespace URI
      • getNamespaceDeclarations

        public java.util.Map<java.lang.String,​java.lang.String> getNamespaceDeclarations()
        Returns:
        Map of namespace declarations for this node
      • removeAttribute

        public void removeAttribute​(java.lang.String attName)
        Removes specified attribute from this tag.
        Parameters:
        attName -
      • getChildren

        public java.util.List getChildren()
        Returns:
        List of children objects. During the cleanup process there could be different kind of childern inside, however after clean there should be only TagNode instances.
      • hasChildren

        public boolean hasChildren()
        Returns:
        Whether this node has child elements or not.
      • setChildren

        void setChildren​(java.util.List children)
      • getChildTagList

        public java.util.List getChildTagList()
      • getChildTags

        public TagNode[] getChildTags()
        Returns:
        An array of child TagNode instances.
      • getText

        public java.lang.StringBuffer getText()
        Returns:
        Text content of this node and it's subelements.
      • getParent

        public TagNode getParent()
        Returns:
        Parent of this node, or null if this is the root node.
      • setDocType

        public void setDocType​(DoctypeToken docType)
      • addChild

        public void addChild​(java.lang.Object child)
      • addChildren

        public void addChildren​(java.util.List newChildren)
        Add all elements from specified list to this node.
        Parameters:
        newChildren -
      • findElement

        private TagNode findElement​(TagNode.ITagNodeCondition condition,
                                    boolean isRecursive)
        Finds first element in the tree that satisfy specified condition.
        Parameters:
        condition -
        isRecursive -
        Returns:
        First TagNode found, or null if no such elements.
      • getElementList

        private java.util.List getElementList​(TagNode.ITagNodeCondition condition,
                                              boolean isRecursive)
        Get all elements in the tree that satisfy specified condition.
        Parameters:
        condition -
        isRecursive -
        Returns:
        List of TagNode instances with specified name.
      • getElements

        private TagNode[] getElements​(TagNode.ITagNodeCondition condition,
                                      boolean isRecursive)
        Parameters:
        condition -
        isRecursive -
        Returns:
        The array of all subelemets that satisfy specified condition.
      • getAllElementsList

        public java.util.List getAllElementsList​(boolean isRecursive)
      • getAllElements

        public TagNode[] getAllElements​(boolean isRecursive)
      • findElementByName

        public TagNode findElementByName​(java.lang.String findName,
                                         boolean isRecursive)
      • getElementListByName

        public java.util.List getElementListByName​(java.lang.String findName,
                                                   boolean isRecursive)
      • getElementsByName

        public TagNode[] getElementsByName​(java.lang.String findName,
                                           boolean isRecursive)
      • findElementHavingAttribute

        public TagNode findElementHavingAttribute​(java.lang.String attName,
                                                  boolean isRecursive)
      • getElementListHavingAttribute

        public java.util.List getElementListHavingAttribute​(java.lang.String attName,
                                                            boolean isRecursive)
      • getElementsHavingAttribute

        public TagNode[] getElementsHavingAttribute​(java.lang.String attName,
                                                    boolean isRecursive)
      • findElementByAttValue

        public TagNode findElementByAttValue​(java.lang.String attName,
                                             java.lang.String attValue,
                                             boolean isRecursive,
                                             boolean isCaseSensitive)
      • getElementListByAttValue

        public java.util.List getElementListByAttValue​(java.lang.String attName,
                                                       java.lang.String attValue,
                                                       boolean isRecursive,
                                                       boolean isCaseSensitive)
      • getElementsByAttValue

        public TagNode[] getElementsByAttValue​(java.lang.String attName,
                                               java.lang.String attValue,
                                               boolean isRecursive,
                                               boolean isCaseSensitive)
      • evaluateXPath

        public java.lang.Object[] evaluateXPath​(java.lang.String xPathExpression)
                                         throws XPatherException
        Evaluates XPath expression on give node.
        This is not fully supported XPath parser and evaluator. Examples below show supported elements:
        • //div//a
        • //div//a[@id][@class]
        • /body/*[1]/@type
        • //div[3]//a[@id][@href='r/n4']
        • //div[last() >= 4]//./div[position() = last()])[position() > 22]//li[2]//a
        • //div[2]/@*[2]
        • data(//div//a[@id][@class])
        • //p/last()
        • //body//div[3][@class]//span[12.2
        • data(//a['v' < @id])
        Parameters:
        xPathExpression -
        Returns:
        Throws:
        XPatherException
      • removeFromTree

        public boolean removeFromTree()
        Remove this node from the tree.
        Returns:
        True if element is removed (if it is not root node).
      • removeChild

        public boolean removeChild​(java.lang.Object child)
        Remove specified child element from this node.
        Parameters:
        child -
        Returns:
        True if child object existed in the children list.
      • removeAllChildren

        public void removeAllChildren()
        Removes all children (subelements and text content).
      • replaceChild

        public void replaceChild​(HtmlNode childToReplace,
                                 HtmlNode replacement)
        Replaces specified child node with specified replacement node.
        Parameters:
        childToReplace - Child node to be replaced
        replacement - Replacement node
      • getChildIndex

        public int getChildIndex​(HtmlNode child)
        Parameters:
        child - Child to find index of
        Returns:
        Index of the specified child node inside this node's children, -1 if node is not the child
      • insertChild

        public void insertChild​(int index,
                                HtmlNode childToAdd)
        Inserts specified node at specified position in array of children
        Parameters:
        index -
        childToAdd -
      • insertChildBefore

        public void insertChildBefore​(HtmlNode node,
                                      HtmlNode nodeToInsert)
        Inserts specified node in the list of children before specified child
        Parameters:
        node - Child before which to insert new node
        nodeToInsert - Node to be inserted at specified position
      • insertChildAfter

        public void insertChildAfter​(HtmlNode node,
                                     HtmlNode nodeToInsert)
        Inserts specified node in the list of children after specified child
        Parameters:
        node - Child after which to insert new node
        nodeToInsert - Node to be inserted at specified position
      • addItemForMoving

        void addItemForMoving​(BaseToken item)
      • getItemsToMove

        java.util.List<BaseToken> getItemsToMove()
      • setItemsToMove

        void setItemsToMove​(java.util.List<BaseToken> itemsToMove)
      • isFormed

        boolean isFormed()
      • setFormed

        void setFormed​(boolean isFormed)
      • setFormed

        void setFormed()
      • traverse

        public void traverse​(TagNodeVisitor visitor)
        Traverses the tree and performs visitor's action on each node. It stops when it finishes all the tree or when visitor returns false.
        Parameters:
        visitor - TagNodeVisitor implementation
      • traverseInternally

        private boolean traverseInternally​(TagNodeVisitor visitor)
      • collectNamespacePrefixesOnPath

        void collectNamespacePrefixesOnPath​(java.util.Set<java.lang.String> prefixes)
        Collect all prefixes in namespace declarations up the path to the document root from the specified node
        Parameters:
        prefixes - Set of prefixes to be collected
      • getNamespaceURIOnPath

        java.lang.String getNamespaceURIOnPath​(java.lang.String nsPrefix)
      • serialize

        public void serialize​(Serializer serializer,
                              java.io.Writer writer)
                       throws java.io.IOException
        Specified by:
        serialize in interface BaseToken
        Throws:
        java.io.IOException