Class XmlParser
java.lang.Object
com.pnfsoftware.jeb.util.encoding.xml.XmlParser
A limited, simple, lenient, fast, and read-only XML parser. It can be used as a back-up when the
JDK implementation (typically, Apache Xerces) fails on documents deviating from the XML
specifications.
Features and limitations:
- XML version must be 1.x, encoding must be UTF-8.
- All unicode chars are accepted as long as there is no parsing ambiguity.
- Multiple root elements are allowed.
- Android-style backslash escapes in attribute values can be supported (
details
). - This parser returns XML documents implementing the read-only parts of the standard
org.w3c.dom
API (refer to theX...
classes in this package). - The following XML node types are NOT supported:
DocumentFragment
,Entity
,EntityReference
,Notation
,ProcessingInstruction
. - Calls to unsupported methods will raise
UnsupportedOperationException
. - Allow unclosed tags (like the famous html <br>)
- Allow unquoted attribute values
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionboolean
boolean
boolean
boolean
boolean
boolean
parse
(byte[] bytes) Parse the provided XML data and return an XMLDocument
object.Parse the provided XML string and return an XMLDocument
object.void
setAllowMismatchedTags
(boolean allowMismatchedTags) void
setAllowNoXmlDeclaration
(boolean allowNoXmlDeclaration) void
setAllowUnclosedTags
(boolean allowUnclosedTags) void
setAssignParentNodes
(boolean assignParentNodes) void
setHandleBackslashAxmlStyle
(boolean handleBackslashAxmlStyle) void
setSortAttributes
(boolean sortAttributes)
-
Constructor Details
-
XmlParser
public XmlParser()
-
-
Method Details
-
setAssignParentNodes
public void setAssignParentNodes(boolean assignParentNodes) - Parameters:
assignParentNodes
- if true, the internal parent and/or owner fields are set, allowing the use of methods likeNode.getParentNode()
,Node.getOwnerDocument()
,Attr.getOwnerElement()
, etc.
-
isAssignParentNodes
public boolean isAssignParentNodes()- Returns:
- the default is false
-
setSortAttributes
public void setSortAttributes(boolean sortAttributes) - Parameters:
sortAttributes
- if true, the element attributes are sorted by name, alphabetically; if false, the original order is maintained (note: Xerces does alphasort)
-
isSortAttributes
public boolean isSortAttributes()- Returns:
- the default is false
-
setHandleBackslashAxmlStyle
public void setHandleBackslashAxmlStyle(boolean handleBackslashAxmlStyle) - Parameters:
handleBackslashAxmlStyle
- If true,\n
and\t
escapes are allowed in attribute values. Other escapes (\x
) will result in the character "x". If false, The normal XML behavior applies: \ is a regular character, meaning \x is literally "\x".
-
isHandleBackslashAxmlStyle
public boolean isHandleBackslashAxmlStyle()- Returns:
- the default is false
-
setAllowUnclosedTags
public void setAllowUnclosedTags(boolean allowUnclosedTags) - Parameters:
allowUnclosedTags
- if true, the parser will consider unclosed tags as part ofXText
-
isAllowUnclosedTags
public boolean isAllowUnclosedTags()- Returns:
- the default is false
-
setAllowMismatchedTags
public void setAllowMismatchedTags(boolean allowMismatchedTags) - Parameters:
allowMismatchedTags
- if true, the parser will match the tags case-insensitively
-
isAllowMismatchedTags
public boolean isAllowMismatchedTags()- Returns:
- the default is false
-
isAllowNoXmlDeclaration
public boolean isAllowNoXmlDeclaration()- Returns:
- the default is false
-
setAllowNoXmlDeclaration
public void setAllowNoXmlDeclaration(boolean allowNoXmlDeclaration) - Parameters:
allowNoXmlDeclaration
- if true, the parser will process even if no xml declaration is found (<?xml...)
-
parse
Parse the provided XML string and return an XMLDocument
object. This method will throw if an error occurs.- Parameters:
str
- XML string (the encoding attribute is disregarded)- Returns:
- a document object, never null (the method throws on error)
- Throws:
ParseException
-
parse
Parse the provided XML data and return an XMLDocument
object. This method will throw if an error occurs.- Parameters:
str
- XML data- Returns:
- a document object, never null (the method throws on error)
- Throws:
ParseException
-