![]() | ![]() |
Home |
|
|
XML Services in Adaptive Server Enterprise |
|
| Chapter 3 XML Language and XML Query Language |
Chapter 3
The XML query functions support the XML 1.0 standard for XML documents and the XPath 1.0 standard for XML queries. This chapter describes the subsets of those standards that "XML Services in Adaptive Server" supports.
The native XML processor supports only ASCII data for XML documents. It does not support non-ASCII characters such as Japanese Shift_JIS, accented characters, etc.The encoding declaration of the XML declaration of an XML document may specify encodings of UTF-8, UTF-16, ISO-10646-UCS-4, ISO-8859-1, or ASCII or may default to UTF-8.Regardless the encoding that is specified or defaulted, XML documents must contain only ASCII characters.
XML documents specify URIs (Universal Resource Indicators) in two contexts, as href attributes or document text, and as external references for DTDs, entity definitions, XML schemas, and namespace declarations.There are no restrictions on the use of URIs as href attributes or document text, and XML Services resolves external reference URIs that specify http URLs. External-reference URIs that specify file, ftp, or relative URIs are not supported.
You can parse and store XML documents with namespace declarations and references with no restriction.
However, when XML element and attribute names that have namespace prefixes are referenced in XM expressions in xmlextract and in xmltest, the namespace prefix and colon are treated as part of the element or attribute name. They are not processed as namespace references.
You can parse and store XML documents with XML schema declarations, subject to the restrictions on external references described in URI support.
You can also query XML documents with XML schema declarations, using xmlextract and xmltest, with the restriction that the XML schema are ignored. All elements are treated as character data, and no schema validation is performed.
The special characters for quote ("), apostrophe ('), less-than (<), greater-than (>), and ampersand (&) are used for punctuation in XML, and are represented with predefined entities: ", ', <, >, and &. Notice that the semicolon is part of the entity.You cannot use "<" or "&" in attributes or elements, as the following series of examples demonstrates.
select xmlparse("<a atr='<'/>")
Msg 14702, Level 16, State 0:
Line 1:
XMLPARSE(): XML parser fatal error <<A '<' character cannot be used in attribute 'atr', except through <>> at line 1, offset 14.
select xmlparse("<a atr1='&'>")
Msg 14702, Level 16, State 0:
Line 1:
XMLPARSE(): XML parser fatal error
<<Expected entity name for reference>>
at line 1, offset 11
select xmlparse("<a> < </a>")
Msg 14702, Level 16, State 0:
Line 2:
XMLPARSE(): XML parser fatal error
<<Expected an element name>>
at line 1, offset 6.
select xmlparse(" & ")
Msg 14702, Level 16, State 0:
Line 1:
XMLPARSE(): XML parser fatal error
<<Expected entity name for reference>>
at line 1, offset 6.Instead, use the predefined entities < and &, as follows:
select xmlextract("/",
"<a atr='< &'> < & </a>" )
--------------------------------
<a atr="< &"> < & </a> You can use quotation marks within attributes delimited by apostrophes, and vice versa. These marks are replaced by the predefined entities " or '. In the following examples, notice that the quotation marks or apostrophes surrounding the word 'yes' are doubled to comply with the SQL character literal convention:
select xmlextract("/", "<a atr=' ""yes"" '/> " )
---------------------------------
<a atr=" "yes" "></a>
select xmlextract('/', '<a atr=" ''yes'' "/> ' )
----------------------------
<a atr=" 'yes' "></a> You can use quotation marks and apostrophes within elements. They are replaced by the predefined entities " and &apol:, as the following example shows:
select xmlextract("/", " ""yes"" and 'no' " )
-------------------------------------
"yes" and 'no' You can also use ">" in attributes or elements, and it is replaced by the predefined entity >, as this example demonstrates:
select xmlextract("/", "<a atr='>'> > </a>" )
----------------------------------------------
<a atr=">"> > </a> When you specify XML queries with character literals that contain the XML special characters, you can write them as either plain characters or as pre-defined entities. The following example shows two points:
The XML document contains an element "<a>" whose value is the XML special characters &<>", represented by their predefined entities, &<>"
The XML query specifies a character literal with those same XML special characters, also represented by their predefined entities.
select xmlextract('/a="&<>""',
"&<>"")
----------------------------------
<a>&<>"</a>The following example is the same, except that the XML query specifies the character literal with the plain XML special characters. Those XML special characters are replaced by the predefined entities before the query is evaluated.
select xmlextract("/a='&<>""' " ,
"<a>&<>"</a>")
----------------------------------
<a>&<>"</a> All whitespace is preserved, and is significant in queries.
select xmlextract("/a[@atr=' this or that ' ]",
"<a atr=' this or that '><b> which or what
</b></a>")
-------------------------------------------------
<a atr=" this or that ">
<b> which or what </b></a>
select xmlextract("/a[b=' which or what ']",
"<a atr=' this or that '><b> which or what
</b></a>")
---------------------------------------------
<a atr=' this or that '>
<b> which or what </b></a> Empty elements that are entered in the style "<a/>" are stored and returned in the style "<a></a>":
select xmlextract("/",
"<doc><a/> <b></b></doc>")
-----------------------------------------
<doc>
<a></a>
<b></b></doc> XML Services supports a subset of the standard XPath Language. That subset is defined by the syntax and tokens in the following section.
XML Services supports the following XPath syntax:
xpath::= or_expr
or_expr::= and_expr | and_expr TOKEN_OR or_expr
and_expr::= union_expr | union_expr TOKEN_AND and_expr
union_expr::= intersect_expr
| intersect_expr TOKEN_UNION union_expr
intersect_expr::= comparison_expr
| comparison_expr TOKEN_INTERSECT intersect_expr
comparison_expr::= range_exp
| range_expr general_comp comparisonRightHandSide
general_comp::= TOKEN_EQUAL | TOKEN_NOTEQUAL
| TOKEN_LESSTHAN | TOKEN_LESSTHANEQUAL
| TOKEN_GREATERTHAN | TOKEN_GREATERTHANEQUAL
range_expr::= unary_expr | unary_expr TOKEN_TO unary_expr
unary_expr::= TOKEN_MINUS path_expr
| TOKEN_PLUS path_expr
| path_expr
comparisonRightHandSide::= literal
path_expr::= relativepath_expr | TOKEN_SLASH
| TOKEN_SLASH relativepath_expr | TOKEN_DOUBLESLASH relativepath_expr
relativepath_expr::= step_expr
| step_expr TOKEN_SLASH relativepath_expr
| step_expr TOKEN_DOUBLESLASH relativepath_expr
step_expr::= forward_step predicates
| primary_expr predicates
| predicates
primary_expr::= literal
forward_step::= abbreviated_forward_step
abbreviated_forward_step::= name_test
| TOKEN_ATRATE name_test
| TOKEN_PERIOD
name_test::= q_name | wild_card| text test
text_test ::= TOKEN_TEXT TOKEN_LPAREN TOKEN_RPAREN
literal::= numeric_literal | string_literal
wild_card::= TOKEN_ASTERISK
q_name::= TOKEN_ID
string_literal::= TOKEN_STRING
numeric_literal::= TOKEN_INT | TOKEN_FLOATVAL|
| TOKEN_MINUS TOKEN_INT
| TOKEN_MINUSTOKEN_FLOATVAL
predicates::=
| TOKEN_LSQUARE expr TOKEN_RSQUARE predicates
| TOKEN_LSQUARE expr TOKEN_RSQUARE The following tokens are supported by the XML Services subset of XPath:
APOS ::= '''
DIGITS ::= [0-9]+
NONAPOS ::= '^''
NONQUOTE ::= '^"'
NONSTART ::= LETTER | DIGIT | '.' | '-' | '_' | ':'
QUOTE ::= '"'
START ::= LETTER | '_'
TOKEN_AND ::= 'and'
TOKEN_ASTERISK ::= '*'
TOKEN_ATRATE ::= '@ '
TOKEN_COMMA ::= ','
TOKEN_DOUBLESLASH ::= '//'
TOKEN_EQUAL ::= '='
TOKEN_GREATERTHAN ::= '>'
TOKEN_GREATERTHANEQUAL ::= '>='
TOKEN_INTERSECT ::= 'intersect'
TOKEN_LESSTHAN ::= '<'
TOKEN_LESSTHANEQUAL ::= '<='
TOKEN_LPAREN ::= '('
TOKEN_LSQUARE ::= '['
TOKEN_MINUS ::= '-'
TOKEN_NOT ::= 'not'
TOKEN_NOTEQUAL ::= '!='
TOKEN_OR ::= 'or'
TOKEN_PERIOD ::= '.'
TOKEN_PLUS ::= '+'
TOKEN_RPAREN ::= ')'
TOKEN_RSQUARE ::= ']'
TOKEN_SLASH ::= '/'
TOKEN_TO ::= 'to '
TOKEN_UNION ::= '|' | 'union'
TOKEN_ID ::= START [NONSTART...]
TOKEN_FLOATVAL ::= DIGITS | '.'DIGITS | DIGITS'.'DIGITS
TOKEN_INT ::= DIGITS
TOKEN_STRING ::=
QUOTE NONQUOTE... QUOTE
| APOS NONAPOS... APOS
TOKEN_TEXT ::= 'text'This section specifies the XPath subset supported by the XML processor.
XPath basic operatorsTable 3-1 shows the supported basic XPath operators.
Operator | Description |
/ | Path (Children): the child operator ('/') selects from immediate children of the left-side collection. |
// | Descendants: the descendant operator ('//') selects from arbitrary descendants of the left-side collection. |
* | Collecting element children: an element can be referenced without using its name by substituting the '*' collection |
@ | Attribute: attribute names are preceded by the '@' symbol |
[] | Filter: You can apply constraints and branching to any collection by adding a filter clause '[ ]' to the collection. The filter is analogous to the SQL where clause with any semantics. The filter contains a query within it, called the sub-query. If a collection is placed within the filter, a Boolean "true" is generated if the collection contains any members, and a "false" is generated if the collection is empty. |
[n] | Index: index is mainly use to find a specific node within a set of nodes. Enclose the index within square brackets. The first node is index 1. |
[-n] | Backtrack index: return the element that is n-1 units from the last element. -1 means the last element, -2 is the next to last element. |
[m to n] | Subscript: returns elements m through n, where m is the first index and n is the last index. |
text() | Selects the text nodes of the current context node. |
, shows the supported XPath set operators.
Operator | Description |
union, | | Union: union operator (shortcut is '|') returns the combined set of values from the query on the left and the query on the right. Duplicates are filtered out and resulting list is sorted in document order. |
intersect | Intersection: intersect operator returns the set of elements in common between two sets. |
( ) | Group: you can use parentheses to group collection operators. |
. (dot) | Period: dot term is evaluated with respect to a search context. The term evaluates to a set that contains only the reference node for this search context. |
Boolean Operators => | Boolean expressions can be used within subqueries. |
and | Boolean "and". |
or | Boolean "or". |
Table 3-3 shows the supported XPath comparison operators.
Operator | Description |
= | equality |
!= | non-equality |
< | less than |
> | greater than |
>= | less than equal |
<= | greater than equal |
|
|