| ~~ Licensed to the Apache Software Foundation (ASF) under one or more |
| ~~ contributor license agreements. See the NOTICE file distributed with |
| ~~ this work for additional information regarding copyright ownership. |
| ~~ The ASF licenses this file to You under the Apache License, Version 2.0 |
| ~~ (the "License"); you may not use this file except in compliance with |
| ~~ the License. You may obtain a copy of the License at |
| ~~ |
| ~~ http://www.apache.org/licenses/LICENSE-2.0 |
| ~~ |
| ~~ Unless required by applicable law or agreed to in writing, software |
| ~~ distributed under the License is distributed on an "AS IS" BASIS, |
| ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| ~~ See the License for the specific language governing permissions and |
| ~~ limitations under the License. |
| |
| Developer Data Handling |
| |
| |
| * Hyracks Data Mapping |
| |
| {{{http://hyracks.org}Hyracks}} supports several basic data types stored in |
| byte arrays. The byte arrays can be accessed through objects referred to as |
| pointables. The pointable helps with tracking the bytes stored in a larger |
| storage array. Some pointables support converting the byte array into a |
| desired format such as for numeric type. The most basic pointable has three |
| values stored in the object. |
| |
| * byte array |
| |
| * starting offset |
| |
| * length |
| |
| [] |
| |
| In Apache VXQuery\x99 the TaggedValuePointable is used to read a result from this byte |
| array. The first byte defines the data type and alerts us to what pointable |
| to use for reading the rest of the data. |
| |
| ** Fixed Length Data |
| |
| Fixed length data types can be stored in a set field size. The following |
| outlines the Hyracks data type or custom VXQuery definition with the |
| details about the implementation. |
| |
| *-------------------------+----------------------+---------------+ |
| | <<Data Type>> | <<Pointable Name>> | <<Data Size>> | |
| *-------------------------+----------------------+---------------: |
| | xs:boolean | BooleanPointable | 1 | |
| *-------------------------+----------------------+---------------: |
| | xs:byte | BytePointable | 1 | |
| *-------------------------+----------------------+---------------: |
| | xs:date | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/atomic/XSDatePointable.java}XSDatePointable}} | 6 | |
| *-------------------------+----------------------+---------------: |
| | xs:dateTime | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/atomic/XSDateTimePointable.java}XSDateTimePointable}} | 12 | |
| *-------------------------+----------------------+---------------: |
| | xs:dayTimeDuration | LongPointable | 8 | |
| *-------------------------+----------------------+---------------: |
| | xs:decimal | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/atomic/XSDecimalPointable.java}XSDecimalPointable}} | 9 | |
| *-------------------------+----------------------+---------------: |
| | xs:double | DoublePointable | 8 | |
| *-------------------------+----------------------+---------------: |
| | xs:duration | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/atomic/XSDurationPointable.java}XSDurationPointable}} | 12 | |
| *-------------------------+----------------------+---------------: |
| | xs:float | FloatPointable | 4 | |
| *-------------------------+----------------------+---------------: |
| | xs:gDay | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/atomic/XSDatePointable.java}XSDatePointable}} | 6 | |
| *-------------------------+----------------------+---------------: |
| | xs:gMonth | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/atomic/XSDatePointable.java}XSDatePointable}} | 6 | |
| *-------------------------+----------------------+---------------: |
| | xs:gMonthDay | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/atomic/XSDatePointable.java}XSDatePointable}} | 6 | |
| *-------------------------+----------------------+---------------: |
| | xs:gYear | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/atomic/XSDatePointable.java}XSDatePointable}} | 6 | |
| *-------------------------+----------------------+---------------: |
| | xs:gYearMonth | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/atomic/XSDatePointable.java}XSDatePointable}} | 6 | |
| *-------------------------+----------------------+---------------: |
| | xs:int | IntegerPointable | 4 | |
| *-------------------------+----------------------+---------------: |
| | xs:integer | LongPointable | 8 | |
| *-------------------------+----------------------+---------------: |
| | xs:negativeInteger | LongPointable | 8 | |
| *-------------------------+----------------------+---------------: |
| | xs:nonNegativeInteger | LongPointable | 8 | |
| *-------------------------+----------------------+---------------: |
| | xs:nonPositiveInteger | LongPointable | 8 | |
| *-------------------------+----------------------+---------------: |
| | xs:positiveInteger | LongPointable | 8 | |
| *-------------------------+----------------------+---------------: |
| | xs:short | ShortPointable | 2 | |
| *-------------------------+----------------------+---------------: |
| | xs:time | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/atomic/XSTimePointable.java}XSTimePointable}} | 8 | |
| *-------------------------+----------------------+---------------: |
| | xs:unsignedByte | ShortPointable | 2 | |
| *-------------------------+----------------------+---------------: |
| | xs:unsignedInt | LongPointable | 8 | |
| *-------------------------+----------------------+---------------: |
| | xs:unsignedLong | LongPointable | 8 | |
| *-------------------------+----------------------+---------------: |
| | xs:unsignedShort | IntegerPointable | 4 | |
| *-------------------------+----------------------+---------------: |
| | xs:yearMonthDuration | IntegerPointable | 4 | |
| *-------------------------+----------------------+---------------: |
| |
| ** Variable Length Data |
| |
| Some information can not be stored in a fixed length value. The following |
| data types are stored in variable length values. Because the size varies, |
| the first two bytes are used to store the length of the total value in |
| bytes. QName is one exception to this rule because the QName field has |
| three distinct variable length fields. In this case we basically are |
| storing three strings right after each other. |
| |
| Please note that all strings are stored in UTF8. The UTF8 characters range |
| in size from one to three bytes. UTF8StringWriter supports writing a |
| character sequence into the UTF8StringPointable format. |
| |
| *-------------------------+----------------------+---------------+ |
| | <<Data Type>> | <<Pointable Name>> | <<Data Size>> | |
| *-------------------------+----------------------+---------------: |
| | xs:anyURI | UTF8StringPointable | 2 + length | |
| *-------------------------+----------------------+---------------: |
| | xs:base64Binary | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/atomic/XSBinaryPointable.java}XSBinaryPointable}} | 2 + length | |
| *-------------------------+----------------------+---------------: |
| | xs:hexBinary | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/atomic/XSBinaryPointable.java}XSBinaryPointable}} | 2 + length | |
| *-------------------------+----------------------+---------------: |
| | xs:NOTATION | UTF8StringPointable | 2 + length | |
| *-------------------------+----------------------+---------------: |
| | xs:QName | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/atomic/XSQNamePointable.java}XSQNamePointable}} | 6 + length | |
| *-------------------------+----------------------+---------------: |
| | xs:string | UTF8StringPointable | 2 + length | |
| *-------------------------+----------------------+---------------: |
| |
| |
| * XPath Node Types |
| |
| XML is used as the data source for XQuery and must be parsed into Hyracks data. Each |
| node type defined in XPath and XQuery can be mapped into pointable defined in Apache |
| VXQuery\x99. |
| |
| *-----------------------------------+----------------------+---------------+ |
| | <<Data Type>> | <<Pointable Name>> | <<Data Size>> | |
| *-----------------------------------+----------------------+---------------: |
| | Attribute Nodes(ANP) | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/nodes/AttributeNodePointable.java}AttributeNodePointable}} | 1 + length | |
| *-----------------------------------+----------------------+---------------: |
| | Document Nodes(DNP) | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/nodes/DocumentNodePointable.java}DocumentNodePointable}} | 1 + length | |
| *-----------------------------------+----------------------+---------------: |
| | Element Nodes(ENP) | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/nodes/ElementNodePointable.java}ElementNodePointable}} | 1 + length | |
| *-----------------------------------+----------------------+---------------: |
| | Node Tree(NTP) | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/nodes/NodeTreePointable.java}NodeTreePointable}} | 1 + length | |
| *-----------------------------------+----------------------+---------------: |
| | Processing Instruction Node(PINP) | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/nodes/PINodePointable.java}PINodePointable}} | 1 + length | |
| *-----------------------------------+----------------------+---------------: |
| | Comment Node(CNP) | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/nodes/TextOrCommentNodePointable.java}TextOrCommentNodePointable}} | 1 + length | |
| *-----------------------------------+----------------------+---------------: |
| | Text Node(TNP) | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/nodes/TextOrCommentNodePointable.java}TextOrCommentNodePointable}} | 1 + length | |
| *-----------------------------------+----------------------+---------------: |
| |
| |
| * JSONiq Data Types |
| |
| Json-items are used as a data source for JSONiq and must be parsed into Hyracks data. Each json-item defined in JSONiq can be mapped into pointable defined in Apache VXQuery\x99. |
| |
| *-------------------------+----------------------+---------------: |
| | Array(AP) | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/jsonitem/ArrayPointable.java}ArrayPointable}} | 1 + length | |
| *-------------------------+----------------------+---------------: |
| | Object(OP) | {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/datamodel/accessors/jsonitem/ObjectPointable.java}ObjectPointable}} | 1 + length | |
| *-------------------------+----------------------+---------------: |
| | js:null(null) | | 1 | |
| *-------------------------+----------------------+---------------: |
| |
| |
| * String Iterators |
| |
| For many string functions, we have used string iterators to traverse the |
| string. The iterator allows the user to ignore the details about the byte |
| size and number of characters. The iterator returns the next character or |
| an end of string value. Stacking iterators can be used to alter the string |
| into a desired form. |
| |
| * {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/runtime/functions/strings/ICharacterIterator.java}ICharacterIterator}} |
| |
| * {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/runtime/functions/strings/LowerCaseCharacterIterator.java}LowerCaseStringCharacterIterator}} |
| |
| * {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/runtime/functions/strings/SubstringAfterCharacterIterator.java}SubstringAfterStringCharacterIterator}} |
| |
| * {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/runtime/functions/strings/SubstringBeforeCharacterIterator.java}SubstringBeforeStringCharacterIterator}} |
| |
| * {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/runtime/functions/strings/SubstringCharacterIterator.java}SubstringStringCharacterIterator}} |
| |
| * {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/runtime/functions/strings/UTF8StringCharacterIterator.java}UTF8StringCharacterIterator}} |
| |
| * {{{https://git-wip-us.apache.org/repos/asf?p=vxquery.git;a=blob;f=vxquery-core/src/main/java/org/apache/vxquery/runtime/functions/strings/UpperCaseCharacterIterator.java}UpperCaseStringCharacterIterator}} |
| |
| * Array Backed Value Store |
| |
| The array back value store is a key design element of Hyracks. The object |
| is used to manage an output array. The system creates an array large enough |
| to hold your output. Adding to the result, if necessary. The array can be |
| reused and can hold multiple pointable results due to the starting offset |
| parameter in the pointable. |
| |