You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "abdullah alamoudi (Code Review)" <do...@asterixdb.incubator.apache.org> on 2016/06/20 23:40:45 UTC

Change in asterixdb[master]: Fix Decoding of byte[] Records

abdullah alamoudi has uploaded a new change for review.

  https://asterix-gerrit.ics.uci.edu/951

Change subject: Fix Decoding of byte[] Records
......................................................................

Fix Decoding of byte[] Records

Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
1 file changed, 9 insertions(+), 8 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/51/951/1

diff --git a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
index 6ce5e98..7e1f142 100644
--- a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
+++ b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
@@ -39,17 +39,15 @@
 import com.couchbase.client.deps.io.netty.buffer.ByteBuf;
 import com.couchbase.client.deps.io.netty.util.ReferenceCountUtil;
 
-public class DCPMessageToRecordConverter
-        implements IRecordToRecordWithMetadataAndPKConverter<DCPRequest, char[]> {
+public class DCPMessageToRecordConverter implements IRecordToRecordWithMetadataAndPKConverter<DCPRequest, char[]> {
 
     private final RecordWithMetadataAndPK<char[]> recordWithMetadata;
     private final CharArrayRecord value;
     private final CharsetDecoder decoder = StandardCharsets.UTF_8.newDecoder();
     private final ByteBuffer bytes = ByteBuffer.allocateDirect(ExternalDataConstants.DEFAULT_BUFFER_SIZE);
     private final CharBuffer chars = CharBuffer.allocate(ExternalDataConstants.DEFAULT_BUFFER_SIZE);
-    private static final IAType[] CB_META_TYPES = new IAType[] { /*ID*/BuiltinType.ASTRING,
-            /*VBID*/BuiltinType.AINT32, /*SEQ*/BuiltinType.AINT64, /*CAS*/BuiltinType.AINT64,
-            /*EXPIRATION*/BuiltinType.AINT32,
+    private static final IAType[] CB_META_TYPES = new IAType[] { /*ID*/BuiltinType.ASTRING, /*VBID*/BuiltinType.AINT32,
+            /*SEQ*/BuiltinType.AINT64, /*CAS*/BuiltinType.AINT64, /*EXPIRATION*/BuiltinType.AINT32,
             /*FLAGS*/BuiltinType.AINT32, /*REV*/BuiltinType.AINT64, /*LOCK*/BuiltinType.AINT32 };
     private static final int[] PK_INDICATOR = { 1 };
     private static final int[] PK_INDEXES = { 0 };
@@ -105,16 +103,19 @@
         int position = content.readerIndex();
         final int limit = content.writerIndex();
         final int contentSize = content.readableBytes();
+        bytes.clear();
+        System.err.println("Using netty decoder: " + content.toString(StandardCharsets.UTF_8));
         while (position < limit) {
-            bytes.clear();
             chars.clear();
-            if ((contentSize - position) < bytes.capacity()) {
+            if ((contentSize - position) < bytes.remaining()) {
                 bytes.limit(contentSize - position);
             }
-            content.getBytes(position, bytes);
+            content.getBytes(position + bytes.position(), bytes);
             position += bytes.position();
             bytes.flip();
             decoder.decode(bytes, chars, false);
+            bytes.compact();
+            position -= bytes.position();
             chars.flip();
             record.append(chars);
         }

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "abdullah alamoudi (Code Review)" <do...@asterixdb.incubator.apache.org>.
abdullah alamoudi has submitted this change and it was merged.

Change subject: Fix Decoding of byte[] Records
......................................................................


Fix Decoding of byte[] Records

Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Reviewed-on: https://asterix-gerrit.ics.uci.edu/951
Reviewed-by: Yingyi Bu <bu...@gmail.com>
Reviewed-by: Jenkins <je...@fulliautomatix.ics.uci.edu>
Tested-by: Jenkins <je...@fulliautomatix.ics.uci.edu>
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/CharArrayRecord.java
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
A asterixdb/asterix-external-data/src/test/java/org/apache/asterix/external/parser/test/ByteBufUTF8DecodeTest.java
A asterixdb/asterix-external-data/src/test/resources/ICanEatGlass.txt
A asterixdb/asterix-external-data/src/test/resources/record.json
5 files changed, 637 insertions(+), 10 deletions(-)

Approvals:
  Yingyi Bu: Looks good to me, approved
  Jenkins: Looks good to me, but someone else must approve; Verified



diff --git a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/CharArrayRecord.java b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/CharArrayRecord.java
index f174962..33f9673 100644
--- a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/CharArrayRecord.java
+++ b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/CharArrayRecord.java
@@ -69,7 +69,7 @@
                 throw new IOException(
                         "Record is too large!. Maximum record size is " + ExternalDataConstants.MAX_RECORD_SIZE);
             }
-            int newSize = Math.min((int)(len * ExternalDataConstants.DEFAULT_BUFFER_INCREMENT_FACTOR),
+            int newSize = Math.min((int) (len * ExternalDataConstants.DEFAULT_BUFFER_INCREMENT_FACTOR),
                     ExternalDataConstants.MAX_RECORD_SIZE);
             value = Arrays.copyOf(value, newSize);
         }
@@ -88,7 +88,7 @@
 
     @Override
     public String toString() {
-        return String.valueOf(value, 0, size);
+        return String.valueOf(value, 0, size == 0 ? 0 : size - 1);
     }
 
     public void endRecord() throws IOException {
diff --git a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
index 6ce5e98..01466fd 100644
--- a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
+++ b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
@@ -39,17 +39,15 @@
 import com.couchbase.client.deps.io.netty.buffer.ByteBuf;
 import com.couchbase.client.deps.io.netty.util.ReferenceCountUtil;
 
-public class DCPMessageToRecordConverter
-        implements IRecordToRecordWithMetadataAndPKConverter<DCPRequest, char[]> {
+public class DCPMessageToRecordConverter implements IRecordToRecordWithMetadataAndPKConverter<DCPRequest, char[]> {
 
     private final RecordWithMetadataAndPK<char[]> recordWithMetadata;
     private final CharArrayRecord value;
     private final CharsetDecoder decoder = StandardCharsets.UTF_8.newDecoder();
-    private final ByteBuffer bytes = ByteBuffer.allocateDirect(ExternalDataConstants.DEFAULT_BUFFER_SIZE);
+    private final ByteBuffer bytes = ByteBuffer.allocate(ExternalDataConstants.DEFAULT_BUFFER_SIZE);
     private final CharBuffer chars = CharBuffer.allocate(ExternalDataConstants.DEFAULT_BUFFER_SIZE);
-    private static final IAType[] CB_META_TYPES = new IAType[] { /*ID*/BuiltinType.ASTRING,
-            /*VBID*/BuiltinType.AINT32, /*SEQ*/BuiltinType.AINT64, /*CAS*/BuiltinType.AINT64,
-            /*EXPIRATION*/BuiltinType.AINT32,
+    private static final IAType[] CB_META_TYPES = new IAType[] { /*ID*/BuiltinType.ASTRING, /*VBID*/BuiltinType.AINT32,
+            /*SEQ*/BuiltinType.AINT64, /*CAS*/BuiltinType.AINT64, /*EXPIRATION*/BuiltinType.AINT32,
             /*FLAGS*/BuiltinType.AINT32, /*REV*/BuiltinType.AINT64, /*LOCK*/BuiltinType.AINT32 };
     private static final int[] PK_INDICATOR = { 1 };
     private static final int[] PK_INDEXES = { 0 };
@@ -105,16 +103,22 @@
         int position = content.readerIndex();
         final int limit = content.writerIndex();
         final int contentSize = content.readableBytes();
+        bytes.clear();
         while (position < limit) {
-            bytes.clear();
             chars.clear();
             if ((contentSize - position) < bytes.capacity()) {
                 bytes.limit(contentSize - position);
             }
-            content.getBytes(position, bytes);
+            content.getBytes(position + bytes.position(), bytes);
             position += bytes.position();
             bytes.flip();
             decoder.decode(bytes, chars, false);
+            if (bytes.hasRemaining()) {
+                bytes.compact();
+                position -= bytes.position();
+            } else {
+                bytes.clear();
+            }
             chars.flip();
             record.append(chars);
         }
diff --git a/asterixdb/asterix-external-data/src/test/java/org/apache/asterix/external/parser/test/ByteBufUTF8DecodeTest.java b/asterixdb/asterix-external-data/src/test/java/org/apache/asterix/external/parser/test/ByteBufUTF8DecodeTest.java
new file mode 100644
index 0000000..c238f1c
--- /dev/null
+++ b/asterixdb/asterix-external-data/src/test/java/org/apache/asterix/external/parser/test/ByteBufUTF8DecodeTest.java
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.asterix.external.parser.test;
+
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.FileReader;
+import java.io.IOException;
+import java.net.URISyntaxException;
+import java.nio.ByteBuffer;
+import java.nio.CharBuffer;
+import java.nio.charset.CharsetDecoder;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.asterix.external.api.IRawRecord;
+import org.apache.asterix.external.input.record.CharArrayRecord;
+import org.apache.asterix.external.input.record.converter.DCPMessageToRecordConverter;
+import org.apache.asterix.external.input.record.reader.stream.SemiStructuredRecordReader;
+import org.apache.asterix.external.input.stream.LocalFSInputStream;
+import org.apache.asterix.external.util.FileSystemWatcher;
+import org.junit.Assert;
+import org.junit.Test;
+
+import com.couchbase.client.deps.io.netty.buffer.ByteBuf;
+import com.couchbase.client.deps.io.netty.buffer.UnpooledByteBufAllocator;
+
+public class ByteBufUTF8DecodeTest {
+
+    private final int BUFFER_SIZE = 8; // Small buffer size to ensure multiple loop execution in the decode call
+    private final int KB32 = 32768;
+    private final CharsetDecoder decoder = StandardCharsets.UTF_8.newDecoder();
+    private final ByteBuffer bytes = ByteBuffer.allocate(BUFFER_SIZE);
+    private final CharBuffer chars = CharBuffer.allocate(BUFFER_SIZE);
+    private final CharArrayRecord value = new CharArrayRecord();
+    private final ByteBuf nettyBuffer = UnpooledByteBufAllocator.DEFAULT.heapBuffer(KB32, Integer.MAX_VALUE);
+
+    @Test
+    public void eatGlass() {
+        try {
+            String fileName = getClass().getResource("/ICanEatGlass.txt").toURI().getPath();
+            try (BufferedReader br = new BufferedReader(new FileReader(new File(fileName)))) {
+                for (String line; (line = br.readLine()) != null;) {
+                    process(line);
+                }
+            }
+        } catch (Throwable e) {
+            e.printStackTrace();
+            Assert.fail(e.getMessage());
+        }
+    }
+
+    @Test
+    public void testDecodingJsonRecords() throws URISyntaxException, IOException {
+        String jsonFileName = "/record.json";
+        List<Path> paths = new ArrayList<>();
+        paths.add(Paths.get(getClass().getResource(jsonFileName).toURI()));
+        FileSystemWatcher watcher = new FileSystemWatcher(paths, null, false);
+        LocalFSInputStream in = new LocalFSInputStream(watcher);
+        try (SemiStructuredRecordReader recordReader = new SemiStructuredRecordReader(in, "{", "}")) {
+            while (recordReader.hasNext()) {
+                try {
+                    IRawRecord<char[]> record = recordReader.next();
+                    process(record.toString());
+                } catch (Throwable th) {
+                    th.printStackTrace();
+                    Assert.fail(th.getMessage());
+                }
+            }
+        }
+    }
+
+    private void process(String input) throws IOException {
+        value.reset();
+        nettyBuffer.clear();
+        nettyBuffer.writeBytes(input.getBytes(StandardCharsets.UTF_8));
+        DCPMessageToRecordConverter.set(nettyBuffer, decoder, bytes, chars, value);
+        Assert.assertEquals(input, value.toString());
+    }
+}
diff --git a/asterixdb/asterix-external-data/src/test/resources/ICanEatGlass.txt b/asterixdb/asterix-external-data/src/test/resources/ICanEatGlass.txt
new file mode 100644
index 0000000..a3d9ca6
--- /dev/null
+++ b/asterixdb/asterix-external-data/src/test/resources/ICanEatGlass.txt
@@ -0,0 +1,149 @@
+Sanskrit: \ufeff\u0915\u093e\u091a\u0902 \u0936\u0915\u094d\u0928\u094b\u092e\u094d\u092f\u0924\u094d\u0924\u0941\u092e\u094d \u0964 \u0928\u094b\u092a\u0939\u093f\u0928\u0938\u094d\u0924\u093f \u092e\u093e\u092e\u094d \u0965
+Sanskrit (standard transcription): k\u0101ca\u1e43 \u015baknomyattum; nopahinasti m\u0101m.
+Classical Greek: \u1f55\u03b1\u03bb\u03bf\u03bd \u03d5\u03b1\u03b3\u03b5\u1fd6\u03bd \u03b4\u1f7b\u03bd\u03b1\u03bc\u03b1\u03b9\u0387 \u03c4\u03bf\u1fe6\u03c4\u03bf \u03bf\u1f54 \u03bc\u03b5 \u03b2\u03bb\u1f71\u03c0\u03c4\u03b5\u03b9.
+Greek (monotonic): \u039c\u03c0\u03bf\u03c1\u03ce \u03bd\u03b1 \u03c6\u03ac\u03c9 \u03c3\u03c0\u03b1\u03c3\u03bc\u03ad\u03bd\u03b1 \u03b3\u03c5\u03b1\u03bb\u03b9\u03ac \u03c7\u03c9\u03c1\u03af\u03c2 \u03bd\u03b1 \u03c0\u03ac\u03b8\u03c9 \u03c4\u03af\u03c0\u03bf\u03c4\u03b1.
+Greek (polytonic): \u039c\u03c0\u03bf\u03c1\u1ff6 \u03bd\u1f70 \u03c6\u03ac\u03c9 \u03c3\u03c0\u03b1\u03c3\u03bc\u03ad\u03bd\u03b1 \u03b3\u03c5\u03b1\u03bb\u03b9\u1f70 \u03c7\u03c9\u03c1\u1f76\u03c2 \u03bd\u1f70 \u03c0\u03ac\u03b8\u03c9 \u03c4\u03af\u03c0\u03bf\u03c4\u03b1.
+Latin: Vitrum edere possum; mihi non nocet.
+Old French: Je puis mangier del voirre. Ne me nuit.
+French: Je peux manger du verre, �a ne me fait pas mal.
+Proven�al / Occitan: P�di manjar de veire, me nafrari� pas.
+Qu�b�cois: J'peux manger d'la vitre, �a m'fa pas mal.
+Walloon: Dji pou magn� do v�re, �oula m' freut n�n m�.
+Picard: Ch'peux mingi du verre, cha m'fo� mie n'ma.
+Krey�l Ayisyen (Hait�): Mwen kap manje v�, li pa blese'm.
+Basque: Kristala jan dezaket, ez dit minik ematen.
+Catalan / Catal�: Puc menjar vidre, que no em fa mal.
+Spanish: Puedo comer vidrio, no me hace da�o.
+Aragon�s: Puedo minchar beire, no me'n fa mal.
+Galician: Eu podo xantar cristais e non cortarme.
+European Portuguese: Posso comer vidro, n�o me faz mal.
+Brazilian Portuguese: Posso comer vidro, n�o me machuca.
+Caboverdiano/Kabuverdianu (Cape Verde): M' pod� cum� vidru, ca ta magu�-m'.
+Papiamentu: Ami por kome glas anto e no ta hasimi da�o.
+Italian: Posso mangiare il vetro e non mi fa male.
+Milanese: S�n b�n de magn� el v�der, el me fa minga mal.
+Roman: Me posso magna' er vetro, e nun me fa male.
+Napoletano: M' pozz magna' o'vetr, e nun m' fa mal.
+Venetian: Mi posso magnare el vetro, no'l me fa mae.
+Zeneise (Genovese): P�sso mangi� o veddro e o no me f� m�.
+Sicilian: Puotsu mangiari u vitru, nun mi fa mali.
+Romansch (Grischun): Jau sai mangiar vaider, senza che quai fa donn a mai.
+Romanian: Pot s\u0103 m\u0103n�nc sticl\u0103 \u0219i ea nu m\u0103 r\u0103ne\u0219te.
+Esperanto: Mi povas man\u011di vitron, \u011di ne dama\u011das min.
+Cornish: M� a yl dybry gw�der hag �f ny wra ow ankenya.
+Welsh: Dw i'n gallu bwyta gwydr, 'dyw e ddim yn gwneud dolur i mi.
+Manx Gaelic: Foddym gee glonney agh cha jean eh gortaghey mee.
+Old Irish (Ogham): \u169b\u169b\u1689\u1691\u1685\u1694\u1689\u1689\u1694\u168b\u1680\u1694\u1688\u1694\u1680\u168d\u1682\u1690\u1685\u1691\u1680\u1685\u1694\u168b\u168c\u1693\u1685\u1690\u169c
+Old Irish (Latin): Con�iccim ithi nglano. N�m�g�na.
+Irish: Is f�idir liom gloinne a ithe. N� dh�anann s� dochar ar bith dom.
+Ulster Gaelic: Ithim-sa gloine agus n� miste damh �.
+Scottish Gaelic: S urrainn dhomh gloinne ithe; cha ghoirtich i mi.
+Anglo-Saxon (Runes): \u16c1\u16b3\u16eb\u16d7\u16a8\u16b7\u16eb\u16b7\u16da\u16a8\u16cb\u16eb\u16d6\u16a9\u16cf\u16aa\u16be\u16eb\u16a9\u16be\u16de\u16eb\u16bb\u16c1\u16cf\u16eb\u16be\u16d6\u16eb\u16bb\u16d6\u16aa\u16b1\u16d7\u16c1\u16aa\u16a7\u16eb\u16d7\u16d6\u16ec
+Anglo-Saxon (Latin): Ic m�g gl�s eotan ond hit ne hearmia� me.
+Middle English: Ich canne glas eten and hit hirti� me nou\u021dt.
+English: I can eat glass and it doesn't hurt me.
+English (IPA): [a\u026a k�n i\u02d0t gl\u0251\u02d0s �nd \u026at d\u0250z n\u0252t h\u025c\u02d0t mi\u02d0] (Received Pronunciation)
+English (Braille): \u280a\u2800\u2809\u2801\u281d\u2800\u2811\u2801\u281e\u2800\u281b\u2807\u2801\u280e\u280e\u2800\u2801\u281d\u2819\u2800\u280a\u281e\u2800\u2819\u2815\u2811\u280e\u281d\u281e\u2800\u2813\u2825\u2817\u281e\u2800\u280d\u2811
+Jamaican: Mi kian niam glas han i neba hot mi.
+Lalland Scots / Doric: Ah can eat gless, it disnae hurt us.
+Gothic: \u040c\u040c\u040c \u040c\u040c\u040c\u040d \u040c\u0308\u040d\u040c\u040c, \u040c\u040c \u040c\u040c\u040d \u040d\u040c \u040c\u040c\u040c\u040c \u040c\u040d\u040c\u040c\u040c\u040c\u040c.
+Old Norse (Runes): \u16d6\u16b4 \u16b7\u16d6\u16cf \u16d6\u16cf\u16c1 \u16a7 \u16b7\u16da\u16d6\u16b1 \u16d8\u16be \u16a6\u16d6\u16cb\u16cb \u16a8\u16a7 \u16a1\u16d6 \u16b1\u16a7\u16a8 \u16cb\u16a8\u16b1
+Old Norse (Latin): Ek get eti� gler �n �ess a� ver�a s�r.
+Norsk / Norwegian (Nynorsk): Eg kan eta glas utan � skada meg.
+Norsk / Norwegian (Bokm�l): Jeg kan spise glass uten � skade meg.
+F�royskt / Faroese: Eg kann eta glas, ska�aleysur.
+�slenska / Icelandic: �g get eti� gler �n �ess a� mei�a mig.
+Svenska / Swedish: Jag kan �ta glas utan att skada mig.
+Dansk / Danish: Jeg kan spise glas, det g�r ikke ondt p� mig.
+S�nderjysk: � ka �e glass uhen at det go m� naue.
+Frysk / Frisian: Ik kin gl�s ite, it docht me net sear.
+Nederlands / Dutch: Ik kan glas eten, het doet m\u0133 geen kwaad.
+Kirchr�adsj/B�chesserplat: Iech ken glaas ��se, mer 't deet miech jing pieng.
+Afrikaans: Ek kan glas eet, maar dit doen my nie skade nie.
+L�tzebuergescht / Luxemburgish: Ech kan Glas iessen, daat deet mir n�t wei.
+Deutsch / German: Ich kann Glas essen, ohne mir zu schaden.
+Ruhrdeutsch: Ich kann Glas verkasematuckeln, ohne dattet mich wat jucken tut.
+Langenfelder Platt: Isch kann Jlaas kimmeln, uuhne datt mich datt weh d��d.
+Lausitzer Mundart ("Lusatian"): Ich koann Gloos assn und doas dudd merr ni wii.
+Odenw�lderisch: Iech konn glaasch voschbachteln ohne dass es mir ebbs daun doun dud.
+S�chsisch / Saxon: 'sch kann Glos essn, ohne dass'sch mer wehtue.
+Pf�lzisch: Isch konn Glass fresse ohne dasses mer ebbes ausmache dud.
+Schw�bisch / Swabian: I k� Glas fr�ssa, ond des macht mr nix!
+Deutsch (Voralberg): I ka glas eassa, ohne dass mar weh tuat.
+Bayrisch / Bavarian: I koh Glos esa, und es duard ma ned wei.
+Allemannisch: I kaun Gloos essen, es tuat ma ned weh.
+Schwyzerd�tsch (Z�rich): Ich chan Glaas �sse, das schadt mir n�d.
+Schwyzerd�tsch (Luzern): Ech cha Gl�s �sse, das schadt mer ned.
+Hungarian: Meg tudom enni az �veget, nem lesz t\u0151le bajom.
+Suomi / Finnish: Voin sy�d� lasia, se ei vahingoita minua.
+Sami (Northern): S�ht�n borrat l�sa, dat ii leat b�v\u010d\u010das.
+Erzian: \u041c\u043e\u043d \u044f\u0440\u0441\u0430\u043d \u0441\u0443\u043b\u0438\u043a\u0430\u0434\u043e, \u0434\u044b \u0437\u044b\u044f\u043d \u044d\u0439\u0441\u0442\u044d\u043d\u0437\u044d \u0430 \u0443\u043b\u0438.
+Northern Karelian: Mie voin syvv� lasie ta minla ei ole kipie.
+Southern Karelian: Min� voin syvv� st'oklua dai minule ei ole kibie.
+Estonian: Ma v�in klaasi s��a, see ei tee mulle midagi.
+Latvian: Es varu \u0113st stiklu, tas man nekait\u0113.
+Lithuanian: A\u0161 galiu valgyti stikl\u0105 ir jis man\u0119s ne\u017eeid\u017eia.
+Czech: Mohu j�st sklo, neubl�\u017e� mi.
+Slovak: M�\u017eem jes\u0165 sklo. Nezran� ma.
+Polska / Polish: Mog\u0119 je\u015b\u0107 szk\u0142o i mi nie szkodzi.
+Slovenian: Lahko jem steklo, ne da bi mi \u0161kodovalo.
+Bosnian, Croatian, Montenegrin and Serbian (Latin): Ja mogu jesti staklo, i to mi ne \u0161teti.
+Bosnian, Montenegrin and Serbian (Cyrillic): \u0408\u0430 \u043c\u043e\u0433\u0443 \u0458\u0435\u0441\u0442\u0438 \u0441\u0442\u0430\u043a\u043b\u043e, \u0438 \u0442\u043e \u043c\u0438 \u043d\u0435 \u0448\u0442\u0435\u0442\u0438.
+Macedonian: \u041c\u043e\u0436\u0430\u043c \u0434\u0430 \u0458\u0430\u0434\u0430\u043c \u0441\u0442\u0430\u043a\u043b\u043e, \u0430 \u043d\u0435 \u043c\u0435 \u0448\u0442\u0435\u0442\u0430.
+Russian: \u042f \u043c\u043e\u0433\u0443 \u0435\u0441\u0442\u044c \u0441\u0442\u0435\u043a\u043b\u043e, \u043e\u043d\u043e \u043c\u043d\u0435 \u043d\u0435 \u0432\u0440\u0435\u0434\u0438\u0442.
+Belarusian (Cyrillic): \u042f \u043c\u0430\u0433\u0443 \u0435\u0441\u0446\u0456 \u0448\u043a\u043b\u043e, \u044f\u043d\u043e \u043c\u043d\u0435 \u043d\u0435 \u0448\u043a\u043e\u0434\u0437\u0456\u0446\u044c.
+Belarusian (Lacinka): Ja mahu je\u015bci \u0161k\u0142o, jano mne ne \u0161kodzi\u0107.
+Ukrainian: \u042f \u043c\u043e\u0436\u0443 \u0457\u0441\u0442\u0438 \u0441\u043a\u043b\u043e, \u0456 \u0432\u043e\u043d\u043e \u043c\u0435\u043d\u0456 \u043d\u0435 \u0437\u0430\u0448\u043a\u043e\u0434\u0438\u0442\u044c.
+Bulgarian: \u041c\u043e\u0433\u0430 \u0434\u0430 \u044f\u043c \u0441\u0442\u044a\u043a\u043b\u043e, \u0442\u043e \u043d\u0435 \u043c\u0438 \u0432\u0440\u0435\u0434\u0438.
+Georgian: \u10db\u10d8\u10dc\u10d0\u10e1 \u10d5\u10ed\u10d0\u10db \u10d3\u10d0 \u10d0\u10e0\u10d0 \u10db\u10e2\u10d9\u10d8\u10d5\u10d0.
+Armenian: \u053f\u0580\u0576\u0561\u0574 \u0561\u057a\u0561\u056f\u056b \u0578\u0582\u057f\u0565\u056c \u0587 \u056b\u0576\u056e\u056b \u0561\u0576\u0570\u0561\u0576\u0563\u056b\u057d\u057f \u0579\u0568\u0576\u0565\u0580\u0589.
+Albanian: Un� mund t� ha qelq dhe nuk m� gjen gj�.
+Turkish: Cam yiyebilirim, bana zarar\u0131 dokunmaz.
+Turkish (Ottoman): \u062c\u0627\u0645 \u064a\u064a\u0647 \u0628\u0644\u0648\u0631\u0645 \u0628\u06ad\u0627 \u0636\u0631\u0631\u0649 \u0637\u0648\u0642\u0648\u0646\u0645\u0632
+Bangla / Bengali: \u0986\u09ae\u09bf \u0995\u09be\u0981\u099a \u0996\u09c7\u09a4\u09c7 \u09aa\u09be\u09b0\u09bf, \u09a4\u09be\u09a4\u09c7 \u0986\u09ae\u09be\u09b0 \u0995\u09cb\u09a8\u09cb \u0995\u09cd\u09b7\u09a4\u09bf \u09b9\u09df \u09a8\u09be\u0964
+Marathi: \u092e\u0940 \u0915\u093e\u091a \u0916\u093e\u090a \u0936\u0915\u0924\u094b, \u092e\u0932\u093e \u0924\u0947 \u0926\u0941\u0916\u0924 \u0928\u093e\u0939\u0940.
+Kannada: \u0ca8\u0ca8\u0c97\u0cc6 \u0cb9\u0cbe\u0ca8\u0cbf \u0c86\u0c97\u0ca6\u0cc6, \u0ca8\u0cbe\u0ca8\u0cc1 \u0c97\u0c9c\u0ca8\u0ccd\u0ca8\u0cc1 \u0ca4\u0cbf\u0ca8\u0cac\u0cb9\u0cc1\u0ca6\u0cc1.
+Hindi: \u092e\u0948\u0902 \u0915\u093e\u0901\u091a \u0916\u093e \u0938\u0915\u0924\u093e \u0939\u0942\u0901 \u0914\u0930 \u092e\u0941\u091d\u0947 \u0909\u0938\u0938\u0947 \u0915\u094b\u0908 \u091a\u094b\u091f \u0928\u0939\u0940\u0902 \u092a\u0939\u0941\u0902\u091a\u0924\u0940.
+Tamil: \u0ba8\u0bbe\u0ba9\u0bcd \u0b95\u0ba3\u0bcd\u0ba3\u0bbe\u0b9f\u0bbf \u0b9a\u0bbe\u0baa\u0bcd\u0baa\u0bbf\u0b9f\u0bc1\u0bb5\u0bc7\u0ba9\u0bcd, \u0b85\u0ba4\u0ba9\u0bbe\u0bb2\u0bcd \u0b8e\u0ba9\u0b95\u0bcd\u0b95\u0bc1 \u0b92\u0bb0\u0bc1 \u0b95\u0bc7\u0b9f\u0bc1\u0bae\u0bcd \u0bb5\u0bb0\u0bbe\u0ba4\u0bc1.
+Telugu: \u0c28\u0c47\u0c28\u0c41 \u0c17\u0c3e\u0c1c\u0c41 \u0c24\u0c3f\u0c28\u0c17\u0c32\u0c28\u0c41 \u0c2e\u0c30\u0c3f\u0c2f\u0c41 \u0c05\u0c32\u0c3e \u0c1a\u0c47\u0c38\u0c3f\u0c28\u0c3e \u0c28\u0c3e\u0c15\u0c41 \u0c0f\u0c2e\u0c3f \u0c07\u0c2c\u0c4d\u0c2c\u0c02\u0c26\u0c3f \u0c32\u0c47\u0c26\u0c41.
+Sinhalese: \u0db8\u0da7 \u0dc0\u0dd3\u0daf\u0dd4\u0dbb\u0dd4 \u0d9a\u0dd1\u0db8\u0da7 \u0dc4\u0dd0\u0d9a\u0dd2\u0dba\u0dd2. \u0d91\u0dba\u0dd2\u0db1\u0dca \u0db8\u0da7 \u0d9a\u0dd2\u0dc3\u0dd2 \u0dc4\u0dcf\u0db1\u0dd2\u0dba\u0d9a\u0dca \u0dc3\u0dd2\u0daf\u0dd4 \u0db1\u0ddc\u0dc0\u0dda.
+Urdu: \u0645\u06cc\u06ba \u06a9\u0627\u0646\u0686 \u06a9\u06be\u0627 \u0633\u06a9\u062a\u0627 \u06c1\u0648\u06ba \u0627\u0648\u0631 \u0645\u062c\u06be\u06d2 \u062a\u06a9\u0644\u06cc\u0641 \u0646\u06c1\u06cc\u06ba \u06c1\u0648\u062a\u06cc \u06d4
+Pashto: \u0632\u0647 \u0634\u064a\u0634\u0647 \u062e\u0648\u0693\u0644\u06d0 \u0634\u0645\u060c \u0647\u063a\u0647 \u0645\u0627 \u0646\u0647 \u062e\u0648\u0696\u0648\u064a
+Farsi: .\u0645\u0646 \u0645\u06cc \u062a\u0648\u0627\u0646\u0645 \u0628\u062f\u0648\u0646\u0650 \u0627\u062d\u0633\u0627\u0633 \u062f\u0631\u062f \u0634\u064a\u0634\u0647 \u0628\u062e\u0648\u0631\u0645
+Arabic: \u0623\u0646\u0627 \u0642\u0627\u062f\u0631 \u0639\u0644\u0649 \u0623\u0643\u0644 \u0627\u0644\u0632\u062c\u0627\u062c \u0648 \u0647\u0630\u0627 \u0644\u0627 \u064a\u0624\u0644\u0645\u0646\u064a.
+Maltese: Nista' niekol il-\u0127\u0121ie\u0121 u ma jag\u0127milli xejn.
+Hebrew: \u05d0\u05e0\u05d9 \u05d9\u05db\u05d5\u05dc \u05dc\u05d0\u05db\u05d5\u05dc \u05d6\u05db\u05d5\u05db\u05d9\u05ea \u05d5\u05d6\u05d4 \u05dc\u05d0 \u05de\u05d6\u05d9\u05e7 \u05dc\u05d9.
+Yiddish: \u05d0\u05d9\u05da \u05e7\u05e2\u05df \u05e2\u05e1\u05df \u05d2\u05dc\u05d0\u05b8\u05d6 \u05d0\u05d5\u05df \u05e2\u05e1 \u05d8\u05d5\u05d8 \u05de\u05d9\u05e8 \u05e0\u05d9\u05e9\u05d8 \u05f0\u05f2.
+Twi: Metumi awe tumpan, \u025cny\u025c me hwee.
+Hausa (Latin): Ina\u0304 iya taunar gila\u0304shi kuma in gama\u0304 la\u0304fiya\u0304.
+Hausa (Ajami): \u0625\u0650\u0646\u0627 \u0625\u0650\u0649\u064e \u062a\u064e\u0648\u0646\u064e\u0631 \u063a\u0650\u0644\u064e\u0627\u0634\u0650 \u0643\u064f\u0645\u064e \u0625\u0650\u0646 \u063a\u064e\u0645\u064e\u0627 \u0644\u064e\u0627\u0641\u0650\u0649\u064e\u0627
+Yoruba: Mo l� je\u0329 d�g�, k� n� pa m� l�ra.
+Lingala: Nakoki\u0301 koli\u0301ya bite\u0301ni bya milungi, ekosa\u0301la nga\u0301i\u0301 mabe\u0301 t\u025b\u0301.
+(Ki)Swahili: Naweza kula bilauri na sikunyui.
+Malay: Saya boleh makan kaca dan ia tidak mencederakan saya.
+Tagalog: Kaya kong kumain nang bubog at hindi ako masaktan.
+Chamorro: Si�a yo' chumocho krestat, ti ha na'lalamen yo'.
+Fijian: Au rawa ni kana iloilo, ia au sega ni vakacacani kina.
+Javanese: Aku isa mangan beling tanpa lara.
+Burmese: \u1000\u1039\u101a\u1039\u101d\u1014\u1039\u200c\u1010\u1031\u102c\u1039\u200c\u104a\u1000\u1039\u101a\u1039\u101d\u1014\u1039\u200c\u1019 \u1019\u1039\u101a\u1000\u1039\u200c\u1005\u102c\u1038\u1014\u102f\u102d\u1004\u1039\u200c\u101e\u100a\u1039\u200c\u104b \u104e\u1000\u1039\u101b\u1031\u102c\u1004\u1039\u200c\u1037 \u1011\u102d\u1001\u102f\u102d\u1000\u1039\u200c\u1019\u1039\u101f\u102f \u1019\u101b\u1039\u101f\u102d\u1015\u102c\u104b.
+Vietnamese (qu\u1ed1c ng\u1eef): T�i c� th\u1ec3 \u0103n th\u1ee7y tinh m� kh�ng h\u1ea1i g�.
+Vietnamese (n�m): \u4e9b \u08ce \u4e16 \u54b9 \u6c34 \u6676 \u0993 \u7a7a \u08ce \u5bb3 \u54a6.
+Khmer: \u1781\u17d2\u1789\u17bb\u17c6\u17a2\u17b6\u1785\u1789\u17bb\u17c6\u1780\u1789\u17d2\u1785\u1780\u17cb\u1794\u17b6\u1793 \u178a\u17c4\u1799\u1782\u17d2\u1798\u17b6\u1793\u1794\u1789\u17d2\u17a0\u17b6\u179a.
+Lao: \u0e82\u0ead\u0ec9\u0e8d\u0e81\u0eb4\u0e99\u0ec1\u0e81\u0ec9\u0ea7\u0ec4\u0e94\u0ec9\u0ec2\u0e94\u0e8d\u0e97\u0eb5\u0ec8\u0ea1\u0eb1\u0e99\u0e9a\u0ecd\u0ec8\u0ec4\u0e94\u0ec9\u0ec0\u0eae\u0eb1\u0e94\u0ec3\u0eab\u0ec9\u0e82\u0ead\u0ec9\u0e8d\u0ec0\u0e88\u0eb1\u0e9a.
+Thai: \u0e09\u0e31\u0e19\u0e01\u0e34\u0e19\u0e01\u0e23\u0e30\u0e08\u0e01\u0e44\u0e14\u0e49 \u0e41\u0e15\u0e48\u0e21\u0e31\u0e19\u0e44\u0e21\u0e48\u0e17\u0e33\u0e43\u0e2b\u0e49\u0e09\u0e31\u0e19\u0e40\u0e08\u0e47\u0e1a.
+Mongolian (Cyrillic): \u0411\u0438 \u0448\u0438\u043b \u0438\u0434\u044d\u0439 \u0447\u0430\u0434\u043d\u0430, \u043d\u0430\u0434\u0430\u0434 \u0445\u043e\u0440\u0442\u043e\u0439 \u0431\u0438\u0448.
+Mongolian (Classic): \u182a\u1822 \u1830\u1822\u182f\u1822 \u1822\u1833\u1821\u1836\u1826 \u1834\u1822\u1833\u1820\u1828\u1820 \u1802 \u1828\u1820\u1833\u1824\u1837 \u182c\u1823\u1824\u1837\u1820\u1833\u1820\u1822 \u182a\u1822\u1830\u1822.
+Nepali: \ufeff\u092e \u0915\u093e\u0901\u091a \u0916\u093e\u0928 \u0938\u0915\u094d\u091b\u0942 \u0930 \u092e\u0932\u093e\u0908 \u0915\u0947\u0939\u093f \u0928\u0940 \u0939\u0941\u0928\u094d\u200d\u0928\u094d \u0964.
+Tibetan: \u0f64\u0f7a\u0f63\u0f0b\u0f66\u0f92\u0f7c\u0f0b\u0f5f\u0f0b\u0f53\u0f66\u0f0b\u0f44\u0f0b\u0f53\u0f0b\u0f42\u0f72\u0f0b\u0f58\u0f0b\u0f62\u0f7a\u0f51\u0f0d.
+Chinese: \u6211\u80fd\u541e\u4e0b\u73bb\u7483\u800c\u4e0d\u4f24\u8eab\u4f53\u3002.
+Taiwanese: G�a \u0113-t�ng chia\u030dh po-l�, m\u0101 b\u0113 tio\u030dh-siong.
+Japanese: \u79c1\u306f\u30ac\u30e9\u30b9\u3092\u98df\u3079\u3089\u308c\u307e\u3059\u3002\u305d\u308c\u306f\u79c1\u3092\u50b7\u3064\u3051\u307e\u305b\u3093\u3002.
+Korean: \ub098\ub294 \uc720\ub9ac\ub97c \uba39\uc744 \uc218 \uc788\uc5b4\uc694. \uadf8\ub798\ub3c4 \uc544\ud504\uc9c0 \uc54a\uc544\uc694.
+Bislama: Mi save kakae glas, hemi no save katem mi.
+Hawaiian: Hiki ia\u02bbu ke \u02bbai i ke aniani; \u02bba\u02bbole n\u014d l\u0101 au e \u02bbeha.
+Marquesan: E ko\u02bbana e kai i te karahi, mea \u02bb\u0101, \u02bba\u02bbe hauhau.
+Inuktitut: \u140a\u14d5\u148d\u1585 \u14c2\u1546\u152d\u154c\u1593\u1483\u146f \u14f1\u154b\u1671\u1466\u1450\u14d0\u14c7\u1585\u1450\u1593.
+Chinook Jargon: Naika m\u0259km\u0259k kaksh\u0259t labutay, pi weyk ukuk munk-sik nay.
+Navajo: Ts�s\u01eb\u02bc yish\u0105\u0301\u0105go b��n�shghah d�� doo shi\u0142 neezgai da.
+Lojban: mi kakne le nu citka le blaci .iku'i le se go'i na xrani mi.
+N�rdicg: Lj\u0153r ye caudran cr�ne� � jor c\u1e83ran.
\ No newline at end of file
diff --git a/asterixdb/asterix-external-data/src/test/resources/record.json b/asterixdb/asterix-external-data/src/test/resources/record.json
new file mode 100644
index 0000000..9b32a5d
--- /dev/null
+++ b/asterixdb/asterix-external-data/src/test/resources/record.json
@@ -0,0 +1,375 @@
+{
+  "quoted_status": {
+    "in_reply_to_status_id_str": null,
+    "in_reply_to_status_id": null,
+    "possibly_sensitive": false,
+    "coordinates": null,
+    "created_at": "Wed Sep 02 07:24:48 +0000 2015",
+    "truncated": false,
+    "in_reply_to_user_id_str": null,
+    "source": "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>",
+    "retweet_count": 0,
+    "retweeted": false,
+    "geo": null,
+    "filter_level": "low",
+    "in_reply_to_screen_name": null,
+    "entities": {
+      "urls": [
+        {
+          "expanded_url": "http://www.bigdata-insider.de/infrastruktur/articles/498946/?cmp=sm-tw-swyn&utm_source=twitter&utm_medium=sm&utm_campaign=twitter-swyn",
+          "display_url": "bigdata-insider.de/infrastruktur/\u2026",
+          "indices": [
+            54,
+            76
+          ],
+          "url": "http://t.co/8inseWDWIE"
+        }
+      ],
+      "hashtags": [
+        {
+          "indices": [
+            16,
+            22
+          ],
+          "text": "NoSQL"
+        },
+        {
+          "indices": [
+            24,
+            36
+          ],
+          "text": "Datenbanken"
+        }
+      ],
+      "user_mentions": [
+        {
+          "name": "EnterpriseDB_DE",
+          "indices": [
+            77,
+            93
+          ],
+          "id": 1219531897,
+          "screen_name": "EnterpriseDB_DE",
+          "id_str": "1219531897"
+        }
+      ],
+      "trends": [],
+      "symbols": []
+    },
+    "id_str": "638975848138285056",
+    "in_reply_to_user_id": null,
+    "favorite_count": 0,
+    "id": 638975848138285000,
+    "text": "Relationale und #NoSQL- #Datenbanken wachsen zusammen http://t.co/8inseWDWIE @EnterpriseDB_DE",
+    "place": null,
+    "contributors": null,
+    "lang": "de",
+    "user": {
+      "utc_offset": null,
+      "friends_count": 1440,
+      "profile_image_url_https": "https://pbs.twimg.com/profile_images/494807363572875265/EUm9CELG_normal.jpeg",
+      "listed_count": 54,
+      "profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png",
+      "default_profile_image": false,
+      "favourites_count": 11,
+      "description": "BigData-Insider.de \u2013 Entscheiderwissen f�r Big Data Professionals",
+      "created_at": "Mon Jun 30 10:40:17 +0000 2014",
+      "is_translator": false,
+      "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png",
+      "protected": false,
+      "screen_name": "bigdata_insider",
+      "id_str": "2596163432",
+      "profile_link_color": "177535",
+      "id": 2596163432,
+      "geo_enabled": false,
+      "profile_background_color": "55965F",
+      "lang": "de",
+      "profile_sidebar_border_color": "FFFFFF",
+      "profile_text_color": "333333",
+      "verified": false,
+      "profile_image_url": "http://pbs.twimg.com/profile_images/494807363572875265/EUm9CELG_normal.jpeg",
+      "time_zone": null,
+      "url": "http://www.bigdata-insider.de",
+      "contributors_enabled": false,
+      "profile_background_tile": false,
+      "profile_banner_url": "https://pbs.twimg.com/profile_banners/2596163432/1405605723",
+      "statuses_count": 325,
+      "follow_request_sent": null,
+      "followers_count": 817,
+      "profile_use_background_image": false,
+      "default_profile": false,
+      "following": null,
+      "name": "BigData-Insider",
+      "location": "Augsburg, Germany",
+      "profile_sidebar_fill_color": "DDEEF6",
+      "notifications": null
+    },
+    "favorited": false
+  },
+  "in_reply_to_status_id_str": null,
+  "in_reply_to_status_id": null,
+  "created_at": "Wed Sep 02 08:17:29 +0000 2015",
+  "in_reply_to_user_id_str": null,
+  "source": "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>",
+  "quoted_status_id": 638975848138285000,
+  "retweet_count": 0,
+  "retweeted": false,
+  "geo": null,
+  "filter_level": "low",
+  "in_reply_to_screen_name": null,
+  "id_str": "638989106882736128",
+  "in_reply_to_user_id": null,
+  "favorite_count": 0,
+  "id": 638989106882736100,
+  "text": "RT: Datenbanken im IoT-Zeitalter - mehr lesen auf @bigdata_insider  https://t.co/Yt0Pzij3tK",
+  "place": null,
+  "lang": "de",
+  "favorited": false,
+  "possibly_sensitive": false,
+  "coordinates": null,
+  "truncated": false,
+  "timestamp_ms": "1441181849581",
+  "entities": {
+    "urls": [
+      {
+        "expanded_url": "https://twitter.com/bigdata_insider/status/638975848138285056",
+        "display_url": "twitter.com/bigdata_inside\u2026",
+        "indices": [
+          68,
+          91
+        ],
+        "url": "https://t.co/Yt0Pzij3tK"
+      }
+    ],
+    "hashtags": [],
+    "user_mentions": [
+      {
+        "name": "BigData-Insider",
+        "indices": [
+          50,
+          66
+        ],
+        "id": 2596163432,
+        "screen_name": "bigdata_insider",
+        "id_str": "2596163432"
+      }
+    ],
+    "trends": [],
+    "symbols": []
+  },
+  "quoted_status_id_str": "638975848138285056",
+  "contributors": null,
+  "user": {
+    "utc_offset": 7200,
+    "friends_count": 382,
+    "profile_image_url_https": "https://pbs.twimg.com/profile_images/600331462982946816/IzBC43SR_normal.png",
+    "listed_count": 22,
+    "profile_background_image_url": "http://abs.twimg.com/images/themes/theme14/bg.gif",
+    "default_profile_image": false,
+    "favourites_count": 56,
+    "description": "EnterpriseDB ist weltgr��ter und f�hrender Anbieter von Enterprise L�sungen und Services basierend auf PostgreSQL, die fortschrittlichste Open Source Datenbank.",
+    "created_at": "Mon Feb 25 18:37:11 +0000 2013",
+    "is_translator": false,
+    "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme14/bg.gif",
+    "protected": false,
+    "screen_name": "EnterpriseDB_DE",
+    "id_str": "1219531897",
+    "profile_link_color": "EC7224",
+    "id": 1219531897,
+    "geo_enabled": false,
+    "profile_background_color": "EC7224",
+    "lang": "de",
+    "profile_sidebar_border_color": "FFFFFF",
+    "profile_text_color": "333333",
+    "verified": false,
+    "profile_image_url": "http://pbs.twimg.com/profile_images/600331462982946816/IzBC43SR_normal.png",
+    "time_zone": "Berlin",
+    "url": "http://www.enterprisedb.com",
+    "contributors_enabled": false,
+    "profile_background_tile": false,
+    "statuses_count": 941,
+    "follow_request_sent": null,
+    "followers_count": 336,
+    "profile_use_background_image": true,
+    "default_profile": false,
+    "following": null,
+    "name": "EnterpriseDB_DE",
+    "location": "Berlin, Germany",
+    "profile_sidebar_fill_color": "DDEEF6",
+    "notifications": null
+  }
+}
+{
+  "in_reply_to_status_id_str": null,
+  "in_reply_to_status_id": null,
+  "created_at": "Fri May 06 12:36:44 +0000 2016",
+  "in_reply_to_user_id_str": null,
+  "source": "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>",
+  "retweeted_status": {
+    "in_reply_to_status_id_str": null,
+    "in_reply_to_status_id": null,
+    "created_at": "Fri May 06 11:09:20 +0000 2016",
+    "in_reply_to_user_id_str": null,
+    "source": "<a href=\"http://jp.techcrunch.com/\" rel=\"nofollow\">TC Japan RTbot</a>",
+    "retweet_count": 4,
+    "retweeted": false,
+    "geo": null,
+    "filter_level": "low",
+    "in_reply_to_screen_name": null,
+    "is_quote_status": false,
+    "id_str": "728542158676852736",
+    "in_reply_to_user_id": null,
+    "favorite_count": 3,
+    "id": 728542158676852700,
+    "text": "16shares: Basho\u304c\u6642\u7cfb\u5217\u30c7\u30fc\u30bf\u5c02\u7528NoSQL\u30c7\u30fc\u30bf\u30d9\u30fc\u30b9Riak TS\u3092\u30aa\u30fc\u30d7\u30f3\u30bd\u30fc\u30b9\u5316\u3057\u3066IoT\u3078\u306e\u6d78\u900f\u3092\u306d\u3089\u3046 https://t.co/vYi3iI3XkZ",
+    "place": null,
+    "lang": "ja",
+    "favorited": false,
+    "possibly_sensitive": false,
+    "coordinates": null,
+    "truncated": false,
+    "entities": {
+      "urls": [
+        {
+          "expanded_url": "http://jp.techcrunch.com/2016/05/06/20160505basho-open-sources-its-riak-ts-database-for-the-internet-of-things/",
+          "display_url": "jp.techcrunch.com/2016/05/06/201\u2026",
+          "indices": [
+            65,
+            88
+          ],
+          "url": "https://t.co/vYi3iI3XkZ"
+        }
+      ],
+      "hashtags": [],
+      "user_mentions": [],
+      "symbols": []
+    },
+    "contributors": null,
+    "user": {
+      "utc_offset": 32400,
+      "friends_count": 456,
+      "profile_image_url_https": "https://pbs.twimg.com/profile_images/542903207098212352/S02CeC4c_normal.png",
+      "listed_count": 4277,
+      "profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png",
+      "default_profile_image": false,
+      "favourites_count": 420,
+      "description": "TechCrunch Japan\u306e\u516c\u5f0f\u30a2\u30ab\u30a6\u30f3\u30c8\u3067\u3059",
+      "created_at": "Fri Apr 22 10:46:18 +0000 2011",
+      "is_translator": false,
+      "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png",
+      "protected": false,
+      "screen_name": "jptechcrunch",
+      "id_str": "286106104",
+      "profile_link_color": "0A9E01",
+      "id": 286106104,
+      "geo_enabled": false,
+      "profile_background_color": "FFFFFF",
+      "lang": "ja",
+      "profile_sidebar_border_color": "C0DEED",
+      "profile_text_color": "333333",
+      "verified": false,
+      "profile_image_url": "http://pbs.twimg.com/profile_images/542903207098212352/S02CeC4c_normal.png",
+      "time_zone": "Tokyo",
+      "url": "http://jp.techcrunch.com",
+      "contributors_enabled": false,
+      "profile_background_tile": false,
+      "profile_banner_url": "https://pbs.twimg.com/profile_banners/286106104/1427898894",
+      "statuses_count": 24997,
+      "follow_request_sent": null,
+      "followers_count": 58290,
+      "profile_use_background_image": true,
+      "default_profile": false,
+      "following": null,
+      "name": "TechCrunch Japan",
+      "location": "Tokyo",
+      "profile_sidebar_fill_color": "DDEEF6",
+      "notifications": null
+    }
+  },
+  "retweet_count": 0,
+  "retweeted": false,
+  "geo": null,
+  "filter_level": "low",
+  "in_reply_to_screen_name": null,
+  "is_quote_status": false,
+  "id_str": "728564152130658304",
+  "in_reply_to_user_id": null,
+  "favorite_count": 0,
+  "id": 728564152130658300,
+  "text": "RT @jptechcrunch: 16shares: Basho\u304c\u6642\u7cfb\u5217\u30c7\u30fc\u30bf\u5c02\u7528NoSQL\u30c7\u30fc\u30bf\u30d9\u30fc\u30b9Riak TS\u3092\u30aa\u30fc\u30d7\u30f3\u30bd\u30fc\u30b9\u5316\u3057\u3066IoT\u3078\u306e\u6d78\u900f\u3092\u306d\u3089\u3046 https://t.co/vYi3iI3XkZ",
+  "place": null,
+  "lang": "ja",
+  "favorited": false,
+  "possibly_sensitive": false,
+  "coordinates": null,
+  "truncated": false,
+  "timestamp_ms": "1462538204592",
+  "entities": {
+    "urls": [
+      {
+        "expanded_url": "http://jp.techcrunch.com/2016/05/06/20160505basho-open-sources-its-riak-ts-database-for-the-internet-of-things/",
+        "display_url": "jp.techcrunch.com/2016/05/06/201\u2026",
+        "indices": [
+          83,
+          106
+        ],
+        "url": "https://t.co/vYi3iI3XkZ"
+      }
+    ],
+    "hashtags": [],
+    "user_mentions": [
+      {
+        "name": "TechCrunch Japan",
+        "indices": [
+          3,
+          16
+        ],
+        "id": 286106104,
+        "screen_name": "jptechcrunch",
+        "id_str": "286106104"
+      }
+    ],
+    "symbols": []
+  },
+  "contributors": null,
+  "user": {
+    "utc_offset": -25200,
+    "friends_count": 184,
+    "profile_image_url_https": "https://pbs.twimg.com/profile_images/615865274592432128/fYOAh2iR_normal.jpg",
+    "listed_count": 10,
+    "profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png",
+    "default_profile_image": false,
+    "favourites_count": 1523,
+    "description": "\u81ea\u4f5c\u7cfb\u3001\u30e9\u30f3\u30cb\u30f3\u30b0\u3001\u7b4b\u30c8\u30ec\u3092\u4e3b\u3068\u3059\u308b\u751f\u614b\u7cfb \u3002\u5b97\u6559\u4e0a\u306e\u7406\u7531\u3067ASRock\u3001nVIDIA\u3092\u5d07\u62dd\u3002\u6c34\u51b7\u5316\u306b\u5411\u3051\u3066\u5039\u7d04\u4e2d\u306e\u8eab\u3002\u70ad\u9178\u98f2\u6599\u306f\u8840\u6db2\u3002\u4eca\u5f8c\u3068\u3082\u3088\u308d\u3057\u304f\u2026\u2026",
+    "created_at": "Fri Jun 26 19:56:49 +0000 2015",
+    "is_translator": false,
+    "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png",
+    "protected": false,
+    "screen_name": "774Inside_X79",
+    "id_str": "3257040751",
+    "profile_link_color": "0084B4",
+    "id": 3257040751,
+    "geo_enabled": true,
+    "profile_background_color": "C0DEED",
+    "lang": "ja",
+    "profile_sidebar_border_color": "C0DEED",
+    "profile_text_color": "333333",
+    "verified": false,
+    "profile_image_url": "http://pbs.twimg.com/profile_images/615865274592432128/fYOAh2iR_normal.jpg",
+    "time_zone": "Pacific Time (US & Canada)",
+    "url": "http://twpf.jp/774Inside_X79",
+    "contributors_enabled": false,
+    "profile_background_tile": false,
+    "profile_banner_url": "https://pbs.twimg.com/profile_banners/3257040751/1458988346",
+    "statuses_count": 3694,
+    "follow_request_sent": null,
+    "followers_count": 144,
+    "profile_use_background_image": true,
+    "default_profile": true,
+    "following": null,
+    "name": "\uff97\uff6b",
+    "location": "\u80cc\u5f8c",
+    "profile_sidebar_fill_color": "DDEEF6",
+    "notifications": null
+  }
+}

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 8
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix Decoding of byte[] Records
......................................................................


Patch Set 7:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/1724/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 7
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix Decoding of byte[] Records
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/1687/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "abdullah alamoudi (Code Review)" <do...@asterixdb.incubator.apache.org>.
Hello Yingyi Bu, Jenkins,

I'd like you to reexamine a change.  Please visit

    https://asterix-gerrit.ics.uci.edu/951

to look at the new patch set (#7).

Change subject: Fix Decoding of byte[] Records
......................................................................

Fix Decoding of byte[] Records

Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/CharArrayRecord.java
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
A asterixdb/asterix-external-data/src/test/java/org/apache/asterix/external/parser/test/ByteBufUTF8DecodeTest.java
A asterixdb/asterix-external-data/src/test/resources/ICanEatGlass.txt
A asterixdb/asterix-external-data/src/test/resources/record.json
5 files changed, 637 insertions(+), 10 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/51/951/7
-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 7
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "abdullah alamoudi (Code Review)" <do...@asterixdb.incubator.apache.org>.
Hello Jenkins,

I'd like you to reexamine a change.  Please visit

    https://asterix-gerrit.ics.uci.edu/951

to look at the new patch set (#4).

Change subject: Fix Decoding of byte[] Records
......................................................................

Fix Decoding of byte[] Records

Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
1 file changed, 11 insertions(+), 7 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/51/951/4
-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 4
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "abdullah alamoudi (Code Review)" <do...@asterixdb.incubator.apache.org>.
Hello Jenkins,

I'd like you to reexamine a change.  Please visit

    https://asterix-gerrit.ics.uci.edu/951

to look at the new patch set (#5).

Change subject: Fix Decoding of byte[] Records
......................................................................

Fix Decoding of byte[] Records

Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/CharArrayRecord.java
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
A asterixdb/asterix-external-data/src/test/java/org/apache/asterix/external/parser/test/ByteBufUTF8DecodeTest.java
A asterixdb/asterix-external-data/src/test/resources/ICanEatGlass.txt
A asterixdb/asterix-external-data/src/test/resources/record.json
5 files changed, 637 insertions(+), 10 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/51/951/5
-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 5
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "abdullah alamoudi (Code Review)" <do...@asterixdb.incubator.apache.org>.
abdullah alamoudi has uploaded a new patch set (#2).

Change subject: Fix Decoding of byte[] Records
......................................................................

Fix Decoding of byte[] Records

Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
1 file changed, 8 insertions(+), 8 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/51/951/2
-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 2
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "Yingyi Bu (Code Review)" <do...@asterixdb.incubator.apache.org>.
Yingyi Bu has posted comments on this change.

Change subject: Fix Decoding of byte[] Records
......................................................................


Patch Set 7: Code-Review+2

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 7
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix Decoding of byte[] Records
......................................................................


Patch Set 3:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/1705/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 3
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "Yingyi Bu (Code Review)" <do...@asterixdb.incubator.apache.org>.
Yingyi Bu has posted comments on this change.

Change subject: Fix Decoding of byte[] Records
......................................................................


Patch Set 2:

(1 comment)

https://asterix-gerrit.ics.uci.edu/#/c/951/2/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
File asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java:

Line 116:             bytes.compact();
> Once per record. However, this is mostly a no op. it is only an op if the d
That's going to be a lot of copies if I understand correctly...  You end up copying half of frame k times, where k is the number of records within that frame.   "half-of-frame" is the mathematical expectation.

For example, you have 100 records for a 32KB frame, you'll copy 1.6MB data for each frame:  100 * (32KB/2).  Am I correct?

How many frames will flow through this set call?  Is it proportional to the data volume?


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 2
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>
Gerrit-HasComments: Yes

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "Yingyi Bu (Code Review)" <do...@asterixdb.incubator.apache.org>.
Yingyi Bu has posted comments on this change.

Change subject: Fix Decoding of byte[] Records
......................................................................


Patch Set 2:

(1 comment)

https://asterix-gerrit.ics.uci.edu/#/c/951/2/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
File asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java:

Line 116:             bytes.compact();
> That's going to be a lot of copies if I understand correctly...  You end up
Oh, I see.  There is a if-block before that.  Therefore, the compact() is called not that frequently and is called only for boundary records?


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 2
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>
Gerrit-HasComments: Yes

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix Decoding of byte[] Records
......................................................................


Patch Set 2:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/1688/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 2
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "abdullah alamoudi (Code Review)" <do...@asterixdb.incubator.apache.org>.
Hello Jenkins,

I'd like you to reexamine a change.  Please visit

    https://asterix-gerrit.ics.uci.edu/951

to look at the new patch set (#6).

Change subject: Fix Decoding of byte[] Records
......................................................................

Fix Decoding of byte[] Records

Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/CharArrayRecord.java
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
A asterixdb/asterix-external-data/src/test/java/org/apache/asterix/external/parser/test/ByteBufUTF8DecodeTest.java
A asterixdb/asterix-external-data/src/test/resources/ICanEatGlass.txt
A asterixdb/asterix-external-data/src/test/resources/record.json
5 files changed, 637 insertions(+), 10 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/51/951/6
-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "abdullah alamoudi (Code Review)" <do...@asterixdb.incubator.apache.org>.
abdullah alamoudi has posted comments on this change.

Change subject: Fix Decoding of byte[] Records
......................................................................


Patch Set 2:

(1 comment)

https://asterix-gerrit.ics.uci.edu/#/c/951/2/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
File asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java:

Line 116:             bytes.compact();
> Oh, I see.  There is a if-block before that.  Therefore, the compact() is c
Yingyi,
I just looked at the implementation details for the compact() and I think I better do a check here myself.

Will push a new patch. compact() should only be called for records which have non-english unicode characters and which the decoder failed to decode completely


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 2
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>
Gerrit-HasComments: Yes

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix Decoding of byte[] Records
......................................................................


Patch Set 5:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/1713/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 5
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "abdullah alamoudi (Code Review)" <do...@asterixdb.incubator.apache.org>.
abdullah alamoudi has posted comments on this change.

Change subject: Fix Decoding of byte[] Records
......................................................................


Patch Set 2:

(1 comment)

https://asterix-gerrit.ics.uci.edu/#/c/951/2/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
File asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java:

Line 116:             bytes.compact();
> How many copies are done for each buffer, roughly?
Once per record. However, this is mostly a no op. it is only an op if the decoder couldn't decode all the bytes and so we need to move the leftover to the beginning of the buffer and correct the position. which is what compact does for you.


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 2
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>
Gerrit-HasComments: Yes

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "abdullah alamoudi (Code Review)" <do...@asterixdb.incubator.apache.org>.
abdullah alamoudi has posted comments on this change.

Change subject: Fix Decoding of byte[] Records
......................................................................


Patch Set 2:

(2 comments)

https://asterix-gerrit.ics.uci.edu/#/c/951/2/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
File asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java:

Line 42: public class DCPMessageToRecordConverter implements IRecordToRecordWithMetadataAndPKConverter<DCPRequest, char[]> {
> Is the code style correct?
If by Style, you mean the formatting? then I think so.

If by style you mean interfaces and class design, Till and I have been discussing going over the interfaces at some point.
For now, this is correct and it means that you give this converter a DCPRequest (DCPRequest id a dcp message. I don't know why they named the class DCPRequest and I brought it up with them). The char[] means that the returned record object has a char[] which can be parsed.


Line 116:             bytes.compact();
> A lot of memory copies here?
the SDK use netty byte buffer which id different from java ByteBuffer. the decoder only deals with ByteBuffers and CharBuffer. The parser can only deal with char[].
hence, doing all of this data movement.
For now, I think this is okay. At least, I am not creating any objects and all the copying is bulk operations which are relatively cheaper.


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 2
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>
Gerrit-HasComments: Yes

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "abdullah alamoudi (Code Review)" <do...@asterixdb.incubator.apache.org>.
Hello Jenkins,

I'd like you to reexamine a change.  Please visit

    https://asterix-gerrit.ics.uci.edu/951

to look at the new patch set (#3).

Change subject: Fix Decoding of byte[] Records
......................................................................

Fix Decoding of byte[] Records

Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
1 file changed, 12 insertions(+), 8 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/51/951/3
-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 3
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix Decoding of byte[] Records
......................................................................


Patch Set 6:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/1717/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix Decoding of byte[] Records
......................................................................


Patch Set 4:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/1711/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 4
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "Yingyi Bu (Code Review)" <do...@asterixdb.incubator.apache.org>.
Yingyi Bu has posted comments on this change.

Change subject: Fix Decoding of byte[] Records
......................................................................


Patch Set 2:

(2 comments)

https://asterix-gerrit.ics.uci.edu/#/c/951/2/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
File asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java:

Line 42: public class DCPMessageToRecordConverter implements IRecordToRecordWithMetadataAndPKConverter<DCPRequest, char[]> {
> If by Style, you mean the formatting? then I think so.
I mean formatting. Right, I double checked.  It's correct.


Line 116:             bytes.compact();
> the SDK use netty byte buffer which id different from java ByteBuffer. the 
How many copies are done for each buffer, roughly?
i.e., how many times compact() is called for each frame?


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 2
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>
Gerrit-HasComments: Yes

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "Yingyi Bu (Code Review)" <do...@asterixdb.incubator.apache.org>.
Yingyi Bu has posted comments on this change.

Change subject: Fix Decoding of byte[] Records
......................................................................


Patch Set 2:

(2 comments)

https://asterix-gerrit.ics.uci.edu/#/c/951/2/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java
File asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/converter/DCPMessageToRecordConverter.java:

Line 42: public class DCPMessageToRecordConverter implements IRecordToRecordWithMetadataAndPKConverter<DCPRequest, char[]> {
Is the code style correct?


Line 116:             bytes.compact();
A lot of memory copies here?


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 2
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-HasComments: Yes

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix Decoding of byte[] Records
......................................................................


Patch Set 6:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/1714/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix Decoding of byte[] Records

Posted by "Yingyi Bu (Code Review)" <do...@asterixdb.incubator.apache.org>.
Yingyi Bu has posted comments on this change.

Change subject: Fix Decoding of byte[] Records
......................................................................


Patch Set 6: Code-Review+2

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/951
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71c3d8b8dfa5a98123725f139247d2b5ce10012e
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <ba...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>
Gerrit-HasComments: No