You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "Xikui Wang (Code Review)" <do...@asterixdb.incubator.apache.org> on 2016/11/03 23:18:12 UTC

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Xikui Wang has uploaded a new change for review.

  https://asterix-gerrit.ics.uci.edu/1339

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................

Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

1. For ASTERIXDB-1609, add UNION type check in writeField, and add one
more case for orderedList.
2. For OrderedList bug, change UnorderedListBuilder to
OrderedListBuilder.

Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/TweetParser.java
1 file changed, 43 insertions(+), 23 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/39/1339/1

diff --git a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/TweetParser.java b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/TweetParser.java
index 8d483dc..fc69d27 100644
--- a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/TweetParser.java
+++ b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/TweetParser.java
@@ -22,14 +22,16 @@
 import org.apache.asterix.builders.IARecordBuilder;
 import org.apache.asterix.builders.IAsterixListBuilder;
 import org.apache.asterix.builders.ListBuilderFactory;
+import org.apache.asterix.builders.OrderedListBuilder;
 import org.apache.asterix.builders.RecordBuilderFactory;
-import org.apache.asterix.builders.UnorderedListBuilder;
 import org.apache.asterix.external.api.IRawRecord;
 import org.apache.asterix.external.api.IRecordDataParser;
 import org.apache.asterix.om.base.AMutablePoint;
 import org.apache.asterix.om.base.ANull;
+import org.apache.asterix.om.types.AOrderedListType;
 import org.apache.asterix.om.types.ARecordType;
 import org.apache.asterix.om.types.ATypeTag;
+import org.apache.asterix.om.types.AUnionType;
 import org.apache.asterix.om.types.BuiltinType;
 import org.apache.asterix.om.types.IAType;
 import org.apache.asterix.om.util.container.IObjectPool;
@@ -60,46 +62,69 @@
         aPoint = new AMutablePoint(0, 0);
     }
 
-    private void parseUnorderedList(JSONArray jArray, DataOutput output) throws IOException, JSONException {
+    private void parseJSONArray(JSONArray jArray, DataOutput output, AOrderedListType orderedListType)
+            throws IOException, JSONException {
         ArrayBackedValueStorage itemBuffer = getTempBuffer();
-        UnorderedListBuilder unorderedListBuilder = (UnorderedListBuilder) getUnorderedListBuilder();
+        OrderedListBuilder orderedList = (OrderedListBuilder) getOrderedListBuilder();
 
-        unorderedListBuilder.reset(null);
+        orderedList.reset(orderedListType);
         for (int iter1 = 0; iter1 < jArray.length(); iter1++) {
             itemBuffer.reset();
-            if (writeField(jArray.get(iter1), null, itemBuffer.getDataOutput())) {
-                unorderedListBuilder.addItem(itemBuffer);
+            if (writeField(jArray.get(iter1), orderedListType.getItemType(), itemBuffer.getDataOutput())) {
+                orderedList.addItem(itemBuffer);
             }
         }
-        unorderedListBuilder.write(output, true);
+        orderedList.write(output, true);
     }
 
     private boolean writeField(Object fieldObj, IAType fieldType, DataOutput out) throws IOException, JSONException {
         boolean writeResult = true;
-        if (fieldType != null) {
-            switch (fieldType.getTypeTag()) {
+        IAType chkFieldType;
+        if (fieldType instanceof AUnionType) {
+            chkFieldType = ((AUnionType) fieldType).getActualType();
+        } else {
+            chkFieldType = fieldType;
+        }
+        if (chkFieldType != null) {
+            switch (chkFieldType.getTypeTag()) {
                 case STRING:
-                    out.write(BuiltinType.ASTRING.getTypeTag().serialize());
+                    out.write(fieldType.getTypeTag().serialize());
                     utf8Writer.writeUTF8(fieldObj.toString(), out);
                     break;
                 case INT64:
-                    aInt64.setValue((long) fieldObj);
+                    out.write(fieldType.getTypeTag().serialize());
+                    if (fieldObj instanceof Integer) {
+                        out.writeLong(((Integer) fieldObj).longValue());
+                    } else {
+                        out.writeLong((Long) fieldObj);
+                    }
                     int64Serde.serialize(aInt64, out);
                     break;
                 case INT32:
-                    out.write(BuiltinType.AINT32.getTypeTag().serialize());
+                    out.write(fieldType.getTypeTag().serialize());
                     out.writeInt((Integer) fieldObj);
                     break;
                 case DOUBLE:
-                    out.write(BuiltinType.ADOUBLE.getTypeTag().serialize());
+                    out.write(fieldType.getTypeTag().serialize());
                     out.writeDouble((Double) fieldObj);
                     break;
                 case BOOLEAN:
-                    out.write(BuiltinType.ABOOLEAN.getTypeTag().serialize());
+                    out.write(fieldType.getTypeTag().serialize());
                     out.writeBoolean((Boolean) fieldObj);
                     break;
                 case RECORD:
-                    writeRecord((JSONObject) fieldObj, out, (ARecordType) fieldType);
+                    if (((JSONObject) fieldObj).length() != 0) {
+                        writeRecord((JSONObject) fieldObj, out, (ARecordType) chkFieldType);
+                    } else {
+                        writeResult = false;
+                    }
+                    break;
+                case ORDEREDLIST:
+                    if (((JSONArray) fieldObj).length() != 0) {
+                        parseJSONArray((JSONArray) fieldObj, out, ((AOrderedListType) chkFieldType));
+                    } else {
+                        writeResult = false;
+                    }
                     break;
                 default:
                     writeResult = false;
@@ -124,7 +149,7 @@
                 utf8Writer.writeUTF8((String) fieldObj, out);
             } else if (fieldObj instanceof JSONArray) {
                 if (((JSONArray) fieldObj).length() != 0) {
-                    parseUnorderedList((JSONArray) fieldObj, out);
+                    parseJSONArray((JSONArray) fieldObj, out, null);
                 } else {
                     writeResult = false;
                 }
@@ -190,7 +215,6 @@
             }
         } else {
             //open record type
-            int closedFieldCount = 0;
             IAType curFieldType = null;
             for (String attrName : JSONObject.getNames(obj)) {
                 if (obj.isNull(attrName) || obj.length() == 0) {
@@ -210,12 +234,8 @@
                         recBuilder.addField(fieldNameBuffer, fieldValueBuffer);
                     } else {
                         recBuilder.addField(attrIdx, fieldValueBuffer);
-                        closedFieldCount++;
                     }
                 }
-            }
-            if (curRecType != null && closedFieldCount < curFNames.length) {
-                throw new HyracksDataException("Non-null field is null");
             }
         }
         recBuilder.write(out, true);
@@ -225,8 +245,8 @@
         return recordBuilderPool.allocate(ATypeTag.RECORD);
     }
 
-    private IAsterixListBuilder getUnorderedListBuilder() {
-        return listBuilderPool.allocate(ATypeTag.UNORDEREDLIST);
+    private IAsterixListBuilder getOrderedListBuilder() {
+        return listBuilderPool.allocate(ATypeTag.ORDEREDLIST);
     }
 
     private ArrayBackedValueStorage getTempBuffer() {

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Xikui Wang (Code Review)" <do...@asterixdb.incubator.apache.org>.
Hello Jenkins,

I'd like you to reexamine a change.  Please visit

    https://asterix-gerrit.ics.uci.edu/1339

to look at the new patch set (#3).

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................

Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

1. For ASTERIXDB-1609, add UNION type check in writeField, and add one
more case for orderedList.
2. For OrderedList bug, change UnorderedListBuilder to
OrderedListBuilder.

Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/TweetParser.java
1 file changed, 67 insertions(+), 28 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/39/1339/3
-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 3
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 6:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/3295/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 4:

Integration Tests Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/1099/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 4
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 2:

Integration Tests Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/1089/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 2
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Xikui Wang (Code Review)" <do...@asterixdb.incubator.apache.org>.
Xikui Wang has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 6:

@Wail,

Yes. You are right. I misunderstood it. :) 

Fixed it in the updated patch. Sorry for the delay.

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 2: Integration-Tests+1

Integration Tests Successful

https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/1089/ : SUCCESS

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 2
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Xikui Wang (Code Review)" <do...@asterixdb.incubator.apache.org>.
Xikui Wang has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 4:

Hi Wail,

Thanks for your comments! My replies are as follow:

Line 220: Here the incoming data is checked first. If the incoming data doesn't have this attribute, and it's not optional, then it throws exception. For optional attribute, if incoming attribute is null, this check will be false as it's not closed field.

Line 213 - 230: Actually it's necessary. I found out if we assigned extra attribute to closed datatype, there will be a problem in inserting. So for the closed datatype, I am using cherry-pick like method, and for open datatype, I just pull whatever in that record.

Line 236, Yes. There is. The weird part of Tweets coming in is, there are certain attribute only has name but with content. The length == 0 case is when obj is a JSONArray.

Line 258, Fixed.

Thanks for your help!

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 4
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Xikui Wang (Code Review)" <do...@asterixdb.incubator.apache.org>.
Xikui Wang has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 3:

Hi Wail,

There are some tweets come with 'entities' like this:

{"urls":[],"hashtags":[],"user_mentions":[],"symbols":[]}

It's fixed now. Please let me know if you meet further issues. Thanks for your help!

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 3
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Steven Jacobs (Code Review)" <do...@asterixdb.incubator.apache.org>.
Steven Jacobs has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 7: Code-Review+2

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 7
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Steven Jacobs <sj...@ucr.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 5:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/3268/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 5
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 7:

Integration Tests Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/1129/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 7
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 3:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/3258/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 3
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 5: Integration-Tests+1

Integration Tests Successful

https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/1106/ : SUCCESS

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 5
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 1: Integration-Tests-1

Integration Tests Failed

https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/1088/ : UNSTABLE

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Steven Jacobs (Code Review)" <do...@asterixdb.incubator.apache.org>.
Steven Jacobs has submitted this change and it was merged.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

1. For ASTERIXDB-1609, add UNION type check in writeField, and add one
more case for orderedList.
2. For OrderedList bug, change UnorderedListBuilder to
OrderedListBuilder.

Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Reviewed-on: https://asterix-gerrit.ics.uci.edu/1339
Sonar-Qube: Jenkins <je...@fulliautomatix.ics.uci.edu>
Tested-by: Jenkins <je...@fulliautomatix.ics.uci.edu>
Integration-Tests: Jenkins <je...@fulliautomatix.ics.uci.edu>
Reviewed-by: Wail Alkowaileet <wa...@gmail.com>
Reviewed-by: Steven Jacobs <sj...@ucr.edu>
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/TweetParser.java
1 file changed, 70 insertions(+), 29 deletions(-)

Approvals:
  Steven Jacobs: Looks good to me, approved
  Wail Alkowaileet: Looks good to me, but someone else must approve
  Jenkins: Verified; No violations found; Verified



diff --git a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/TweetParser.java b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/TweetParser.java
index 8d483dc..d23d490 100644
--- a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/TweetParser.java
+++ b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/TweetParser.java
@@ -22,14 +22,16 @@
 import org.apache.asterix.builders.IARecordBuilder;
 import org.apache.asterix.builders.IAsterixListBuilder;
 import org.apache.asterix.builders.ListBuilderFactory;
+import org.apache.asterix.builders.OrderedListBuilder;
 import org.apache.asterix.builders.RecordBuilderFactory;
-import org.apache.asterix.builders.UnorderedListBuilder;
 import org.apache.asterix.external.api.IRawRecord;
 import org.apache.asterix.external.api.IRecordDataParser;
 import org.apache.asterix.om.base.AMutablePoint;
 import org.apache.asterix.om.base.ANull;
+import org.apache.asterix.om.types.AOrderedListType;
 import org.apache.asterix.om.types.ARecordType;
 import org.apache.asterix.om.types.ATypeTag;
+import org.apache.asterix.om.types.AUnionType;
 import org.apache.asterix.om.types.BuiltinType;
 import org.apache.asterix.om.types.IAType;
 import org.apache.asterix.om.util.container.IObjectPool;
@@ -60,51 +62,80 @@
         aPoint = new AMutablePoint(0, 0);
     }
 
-    private void parseUnorderedList(JSONArray jArray, DataOutput output) throws IOException, JSONException {
+    private void parseJSONArray(JSONArray jArray, DataOutput output, AOrderedListType orderedListType)
+            throws IOException, JSONException {
         ArrayBackedValueStorage itemBuffer = getTempBuffer();
-        UnorderedListBuilder unorderedListBuilder = (UnorderedListBuilder) getUnorderedListBuilder();
+        OrderedListBuilder orderedList = (OrderedListBuilder) getOrderedListBuilder();
 
-        unorderedListBuilder.reset(null);
+        orderedList.reset(orderedListType);
         for (int iter1 = 0; iter1 < jArray.length(); iter1++) {
             itemBuffer.reset();
-            if (writeField(jArray.get(iter1), null, itemBuffer.getDataOutput())) {
-                unorderedListBuilder.addItem(itemBuffer);
+            if (writeField(jArray.get(iter1), orderedListType == null ? null : orderedListType.getItemType(),
+                    itemBuffer.getDataOutput())) {
+                orderedList.addItem(itemBuffer);
             }
         }
-        unorderedListBuilder.write(output, true);
+        orderedList.write(output, true);
     }
 
-    private boolean writeField(Object fieldObj, IAType fieldType, DataOutput out) throws IOException, JSONException {
+    private boolean writeFieldWithFieldType(Object fieldObj, IAType fieldType, DataOutput out)
+            throws HyracksDataException {
         boolean writeResult = true;
-        if (fieldType != null) {
-            switch (fieldType.getTypeTag()) {
+        IAType chkFieldType;
+        chkFieldType = fieldType instanceof AUnionType ? ((AUnionType) fieldType).getActualType() : fieldType;
+        try {
+            switch (chkFieldType.getTypeTag()) {
                 case STRING:
-                    out.write(BuiltinType.ASTRING.getTypeTag().serialize());
+                    out.write(fieldType.getTypeTag().serialize());
                     utf8Writer.writeUTF8(fieldObj.toString(), out);
                     break;
                 case INT64:
-                    aInt64.setValue((long) fieldObj);
+                    out.write(fieldType.getTypeTag().serialize());
+                    if (fieldObj instanceof Integer) {
+                        out.writeLong(((Integer) fieldObj).longValue());
+                    } else {
+                        out.writeLong((Long) fieldObj);
+                    }
                     int64Serde.serialize(aInt64, out);
                     break;
                 case INT32:
-                    out.write(BuiltinType.AINT32.getTypeTag().serialize());
+                    out.write(fieldType.getTypeTag().serialize());
                     out.writeInt((Integer) fieldObj);
                     break;
                 case DOUBLE:
-                    out.write(BuiltinType.ADOUBLE.getTypeTag().serialize());
+                    out.write(fieldType.getTypeTag().serialize());
                     out.writeDouble((Double) fieldObj);
                     break;
                 case BOOLEAN:
-                    out.write(BuiltinType.ABOOLEAN.getTypeTag().serialize());
+                    out.write(fieldType.getTypeTag().serialize());
                     out.writeBoolean((Boolean) fieldObj);
                     break;
                 case RECORD:
-                    writeRecord((JSONObject) fieldObj, out, (ARecordType) fieldType);
+                    if (((JSONObject) fieldObj).length() != 0) {
+                        writeResult = writeRecord((JSONObject) fieldObj, out, (ARecordType) chkFieldType);
+                    } else {
+                        writeResult = false;
+                    }
+                    break;
+                case ORDEREDLIST:
+                    if (((JSONArray) fieldObj).length() != 0) {
+                        parseJSONArray((JSONArray) fieldObj, out, (AOrderedListType) chkFieldType);
+                    } else {
+                        writeResult = false;
+                    }
                     break;
                 default:
                     writeResult = false;
             }
-        } else {
+        } catch (IOException | JSONException e) {
+            throw new HyracksDataException(e);
+        }
+        return writeResult;
+    }
+
+    private boolean writeFieldWithoutFieldType(Object fieldObj, DataOutput out) throws HyracksDataException {
+        boolean writeResult = true;
+        try {
             if (fieldObj == JSONObject.NULL) {
                 nullSerde.serialize(ANull.NULL, out);
             } else if (fieldObj instanceof Integer) {
@@ -124,19 +155,26 @@
                 utf8Writer.writeUTF8((String) fieldObj, out);
             } else if (fieldObj instanceof JSONArray) {
                 if (((JSONArray) fieldObj).length() != 0) {
-                    parseUnorderedList((JSONArray) fieldObj, out);
+                    parseJSONArray((JSONArray) fieldObj, out, null);
                 } else {
                     writeResult = false;
                 }
             } else if (fieldObj instanceof JSONObject) {
                 if (((JSONObject) fieldObj).length() != 0) {
-                    writeRecord((JSONObject) fieldObj, out, null);
+                    writeResult = writeRecord((JSONObject) fieldObj, out, null);
                 } else {
                     writeResult = false;
                 }
             }
+        } catch (IOException | JSONException e) {
+            throw new HyracksDataException(e);
         }
         return writeResult;
+    }
+
+    private boolean writeField(Object fieldObj, IAType fieldType, DataOutput out) throws HyracksDataException {
+        return fieldType == null ? writeFieldWithoutFieldType(fieldObj, out)
+                : writeFieldWithFieldType(fieldObj, fieldType, out);
     }
 
     private int checkAttrNameIdx(String[] nameList, String name) {
@@ -152,11 +190,13 @@
         return -1;
     }
 
-    public void writeRecord(JSONObject obj, DataOutput out, ARecordType curRecType) throws IOException, JSONException {
+    public boolean writeRecord(JSONObject obj, DataOutput out, ARecordType curRecType)
+            throws IOException, JSONException {
         IAType[] curTypes = null;
         String[] curFNames = null;
         int fieldN;
         int attrIdx;
+        boolean writeRecord = false;
 
         ArrayBackedValueStorage fieldValueBuffer = getTempBuffer();
         ArrayBackedValueStorage fieldNameBuffer = getTempBuffer();
@@ -177,7 +217,8 @@
                 fieldValueBuffer.reset();
                 DataOutput fieldOutput = fieldValueBuffer.getDataOutput();
                 if (obj.isNull(curFNames[iter1])) {
-                    if (curRecType.isClosedField(curFNames[iter1])) {
+                    if (curRecType.getFieldType(curFNames[iter1]) != null
+                            && !(curRecType.getFieldType(curFNames[iter1]) instanceof AUnionType)) {
                         throw new HyracksDataException("Closed field " + curFNames[iter1] + " has null value.");
                     } else {
                         continue;
@@ -185,12 +226,12 @@
                 } else {
                     if (writeField(obj.get(curFNames[iter1]), curTypes[iter1], fieldOutput)) {
                         recBuilder.addField(iter1, fieldValueBuffer);
+                        writeRecord = true;
                     }
                 }
             }
         } else {
             //open record type
-            int closedFieldCount = 0;
             IAType curFieldType = null;
             for (String attrName : JSONObject.getNames(obj)) {
                 if (obj.isNull(attrName) || obj.length() == 0) {
@@ -204,29 +245,29 @@
                 fieldNameBuffer.reset();
                 DataOutput fieldOutput = fieldValueBuffer.getDataOutput();
                 if (writeField(obj.get(attrName), curFieldType, fieldOutput)) {
+                    writeRecord = true;
                     if (attrIdx == -1) {
                         aString.setValue(attrName);
                         stringSerde.serialize(aString, fieldNameBuffer.getDataOutput());
                         recBuilder.addField(fieldNameBuffer, fieldValueBuffer);
                     } else {
                         recBuilder.addField(attrIdx, fieldValueBuffer);
-                        closedFieldCount++;
                     }
                 }
             }
-            if (curRecType != null && closedFieldCount < curFNames.length) {
-                throw new HyracksDataException("Non-null field is null");
-            }
         }
-        recBuilder.write(out, true);
+        if (writeRecord) {
+            recBuilder.write(out, true);
+        }
+        return writeRecord;
     }
 
     private IARecordBuilder getRecordBuilder() {
         return recordBuilderPool.allocate(ATypeTag.RECORD);
     }
 
-    private IAsterixListBuilder getUnorderedListBuilder() {
-        return listBuilderPool.allocate(ATypeTag.UNORDEREDLIST);
+    private IAsterixListBuilder getOrderedListBuilder() {
+        return listBuilderPool.allocate(ATypeTag.ORDEREDLIST);
     }
 
     private ArrayBackedValueStorage getTempBuffer() {

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 8
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Steven Jacobs <sj...@ucr.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Xikui Wang (Code Review)" <do...@asterixdb.incubator.apache.org>.
Hello Jenkins,

I'd like you to reexamine a change.  Please visit

    https://asterix-gerrit.ics.uci.edu/1339

to look at the new patch set (#5).

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................

Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

1. For ASTERIXDB-1609, add UNION type check in writeField, and add one
more case for orderedList.
2. For OrderedList bug, change UnorderedListBuilder to
OrderedListBuilder.

Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/TweetParser.java
1 file changed, 68 insertions(+), 28 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/39/1339/5
-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 5
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 4: Integration-Tests+1

Integration Tests Successful

https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/1099/ : SUCCESS

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 4
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 7: Integration-Tests+1

Integration Tests Successful

https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/1129/ : SUCCESS

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 7
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 7:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/3296/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 7
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Xikui Wang (Code Review)" <do...@asterixdb.incubator.apache.org>.
Hello Jenkins,

I'd like you to reexamine a change.  Please visit

    https://asterix-gerrit.ics.uci.edu/1339

to look at the new patch set (#6).

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................

Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

1. For ASTERIXDB-1609, add UNION type check in writeField, and add one
more case for orderedList.
2. For OrderedList bug, change UnorderedListBuilder to
OrderedListBuilder.

Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/TweetParser.java
1 file changed, 70 insertions(+), 29 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/39/1339/6
-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Xikui Wang (Code Review)" <do...@asterixdb.incubator.apache.org>.
Hello Jenkins,

I'd like you to reexamine a change.  Please visit

    https://asterix-gerrit.ics.uci.edu/1339

to look at the new patch set (#2).

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................

Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

1. For ASTERIXDB-1609, add UNION type check in writeField, and add one
more case for orderedList.
2. For OrderedList bug, change UnorderedListBuilder to
OrderedListBuilder.

Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/TweetParser.java
1 file changed, 57 insertions(+), 25 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/39/1339/2
-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 2
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 1:

Integration Tests Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/1088/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Xikui Wang (Code Review)" <do...@asterixdb.incubator.apache.org>.
Hello Jenkins,

I'd like you to reexamine a change.  Please visit

    https://asterix-gerrit.ics.uci.edu/1339

to look at the new patch set (#7).

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................

Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

1. For ASTERIXDB-1609, add UNION type check in writeField, and add one
more case for orderedList.
2. For OrderedList bug, change UnorderedListBuilder to
OrderedListBuilder.

Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/TweetParser.java
1 file changed, 70 insertions(+), 29 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/39/1339/7
-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 7
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Wail Alkowaileet (Code Review)" <do...@asterixdb.incubator.apache.org>.
Wail Alkowaileet has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 4:

For some reason, I cannot inline my comment. 
I will follow this format:
{line# : code
comment}

220 : if (curRecType.isClosedField(curFNames[iter1])) 
should that check if it's optional instead of closed. Because a field can be optional and that allows null.

Probably we can get rid of the first part (213 - 230). And make:
recType = curRecType == null ? ARecordType.FULLY_OPEN_RECORD_TYPE : curRecType;

236 : if (obj.isNull(attrName) || obj.length() == 0) 
is obj.length() ever going to equal zero? obj is the parent of attrName.

258: if (writeRecord == true) 
no need for == true

Great work.. Thanks!

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 4
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Wail Alkowaileet (Code Review)" <do...@asterixdb.incubator.apache.org>.
Wail Alkowaileet has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 7: Code-Review+1

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 7
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 4:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/3259/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 4
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Xikui Wang (Code Review)" <do...@asterixdb.incubator.apache.org>.
Hello Jenkins,

I'd like you to reexamine a change.  Please visit

    https://asterix-gerrit.ics.uci.edu/1339

to look at the new patch set (#4).

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................

Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

1. For ASTERIXDB-1609, add UNION type check in writeField, and add one
more case for orderedList.
2. For OrderedList bug, change UnorderedListBuilder to
OrderedListBuilder.

Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/TweetParser.java
1 file changed, 68 insertions(+), 28 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/39/1339/4
-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 4
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 3:

Integration Tests Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/1098/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 3
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 2:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/3243/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 2
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 5:

Integration Tests Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/1106/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 5
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Wail Alkowaileet (Code Review)" <do...@asterixdb.incubator.apache.org>.
Wail Alkowaileet has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 2:

Hi Xikui,

I tried to see the output of the new parser.
I noticed something different:
{ "id": 795240956979126272, "user": { "screen_name": "77mono7", "lang": "ja", "friends_count": 65, "statuses_count": 712 }, "in_reply_to_status_id_str": "795234196201426946", "in_reply_to_status_id": 795234196201426946, "created_at": "Sun Nov 06 12:26:33 +0000 2016", "in_reply_to_user_id_str": "762203399202799616", "source": "Twitter for iPhone", "retweet_count": 0, "retweeted": false, "filter_level": "low", "in_reply_to_screen_name": "77mono7", "is_quote_status": false, "id_str": "795240956979126272", "in_reply_to_user_id": 762203399202799616, "favorite_count": 0, "text": "\u4ef2\u826f\u304f\u3057\u3066\u306d", "lang": "ja", "favorited": false, "truncated": false, "timestamp_ms": "1478435193663", "entities": {  } }

You can see 'entities' as an empty record. I'm not sure how is that mapped in AsterixDB ? it's not null or missing.

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 2
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <ba...@gmail.com>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/3242/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser

Posted by "Wail Alkowaileet (Code Review)" <do...@asterixdb.incubator.apache.org>.
Wail Alkowaileet has posted comments on this change.

Change subject: Fix ASTERIXDB-1609 and OrderedList bug in TweetParser
......................................................................


Patch Set 5:

isClosed doesn't mean it's optional or not optional. it means the field is defined despite the type. Optional fields has to be checked as a union type of (null, missing, actual-type). 

To produce the exception, use the following DDL (put breakpoint on the exception line: 221):

drop dataverse feeds if exists
create dataverse feeds;
use dataverse feeds;

create type Geo as closed {
    coordinates: [double],
    'type':string
}

create type Tweet as closed {
    id: int64,
    text:string,
    geo:Geo?  
}
create dataset Tweets (Tweet)
primary key id

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1339
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia27148cb10206b93dabf7655aed68f3004f96dfd
Gerrit-PatchSet: 5
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xk...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Wail Alkowaileet <wa...@gmail.com>
Gerrit-Reviewer: Xikui Wang <xk...@gmail.com>
Gerrit-HasComments: No