You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hawq.apache.org by kavinderd <gi...@git.apache.org> on 2016/10/20 01:35:25 UTC

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

GitHub user kavinderd opened a pull request:

    https://github.com/apache/incubator-hawq/pull/969

    HAWQ-1100. Support for new filter string

    - constant data type is now of the form "c<DATATYPE>s<LENGTH>d<DATA>"

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kavinderd/incubator-hawq HAWQ-1100

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-hawq/pull/969.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #969
    
----
commit ff189586841f27813f2b94f06d708dcbe12197b4
Author: Kavinder Dhaliwal <ka...@gmail.com>
Date:   2016-10-13T22:53:39Z

    HAWQ-1100. Support for new filter string
    
    - constant data type is now of the form "c<DATATYPE>s<LENGTH>d<DATA>"

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by denalex <gi...@git.apache.org>.
Github user denalex commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84745915
  
    --- Diff: pxf/pxf-hive/src/main/java/org/apache/hawq/pxf/plugins/hive/HiveFilterBuilder.java ---
    @@ -61,8 +61,11 @@ public HiveFilterBuilder(InputData input) {
          *             filter or list of basic filters
          */
         public Object getFilterObject(String filterString) throws Exception {
    +        if (filterString == null)
    +            return null;
    +
             FilterParser parser = new FilterParser(this);
    -        Object result = parser.parse(filterString);
    +        Object result = parser.parse(filterString.getBytes());
    --- End diff --
    
    and here ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by sansanichfb <gi...@git.apache.org>.
Github user sansanichfb commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84352763
  
    --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/FilterParser.java ---
    @@ -279,25 +282,108 @@ int safeToInt(Long value) throws FilterStringSyntaxException {
             return value.intValue();
         }
     
    +    private int parseConstDataType() throws Exception {
    +        if (!Character.isDigit((char) filterByteArr[index])) {
    +            throw new FilterStringSyntaxException("datatype OID should follow at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument at " + digits);
    +        }
    +    }
    +
    +    private int parseDataLength() throws Exception {
    +        if (((char) filterByteArr[index]) != 's') {
    +            throw new FilterStringSyntaxException("data length delimiter 's' expected at " +  index);
    +        }
    +
    +        index++;
    +        return parseInt();
    +    }
    +
    +    private int parseInt() throws Exception {
    +        if (index == filterByteArr.length) {
    +            throw new FilterStringSyntaxException("numeric argument expected at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument " + digits);
    +        }
    +    }
    +
    +    private Object convertDataType(byte[] byteData, DataType dataType) throws Exception {
    +        String data = new String(byteData);
    +        try {
    +            switch (dataType) {
    --- End diff --
    
    Should we also include TIME and BYTEA types?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by denalex <gi...@git.apache.org>.
Github user denalex commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84548461
  
    --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/FilterParser.java ---
    @@ -356,8 +442,8 @@ private String parseString() throws Exception {
             int i;
     
             // starting from index + 1 to skip leading "
    -        for (i = index + 1; i < filterString.length(); ++i) {
    -            char chr = filterString.charAt(i);
    +        for (i = index + 1; i < filterByteArr.length; ++i) {
    +            char chr = (char) filterByteArr[i];
    --- End diff --
    
    do we still have quotes around string values ? since we are passing length, it is no longer necessary


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by denalex <gi...@git.apache.org>.
Github user denalex commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84549783
  
    --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/FilterParser.java ---
    @@ -279,25 +282,108 @@ int safeToInt(Long value) throws FilterStringSyntaxException {
             return value.intValue();
         }
     
    +    private int parseConstDataType() throws Exception {
    +        if (!Character.isDigit((char) filterByteArr[index])) {
    +            throw new FilterStringSyntaxException("datatype OID should follow at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument at " + digits);
    +        }
    +    }
    +
    +    private int parseDataLength() throws Exception {
    +        if (((char) filterByteArr[index]) != 's') {
    +            throw new FilterStringSyntaxException("data length delimiter 's' expected at " +  index);
    +        }
    +
    +        index++;
    +        return parseInt();
    +    }
    +
    +    private int parseInt() throws Exception {
    +        if (index == filterByteArr.length) {
    +            throw new FilterStringSyntaxException("numeric argument expected at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument " + digits);
    +        }
    +    }
    +
    +    private Object convertDataType(byte[] byteData, DataType dataType) throws Exception {
    +        String data = new String(byteData);
    --- End diff --
    
    assuming bytes come from outside, the charset should be the same as was used during encoding. incidentally, we convert String to byte[] here ourselves for some unknown to me purpose, that's a different comment. If we really need to do that, it is better to be explicit about charset used for conversions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by denalex <gi...@git.apache.org>.
Github user denalex commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84549151
  
    --- Diff: pxf/pxf-hive/src/main/java/org/apache/hawq/pxf/plugins/hive/HiveFilterBuilder.java ---
    @@ -61,8 +61,11 @@ public HiveFilterBuilder(InputData input) {
          *             filter or list of basic filters
          */
         public Object getFilterObject(String filterString) throws Exception {
    +        if (filterString == null)
    --- End diff --
    
    style : curly brackets missing ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by denalex <gi...@git.apache.org>.
Github user denalex commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84548811
  
    --- Diff: pxf/pxf-hbase/src/main/java/org/apache/hawq/pxf/plugins/hbase/HBaseFilterBuilder.java ---
    @@ -89,12 +90,15 @@ private boolean filterNotOpPresent(String filterString) {
          * @throws Exception if parsing failed
          */
         public Filter getFilterObject(String filterString) throws Exception {
    +        if (filterString == null)
    +            return null;
    +
             // First check for NOT, HBase does not support this
             if (filterNotOpPresent(filterString))
                 return null;
     
             FilterParser parser = new FilterParser(this);
    -        Object result = parser.parse(filterString);
    +        Object result = parser.parse(filterString.getBytes());
    --- End diff --
    
    this is confusing ... we already have a string here, then we convert it to byte[] to send to parser which converts each byte to char ... why ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by kavinderd <gi...@git.apache.org>.
Github user kavinderd commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84744935
  
    --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/FilterParser.java ---
    @@ -279,25 +296,110 @@ int safeToInt(Long value) throws FilterStringSyntaxException {
             return value.intValue();
         }
     
    +    private int parseConstDataType() throws Exception {
    +        if (!Character.isDigit((char) filterByteArr[index])) {
    +            throw new FilterStringSyntaxException("datatype OID should follow at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument at " + digits);
    +        }
    +    }
    +
    +    private int parseDataLength() throws Exception {
    +        if (((char) filterByteArr[index]) != CONST_LEN) {
    +            throw new FilterStringSyntaxException("data length delimiter 's' expected at " +  index);
    +        }
    +
    +        index++;
    +        return parseInt();
    +    }
    +
    +    private int parseInt() throws Exception {
    +        if (index == filterByteArr.length) {
    +            throw new FilterStringSyntaxException("numeric argument expected at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument " + digits);
    +        }
    +    }
    +
    +    private Object convertDataType(byte[] byteData, int start, int end, DataType dataType) throws Exception {
    +        String data = new String(byteData, start, end-start);
    --- End diff --
    
    Sure, I was under the assumption that the default charset is UTF-8 but it makes sense to be explicit


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by shivzone <gi...@git.apache.org>.
Github user shivzone commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84548319
  
    --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/FilterParser.java ---
    @@ -279,25 +282,108 @@ int safeToInt(Long value) throws FilterStringSyntaxException {
             return value.intValue();
         }
     
    +    private int parseConstDataType() throws Exception {
    +        if (!Character.isDigit((char) filterByteArr[index])) {
    +            throw new FilterStringSyntaxException("datatype OID should follow at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument at " + digits);
    +        }
    +    }
    +
    +    private int parseDataLength() throws Exception {
    +        if (((char) filterByteArr[index]) != 's') {
    +            throw new FilterStringSyntaxException("data length delimiter 's' expected at " +  index);
    +        }
    +
    +        index++;
    +        return parseInt();
    +    }
    +
    +    private int parseInt() throws Exception {
    +        if (index == filterByteArr.length) {
    +            throw new FilterStringSyntaxException("numeric argument expected at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument " + digits);
    +        }
    +    }
    +
    +    private Object convertDataType(byte[] byteData, DataType dataType) throws Exception {
    +        String data = new String(byteData);
    --- End diff --
    
    it would use the platform's default charset. @kavinderd is that the interned behavior ?
    If not, use "UTF-8"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by shivzone <gi...@git.apache.org>.
Github user shivzone commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84550548
  
    --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/FilterParser.java ---
    @@ -165,17 +168,17 @@ public FilterParser(FilterBuilder eval) {
          * @return the parsed filter
          * @throws Exception if the filter string had wrong syntax
          */
    -    public Object parse(String filter) throws Exception {
    +    public Object parse(byte[] filter) throws Exception {
             index = 0;
    -        filterString = filter;
    +        filterByteArr = filter;
             int opNumber;
     
             if (filter == null) {
                 throw new FilterStringSyntaxException("filter parsing ended with no result");
             }
     
    -        while (index < filterString.length()) {
    -            char op = filterString.charAt(index);
    +        while (index < filterByteArr.length) {
    +            char op = (char) filterByteArr[index];
    --- End diff --
    
    Instead of two lines just use  char op = (char) filterByteArr[index++];


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by shivzone <gi...@git.apache.org>.
Github user shivzone commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84563396
  
    --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/FilterParser.java ---
    @@ -279,25 +282,108 @@ int safeToInt(Long value) throws FilterStringSyntaxException {
             return value.intValue();
         }
     
    +    private int parseConstDataType() throws Exception {
    +        if (!Character.isDigit((char) filterByteArr[index])) {
    +            throw new FilterStringSyntaxException("datatype OID should follow at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument at " + digits);
    +        }
    +    }
    +
    +    private int parseDataLength() throws Exception {
    +        if (((char) filterByteArr[index]) != 's') {
    +            throw new FilterStringSyntaxException("data length delimiter 's' expected at " +  index);
    +        }
    +
    +        index++;
    +        return parseInt();
    +    }
    +
    +    private int parseInt() throws Exception {
    +        if (index == filterByteArr.length) {
    +            throw new FilterStringSyntaxException("numeric argument expected at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument " + digits);
    +        }
    +    }
    +
    +    private Object convertDataType(byte[] byteData, DataType dataType) throws Exception {
    +        String data = new String(byteData);
    +        try {
    +            switch (dataType) {
    +                case BIGINT:
    +                    return Long.parseLong(data);
    +                case INTEGER:
    +                case SMALLINT:
    +                    return Integer.parseInt(data);
    +                case FLOAT8:
    +                case REAL:
    +                    return Float.parseFloat(data);
    +                case NUMERIC:
    +                    return Double.parseDouble(data);
    +                case TEXT:
    +                case VARCHAR:
    +                case BPCHAR:
    +                    return data;
    +                case BOOLEAN:
    +                    return Boolean.parseBoolean(data);
    +                case DATE:
    +                    return Date.valueOf(data);
    +                case TIMESTAMP:
    +                    return Timestamp.valueOf(data);
    +                default:
    +                    throw new FilterStringSyntaxException("DataType " + dataType.toString() + " unsupported");
    +            }
    +        } catch (NumberFormatException nfe) {
    +            throw new FilterStringSyntaxException("failed to parse number data type starting at " + index);
    +        }
    +    }
         /**
          * Parses either a number or a string.
          */
         private Object parseParameter() throws Exception {
    -        if (index == filterString.length()) {
    +        if (index == filterByteArr.length) {
                 throw new FilterStringSyntaxException("argument should follow at " + index);
             }
     
    -        return senseString()
    -                ? parseString()
    -                : parseNumber();
    -    }
    +        DataType dataType = DataType.get(parseConstDataType());
    +        if (dataType == DataType.UNSUPPORTED_TYPE) {
    +            throw new FilterStringSyntaxException("invalid DataType OID at " + (index - 1));
    +        }
    +
    +        int dataLength = parseDataLength();
    +        if (dataLength < 1) {
    --- End diff --
    
    For the case when TEXT field is checked with = against "", bridge uses s0do5 in the expression string.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by shivzone <gi...@git.apache.org>.
Github user shivzone commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84527217
  
    --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/FilterParser.java ---
    @@ -279,25 +282,108 @@ int safeToInt(Long value) throws FilterStringSyntaxException {
             return value.intValue();
         }
     
    +    private int parseConstDataType() throws Exception {
    +        if (!Character.isDigit((char) filterByteArr[index])) {
    +            throw new FilterStringSyntaxException("datatype OID should follow at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument at " + digits);
    +        }
    +    }
    +
    +    private int parseDataLength() throws Exception {
    +        if (((char) filterByteArr[index]) != 's') {
    +            throw new FilterStringSyntaxException("data length delimiter 's' expected at " +  index);
    +        }
    +
    +        index++;
    +        return parseInt();
    +    }
    +
    +    private int parseInt() throws Exception {
    +        if (index == filterByteArr.length) {
    +            throw new FilterStringSyntaxException("numeric argument expected at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument " + digits);
    +        }
    +    }
    +
    +    private Object convertDataType(byte[] byteData, DataType dataType) throws Exception {
    +        String data = new String(byteData);
    +        try {
    +            switch (dataType) {
    +                case BIGINT:
    +                    return Long.parseLong(data);
    +                case INTEGER:
    +                case SMALLINT:
    +                    return Integer.parseInt(data);
    +                case FLOAT8:
    +                case REAL:
    +                    return Float.parseFloat(data);
    +                case NUMERIC:
    +                    return Double.parseDouble(data);
    +                case TEXT:
    +                case VARCHAR:
    +                case BPCHAR:
    +                    return data;
    +                case BOOLEAN:
    +                    return Boolean.parseBoolean(data);
    +                case DATE:
    +                    return Date.valueOf(data);
    +                case TIMESTAMP:
    +                    return Timestamp.valueOf(data);
    +                default:
    +                    throw new FilterStringSyntaxException("DataType " + dataType.toString() + " unsupported");
    +            }
    +        } catch (NumberFormatException nfe) {
    +            throw new FilterStringSyntaxException("failed to parse number data type starting at " + index);
    +        }
    +    }
         /**
    --- End diff --
    
    Provide more explanation in the comment section on how we go about parsing (mention what you expect). Would be useful for future reference.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by sansanichfb <gi...@git.apache.org>.
Github user sansanichfb commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84376927
  
    --- Diff: pxf/pxf-hbase/src/main/java/org/apache/hawq/pxf/plugins/hbase/HBaseFilterBuilder.java ---
    @@ -229,8 +233,20 @@ private ByteArrayComparable getComparator(int type, Object data) throws Exceptio
                     break;
                 case SMALLINT:
                 case INTEGER:
    +                result = new HBaseIntegerComparator(((Integer) data).longValue());
    +                break;
                 case BIGINT:
    -                result = new HBaseIntegerComparator((Long) data);
    +                if (data instanceof Long) {
    +                    result = new HBaseIntegerComparator((Long) data);
    +                } else if (data instanceof Integer) {
    +                    result = new HBaseIntegerComparator(((Integer) data).longValue());
    +                } else {
    +                    result = null;
    +                }
    +                break;
    +            case FLOAT8:
    --- End diff --
    
    Float8 has 8 bytes but Java float has only 4 bytes. Should we cast Float8 to double instead?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by shivzone <gi...@git.apache.org>.
Github user shivzone commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84526911
  
    --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/FilterParser.java ---
    @@ -165,17 +168,17 @@ public FilterParser(FilterBuilder eval) {
          * @return the parsed filter
          * @throws Exception if the filter string had wrong syntax
          */
    -    public Object parse(String filter) throws Exception {
    +    public Object parse(byte[] filter) throws Exception {
             index = 0;
    -        filterString = filter;
    +        filterByteArr = filter;
             int opNumber;
     
             if (filter == null) {
                 throw new FilterStringSyntaxException("filter parsing ended with no result");
             }
     
    -        while (index < filterString.length()) {
    -            char op = filterString.charAt(index);
    +        while (index < filterByteArr.length) {
    +            char op = (char) filterByteArr[index];
                 ++index; // skip op character
                 switch (op) {
                     case 'a':
    --- End diff --
    
    Define static final characters for each. Java Enums are not optimal for characters.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by denalex <gi...@git.apache.org>.
Github user denalex commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84548203
  
    --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/FilterParser.java ---
    @@ -338,7 +424,7 @@ private String parseDigits() throws Exception {
                 throw new FilterStringSyntaxException("numeric argument expected at " + index);
             }
     
    -        result = filterString.substring(index, i);
    +        result = new String(Arrays.copyOfRange(filterByteArr, index, i));
    --- End diff --
    
    use constructor that doesn't require a copy: http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#String(byte[],%20int,%20int)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by sansanichfb <gi...@git.apache.org>.
Github user sansanichfb commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84354092
  
    --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/FilterParser.java ---
    @@ -165,17 +168,17 @@ public FilterParser(FilterBuilder eval) {
          * @return the parsed filter
          * @throws Exception if the filter string had wrong syntax
          */
    -    public Object parse(String filter) throws Exception {
    +    public Object parse(byte[] filter) throws Exception {
             index = 0;
    -        filterString = filter;
    +        filterByteArr = filter;
             int opNumber;
     
             if (filter == null) {
                 throw new FilterStringSyntaxException("filter parsing ended with no result");
             }
     
    -        while (index < filterString.length()) {
    -            char op = filterString.charAt(index);
    +        while (index < filterByteArr.length) {
    +            char op = (char) filterByteArr[index];
                 ++index; // skip op character
                 switch (op) {
                     case 'a':
    --- End diff --
    
    Should we have all conventional characters used in protocol as constants or enum?
    'a','c', 's','d','o','l'


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by denalex <gi...@git.apache.org>.
Github user denalex commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84744638
  
    --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/FilterParser.java ---
    @@ -279,25 +296,110 @@ int safeToInt(Long value) throws FilterStringSyntaxException {
             return value.intValue();
         }
     
    +    private int parseConstDataType() throws Exception {
    +        if (!Character.isDigit((char) filterByteArr[index])) {
    +            throw new FilterStringSyntaxException("datatype OID should follow at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument at " + digits);
    +        }
    +    }
    +
    +    private int parseDataLength() throws Exception {
    +        if (((char) filterByteArr[index]) != CONST_LEN) {
    +            throw new FilterStringSyntaxException("data length delimiter 's' expected at " +  index);
    +        }
    +
    +        index++;
    +        return parseInt();
    +    }
    +
    +    private int parseInt() throws Exception {
    +        if (index == filterByteArr.length) {
    +            throw new FilterStringSyntaxException("numeric argument expected at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument " + digits);
    +        }
    +    }
    +
    +    private Object convertDataType(byte[] byteData, int start, int end, DataType dataType) throws Exception {
    +        String data = new String(byteData, start, end-start);
    --- End diff --
    
    didn't we want to explicitly specify UTF-8 here, since the data was re-encoded with UTF-8 after being parsed from the header ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by kavinderd <gi...@git.apache.org>.
Github user kavinderd commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84760743
  
    --- Diff: pxf/pxf-hive/src/main/java/org/apache/hawq/pxf/plugins/hive/HiveFilterBuilder.java ---
    @@ -61,8 +61,11 @@ public HiveFilterBuilder(InputData input) {
          *             filter or list of basic filters
          */
         public Object getFilterObject(String filterString) throws Exception {
    +        if (filterString == null)
    +            return null;
    +
             FilterParser parser = new FilterParser(this);
    -        Object result = parser.parse(filterString);
    +        Object result = parser.parse(filterString.getBytes());
    --- End diff --
    
    ok, makes sense


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by shivzone <gi...@git.apache.org>.
Github user shivzone commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84527705
  
    --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/FilterParser.java ---
    @@ -279,25 +282,108 @@ int safeToInt(Long value) throws FilterStringSyntaxException {
             return value.intValue();
         }
     
    +    private int parseConstDataType() throws Exception {
    +        if (!Character.isDigit((char) filterByteArr[index])) {
    +            throw new FilterStringSyntaxException("datatype OID should follow at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument at " + digits);
    +        }
    +    }
    +
    +    private int parseDataLength() throws Exception {
    +        if (((char) filterByteArr[index]) != 's') {
    +            throw new FilterStringSyntaxException("data length delimiter 's' expected at " +  index);
    +        }
    +
    +        index++;
    +        return parseInt();
    +    }
    +
    +    private int parseInt() throws Exception {
    +        if (index == filterByteArr.length) {
    +            throw new FilterStringSyntaxException("numeric argument expected at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument " + digits);
    +        }
    +    }
    +
    +    private Object convertDataType(byte[] byteData, DataType dataType) throws Exception {
    +        String data = new String(byteData);
    +        try {
    +            switch (dataType) {
    +                case BIGINT:
    +                    return Long.parseLong(data);
    +                case INTEGER:
    +                case SMALLINT:
    +                    return Integer.parseInt(data);
    +                case FLOAT8:
    +                case REAL:
    +                    return Float.parseFloat(data);
    +                case NUMERIC:
    +                    return Double.parseDouble(data);
    +                case TEXT:
    +                case VARCHAR:
    +                case BPCHAR:
    +                    return data;
    +                case BOOLEAN:
    +                    return Boolean.parseBoolean(data);
    +                case DATE:
    +                    return Date.valueOf(data);
    +                case TIMESTAMP:
    +                    return Timestamp.valueOf(data);
    +                default:
    +                    throw new FilterStringSyntaxException("DataType " + dataType.toString() + " unsupported");
    +            }
    +        } catch (NumberFormatException nfe) {
    +            throw new FilterStringSyntaxException("failed to parse number data type starting at " + index);
    +        }
    +    }
         /**
          * Parses either a number or a string.
          */
         private Object parseParameter() throws Exception {
    -        if (index == filterString.length()) {
    +        if (index == filterByteArr.length) {
                 throw new FilterStringSyntaxException("argument should follow at " + index);
             }
     
    -        return senseString()
    -                ? parseString()
    -                : parseNumber();
    -    }
    +        DataType dataType = DataType.get(parseConstDataType());
    +        if (dataType == DataType.UNSUPPORTED_TYPE) {
    +            throw new FilterStringSyntaxException("invalid DataType OID at " + (index - 1));
    +        }
    +
    +        int dataLength = parseDataLength();
    +        if (dataLength < 1) {
    --- End diff --
    
    How would an equality check for "" be handled ? What would the data length be ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by kavinderd <gi...@git.apache.org>.
Github user kavinderd commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84528951
  
    --- Diff: pxf/pxf-hbase/src/main/java/org/apache/hawq/pxf/plugins/hbase/HBaseFilterBuilder.java ---
    @@ -229,8 +233,20 @@ private ByteArrayComparable getComparator(int type, Object data) throws Exceptio
                     break;
                 case SMALLINT:
                 case INTEGER:
    +                result = new HBaseIntegerComparator(((Integer) data).longValue());
    +                break;
                 case BIGINT:
    -                result = new HBaseIntegerComparator((Long) data);
    +                if (data instanceof Long) {
    +                    result = new HBaseIntegerComparator((Long) data);
    +                } else if (data instanceof Integer) {
    +                    result = new HBaseIntegerComparator(((Integer) data).longValue());
    +                } else {
    +                    result = null;
    +                }
    +                break;
    +            case FLOAT8:
    --- End diff --
    
    Good point


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by denalex <gi...@git.apache.org>.
Github user denalex commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84548026
  
    --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/FilterParser.java ---
    @@ -318,17 +404,17 @@ private Long parseNumber() throws Exception {
         private String parseDigits() throws Exception {
             String result;
             int i = index;
    -        int filterLength = filterString.length();
    +        int filterLength = filterByteArr.length;
     
             // allow sign
             if (filterLength > 0) {
    -            int chr = filterString.charAt(i);
    +            int chr = (char) filterByteArr[i];
    --- End diff --
    
    if we cast to (char), should the datatype be char as well, not int ? i know it works either way, but for consistency...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by kavinderd <gi...@git.apache.org>.
Github user kavinderd closed the pull request at:

    https://github.com/apache/incubator-hawq/pull/969


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by shivzone <gi...@git.apache.org>.
Github user shivzone commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84548997
  
    --- Diff: pxf/pxf-hive/src/main/java/org/apache/hawq/pxf/plugins/hive/HiveORCAccessor.java ---
    @@ -162,6 +165,10 @@ private boolean buildArgument(SearchArgument.Builder builder, Object filterObj)
             ColumnDescriptor filterColumn = inputData.getColumn(filterColumnIndex);
             String filterColumnName = filterColumn.columnName();
     
    +        /* Need to convert java.sql.Date to Hive's DateWritable Format */
    +        if (filterValue instanceof Date)
    --- End diff --
    
    Why does the type conversion only need to happen for Date and not for other types ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by shivzone <gi...@git.apache.org>.
Github user shivzone commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84745476
  
    --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/FilterParser.java ---
    @@ -30,26 +34,40 @@
      * interface with two pop-ed operands.
      * <br>
    --- End diff --
    
    Can you add this 
    The filter string is of the pattern `<attcode><attnum><constcode><constval><constsizecode><constsize><constdata><constvalue><opercode><opernum>`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #969: HAWQ-1100. Support for new filter string

Posted by denalex <gi...@git.apache.org>.
Github user denalex commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/969#discussion_r84547540
  
    --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/FilterParser.java ---
    @@ -279,25 +282,108 @@ int safeToInt(Long value) throws FilterStringSyntaxException {
             return value.intValue();
         }
     
    +    private int parseConstDataType() throws Exception {
    +        if (!Character.isDigit((char) filterByteArr[index])) {
    +            throw new FilterStringSyntaxException("datatype OID should follow at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument at " + digits);
    +        }
    +    }
    +
    +    private int parseDataLength() throws Exception {
    +        if (((char) filterByteArr[index]) != 's') {
    +            throw new FilterStringSyntaxException("data length delimiter 's' expected at " +  index);
    +        }
    +
    +        index++;
    +        return parseInt();
    +    }
    +
    +    private int parseInt() throws Exception {
    +        if (index == filterByteArr.length) {
    +            throw new FilterStringSyntaxException("numeric argument expected at " + index);
    +        }
    +
    +        String digits = parseDigits();
    +
    +        try {
    +            return Integer.parseInt(digits);
    +        } catch (NumberFormatException e) {
    +            throw new FilterStringSyntaxException("invalid numeric argument " + digits);
    +        }
    +    }
    +
    +    private Object convertDataType(byte[] byteData, DataType dataType) throws Exception {
    +        String data = new String(byteData);
    +        try {
    +            switch (dataType) {
    +                case BIGINT:
    +                    return Long.parseLong(data);
    +                case INTEGER:
    +                case SMALLINT:
    +                    return Integer.parseInt(data);
    +                case FLOAT8:
    +                case REAL:
    +                    return Float.parseFloat(data);
    +                case NUMERIC:
    +                    return Double.parseDouble(data);
    +                case TEXT:
    +                case VARCHAR:
    +                case BPCHAR:
    +                    return data;
    +                case BOOLEAN:
    +                    return Boolean.parseBoolean(data);
    +                case DATE:
    +                    return Date.valueOf(data);
    +                case TIMESTAMP:
    +                    return Timestamp.valueOf(data);
    +                default:
    +                    throw new FilterStringSyntaxException("DataType " + dataType.toString() + " unsupported");
    +            }
    +        } catch (NumberFormatException nfe) {
    +            throw new FilterStringSyntaxException("failed to parse number data type starting at " + index);
    +        }
    +    }
         /**
          * Parses either a number or a string.
          */
         private Object parseParameter() throws Exception {
    -        if (index == filterString.length()) {
    +        if (index == filterByteArr.length) {
                 throw new FilterStringSyntaxException("argument should follow at " + index);
             }
     
    -        return senseString()
    -                ? parseString()
    -                : parseNumber();
    -    }
    +        DataType dataType = DataType.get(parseConstDataType());
    +        if (dataType == DataType.UNSUPPORTED_TYPE) {
    +            throw new FilterStringSyntaxException("invalid DataType OID at " + (index - 1));
    +        }
    +
    +        int dataLength = parseDataLength();
    +        if (dataLength < 1) {
    +            throw new FilterStringSyntaxException("invalid data size " + dataLength + " at " + (index - 1));
    +        }
    +        if (index + dataLength > filterByteArr.length) {
    +            throw new FilterStringSyntaxException("data size larger than filter string starting at " + index);
    +        }
    +
    +        if (((char) filterByteArr[index]) != 'd') {
    +            throw new FilterStringSyntaxException("data delimiter 'd' expected at " + index);
    +        }
    +
    +        index++;
     
    -    private boolean senseString() {
    -        return filterString.charAt(index) == '"';
    +        Object data = convertDataType(Arrays.copyOfRange(filterByteArr, index, index+dataLength), dataType);
    --- End diff --
    
    why do we copy data once again here ? we can send the original array and indicies and use new String (arr[], start, end) constructor in convertDataType


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---