You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by AakashPradeep <gi...@git.apache.org> on 2015/04/23 04:44:17 UTC

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

GitHub user AakashPradeep opened a pull request:

    https://github.com/apache/phoenix/pull/75

    Support native JSON data type and json_extract_path function PHOENIX-174...

    ...3 and PHOENIX-1710 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/AakashPradeep/phoenix json

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/phoenix/pull/75.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #75
    
----
commit bab8f40c745b0f3cfd28597d6938574f4ba91ea2
Author: Aakash <aa...@salesforce.com>
Date:   2015-04-14T01:32:10Z

    Support native JSON data type and json_extract_path function PHOENIX-1743 and PHOENIX-1710

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by AakashPradeep <gi...@git.apache.org>.
Github user AakashPradeep commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28941403
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PDataType.java ---
    @@ -1144,17 +1144,18 @@ public static PDataType fromLiteral(Object value) {
           return null;
         }
         for (PDataType type : PDataType.values()) {
    -      if (type.isArrayType()) {
    -        PhoenixArray arr = (PhoenixArray) value;
    -        if ((type.getSqlType() == arr.baseType.sqlType + PDataType.ARRAY_TYPE_BASE)
    -            && type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      } else {
    -        if (type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      }
    +			if (type.isArrayType() && type.getJavaClass().isInstance(value)) {
    --- End diff --
    
    Default implementation of isArrayType() returns false, hence I have not modified it. But in ordered list PVarcharArray comes before PJsonDataType and so value is first testified against PVarcharArraay. Now since PVarcharArray is ArrayType it gets inside the if() and type cast the value which throws ClassCastException since value is of type PJsonDataType. After casting the value, in next if (Line 1150) it validates whether this isInstance() which is not correct. Instead before casting value we should check whether it is instance of current type or not to avoid any ClassCastException. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by AakashPradeep <gi...@git.apache.org>.
Github user AakashPradeep closed the pull request at:

    https://github.com/apache/phoenix/pull/75


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by AakashPradeep <gi...@git.apache.org>.
Github user AakashPradeep commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28935743
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PDataType.java ---
    @@ -1144,17 +1144,18 @@ public static PDataType fromLiteral(Object value) {
           return null;
         }
         for (PDataType type : PDataType.values()) {
    -      if (type.isArrayType()) {
    -        PhoenixArray arr = (PhoenixArray) value;
    -        if ((type.getSqlType() == arr.baseType.sqlType + PDataType.ARRAY_TYPE_BASE)
    -            && type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      } else {
    -        if (type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      }
    +			if (type.isArrayType() && type.getJavaClass().isInstance(value)) {
    --- End diff --
    
    I will fix the formatting error here. But it is required to check the isInstance() in first IF itself otherwise it throw ClassCastException. For any value whose data type ordinal is less than PhoenixArray will try to be type casted to PhoenixArray at 1148 which is wrong as it is iterated in the order of ordinal where PhoenixArray type will be iterated before PJsonDataType.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28933979
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/expression/function/JsonExtractPathFunction.java ---
    @@ -0,0 +1,110 @@
    +package org.apache.phoenix.expression.function;
    +
    +import java.sql.SQLException;
    +import java.util.List;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.phoenix.compile.KeyPart;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.expression.Expression;
    +import org.apache.phoenix.parse.FunctionParseNode.Argument;
    +import org.apache.phoenix.parse.FunctionParseNode.BuiltInFunction;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.schema.tuple.Tuple;
    +import org.apache.phoenix.schema.types.PDataType;
    +import org.apache.phoenix.schema.types.PJsonDataType;
    +import org.apache.phoenix.schema.types.PVarcharArray;
    +import org.apache.phoenix.schema.types.PhoenixArray;
    +import org.apache.phoenix.util.ByteUtil;
    +
    +@BuiltInFunction(name = JsonExtractPathFunction.NAME, args = {
    +        @Argument(allowedTypes = { PJsonDataType.class }),
    +        @Argument(allowedTypes = { PVarcharArray.class }) })
    +public class JsonExtractPathFunction extends ScalarFunction {
    +    public static final String NAME = "JSON_EXTRACT_PATH";
    +
    +    public JsonExtractPathFunction() {
    +        super();
    +    }
    +
    +    public JsonExtractPathFunction(List<Expression> children) {
    +        super(children);
    +    }
    +
    +    @Override
    +    public boolean evaluate(Tuple tuple, ImmutableBytesWritable ptr) {
    +
    +        Expression jsonExpression = this.children.get(0);
    +        if (!jsonExpression.evaluate(tuple, ptr)) {
    +            return false;
    +        }
    +        if (ptr.getLength() == 0) {
    +            return false;
    +        }
    +        PhoenixJson phoenixJson =
    +                (PhoenixJson) PJsonDataType.INSTANCE.toObject(ptr.get(), ptr.getOffset(),
    +                    ptr.getLength());
    +
    +        Expression jsonPathArrayExpression = children.get(1);
    +        if (!jsonPathArrayExpression.evaluate(tuple, ptr)) {
    +            return false;
    +        }
    +
    +        if (ptr.getLength() == 0) {
    +            return false;
    --- End diff --
    
    Same here: return true


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28934102
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/expression/function/JsonExtractPathFunction.java ---
    @@ -0,0 +1,110 @@
    +package org.apache.phoenix.expression.function;
    +
    +import java.sql.SQLException;
    +import java.util.List;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.phoenix.compile.KeyPart;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.expression.Expression;
    +import org.apache.phoenix.parse.FunctionParseNode.Argument;
    +import org.apache.phoenix.parse.FunctionParseNode.BuiltInFunction;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.schema.tuple.Tuple;
    +import org.apache.phoenix.schema.types.PDataType;
    +import org.apache.phoenix.schema.types.PJsonDataType;
    +import org.apache.phoenix.schema.types.PVarcharArray;
    +import org.apache.phoenix.schema.types.PhoenixArray;
    +import org.apache.phoenix.util.ByteUtil;
    +
    +@BuiltInFunction(name = JsonExtractPathFunction.NAME, args = {
    +        @Argument(allowedTypes = { PJsonDataType.class }),
    +        @Argument(allowedTypes = { PVarcharArray.class }) })
    +public class JsonExtractPathFunction extends ScalarFunction {
    +    public static final String NAME = "JSON_EXTRACT_PATH";
    +
    +    public JsonExtractPathFunction() {
    +        super();
    +    }
    +
    +    public JsonExtractPathFunction(List<Expression> children) {
    +        super(children);
    +    }
    +
    +    @Override
    +    public boolean evaluate(Tuple tuple, ImmutableBytesWritable ptr) {
    +
    +        Expression jsonExpression = this.children.get(0);
    +        if (!jsonExpression.evaluate(tuple, ptr)) {
    +            return false;
    +        }
    +        if (ptr.getLength() == 0) {
    +            return false;
    +        }
    +        PhoenixJson phoenixJson =
    +                (PhoenixJson) PJsonDataType.INSTANCE.toObject(ptr.get(), ptr.getOffset(),
    +                    ptr.getLength());
    +
    +        Expression jsonPathArrayExpression = children.get(1);
    +        if (!jsonPathArrayExpression.evaluate(tuple, ptr)) {
    +            return false;
    +        }
    +
    +        if (ptr.getLength() == 0) {
    +            return false;
    +        }
    +
    +        PhoenixArray phoenixArray = (PhoenixArray) PVarcharArray.INSTANCE.toObject(ptr);
    +        try {
    +            String[] jsonPaths = (String[]) phoenixArray.getArray();
    +            PhoenixJson phoenixJson2 = phoenixJson.getNullablePhoenixJson(jsonPaths);
    +            
    +            if (phoenixJson2 == null) {
    +                ptr.set(ByteUtil.EMPTY_BYTE_ARRAY);
    +
    +            } else {
    +                byte[] json = PJsonDataType.INSTANCE.toBytes(phoenixJson2);
    +                ptr.set(json);
    +            }
    +
    +        } catch (SQLException sqe) {
    +            new IllegalDataException(new SQLExceptionInfo.Builder(SQLExceptionCode.ILLEGAL_DATA)
    +                    .setRootCause(sqe).build().buildException());
    +        }
    +        return true;
    +    }
    +
    +    @SuppressWarnings("rawtypes")
    +    @Override
    +    public PDataType getDataType() {
    +        return PJsonDataType.INSTANCE;
    +    }
    +
    +    @Override
    +    public String getName() {
    +        return NAME;
    +    }
    +
    +    @Override
    +    public boolean isNullable() {
    +        return PJsonDataType.INSTANCE.isNullable();
    --- End diff --
    
    Should be based on the children being nullable. Ask yourself, when can this built-in function return null? If either the first or the second child is nullable (based on the evaluate impl). So maybe this instead:
    
        return this.children.get(0).isNullable() || this.children.get(0).isNullable() 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28933941
  
    --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/PhoenixJsonE2ETest.java ---
    @@ -0,0 +1,246 @@
    +/**
    + * 
    + */
    +package org.apache.phoenix.end2end;
    +
    +import static org.apache.phoenix.util.TestUtil.TEST_PROPERTIES;
    +import static org.junit.Assert.assertEquals;
    +import static org.junit.Assert.assertFalse;
    +import static org.junit.Assert.assertTrue;
    +import static org.junit.Assert.assertNotEquals;
    +import static org.junit.Assert.fail;
    +
    +import java.sql.Connection;
    +import java.sql.DriverManager;
    +import java.sql.PreparedStatement;
    +import java.sql.ResultSet;
    +import java.sql.SQLException;
    +import java.util.HashMap;
    +import java.util.Map.Entry;
    +import java.util.Properties;
    +import java.util.Set;
    +
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.schema.types.PJsonDataType;
    +import org.apache.phoenix.util.PropertiesUtil;
    +import org.junit.Test;
    +
    +/**
    + * End to end test for JSON data type for {@link PJsonDataType} and
    + * {@link PhoenixJson}.
    + */
    +public class PhoenixJsonE2ETest extends BaseHBaseManagedTimeIT {
    --- End diff --
    
    Rename PhoenixJsonIT instead so it gets run with mvn verify


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28934607
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PVarchar.java ---
    @@ -30,134 +31,145 @@
     
     public class PVarchar extends PDataType<String> {
    --- End diff --
    
    Any non whitespace changes here - if not, please revert.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28934572
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PJsonDataType.java ---
    @@ -0,0 +1,258 @@
    +package org.apache.phoenix.schema.types;
    +
    +import java.io.IOException;
    +import java.sql.Types;
    +import java.text.Format;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.hadoop.hbase.util.Bytes;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.SortOrder;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.util.ByteUtil;
    +import org.apache.phoenix.util.StringUtil;
    +import org.codehaus.jackson.JsonParseException;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * <p>
    + * A Phoenix data type to represent JSON. The json data type stores an exact
    + * copy of the input text, which processing functions must reparse on each
    + * execution. Because the json type stores an exact copy of the input text, it
    + * will preserve semantically-insignificant white space between tokens, as well
    + * as the order of keys within JSON objects. Also, if a JSON object within the
    + * value contains the same key more than once, all the key/value pairs are kept.
    + * It stores the data as string in single column of HBase and it has same data
    + * size limit as Phoenix's Varchar.
    + * <p>
    + * JSON data types are for storing JSON (JavaScript Object Notation) data, as
    + * specified in RFC 7159. Such data can also be stored as text, but the JSON
    + * data types have the advantage of enforcing that each stored value is valid
    + * according to the JSON rules.
    + */
    +public class PJsonDataType extends PDataType<String> {
    --- End diff --
    
    Did you consider deriving from PVarchar?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by AakashPradeep <gi...@git.apache.org>.
Github user AakashPradeep commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r29023060
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PJsonDataType.java ---
    @@ -0,0 +1,258 @@
    +package org.apache.phoenix.schema.types;
    +
    +import java.io.IOException;
    +import java.sql.Types;
    +import java.text.Format;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.hadoop.hbase.util.Bytes;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.SortOrder;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.util.ByteUtil;
    +import org.apache.phoenix.util.StringUtil;
    +import org.codehaus.jackson.JsonParseException;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * <p>
    + * A Phoenix data type to represent JSON. The json data type stores an exact
    + * copy of the input text, which processing functions must reparse on each
    + * execution. Because the json type stores an exact copy of the input text, it
    + * will preserve semantically-insignificant white space between tokens, as well
    + * as the order of keys within JSON objects. Also, if a JSON object within the
    + * value contains the same key more than once, all the key/value pairs are kept.
    + * It stores the data as string in single column of HBase and it has same data
    + * size limit as Phoenix's Varchar.
    + * <p>
    + * JSON data types are for storing JSON (JavaScript Object Notation) data, as
    + * specified in RFC 7159. Such data can also be stored as text, but the JSON
    + * data types have the advantage of enforcing that each stored value is valid
    + * according to the JSON rules.
    + */
    +public class PJsonDataType extends PDataType<String> {
    +
    +	public static final PJsonDataType INSTANCE = new PJsonDataType();
    +
    +	PJsonDataType() {
    +		super("JSON", Types.OTHER, PhoenixJson.class, null, 48);
    +	}
    +
    +	@Override
    +	public int toBytes(Object object, byte[] bytes, int offset) {
    +
    +		if (object == null) {
    +			return 0;
    +		}
    +		byte[] b = toBytes(object);
    +		System.arraycopy(b, 0, bytes, offset, b.length);
    +		return b.length;
    +
    +	}
    +
    +	@Override
    +	public byte[] toBytes(Object object) {
    +		if (object == null) {
    +			return ByteUtil.EMPTY_BYTE_ARRAY;
    +		}
    +		PhoenixJson phoenixJson = (PhoenixJson) object;
    +		return PVarchar.INSTANCE.toBytes(phoenixJson.toString());
    +	}
    +
    +	@Override
    +	public Object toObject(byte[] bytes, int offset, int length,
    +			@SuppressWarnings("rawtypes") PDataType actualType,
    +			SortOrder sortOrder, Integer maxLength, Integer scale) {
    +
    +		if (!actualType.isCoercibleTo(this)) {
    +			throwConstraintViolationException(actualType, this);
    +		}
    +		if (length == 0) {
    +			return null;
    +		}
    +		return getPhoenixJson(bytes, offset, length);
    +
    +	}
    +
    +	@Override
    +	public Object toObject(Object object,
    +			@SuppressWarnings("rawtypes") PDataType actualType) {
    +		if (object == null) {
    +			return null;
    +		}
    +		if (equalsAny(actualType, PJsonDataType.INSTANCE)) {
    +			return object;
    +		}
    +		if (equalsAny(actualType, PVarchar.INSTANCE)) {
    +			return getJsonFromVarchar(object, actualType);
    +		}
    +		return throwConstraintViolationException(actualType, this);
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    +			@SuppressWarnings("rawtypes") PDataType targetType) {
    +		return equalsAny(targetType, this, PVarchar.INSTANCE);
    +
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    +			@SuppressWarnings("rawtypes") PDataType targetType, Object value) {
    +		return isCoercibleTo(targetType);
    +	}
    +
    +	@Override
    +	public boolean isSizeCompatible(ImmutableBytesWritable ptr, Object value,
    +			@SuppressWarnings("rawtypes") PDataType srcType, Integer maxLength,
    +			Integer scale, Integer desiredMaxLength, Integer desiredScale) {
    +		if (ptr.getLength() != 0 && maxLength != null
    +				&& desiredMaxLength != null) {
    +			return maxLength <= desiredMaxLength;
    +		}
    +		return true;
    +	}
    +
    +	@Override
    +	public boolean isFixedWidth() {
    +		return false;
    +	}
    +
    +	@Override
    +	public int estimateByteSize(Object o) {
    +		PhoenixJson phoenixJson = (PhoenixJson) o;
    +		return phoenixJson.toString().length();
    +	}
    +
    +	@Override
    +	public Integer getByteSize() {
    +		return null;
    +	}
    +
    +	@Override
    +	public int compareTo(Object lhs, Object rhs,
    +			@SuppressWarnings("rawtypes") PDataType rhsType) {
    +		if (PJsonDataType.INSTANCE != rhsType) {
    +			throwConstraintViolationException(rhsType, this);
    +		}
    +		PhoenixJson phoenixJsonLHS = (PhoenixJson) lhs;
    +		PhoenixJson phoenixJsonRHS = (PhoenixJson) rhs;
    +		return PVarchar.INSTANCE.compareTo(phoenixJsonLHS.toString(),
    +				phoenixJsonRHS.toString(), rhsType);
    +	}
    +
    +	@Override
    +	public Object toObject(String value) {
    +		byte[] jsonData = Bytes.toBytes(value);
    +		return getPhoenixJson(jsonData, 0, jsonData.length);
    +	}
    +
    +	@Override
    +	public boolean isBytesComparableWith(
    +			@SuppressWarnings("rawtypes") PDataType otherType) {
    +		return otherType == PJsonDataType.INSTANCE
    +				|| otherType == PVarchar.INSTANCE;
    +	}
    +
    +	@Override
    +	public String toStringLiteral(Object o, Format formatter) {
    +		if (o == null) {
    +			return StringUtil.EMPTY_STRING;
    +		}
    +		PhoenixJson phoenixJson = (PhoenixJson) o;
    +		return PVarchar.INSTANCE.toStringLiteral(phoenixJson.toString(),
    +				formatter);
    +	}
    +
    +	@Override
    +	public void coerceBytes(ImmutableBytesWritable ptr, Object o,
    +			@SuppressWarnings("rawtypes") PDataType actualType,
    +			Integer actualMaxLength, Integer actualScale,
    +			SortOrder actualModifier, Integer desiredMaxLength,
    +			Integer desiredScale, SortOrder expectedModifier) {
    --- End diff --
    
    I am ignoring the order of json here since inversion of order will make json invalid.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28982531
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PJsonDataType.java ---
    @@ -0,0 +1,258 @@
    +package org.apache.phoenix.schema.types;
    +
    +import java.io.IOException;
    +import java.sql.Types;
    +import java.text.Format;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.hadoop.hbase.util.Bytes;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.SortOrder;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.util.ByteUtil;
    +import org.apache.phoenix.util.StringUtil;
    +import org.codehaus.jackson.JsonParseException;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * <p>
    + * A Phoenix data type to represent JSON. The json data type stores an exact
    + * copy of the input text, which processing functions must reparse on each
    + * execution. Because the json type stores an exact copy of the input text, it
    + * will preserve semantically-insignificant white space between tokens, as well
    + * as the order of keys within JSON objects. Also, if a JSON object within the
    + * value contains the same key more than once, all the key/value pairs are kept.
    + * It stores the data as string in single column of HBase and it has same data
    + * size limit as Phoenix's Varchar.
    + * <p>
    + * JSON data types are for storing JSON (JavaScript Object Notation) data, as
    + * specified in RFC 7159. Such data can also be stored as text, but the JSON
    + * data types have the advantage of enforcing that each stored value is valid
    + * according to the JSON rules.
    + */
    +public class PJsonDataType extends PDataType<String> {
    +
    +	public static final PJsonDataType INSTANCE = new PJsonDataType();
    +
    +	PJsonDataType() {
    +		super("JSON", Types.OTHER, PhoenixJson.class, null, 48);
    +	}
    +
    +	@Override
    +	public int toBytes(Object object, byte[] bytes, int offset) {
    +
    +		if (object == null) {
    +			return 0;
    +		}
    +		byte[] b = toBytes(object);
    +		System.arraycopy(b, 0, bytes, offset, b.length);
    +		return b.length;
    +
    +	}
    +
    +	@Override
    +	public byte[] toBytes(Object object) {
    +		if (object == null) {
    +			return ByteUtil.EMPTY_BYTE_ARRAY;
    +		}
    +		PhoenixJson phoenixJson = (PhoenixJson) object;
    +		return PVarchar.INSTANCE.toBytes(phoenixJson.toString());
    +	}
    +
    +	@Override
    +	public Object toObject(byte[] bytes, int offset, int length,
    +			@SuppressWarnings("rawtypes") PDataType actualType,
    +			SortOrder sortOrder, Integer maxLength, Integer scale) {
    +
    +		if (!actualType.isCoercibleTo(this)) {
    +			throwConstraintViolationException(actualType, this);
    +		}
    +		if (length == 0) {
    +			return null;
    +		}
    +		return getPhoenixJson(bytes, offset, length);
    +
    +	}
    +
    +	@Override
    +	public Object toObject(Object object,
    +			@SuppressWarnings("rawtypes") PDataType actualType) {
    +		if (object == null) {
    +			return null;
    +		}
    +		if (equalsAny(actualType, PJsonDataType.INSTANCE)) {
    +			return object;
    +		}
    +		if (equalsAny(actualType, PVarchar.INSTANCE)) {
    +			return getJsonFromVarchar(object, actualType);
    +		}
    +		return throwConstraintViolationException(actualType, this);
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    +			@SuppressWarnings("rawtypes") PDataType targetType) {
    +		return equalsAny(targetType, this, PVarchar.INSTANCE);
    +
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    +			@SuppressWarnings("rawtypes") PDataType targetType, Object value) {
    +		return isCoercibleTo(targetType);
    +	}
    +
    +	@Override
    +	public boolean isSizeCompatible(ImmutableBytesWritable ptr, Object value,
    +			@SuppressWarnings("rawtypes") PDataType srcType, Integer maxLength,
    +			Integer scale, Integer desiredMaxLength, Integer desiredScale) {
    +		if (ptr.getLength() != 0 && maxLength != null
    +				&& desiredMaxLength != null) {
    +			return maxLength <= desiredMaxLength;
    +		}
    +		return true;
    +	}
    +
    +	@Override
    +	public boolean isFixedWidth() {
    +		return false;
    +	}
    +
    +	@Override
    +	public int estimateByteSize(Object o) {
    +		PhoenixJson phoenixJson = (PhoenixJson) o;
    +		return phoenixJson.toString().length();
    +	}
    +
    +	@Override
    +	public Integer getByteSize() {
    +		return null;
    +	}
    +
    +	@Override
    +	public int compareTo(Object lhs, Object rhs,
    +			@SuppressWarnings("rawtypes") PDataType rhsType) {
    +		if (PJsonDataType.INSTANCE != rhsType) {
    +			throwConstraintViolationException(rhsType, this);
    +		}
    +		PhoenixJson phoenixJsonLHS = (PhoenixJson) lhs;
    +		PhoenixJson phoenixJsonRHS = (PhoenixJson) rhs;
    +		return PVarchar.INSTANCE.compareTo(phoenixJsonLHS.toString(),
    +				phoenixJsonRHS.toString(), rhsType);
    --- End diff --
    
    Doesn't Jackson provide a way of comparing two JSONs? Create a compareTo method on PhoenixJson.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28934392
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/json/PhoenixJson.java ---
    @@ -0,0 +1,216 @@
    +package org.apache.phoenix.schema.json;
    +
    +import java.io.IOException;
    +import java.sql.SQLException;
    +
    +import org.apache.hadoop.hbase.util.Bytes;
    +import org.codehaus.jackson.JsonFactory;
    +import org.codehaus.jackson.JsonNode;
    +import org.codehaus.jackson.JsonParseException;
    +import org.codehaus.jackson.JsonParser;
    +import org.codehaus.jackson.JsonParser.Feature;
    +import org.codehaus.jackson.map.ObjectMapper;
    +import org.codehaus.jackson.node.ValueNode;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * The {@link PhoenixJson} wraps json and uses Jackson library to parse and traverse the json. It
    + * should be used to represent the JSON data type and also should be used to parse Json data and
    + * read the value from it. It always conside the last value if same key exist more than once.
    + */
    +public class PhoenixJson {
    +    private final JsonNode node;
    +    private String jsonAsString;
    +
    +    /**
    +     * Static Factory method to get an {@link PhoenixJson} object. It also validates the json and
    +     * throws {@link JsonParseException} if it is invalid with line number and character, which
    +     * should be wrapped in {@link SQLException} by caller.
    +     * @param data Buffer that contains data to parse
    +     * @param offset Offset of the first data byte within buffer
    +     * @param length Length of contents to parse within buffer
    +     * @return {@link PhoenixJson}.
    +     * @throws JsonParseException
    +     * @throws IOException
    +     */
    +    public static PhoenixJson getPhoenixJson(byte[] jsonData, int offset, int length)
    +            throws JsonParseException, IOException {
    --- End diff --
    
    Probably best to throw a SQLException here with the JsonParseException or IOException nested inside. Perhaps create an INVALID_JSON_DATA exception in SQLExceptionCode that you can use for the JsonParseException.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28934116
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/expression/function/JsonExtractPathFunction.java ---
    @@ -0,0 +1,110 @@
    +package org.apache.phoenix.expression.function;
    +
    +import java.sql.SQLException;
    +import java.util.List;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.phoenix.compile.KeyPart;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.expression.Expression;
    +import org.apache.phoenix.parse.FunctionParseNode.Argument;
    +import org.apache.phoenix.parse.FunctionParseNode.BuiltInFunction;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.schema.tuple.Tuple;
    +import org.apache.phoenix.schema.types.PDataType;
    +import org.apache.phoenix.schema.types.PJsonDataType;
    +import org.apache.phoenix.schema.types.PVarcharArray;
    +import org.apache.phoenix.schema.types.PhoenixArray;
    +import org.apache.phoenix.util.ByteUtil;
    +
    +@BuiltInFunction(name = JsonExtractPathFunction.NAME, args = {
    +        @Argument(allowedTypes = { PJsonDataType.class }),
    +        @Argument(allowedTypes = { PVarcharArray.class }) })
    +public class JsonExtractPathFunction extends ScalarFunction {
    +    public static final String NAME = "JSON_EXTRACT_PATH";
    +
    +    public JsonExtractPathFunction() {
    +        super();
    +    }
    +
    +    public JsonExtractPathFunction(List<Expression> children) {
    +        super(children);
    +    }
    +
    +    @Override
    +    public boolean evaluate(Tuple tuple, ImmutableBytesWritable ptr) {
    +
    +        Expression jsonExpression = this.children.get(0);
    +        if (!jsonExpression.evaluate(tuple, ptr)) {
    +            return false;
    +        }
    +        if (ptr.getLength() == 0) {
    +            return false;
    +        }
    +        PhoenixJson phoenixJson =
    +                (PhoenixJson) PJsonDataType.INSTANCE.toObject(ptr.get(), ptr.getOffset(),
    +                    ptr.getLength());
    +
    +        Expression jsonPathArrayExpression = children.get(1);
    +        if (!jsonPathArrayExpression.evaluate(tuple, ptr)) {
    +            return false;
    +        }
    +
    +        if (ptr.getLength() == 0) {
    +            return false;
    +        }
    +
    +        PhoenixArray phoenixArray = (PhoenixArray) PVarcharArray.INSTANCE.toObject(ptr);
    +        try {
    +            String[] jsonPaths = (String[]) phoenixArray.getArray();
    +            PhoenixJson phoenixJson2 = phoenixJson.getNullablePhoenixJson(jsonPaths);
    +            
    +            if (phoenixJson2 == null) {
    +                ptr.set(ByteUtil.EMPTY_BYTE_ARRAY);
    +
    +            } else {
    +                byte[] json = PJsonDataType.INSTANCE.toBytes(phoenixJson2);
    +                ptr.set(json);
    +            }
    +
    +        } catch (SQLException sqe) {
    +            new IllegalDataException(new SQLExceptionInfo.Builder(SQLExceptionCode.ILLEGAL_DATA)
    +                    .setRootCause(sqe).build().buildException());
    +        }
    +        return true;
    +    }
    +
    +    @SuppressWarnings("rawtypes")
    +    @Override
    +    public PDataType getDataType() {
    +        return PJsonDataType.INSTANCE;
    +    }
    +
    +    @Override
    +    public String getName() {
    +        return NAME;
    +    }
    +
    +    @Override
    +    public boolean isNullable() {
    +        return PJsonDataType.INSTANCE.isNullable();
    +    }
    +
    +    @Override
    --- End diff --
    
    Remove these implementations, as this looks like the default impl.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28982704
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PJsonDataType.java ---
    @@ -0,0 +1,258 @@
    +package org.apache.phoenix.schema.types;
    +
    +import java.io.IOException;
    +import java.sql.Types;
    +import java.text.Format;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.hadoop.hbase.util.Bytes;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.SortOrder;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.util.ByteUtil;
    +import org.apache.phoenix.util.StringUtil;
    +import org.codehaus.jackson.JsonParseException;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * <p>
    + * A Phoenix data type to represent JSON. The json data type stores an exact
    + * copy of the input text, which processing functions must reparse on each
    + * execution. Because the json type stores an exact copy of the input text, it
    + * will preserve semantically-insignificant white space between tokens, as well
    + * as the order of keys within JSON objects. Also, if a JSON object within the
    + * value contains the same key more than once, all the key/value pairs are kept.
    + * It stores the data as string in single column of HBase and it has same data
    + * size limit as Phoenix's Varchar.
    + * <p>
    + * JSON data types are for storing JSON (JavaScript Object Notation) data, as
    + * specified in RFC 7159. Such data can also be stored as text, but the JSON
    + * data types have the advantage of enforcing that each stored value is valid
    + * according to the JSON rules.
    + */
    +public class PJsonDataType extends PDataType<String> {
    +
    +	public static final PJsonDataType INSTANCE = new PJsonDataType();
    +
    +	PJsonDataType() {
    +		super("JSON", Types.OTHER, PhoenixJson.class, null, 48);
    +	}
    +
    +	@Override
    +	public int toBytes(Object object, byte[] bytes, int offset) {
    +
    +		if (object == null) {
    +			return 0;
    +		}
    +		byte[] b = toBytes(object);
    +		System.arraycopy(b, 0, bytes, offset, b.length);
    +		return b.length;
    +
    +	}
    +
    +	@Override
    +	public byte[] toBytes(Object object) {
    +		if (object == null) {
    +			return ByteUtil.EMPTY_BYTE_ARRAY;
    +		}
    +		PhoenixJson phoenixJson = (PhoenixJson) object;
    +		return PVarchar.INSTANCE.toBytes(phoenixJson.toString());
    +	}
    +
    +	@Override
    +	public Object toObject(byte[] bytes, int offset, int length,
    +			@SuppressWarnings("rawtypes") PDataType actualType,
    +			SortOrder sortOrder, Integer maxLength, Integer scale) {
    +
    +		if (!actualType.isCoercibleTo(this)) {
    +			throwConstraintViolationException(actualType, this);
    +		}
    +		if (length == 0) {
    +			return null;
    +		}
    +		return getPhoenixJson(bytes, offset, length);
    +
    +	}
    +
    +	@Override
    +	public Object toObject(Object object,
    +			@SuppressWarnings("rawtypes") PDataType actualType) {
    +		if (object == null) {
    +			return null;
    +		}
    +		if (equalsAny(actualType, PJsonDataType.INSTANCE)) {
    +			return object;
    +		}
    +		if (equalsAny(actualType, PVarchar.INSTANCE)) {
    +			return getJsonFromVarchar(object, actualType);
    +		}
    +		return throwConstraintViolationException(actualType, this);
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    +			@SuppressWarnings("rawtypes") PDataType targetType) {
    +		return equalsAny(targetType, this, PVarchar.INSTANCE);
    +
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    +			@SuppressWarnings("rawtypes") PDataType targetType, Object value) {
    +		return isCoercibleTo(targetType);
    +	}
    +
    +	@Override
    +	public boolean isSizeCompatible(ImmutableBytesWritable ptr, Object value,
    +			@SuppressWarnings("rawtypes") PDataType srcType, Integer maxLength,
    +			Integer scale, Integer desiredMaxLength, Integer desiredScale) {
    +		if (ptr.getLength() != 0 && maxLength != null
    +				&& desiredMaxLength != null) {
    +			return maxLength <= desiredMaxLength;
    +		}
    +		return true;
    +	}
    +
    +	@Override
    +	public boolean isFixedWidth() {
    +		return false;
    +	}
    +
    +	@Override
    +	public int estimateByteSize(Object o) {
    +		PhoenixJson phoenixJson = (PhoenixJson) o;
    +		return phoenixJson.toString().length();
    +	}
    +
    +	@Override
    +	public Integer getByteSize() {
    +		return null;
    +	}
    +
    +	@Override
    +	public int compareTo(Object lhs, Object rhs,
    +			@SuppressWarnings("rawtypes") PDataType rhsType) {
    +		if (PJsonDataType.INSTANCE != rhsType) {
    +			throwConstraintViolationException(rhsType, this);
    +		}
    +		PhoenixJson phoenixJsonLHS = (PhoenixJson) lhs;
    +		PhoenixJson phoenixJsonRHS = (PhoenixJson) rhs;
    +		return PVarchar.INSTANCE.compareTo(phoenixJsonLHS.toString(),
    +				phoenixJsonRHS.toString(), rhsType);
    +	}
    +
    +	@Override
    +	public Object toObject(String value) {
    +		byte[] jsonData = Bytes.toBytes(value);
    +		return getPhoenixJson(jsonData, 0, jsonData.length);
    +	}
    +
    +	@Override
    +	public boolean isBytesComparableWith(
    +			@SuppressWarnings("rawtypes") PDataType otherType) {
    +		return otherType == PJsonDataType.INSTANCE
    +				|| otherType == PVarchar.INSTANCE;
    +	}
    +
    +	@Override
    +	public String toStringLiteral(Object o, Format formatter) {
    +		if (o == null) {
    +			return StringUtil.EMPTY_STRING;
    +		}
    +		PhoenixJson phoenixJson = (PhoenixJson) o;
    +		return PVarchar.INSTANCE.toStringLiteral(phoenixJson.toString(),
    +				formatter);
    +	}
    +
    +	@Override
    +	public void coerceBytes(ImmutableBytesWritable ptr, Object o,
    +			@SuppressWarnings("rawtypes") PDataType actualType,
    +			Integer actualMaxLength, Integer actualScale,
    +			SortOrder actualModifier, Integer desiredMaxLength,
    +			Integer desiredScale, SortOrder expectedModifier) {
    --- End diff --
    
    Same as PVarchar? Or even the default? Don't repeat here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r29023183
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PDataType.java ---
    @@ -1144,17 +1144,18 @@ public static PDataType fromLiteral(Object value) {
           return null;
         }
         for (PDataType type : PDataType.values()) {
    -      if (type.isArrayType()) {
    -        PhoenixArray arr = (PhoenixArray) value;
    -        if ((type.getSqlType() == arr.baseType.sqlType + PDataType.ARRAY_TYPE_BASE)
    -            && type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      } else {
    -        if (type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      }
    +			if (type.isArrayType() && type.getJavaClass().isInstance(value)) {
    --- End diff --
    
    I'm still not clear what is the root of the issue, so let's start there. If the type.isArrayType() is true, then the data type should be coercible to a PhoenixArray. Why isn't it? Is it because it's a PhoenixJson? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28936075
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PDataType.java ---
    @@ -1144,17 +1144,18 @@ public static PDataType fromLiteral(Object value) {
           return null;
         }
         for (PDataType type : PDataType.values()) {
    -      if (type.isArrayType()) {
    -        PhoenixArray arr = (PhoenixArray) value;
    -        if ((type.getSqlType() == arr.baseType.sqlType + PDataType.ARRAY_TYPE_BASE)
    -            && type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      } else {
    -        if (type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      }
    +			if (type.isArrayType() && type.getJavaClass().isInstance(value)) {
    --- End diff --
    
    But a JSON type shouldn't pass the type.isArrayType() right? How about modifying the type.isArrayType() instead?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28982969
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PDataType.java ---
    @@ -1144,17 +1144,18 @@ public static PDataType fromLiteral(Object value) {
           return null;
         }
         for (PDataType type : PDataType.values()) {
    -      if (type.isArrayType()) {
    -        PhoenixArray arr = (PhoenixArray) value;
    -        if ((type.getSqlType() == arr.baseType.sqlType + PDataType.ARRAY_TYPE_BASE)
    -            && type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      } else {
    -        if (type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      }
    +			if (type.isArrayType() && type.getJavaClass().isInstance(value)) {
    --- End diff --
    
    If type.isArrayType() is true, then you should be able to cast it to a PhoenixArray. Is it because a PVarcharArray is storing the PJsonDataType instead of PVarchar? I think we'll need a PJsonArray.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28934216
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/expression/function/JsonExtractPathTextFunction.java ---
    @@ -0,0 +1,108 @@
    +package org.apache.phoenix.expression.function;
    +
    +import java.sql.SQLException;
    +import java.util.List;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.phoenix.compile.KeyPart;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.expression.Expression;
    +import org.apache.phoenix.parse.FunctionParseNode.Argument;
    +import org.apache.phoenix.parse.FunctionParseNode.BuiltInFunction;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.schema.tuple.Tuple;
    +import org.apache.phoenix.schema.types.PDataType;
    +import org.apache.phoenix.schema.types.PJsonDataType;
    +import org.apache.phoenix.schema.types.PVarchar;
    +import org.apache.phoenix.schema.types.PVarcharArray;
    +import org.apache.phoenix.schema.types.PhoenixArray;
    +import org.apache.phoenix.util.ByteUtil;
    +
    +@BuiltInFunction(name = JsonExtractPathTextFunction.NAME, args = {
    +        @Argument(allowedTypes = { PJsonDataType.class }),
    +        @Argument(allowedTypes = { PVarcharArray.class }) })
    +public class JsonExtractPathTextFunction extends ScalarFunction {
    +    public static final String NAME = "JSON_EXTRACT_PATH_TEXT";
    +
    +    public JsonExtractPathTextFunction() {
    +        super();
    +    }
    +
    +    public JsonExtractPathTextFunction(List<Expression> children) {
    +        super(children);
    +    }
    +
    +    @Override
    +    public boolean evaluate(Tuple tuple, ImmutableBytesWritable ptr) {
    +        Expression jsonExpression = this.children.get(0);
    +        if (!jsonExpression.evaluate(tuple, ptr)) {
    +            return false;
    +        }
    +        if (ptr.getLength() == 0) {
    +            return false;
    +        }
    +        PhoenixJson phoenixJson =
    +                (PhoenixJson) PJsonDataType.INSTANCE.toObject(ptr.get(), ptr.getOffset(),
    +                    ptr.getLength());
    +
    +        Expression jsonPathArrayExpression = children.get(1);
    +        if (!jsonPathArrayExpression.evaluate(tuple, ptr)) {
    +            return false;
    +        }
    +
    +        if (ptr.getLength() == 0) {
    +            return false;
    +        }
    +
    +        PhoenixArray phoenixArray = (PhoenixArray) PVarcharArray.INSTANCE.toObject(ptr);
    +        try {
    +            String[] jsonPaths = (String[]) phoenixArray.getArray();
    +            PhoenixJson phoenixJson2 = phoenixJson.getNullablePhoenixJson(jsonPaths);
    +            if (phoenixJson2 == null) {
    +                ptr.set(ByteUtil.EMPTY_BYTE_ARRAY);
    +            } else {
    +                byte[] json = PVarchar.INSTANCE.toBytes(phoenixJson2.serializeToString());
    +                ptr.set(json);
    +            }
    +
    +        } catch (SQLException sqe) {
    +            new IllegalDataException(new SQLExceptionInfo.Builder(SQLExceptionCode.ILLEGAL_DATA)
    --- End diff --
    
    You can just do a throw here with the sqe in the constructor and it'll get un-nested for you (don't forget to throw it, though! :-) ):
        throw new IllegalDataException(sqe);
         


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by AakashPradeep <gi...@git.apache.org>.
Github user AakashPradeep commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28935769
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PVarchar.java ---
    @@ -30,134 +31,145 @@
     
     public class PVarchar extends PDataType<String> {
    --- End diff --
    
    I will revert this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r29023368
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PJsonDataType.java ---
    @@ -0,0 +1,258 @@
    +package org.apache.phoenix.schema.types;
    +
    +import java.io.IOException;
    +import java.sql.Types;
    +import java.text.Format;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.hadoop.hbase.util.Bytes;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.SortOrder;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.util.ByteUtil;
    +import org.apache.phoenix.util.StringUtil;
    +import org.codehaus.jackson.JsonParseException;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * <p>
    + * A Phoenix data type to represent JSON. The json data type stores an exact
    + * copy of the input text, which processing functions must reparse on each
    + * execution. Because the json type stores an exact copy of the input text, it
    + * will preserve semantically-insignificant white space between tokens, as well
    + * as the order of keys within JSON objects. Also, if a JSON object within the
    + * value contains the same key more than once, all the key/value pairs are kept.
    + * It stores the data as string in single column of HBase and it has same data
    + * size limit as Phoenix's Varchar.
    + * <p>
    + * JSON data types are for storing JSON (JavaScript Object Notation) data, as
    + * specified in RFC 7159. Such data can also be stored as text, but the JSON
    + * data types have the advantage of enforcing that each stored value is valid
    + * according to the JSON rules.
    + */
    +public class PJsonDataType extends PDataType<String> {
    +
    +	public static final PJsonDataType INSTANCE = new PJsonDataType();
    +
    +	PJsonDataType() {
    +		super("JSON", Types.OTHER, PhoenixJson.class, null, 48);
    +	}
    +
    +	@Override
    +	public int toBytes(Object object, byte[] bytes, int offset) {
    +
    +		if (object == null) {
    +			return 0;
    +		}
    +		byte[] b = toBytes(object);
    +		System.arraycopy(b, 0, bytes, offset, b.length);
    +		return b.length;
    +
    +	}
    +
    +	@Override
    +	public byte[] toBytes(Object object) {
    +		if (object == null) {
    +			return ByteUtil.EMPTY_BYTE_ARRAY;
    +		}
    +		PhoenixJson phoenixJson = (PhoenixJson) object;
    +		return PVarchar.INSTANCE.toBytes(phoenixJson.toString());
    +	}
    +
    +	@Override
    +	public Object toObject(byte[] bytes, int offset, int length,
    +			@SuppressWarnings("rawtypes") PDataType actualType,
    +			SortOrder sortOrder, Integer maxLength, Integer scale) {
    +
    +		if (!actualType.isCoercibleTo(this)) {
    +			throwConstraintViolationException(actualType, this);
    +		}
    +		if (length == 0) {
    +			return null;
    +		}
    +		return getPhoenixJson(bytes, offset, length);
    +
    +	}
    +
    +	@Override
    +	public Object toObject(Object object,
    +			@SuppressWarnings("rawtypes") PDataType actualType) {
    +		if (object == null) {
    +			return null;
    +		}
    +		if (equalsAny(actualType, PJsonDataType.INSTANCE)) {
    +			return object;
    +		}
    +		if (equalsAny(actualType, PVarchar.INSTANCE)) {
    +			return getJsonFromVarchar(object, actualType);
    +		}
    +		return throwConstraintViolationException(actualType, this);
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    +			@SuppressWarnings("rawtypes") PDataType targetType) {
    +		return equalsAny(targetType, this, PVarchar.INSTANCE);
    +
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    +			@SuppressWarnings("rawtypes") PDataType targetType, Object value) {
    +		return isCoercibleTo(targetType);
    +	}
    +
    +	@Override
    +	public boolean isSizeCompatible(ImmutableBytesWritable ptr, Object value,
    +			@SuppressWarnings("rawtypes") PDataType srcType, Integer maxLength,
    +			Integer scale, Integer desiredMaxLength, Integer desiredScale) {
    +		if (ptr.getLength() != 0 && maxLength != null
    +				&& desiredMaxLength != null) {
    +			return maxLength <= desiredMaxLength;
    +		}
    +		return true;
    +	}
    +
    +	@Override
    +	public boolean isFixedWidth() {
    +		return false;
    +	}
    +
    +	@Override
    +	public int estimateByteSize(Object o) {
    +		PhoenixJson phoenixJson = (PhoenixJson) o;
    +		return phoenixJson.toString().length();
    +	}
    +
    +	@Override
    +	public Integer getByteSize() {
    +		return null;
    +	}
    +
    +	@Override
    +	public int compareTo(Object lhs, Object rhs,
    +			@SuppressWarnings("rawtypes") PDataType rhsType) {
    +		if (PJsonDataType.INSTANCE != rhsType) {
    +			throwConstraintViolationException(rhsType, this);
    +		}
    +		PhoenixJson phoenixJsonLHS = (PhoenixJson) lhs;
    +		PhoenixJson phoenixJsonRHS = (PhoenixJson) rhs;
    +		return PVarchar.INSTANCE.compareTo(phoenixJsonLHS.toString(),
    +				phoenixJsonRHS.toString(), rhsType);
    +	}
    +
    +	@Override
    +	public Object toObject(String value) {
    +		byte[] jsonData = Bytes.toBytes(value);
    +		return getPhoenixJson(jsonData, 0, jsonData.length);
    +	}
    +
    +	@Override
    +	public boolean isBytesComparableWith(
    +			@SuppressWarnings("rawtypes") PDataType otherType) {
    +		return otherType == PJsonDataType.INSTANCE
    +				|| otherType == PVarchar.INSTANCE;
    +	}
    +
    +	@Override
    +	public String toStringLiteral(Object o, Format formatter) {
    +		if (o == null) {
    +			return StringUtil.EMPTY_STRING;
    +		}
    +		PhoenixJson phoenixJson = (PhoenixJson) o;
    +		return PVarchar.INSTANCE.toStringLiteral(phoenixJson.toString(),
    +				formatter);
    +	}
    +
    +	@Override
    +	public void coerceBytes(ImmutableBytesWritable ptr, Object o,
    +			@SuppressWarnings("rawtypes") PDataType actualType,
    +			Integer actualMaxLength, Integer actualScale,
    +			SortOrder actualModifier, Integer desiredMaxLength,
    +			Integer desiredScale, SortOrder expectedModifier) {
    --- End diff --
    
    You can't ignore the sortOrder. It'll be inverted back before any attempts to change it back into JSON. It's the same for other types. For example, an inverted String cannot be converted back until it's re-inverted. Same with Decimal.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28982298
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PJsonDataType.java ---
    @@ -0,0 +1,258 @@
    +package org.apache.phoenix.schema.types;
    +
    +import java.io.IOException;
    +import java.sql.Types;
    +import java.text.Format;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.hadoop.hbase.util.Bytes;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.SortOrder;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.util.ByteUtil;
    +import org.apache.phoenix.util.StringUtil;
    +import org.codehaus.jackson.JsonParseException;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * <p>
    + * A Phoenix data type to represent JSON. The json data type stores an exact
    + * copy of the input text, which processing functions must reparse on each
    + * execution. Because the json type stores an exact copy of the input text, it
    + * will preserve semantically-insignificant white space between tokens, as well
    + * as the order of keys within JSON objects. Also, if a JSON object within the
    + * value contains the same key more than once, all the key/value pairs are kept.
    + * It stores the data as string in single column of HBase and it has same data
    + * size limit as Phoenix's Varchar.
    + * <p>
    + * JSON data types are for storing JSON (JavaScript Object Notation) data, as
    + * specified in RFC 7159. Such data can also be stored as text, but the JSON
    + * data types have the advantage of enforcing that each stored value is valid
    + * according to the JSON rules.
    + */
    +public class PJsonDataType extends PDataType<String> {
    +
    +	public static final PJsonDataType INSTANCE = new PJsonDataType();
    +
    +	PJsonDataType() {
    +		super("JSON", Types.OTHER, PhoenixJson.class, null, 48);
    +	}
    +
    +	@Override
    +	public int toBytes(Object object, byte[] bytes, int offset) {
    +
    +		if (object == null) {
    +			return 0;
    +		}
    +		byte[] b = toBytes(object);
    +		System.arraycopy(b, 0, bytes, offset, b.length);
    +		return b.length;
    +
    +	}
    +
    +	@Override
    +	public byte[] toBytes(Object object) {
    +		if (object == null) {
    +			return ByteUtil.EMPTY_BYTE_ARRAY;
    +		}
    +		PhoenixJson phoenixJson = (PhoenixJson) object;
    +		return PVarchar.INSTANCE.toBytes(phoenixJson.toString());
    +	}
    +
    +	@Override
    +	public Object toObject(byte[] bytes, int offset, int length,
    +			@SuppressWarnings("rawtypes") PDataType actualType,
    +			SortOrder sortOrder, Integer maxLength, Integer scale) {
    +
    +		if (!actualType.isCoercibleTo(this)) {
    +			throwConstraintViolationException(actualType, this);
    +		}
    +		if (length == 0) {
    +			return null;
    +		}
    +		return getPhoenixJson(bytes, offset, length);
    +
    +	}
    +
    +	@Override
    +	public Object toObject(Object object,
    +			@SuppressWarnings("rawtypes") PDataType actualType) {
    +		if (object == null) {
    +			return null;
    +		}
    +		if (equalsAny(actualType, PJsonDataType.INSTANCE)) {
    +			return object;
    +		}
    +		if (equalsAny(actualType, PVarchar.INSTANCE)) {
    +			return getJsonFromVarchar(object, actualType);
    +		}
    +		return throwConstraintViolationException(actualType, this);
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    +			@SuppressWarnings("rawtypes") PDataType targetType) {
    +		return equalsAny(targetType, this, PVarchar.INSTANCE);
    +
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    --- End diff --
    
    This is the default impl, so no need to repeat this here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by AakashPradeep <gi...@git.apache.org>.
Github user AakashPradeep commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28935112
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/expression/function/JsonExtractPathTextFunction.java ---
    @@ -0,0 +1,108 @@
    +package org.apache.phoenix.expression.function;
    +
    +import java.sql.SQLException;
    +import java.util.List;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.phoenix.compile.KeyPart;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.expression.Expression;
    +import org.apache.phoenix.parse.FunctionParseNode.Argument;
    +import org.apache.phoenix.parse.FunctionParseNode.BuiltInFunction;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.schema.tuple.Tuple;
    +import org.apache.phoenix.schema.types.PDataType;
    +import org.apache.phoenix.schema.types.PJsonDataType;
    +import org.apache.phoenix.schema.types.PVarchar;
    +import org.apache.phoenix.schema.types.PVarcharArray;
    +import org.apache.phoenix.schema.types.PhoenixArray;
    +import org.apache.phoenix.util.ByteUtil;
    +
    +@BuiltInFunction(name = JsonExtractPathTextFunction.NAME, args = {
    +        @Argument(allowedTypes = { PJsonDataType.class }),
    +        @Argument(allowedTypes = { PVarcharArray.class }) })
    +public class JsonExtractPathTextFunction extends ScalarFunction {
    +    public static final String NAME = "JSON_EXTRACT_PATH_TEXT";
    +
    +    public JsonExtractPathTextFunction() {
    +        super();
    +    }
    +
    +    public JsonExtractPathTextFunction(List<Expression> children) {
    +        super(children);
    +    }
    +
    +    @Override
    +    public boolean evaluate(Tuple tuple, ImmutableBytesWritable ptr) {
    +        Expression jsonExpression = this.children.get(0);
    +        if (!jsonExpression.evaluate(tuple, ptr)) {
    +            return false;
    +        }
    +        if (ptr.getLength() == 0) {
    +            return false;
    +        }
    +        PhoenixJson phoenixJson =
    +                (PhoenixJson) PJsonDataType.INSTANCE.toObject(ptr.get(), ptr.getOffset(),
    +                    ptr.getLength());
    +
    +        Expression jsonPathArrayExpression = children.get(1);
    +        if (!jsonPathArrayExpression.evaluate(tuple, ptr)) {
    +            return false;
    +        }
    +
    +        if (ptr.getLength() == 0) {
    +            return false;
    +        }
    +
    +        PhoenixArray phoenixArray = (PhoenixArray) PVarcharArray.INSTANCE.toObject(ptr);
    +        try {
    +            String[] jsonPaths = (String[]) phoenixArray.getArray();
    +            PhoenixJson phoenixJson2 = phoenixJson.getNullablePhoenixJson(jsonPaths);
    +            if (phoenixJson2 == null) {
    +                ptr.set(ByteUtil.EMPTY_BYTE_ARRAY);
    +            } else {
    +                byte[] json = PVarchar.INSTANCE.toBytes(phoenixJson2.serializeToString());
    +                ptr.set(json);
    +            }
    +
    +        } catch (SQLException sqe) {
    +            new IllegalDataException(new SQLExceptionInfo.Builder(SQLExceptionCode.ILLEGAL_DATA)
    --- End diff --
    
    Yeah I missed that. I will add them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28934535
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PDataType.java ---
    @@ -1144,17 +1144,18 @@ public static PDataType fromLiteral(Object value) {
           return null;
         }
         for (PDataType type : PDataType.values()) {
    -      if (type.isArrayType()) {
    -        PhoenixArray arr = (PhoenixArray) value;
    -        if ((type.getSqlType() == arr.baseType.sqlType + PDataType.ARRAY_TYPE_BASE)
    -            && type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      } else {
    -        if (type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      }
    +			if (type.isArrayType() && type.getJavaClass().isInstance(value)) {
    --- End diff --
    
    Actually, if there are no changes here but whitespace, then please just revert this change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by AakashPradeep <gi...@git.apache.org>.
Github user AakashPradeep commented on the pull request:

    https://github.com/apache/phoenix/pull/75#issuecomment-95399238
  
    @JamesRTaylor  Can you please review Phoenix Json.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28982616
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PJsonDataType.java ---
    @@ -0,0 +1,258 @@
    +package org.apache.phoenix.schema.types;
    +
    +import java.io.IOException;
    +import java.sql.Types;
    +import java.text.Format;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.hadoop.hbase.util.Bytes;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.SortOrder;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.util.ByteUtil;
    +import org.apache.phoenix.util.StringUtil;
    +import org.codehaus.jackson.JsonParseException;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * <p>
    + * A Phoenix data type to represent JSON. The json data type stores an exact
    + * copy of the input text, which processing functions must reparse on each
    + * execution. Because the json type stores an exact copy of the input text, it
    + * will preserve semantically-insignificant white space between tokens, as well
    + * as the order of keys within JSON objects. Also, if a JSON object within the
    + * value contains the same key more than once, all the key/value pairs are kept.
    + * It stores the data as string in single column of HBase and it has same data
    + * size limit as Phoenix's Varchar.
    + * <p>
    + * JSON data types are for storing JSON (JavaScript Object Notation) data, as
    + * specified in RFC 7159. Such data can also be stored as text, but the JSON
    + * data types have the advantage of enforcing that each stored value is valid
    + * according to the JSON rules.
    + */
    +public class PJsonDataType extends PDataType<String> {
    +
    +	public static final PJsonDataType INSTANCE = new PJsonDataType();
    +
    +	PJsonDataType() {
    +		super("JSON", Types.OTHER, PhoenixJson.class, null, 48);
    +	}
    +
    +	@Override
    +	public int toBytes(Object object, byte[] bytes, int offset) {
    +
    +		if (object == null) {
    +			return 0;
    +		}
    +		byte[] b = toBytes(object);
    +		System.arraycopy(b, 0, bytes, offset, b.length);
    +		return b.length;
    +
    +	}
    +
    +	@Override
    +	public byte[] toBytes(Object object) {
    +		if (object == null) {
    +			return ByteUtil.EMPTY_BYTE_ARRAY;
    +		}
    +		PhoenixJson phoenixJson = (PhoenixJson) object;
    +		return PVarchar.INSTANCE.toBytes(phoenixJson.toString());
    +	}
    +
    +	@Override
    +	public Object toObject(byte[] bytes, int offset, int length,
    +			@SuppressWarnings("rawtypes") PDataType actualType,
    +			SortOrder sortOrder, Integer maxLength, Integer scale) {
    +
    +		if (!actualType.isCoercibleTo(this)) {
    +			throwConstraintViolationException(actualType, this);
    +		}
    +		if (length == 0) {
    +			return null;
    +		}
    +		return getPhoenixJson(bytes, offset, length);
    +
    +	}
    +
    +	@Override
    +	public Object toObject(Object object,
    +			@SuppressWarnings("rawtypes") PDataType actualType) {
    +		if (object == null) {
    +			return null;
    +		}
    +		if (equalsAny(actualType, PJsonDataType.INSTANCE)) {
    +			return object;
    +		}
    +		if (equalsAny(actualType, PVarchar.INSTANCE)) {
    +			return getJsonFromVarchar(object, actualType);
    +		}
    +		return throwConstraintViolationException(actualType, this);
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    +			@SuppressWarnings("rawtypes") PDataType targetType) {
    +		return equalsAny(targetType, this, PVarchar.INSTANCE);
    +
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    +			@SuppressWarnings("rawtypes") PDataType targetType, Object value) {
    +		return isCoercibleTo(targetType);
    +	}
    +
    +	@Override
    +	public boolean isSizeCompatible(ImmutableBytesWritable ptr, Object value,
    +			@SuppressWarnings("rawtypes") PDataType srcType, Integer maxLength,
    +			Integer scale, Integer desiredMaxLength, Integer desiredScale) {
    +		if (ptr.getLength() != 0 && maxLength != null
    +				&& desiredMaxLength != null) {
    +			return maxLength <= desiredMaxLength;
    +		}
    +		return true;
    +	}
    +
    +	@Override
    +	public boolean isFixedWidth() {
    +		return false;
    +	}
    +
    +	@Override
    +	public int estimateByteSize(Object o) {
    +		PhoenixJson phoenixJson = (PhoenixJson) o;
    +		return phoenixJson.toString().length();
    +	}
    +
    +	@Override
    +	public Integer getByteSize() {
    +		return null;
    +	}
    +
    +	@Override
    +	public int compareTo(Object lhs, Object rhs,
    +			@SuppressWarnings("rawtypes") PDataType rhsType) {
    +		if (PJsonDataType.INSTANCE != rhsType) {
    +			throwConstraintViolationException(rhsType, this);
    +		}
    +		PhoenixJson phoenixJsonLHS = (PhoenixJson) lhs;
    +		PhoenixJson phoenixJsonRHS = (PhoenixJson) rhs;
    +		return PVarchar.INSTANCE.compareTo(phoenixJsonLHS.toString(),
    +				phoenixJsonRHS.toString(), rhsType);
    +	}
    +
    +	@Override
    +	public Object toObject(String value) {
    +		byte[] jsonData = Bytes.toBytes(value);
    +		return getPhoenixJson(jsonData, 0, jsonData.length);
    --- End diff --
    
    You don't want to go from String->byte[]->String here. Create a getPhoenixJson(String) method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28933972
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/expression/function/JsonExtractPathFunction.java ---
    @@ -0,0 +1,110 @@
    +package org.apache.phoenix.expression.function;
    +
    +import java.sql.SQLException;
    +import java.util.List;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.phoenix.compile.KeyPart;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.expression.Expression;
    +import org.apache.phoenix.parse.FunctionParseNode.Argument;
    +import org.apache.phoenix.parse.FunctionParseNode.BuiltInFunction;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.schema.tuple.Tuple;
    +import org.apache.phoenix.schema.types.PDataType;
    +import org.apache.phoenix.schema.types.PJsonDataType;
    +import org.apache.phoenix.schema.types.PVarcharArray;
    +import org.apache.phoenix.schema.types.PhoenixArray;
    +import org.apache.phoenix.util.ByteUtil;
    +
    +@BuiltInFunction(name = JsonExtractPathFunction.NAME, args = {
    +        @Argument(allowedTypes = { PJsonDataType.class }),
    +        @Argument(allowedTypes = { PVarcharArray.class }) })
    +public class JsonExtractPathFunction extends ScalarFunction {
    +    public static final String NAME = "JSON_EXTRACT_PATH";
    +
    +    public JsonExtractPathFunction() {
    +        super();
    +    }
    +
    +    public JsonExtractPathFunction(List<Expression> children) {
    +        super(children);
    +    }
    +
    +    @Override
    +    public boolean evaluate(Tuple tuple, ImmutableBytesWritable ptr) {
    +
    +        Expression jsonExpression = this.children.get(0);
    +        if (!jsonExpression.evaluate(tuple, ptr)) {
    +            return false;
    +        }
    +        if (ptr.getLength() == 0) {
    +            return false;
    --- End diff --
    
    Return true here. Only return false if the child expression evaluation returns false.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by AakashPradeep <gi...@git.apache.org>.
Github user AakashPradeep commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r29022695
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PDataType.java ---
    @@ -1144,17 +1144,18 @@ public static PDataType fromLiteral(Object value) {
           return null;
         }
         for (PDataType type : PDataType.values()) {
    -      if (type.isArrayType()) {
    -        PhoenixArray arr = (PhoenixArray) value;
    -        if ((type.getSqlType() == arr.baseType.sqlType + PDataType.ARRAY_TYPE_BASE)
    -            && type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      } else {
    -        if (type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      }
    +			if (type.isArrayType() && type.getJavaClass().isInstance(value)) {
    --- End diff --
    
    IMHO Array is a complex data type and so does the json, we should not allow array of json as we dont allow array of array.  @twdsilva  whats your take on this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28982446
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PJsonDataType.java ---
    @@ -0,0 +1,258 @@
    +package org.apache.phoenix.schema.types;
    +
    +import java.io.IOException;
    +import java.sql.Types;
    +import java.text.Format;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.hadoop.hbase.util.Bytes;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.SortOrder;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.util.ByteUtil;
    +import org.apache.phoenix.util.StringUtil;
    +import org.codehaus.jackson.JsonParseException;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * <p>
    + * A Phoenix data type to represent JSON. The json data type stores an exact
    + * copy of the input text, which processing functions must reparse on each
    + * execution. Because the json type stores an exact copy of the input text, it
    + * will preserve semantically-insignificant white space between tokens, as well
    + * as the order of keys within JSON objects. Also, if a JSON object within the
    + * value contains the same key more than once, all the key/value pairs are kept.
    + * It stores the data as string in single column of HBase and it has same data
    + * size limit as Phoenix's Varchar.
    + * <p>
    + * JSON data types are for storing JSON (JavaScript Object Notation) data, as
    + * specified in RFC 7159. Such data can also be stored as text, but the JSON
    + * data types have the advantage of enforcing that each stored value is valid
    + * according to the JSON rules.
    + */
    +public class PJsonDataType extends PDataType<String> {
    +
    +	public static final PJsonDataType INSTANCE = new PJsonDataType();
    +
    +	PJsonDataType() {
    +		super("JSON", Types.OTHER, PhoenixJson.class, null, 48);
    +	}
    +
    +	@Override
    +	public int toBytes(Object object, byte[] bytes, int offset) {
    +
    +		if (object == null) {
    +			return 0;
    +		}
    +		byte[] b = toBytes(object);
    +		System.arraycopy(b, 0, bytes, offset, b.length);
    +		return b.length;
    +
    +	}
    +
    +	@Override
    +	public byte[] toBytes(Object object) {
    +		if (object == null) {
    +			return ByteUtil.EMPTY_BYTE_ARRAY;
    +		}
    +		PhoenixJson phoenixJson = (PhoenixJson) object;
    +		return PVarchar.INSTANCE.toBytes(phoenixJson.toString());
    +	}
    +
    +	@Override
    +	public Object toObject(byte[] bytes, int offset, int length,
    +			@SuppressWarnings("rawtypes") PDataType actualType,
    +			SortOrder sortOrder, Integer maxLength, Integer scale) {
    +
    +		if (!actualType.isCoercibleTo(this)) {
    +			throwConstraintViolationException(actualType, this);
    +		}
    +		if (length == 0) {
    +			return null;
    +		}
    +		return getPhoenixJson(bytes, offset, length);
    +
    +	}
    +
    +	@Override
    +	public Object toObject(Object object,
    +			@SuppressWarnings("rawtypes") PDataType actualType) {
    +		if (object == null) {
    +			return null;
    +		}
    +		if (equalsAny(actualType, PJsonDataType.INSTANCE)) {
    +			return object;
    +		}
    +		if (equalsAny(actualType, PVarchar.INSTANCE)) {
    +			return getJsonFromVarchar(object, actualType);
    +		}
    +		return throwConstraintViolationException(actualType, this);
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    +			@SuppressWarnings("rawtypes") PDataType targetType) {
    +		return equalsAny(targetType, this, PVarchar.INSTANCE);
    +
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    +			@SuppressWarnings("rawtypes") PDataType targetType, Object value) {
    +		return isCoercibleTo(targetType);
    +	}
    +
    +	@Override
    +	public boolean isSizeCompatible(ImmutableBytesWritable ptr, Object value,
    +			@SuppressWarnings("rawtypes") PDataType srcType, Integer maxLength,
    +			Integer scale, Integer desiredMaxLength, Integer desiredScale) {
    +		if (ptr.getLength() != 0 && maxLength != null
    +				&& desiredMaxLength != null) {
    +			return maxLength <= desiredMaxLength;
    +		}
    +		return true;
    +	}
    +
    +	@Override
    +	public boolean isFixedWidth() {
    +		return false;
    +	}
    +
    +	@Override
    +	public int estimateByteSize(Object o) {
    +		PhoenixJson phoenixJson = (PhoenixJson) o;
    +		return phoenixJson.toString().length();
    --- End diff --
    
    Add a estimateByteSize() on PhoenixJson as you don't want to have to string-ify the JSON just to get an estimate of the size.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28934484
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/json/PhoenixJson.java ---
    @@ -0,0 +1,216 @@
    +package org.apache.phoenix.schema.json;
    +
    +import java.io.IOException;
    +import java.sql.SQLException;
    +
    +import org.apache.hadoop.hbase.util.Bytes;
    +import org.codehaus.jackson.JsonFactory;
    +import org.codehaus.jackson.JsonNode;
    +import org.codehaus.jackson.JsonParseException;
    +import org.codehaus.jackson.JsonParser;
    +import org.codehaus.jackson.JsonParser.Feature;
    +import org.codehaus.jackson.map.ObjectMapper;
    +import org.codehaus.jackson.node.ValueNode;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * The {@link PhoenixJson} wraps json and uses Jackson library to parse and traverse the json. It
    + * should be used to represent the JSON data type and also should be used to parse Json data and
    + * read the value from it. It always conside the last value if same key exist more than once.
    + */
    +public class PhoenixJson {
    +    private final JsonNode node;
    +    private String jsonAsString;
    +
    +    /**
    +     * Static Factory method to get an {@link PhoenixJson} object. It also validates the json and
    +     * throws {@link JsonParseException} if it is invalid with line number and character, which
    +     * should be wrapped in {@link SQLException} by caller.
    +     * @param data Buffer that contains data to parse
    +     * @param offset Offset of the first data byte within buffer
    +     * @param length Length of contents to parse within buffer
    +     * @return {@link PhoenixJson}.
    +     * @throws JsonParseException
    +     * @throws IOException
    +     */
    +    public static PhoenixJson getPhoenixJson(byte[] jsonData, int offset, int length)
    +            throws JsonParseException, IOException {
    +        JsonFactory jsonFactory = new JsonFactory();
    +        JsonParser jsonParser = jsonFactory.createJsonParser(jsonData, offset, length);
    +        jsonParser.configure(Feature.ALLOW_COMMENTS, true);
    +        ObjectMapper objectMapper = new ObjectMapper();
    +        try {
    +            JsonNode rootNode = objectMapper.readTree(jsonParser);
    +            PhoenixJson phoenixJson = new PhoenixJson(rootNode);
    +
    +            /*
    +             * input data has been stored as it is, since some data is lost when json parser runs,
    +             * for example if a JSON object within the value contains the same key more than once
    +             * then only last one is stored rest all of them are ignored, which will defy the
    +             * contract of PJsonDataType of keeping user data as it is.
    +             */
    +            phoenixJson.setJsonAsString(Bytes.toString(jsonData, offset, length));
    +            return phoenixJson;
    +        } finally {
    +            jsonParser.close();
    +        }
    +
    +    }
    +
    +    /* Default for unit testing */PhoenixJson(final JsonNode node) {
    +        Preconditions.checkNotNull(node, "root node cannot be null for json");
    +        this.node = node;
    +    }
    +
    +    /**
    +     * Get {@link PhoenixJson} for a given json paths. For example :
    +     * <p>
    +     * <code>
    +     * {"f2":{"f3":1},"f4":{"f5":99,"f6":{"f7":"2"}}}'
    +     * </code>
    +     * <p>
    +     * for this source json, if we want to know the json at path {'f4','f6'} it will return
    +     * {@link PhoenixJson} object for json {"f7":"2"}. It always returns the last key if same key
    +     * exist more than once.
    +     * <p>
    +     * If the given path is unreachable then it throws {@link PhoenixJsonException} with the message
    +     * having information about not found path. It is caller responsibility to wrap it in
    +     * {@link SQLException} or catch it and return null to client.
    +     * @param paths {@link String []} of path in the same order as they appear in json.
    +     * @return {@link PhoenixJson} for the json against @paths.
    +     * @throws PhoenixJsonException
    +     */
    +    public PhoenixJson getPhoenixJson(String[] paths) throws PhoenixJsonException {
    +        JsonNode node = this.node;
    +        for (String path : paths) {
    +            JsonNode nodeTemp = null;
    +            if (node.isArray()) {
    +                try {
    +                    int index = Integer.parseInt(path);
    +                    nodeTemp = node.path(index);
    +                } catch (NumberFormatException nfe) {
    +                    throw new PhoenixJsonException("path: " + path + " not found", nfe);
    +                }
    +            } else {
    +                nodeTemp = node.path(path);
    +            }
    +            if (nodeTemp == null || nodeTemp.isMissingNode()) {
    +                throw new PhoenixJsonException("path: " + path + " not found");
    +            }
    +            node = nodeTemp;
    +        }
    +        return new PhoenixJson(node);
    +    }
    +
    +    /**
    +     * Get {@link PhoenixJson} for a given json paths. For example :
    +     * <p>
    +     * <code>
    +     * {"f2":{"f3":1},"f4":{"f5":99,"f6":{"f7":"2"}}}'
    +     * </code>
    +     * <p>
    +     * for this source json, if we want to know the json at path {'f4','f6'} it will return
    +     * {@link PhoenixJson} object for json {"f7":"2"}. It always returns the last key if same key
    +     * exist more than once.
    +     * <p>
    +     * If the given path is unreachable then it return null.
    +     * @param paths {@link String []} of path in the same order as they appear in json.
    +     * @return {@link PhoenixJson} for the json against @paths.
    +     */
    +    public PhoenixJson getNullablePhoenixJson(String[] paths) {
    +        JsonNode node = this.node;
    +        for (String path : paths) {
    +            JsonNode nodeTemp = null;
    +            if (node.isArray()) {
    +                try {
    +                    int index = Integer.parseInt(path);
    +                    nodeTemp = node.path(index);
    +                } catch (NumberFormatException nfe) {
    +                    return null;
    +                }
    +            } else {
    +                nodeTemp = node.path(path);
    +            }
    +            if (nodeTemp == null || nodeTemp.isMissingNode()) {
    +                return null;
    +            }
    +            node = nodeTemp;
    +        }
    +        return new PhoenixJson(node);
    +    }
    +
    +    /**
    +     * Serialize the current {@link PhoenixJson} to String. Its required for
    +     * json_extract_path_text(). If we just return node.toString() it will wrap String value in
    +     * double quote which is not the expectation, hence avoiding calling toString() on
    +     * {@link JsonNode} until PhoenixJson represent a Json Array or container for Json object. If
    +     * PhoenixJson just represent a {@link ValueNode} then it should return value returned from
    +     * objects toString().
    +     */
    +    public String serializeToString() {
    +        if (this.node == null || this.node.isNull()) {
    +            return null;
    +        } else if (this.node.isValueNode()) {
    +
    +            if (this.node.isNumber()) {
    +                return this.node.getNumberValue().toString();
    +            } else if (this.node.isBoolean()) {
    +                return String.valueOf(this.node.getBooleanValue());
    +            } else if (this.node.isTextual()) {
    +                return this.node.getTextValue();
    +            } else {
    +                return getJsonAsString();
    +            }
    +        } else if (this.node.isArray()) {
    +            return getJsonAsString();
    +        } else if (this.node.isContainerNode()) {
    +            return getJsonAsString();
    +        }
    +
    +        return null;
    +
    +    }
    +
    +    @Override
    +    public String toString() {
    +        if (this.jsonAsString == null && this.node != null) {
    --- End diff --
    
    Will this be called often? Not sure if there's a potential threading issue. If not called often, might not want to cache the string.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28983079
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PJsonDataType.java ---
    @@ -0,0 +1,258 @@
    +package org.apache.phoenix.schema.types;
    +
    +import java.io.IOException;
    +import java.sql.Types;
    +import java.text.Format;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.hadoop.hbase.util.Bytes;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.SortOrder;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.util.ByteUtil;
    +import org.apache.phoenix.util.StringUtil;
    +import org.codehaus.jackson.JsonParseException;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * <p>
    + * A Phoenix data type to represent JSON. The json data type stores an exact
    + * copy of the input text, which processing functions must reparse on each
    + * execution. Because the json type stores an exact copy of the input text, it
    + * will preserve semantically-insignificant white space between tokens, as well
    + * as the order of keys within JSON objects. Also, if a JSON object within the
    + * value contains the same key more than once, all the key/value pairs are kept.
    + * It stores the data as string in single column of HBase and it has same data
    + * size limit as Phoenix's Varchar.
    + * <p>
    + * JSON data types are for storing JSON (JavaScript Object Notation) data, as
    + * specified in RFC 7159. Such data can also be stored as text, but the JSON
    + * data types have the advantage of enforcing that each stored value is valid
    + * according to the JSON rules.
    + */
    +public class PJsonDataType extends PDataType<String> {
    --- End diff --
    
    Also, let's just call this PJson as we don't append "DataType" to any of our other PDataType concrete types.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on the pull request:

    https://github.com/apache/phoenix/pull/75#issuecomment-95652863
  
    @AakashPradeep - this pull request is associated with the wrong JIRA, as it seems to be adding comments to PHOENIX-174. Can you close this pull and open two new ones? One for PHOENIX-628 and then one for PHOENIX-1710 (for the new built-in functions)? Make sure to use the JIRA in the commit message so it get associated correctly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by twdsilva <gi...@git.apache.org>.
Github user twdsilva commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r29026064
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PDataType.java ---
    @@ -1144,17 +1144,18 @@ public static PDataType fromLiteral(Object value) {
           return null;
         }
         for (PDataType type : PDataType.values()) {
    -      if (type.isArrayType()) {
    -        PhoenixArray arr = (PhoenixArray) value;
    -        if ((type.getSqlType() == arr.baseType.sqlType + PDataType.ARRAY_TYPE_BASE)
    -            && type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      } else {
    -        if (type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      }
    +			if (type.isArrayType() && type.getJavaClass().isInstance(value)) {
    --- End diff --
    
    @AakashPradeep  You had to make this change because there was a test that was failing can you point us to the test ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28934521
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PDataType.java ---
    @@ -1144,17 +1144,18 @@ public static PDataType fromLiteral(Object value) {
           return null;
         }
         for (PDataType type : PDataType.values()) {
    -      if (type.isArrayType()) {
    -        PhoenixArray arr = (PhoenixArray) value;
    -        if ((type.getSqlType() == arr.baseType.sqlType + PDataType.ARRAY_TYPE_BASE)
    -            && type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      } else {
    -        if (type.getJavaClass().isInstance(value)) {
    -          return type;
    -        }
    -      }
    +			if (type.isArrayType() && type.getJavaClass().isInstance(value)) {
    --- End diff --
    
    Formatting looks off here (but it's mistakenly using 2 spaces for tabs which isn't correct). You can reformat the entire class if you want.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28982807
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PJsonDataType.java ---
    @@ -0,0 +1,258 @@
    +package org.apache.phoenix.schema.types;
    +
    +import java.io.IOException;
    +import java.sql.Types;
    +import java.text.Format;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.hadoop.hbase.util.Bytes;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.SortOrder;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.util.ByteUtil;
    +import org.apache.phoenix.util.StringUtil;
    +import org.codehaus.jackson.JsonParseException;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * <p>
    + * A Phoenix data type to represent JSON. The json data type stores an exact
    + * copy of the input text, which processing functions must reparse on each
    + * execution. Because the json type stores an exact copy of the input text, it
    + * will preserve semantically-insignificant white space between tokens, as well
    + * as the order of keys within JSON objects. Also, if a JSON object within the
    + * value contains the same key more than once, all the key/value pairs are kept.
    + * It stores the data as string in single column of HBase and it has same data
    + * size limit as Phoenix's Varchar.
    + * <p>
    + * JSON data types are for storing JSON (JavaScript Object Notation) data, as
    + * specified in RFC 7159. Such data can also be stored as text, but the JSON
    + * data types have the advantage of enforcing that each stored value is valid
    + * according to the JSON rules.
    + */
    +public class PJsonDataType extends PDataType<String> {
    +
    +	public static final PJsonDataType INSTANCE = new PJsonDataType();
    +
    +	PJsonDataType() {
    +		super("JSON", Types.OTHER, PhoenixJson.class, null, 48);
    +	}
    +
    +	@Override
    +	public int toBytes(Object object, byte[] bytes, int offset) {
    +
    +		if (object == null) {
    +			return 0;
    +		}
    +		byte[] b = toBytes(object);
    +		System.arraycopy(b, 0, bytes, offset, b.length);
    +		return b.length;
    +
    +	}
    +
    +	@Override
    +	public byte[] toBytes(Object object) {
    +		if (object == null) {
    +			return ByteUtil.EMPTY_BYTE_ARRAY;
    +		}
    +		PhoenixJson phoenixJson = (PhoenixJson) object;
    +		return PVarchar.INSTANCE.toBytes(phoenixJson.toString());
    +	}
    +
    +	@Override
    +	public Object toObject(byte[] bytes, int offset, int length,
    +			@SuppressWarnings("rawtypes") PDataType actualType,
    +			SortOrder sortOrder, Integer maxLength, Integer scale) {
    +
    +		if (!actualType.isCoercibleTo(this)) {
    +			throwConstraintViolationException(actualType, this);
    +		}
    +		if (length == 0) {
    +			return null;
    +		}
    +		return getPhoenixJson(bytes, offset, length);
    +
    +	}
    +
    +	@Override
    +	public Object toObject(Object object,
    +			@SuppressWarnings("rawtypes") PDataType actualType) {
    +		if (object == null) {
    +			return null;
    +		}
    +		if (equalsAny(actualType, PJsonDataType.INSTANCE)) {
    +			return object;
    +		}
    +		if (equalsAny(actualType, PVarchar.INSTANCE)) {
    +			return getJsonFromVarchar(object, actualType);
    +		}
    +		return throwConstraintViolationException(actualType, this);
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    +			@SuppressWarnings("rawtypes") PDataType targetType) {
    +		return equalsAny(targetType, this, PVarchar.INSTANCE);
    +
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    +			@SuppressWarnings("rawtypes") PDataType targetType, Object value) {
    +		return isCoercibleTo(targetType);
    +	}
    +
    +	@Override
    +	public boolean isSizeCompatible(ImmutableBytesWritable ptr, Object value,
    +			@SuppressWarnings("rawtypes") PDataType srcType, Integer maxLength,
    +			Integer scale, Integer desiredMaxLength, Integer desiredScale) {
    +		if (ptr.getLength() != 0 && maxLength != null
    +				&& desiredMaxLength != null) {
    +			return maxLength <= desiredMaxLength;
    +		}
    +		return true;
    +	}
    +
    +	@Override
    +	public boolean isFixedWidth() {
    +		return false;
    +	}
    +
    +	@Override
    +	public int estimateByteSize(Object o) {
    +		PhoenixJson phoenixJson = (PhoenixJson) o;
    +		return phoenixJson.toString().length();
    +	}
    +
    +	@Override
    +	public Integer getByteSize() {
    +		return null;
    +	}
    +
    +	@Override
    +	public int compareTo(Object lhs, Object rhs,
    +			@SuppressWarnings("rawtypes") PDataType rhsType) {
    +		if (PJsonDataType.INSTANCE != rhsType) {
    +			throwConstraintViolationException(rhsType, this);
    +		}
    +		PhoenixJson phoenixJsonLHS = (PhoenixJson) lhs;
    +		PhoenixJson phoenixJsonRHS = (PhoenixJson) rhs;
    +		return PVarchar.INSTANCE.compareTo(phoenixJsonLHS.toString(),
    +				phoenixJsonRHS.toString(), rhsType);
    +	}
    +
    +	@Override
    +	public Object toObject(String value) {
    +		byte[] jsonData = Bytes.toBytes(value);
    +		return getPhoenixJson(jsonData, 0, jsonData.length);
    +	}
    +
    +	@Override
    +	public boolean isBytesComparableWith(
    +			@SuppressWarnings("rawtypes") PDataType otherType) {
    +		return otherType == PJsonDataType.INSTANCE
    +				|| otherType == PVarchar.INSTANCE;
    +	}
    +
    +	@Override
    +	public String toStringLiteral(Object o, Format formatter) {
    +		if (o == null) {
    +			return StringUtil.EMPTY_STRING;
    +		}
    +		PhoenixJson phoenixJson = (PhoenixJson) o;
    +		return PVarchar.INSTANCE.toStringLiteral(phoenixJson.toString(),
    +				formatter);
    +	}
    +
    +	@Override
    +	public void coerceBytes(ImmutableBytesWritable ptr, Object o,
    +			@SuppressWarnings("rawtypes") PDataType actualType,
    +			Integer actualMaxLength, Integer actualScale,
    +			SortOrder actualModifier, Integer desiredMaxLength,
    +			Integer desiredScale, SortOrder expectedModifier) {
    +		Preconditions.checkNotNull(actualModifier);
    +		Preconditions.checkNotNull(expectedModifier);
    +		if (ptr.getLength() == 0) {
    +			return;
    +		}
    +		if (this.isBytesComparableWith(actualType)) { // No coerce necessary
    +			return;
    +		}
    +
    +		// Optimization for cases in which we already have the object around
    +		if (o == null) {
    +			o = actualType.toObject(ptr, actualType, actualModifier);
    +		}
    +
    +		o = toObject(o, actualType);
    +		byte[] b = toBytes(o, expectedModifier);
    +		ptr.set(b);
    +	}
    +
    +	@Override
    +	public Object getSampleValue(Integer maxLength, Integer arrayLength) {
    +		Preconditions.checkArgument(maxLength == null || maxLength >= 0);
    +
    +		char[] key = new char[4];
    +		char[] value = new char[4];
    +		int length = maxLength != null ? maxLength : 1;
    +		if (length > (key.length + value.length)) {
    +			key = new char[length + 2];
    +			value = new char[length - key.length];
    +		}
    +		int j = 1;
    +		key[0] = '"';
    +		key[j++] = 'k';
    +		for (int i = 2; i < key.length - 1; i++) {
    +			key[j++] = (char) ('0' + RANDOM.get().nextInt(Byte.MAX_VALUE) % 10);
    +		}
    +		key[j] = '"';
    +
    +		int k = 1;
    +		value[0] = '"';
    +		value[k++] = 'v';
    +		for (int i = 2; i < value.length - 1; i++) {
    +			value[k++] = (char) ('0' + RANDOM.get().nextInt(Byte.MAX_VALUE) % 10);
    +		}
    +		value[k] = '"';
    +		StringBuilder sbr = new StringBuilder();
    +		sbr.append("{").append(key).append(":").append(value).append("}");
    +
    +		byte[] bytes = Bytes.toBytes(sbr.toString());
    +		return getPhoenixJson(bytes, 0, bytes.length);
    +	}
    +
    +	private Object getJsonFromVarchar(Object object,
    +			@SuppressWarnings("rawtypes") PDataType actualType) {
    +		String s = (String) object;
    --- End diff --
    
    Same here - don't go from String->byte[]->String.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by AakashPradeep <gi...@git.apache.org>.
Github user AakashPradeep commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28935794
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/json/PhoenixJson.java ---
    @@ -0,0 +1,216 @@
    +package org.apache.phoenix.schema.json;
    +
    +import java.io.IOException;
    +import java.sql.SQLException;
    +
    +import org.apache.hadoop.hbase.util.Bytes;
    +import org.codehaus.jackson.JsonFactory;
    +import org.codehaus.jackson.JsonNode;
    +import org.codehaus.jackson.JsonParseException;
    +import org.codehaus.jackson.JsonParser;
    +import org.codehaus.jackson.JsonParser.Feature;
    +import org.codehaus.jackson.map.ObjectMapper;
    +import org.codehaus.jackson.node.ValueNode;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * The {@link PhoenixJson} wraps json and uses Jackson library to parse and traverse the json. It
    + * should be used to represent the JSON data type and also should be used to parse Json data and
    + * read the value from it. It always conside the last value if same key exist more than once.
    + */
    +public class PhoenixJson {
    +    private final JsonNode node;
    +    private String jsonAsString;
    +
    +    /**
    +     * Static Factory method to get an {@link PhoenixJson} object. It also validates the json and
    +     * throws {@link JsonParseException} if it is invalid with line number and character, which
    +     * should be wrapped in {@link SQLException} by caller.
    +     * @param data Buffer that contains data to parse
    +     * @param offset Offset of the first data byte within buffer
    +     * @param length Length of contents to parse within buffer
    +     * @return {@link PhoenixJson}.
    +     * @throws JsonParseException
    +     * @throws IOException
    +     */
    +    public static PhoenixJson getPhoenixJson(byte[] jsonData, int offset, int length)
    +            throws JsonParseException, IOException {
    --- End diff --
    
    Nice suggestion, will make this change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on the pull request:

    https://github.com/apache/phoenix/pull/75#issuecomment-95414642
  
    Very nice, @AakashPradeep! @twdsilva - would you mind reviewing too? Also, looks like it needs to be rebased.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28982359
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/types/PJsonDataType.java ---
    @@ -0,0 +1,258 @@
    +package org.apache.phoenix.schema.types;
    +
    +import java.io.IOException;
    +import java.sql.Types;
    +import java.text.Format;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.hadoop.hbase.util.Bytes;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.SortOrder;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.util.ByteUtil;
    +import org.apache.phoenix.util.StringUtil;
    +import org.codehaus.jackson.JsonParseException;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * <p>
    + * A Phoenix data type to represent JSON. The json data type stores an exact
    + * copy of the input text, which processing functions must reparse on each
    + * execution. Because the json type stores an exact copy of the input text, it
    + * will preserve semantically-insignificant white space between tokens, as well
    + * as the order of keys within JSON objects. Also, if a JSON object within the
    + * value contains the same key more than once, all the key/value pairs are kept.
    + * It stores the data as string in single column of HBase and it has same data
    + * size limit as Phoenix's Varchar.
    + * <p>
    + * JSON data types are for storing JSON (JavaScript Object Notation) data, as
    + * specified in RFC 7159. Such data can also be stored as text, but the JSON
    + * data types have the advantage of enforcing that each stored value is valid
    + * according to the JSON rules.
    + */
    +public class PJsonDataType extends PDataType<String> {
    +
    +	public static final PJsonDataType INSTANCE = new PJsonDataType();
    +
    +	PJsonDataType() {
    +		super("JSON", Types.OTHER, PhoenixJson.class, null, 48);
    +	}
    +
    +	@Override
    +	public int toBytes(Object object, byte[] bytes, int offset) {
    +
    +		if (object == null) {
    +			return 0;
    +		}
    +		byte[] b = toBytes(object);
    +		System.arraycopy(b, 0, bytes, offset, b.length);
    +		return b.length;
    +
    +	}
    +
    +	@Override
    +	public byte[] toBytes(Object object) {
    +		if (object == null) {
    +			return ByteUtil.EMPTY_BYTE_ARRAY;
    +		}
    +		PhoenixJson phoenixJson = (PhoenixJson) object;
    +		return PVarchar.INSTANCE.toBytes(phoenixJson.toString());
    +	}
    +
    +	@Override
    +	public Object toObject(byte[] bytes, int offset, int length,
    +			@SuppressWarnings("rawtypes") PDataType actualType,
    +			SortOrder sortOrder, Integer maxLength, Integer scale) {
    +
    +		if (!actualType.isCoercibleTo(this)) {
    +			throwConstraintViolationException(actualType, this);
    +		}
    +		if (length == 0) {
    +			return null;
    +		}
    +		return getPhoenixJson(bytes, offset, length);
    +
    +	}
    +
    +	@Override
    +	public Object toObject(Object object,
    +			@SuppressWarnings("rawtypes") PDataType actualType) {
    +		if (object == null) {
    +			return null;
    +		}
    +		if (equalsAny(actualType, PJsonDataType.INSTANCE)) {
    +			return object;
    +		}
    +		if (equalsAny(actualType, PVarchar.INSTANCE)) {
    +			return getJsonFromVarchar(object, actualType);
    +		}
    +		return throwConstraintViolationException(actualType, this);
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    +			@SuppressWarnings("rawtypes") PDataType targetType) {
    +		return equalsAny(targetType, this, PVarchar.INSTANCE);
    +
    +	}
    +
    +	@Override
    +	public boolean isCoercibleTo(
    +			@SuppressWarnings("rawtypes") PDataType targetType, Object value) {
    +		return isCoercibleTo(targetType);
    +	}
    +
    +	@Override
    +	public boolean isSizeCompatible(ImmutableBytesWritable ptr, Object value,
    --- End diff --
    
    Call PVarchar.isSizeCompatible() instead.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Support native JSON data type and json_extra...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/75#discussion_r28934224
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/expression/function/JsonExtractPathTextFunction.java ---
    @@ -0,0 +1,108 @@
    +package org.apache.phoenix.expression.function;
    +
    +import java.sql.SQLException;
    +import java.util.List;
    +
    +import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    +import org.apache.phoenix.compile.KeyPart;
    +import org.apache.phoenix.exception.SQLExceptionCode;
    +import org.apache.phoenix.exception.SQLExceptionInfo;
    +import org.apache.phoenix.expression.Expression;
    +import org.apache.phoenix.parse.FunctionParseNode.Argument;
    +import org.apache.phoenix.parse.FunctionParseNode.BuiltInFunction;
    +import org.apache.phoenix.schema.IllegalDataException;
    +import org.apache.phoenix.schema.json.PhoenixJson;
    +import org.apache.phoenix.schema.tuple.Tuple;
    +import org.apache.phoenix.schema.types.PDataType;
    +import org.apache.phoenix.schema.types.PJsonDataType;
    +import org.apache.phoenix.schema.types.PVarchar;
    +import org.apache.phoenix.schema.types.PVarcharArray;
    +import org.apache.phoenix.schema.types.PhoenixArray;
    +import org.apache.phoenix.util.ByteUtil;
    +
    +@BuiltInFunction(name = JsonExtractPathTextFunction.NAME, args = {
    +        @Argument(allowedTypes = { PJsonDataType.class }),
    +        @Argument(allowedTypes = { PVarcharArray.class }) })
    +public class JsonExtractPathTextFunction extends ScalarFunction {
    --- End diff --
    
    Same feedback as with previous built-in


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---