You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@phoenix.apache.org by nmaillard <gi...@git.apache.org> on 2015/04/23 04:40:39 UTC

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

GitHub user nmaillard opened a pull request:

    https://github.com/apache/phoenix/pull/74

    Phoenix 331- Phoenix-Hive initial commit 

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/nmaillard/phoenix PHOENIX-331

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/phoenix/pull/74.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #74
    
----
commit ef53df3f6222e4c6b56258d1825452b5c0977d8f
Author: Nicolas Maillard <nm...@macbook-pro-de-nicolas.local>
Date:   2015-04-23T00:59:21Z

    Phoenix-Hive initial commit
    
    added Phoenix Hive first phase of code

commit 3de9b30e086d29344f3ff9b4e94d865ddd45632f
Author: Nicolas Maillard <nm...@macbook-pro-de-nicolas.local>
Date:   2015-04-23T02:34:52Z

    Adding UTs
    
    First UTs

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by nmaillard <gi...@git.apache.org>.

Github user nmaillard commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29038991
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixOutputCommitter.java ---
    @@ -0,0 +1,65 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import java.io.IOException;
    +import java.sql.Connection;
    +import java.sql.SQLException;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.phoenix.mapreduce.PhoenixOutputCommitter;
    +
    +/**
    +* HivePhoenixOutputCommitter
    +*
    +* @version 1.0
    +* @since   2015-02-08 
    +*/
    +
    +public class HivePhoenixOutputCommitter extends PhoenixOutputCommitter{
    +    public final Log LOG = LogFactory.getLog(HivePhoenixOutputCommitter.class);
    +    
    +    public void commit(Connection connection)
    --- End diff --
    
    You are correct this is a relica of the first implementation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by ddraj <gi...@git.apache.org>.

Github user ddraj commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29026147
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixRecordWriter.java ---
    @@ -0,0 +1,85 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import java.io.IOException;
    +import java.sql.Connection;
    +import java.sql.PreparedStatement;
    +import java.sql.SQLException;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.io.NullWritable;
    +import org.apache.hadoop.mapred.RecordWriter;
    +import org.apache.hadoop.mapred.Reporter;
    +import org.apache.hadoop.mapreduce.lib.db.DBWritable;
    +import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil;
    +
    +public class HivePhoenixRecordWriter<T extends DBWritable> implements RecordWriter<NullWritable, T> {
    +    private static final Log LOG = LogFactory.getLog(HivePhoenixRecordWriter.class);
    +
    +    private long numRecords = 0L;
    +    private final Connection conn;
    +    private final PreparedStatement statement;
    +    private final long batchSize;
    +
    +    public HivePhoenixRecordWriter(Connection conn, Configuration config) throws SQLException {
    +        this.conn = conn;
    +        this.batchSize = PhoenixConfigurationUtil.getBatchSize(config);
    +        String upsertQuery = PhoenixConfigurationUtil.getUpsertStatement(config);
    +        this.statement = this.conn.prepareStatement(upsertQuery);
    +    }
    +
    +    public void write(NullWritable n, T record) throws IOException {
    +        try {
    +            record.write(this.statement);
    +            this.numRecords += 1L;
    +            this.statement.addBatch();
    +
    +            if (this.numRecords % this.batchSize == 0L) {
    +                LOG.info("log commit called on a batch of size : " + this.batchSize);
    +                this.statement.executeBatch();
    +                //this.conn.commit();
    --- End diff --
    
    Did you mean to do this or not? What happens if tasks fail? The data in the table is partially written I guess, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by nmaillard <gi...@git.apache.org>.

Github user nmaillard commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29030194
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixRecordWriter.java ---
    @@ -0,0 +1,85 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import java.io.IOException;
    +import java.sql.Connection;
    +import java.sql.PreparedStatement;
    +import java.sql.SQLException;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.io.NullWritable;
    +import org.apache.hadoop.mapred.RecordWriter;
    +import org.apache.hadoop.mapred.Reporter;
    +import org.apache.hadoop.mapreduce.lib.db.DBWritable;
    +import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil;
    +
    +public class HivePhoenixRecordWriter<T extends DBWritable> implements RecordWriter<NullWritable, T> {
    +    private static final Log LOG = LogFactory.getLog(HivePhoenixRecordWriter.class);
    +
    +    private long numRecords = 0L;
    +    private final Connection conn;
    +    private final PreparedStatement statement;
    +    private final long batchSize;
    +
    +    public HivePhoenixRecordWriter(Connection conn, Configuration config) throws SQLException {
    +        this.conn = conn;
    +        this.batchSize = PhoenixConfigurationUtil.getBatchSize(config);
    +        String upsertQuery = PhoenixConfigurationUtil.getUpsertStatement(config);
    +        this.statement = this.conn.prepareStatement(upsertQuery);
    +    }
    +
    +    public void write(NullWritable n, T record) throws IOException {
    +        try {
    +            record.write(this.statement);
    +            this.numRecords += 1L;
    +            this.statement.addBatch();
    +
    +            if (this.numRecords % this.batchSize == 0L) {
    +                LOG.info("log commit called on a batch of size : " + this.batchSize);
    +                this.statement.executeBatch();
    +                //this.conn.commit();
    --- End diff --
    
    You are correct did not mean to do this, uncommenting to make sure we commit each batch


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by JamesRTaylor <gi...@git.apache.org>.

Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29023417
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixInputFormat.java ---
    @@ -0,0 +1,161 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import java.io.IOException;
    +import java.sql.Connection;
    +import java.sql.Statement;
    +import java.util.List;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.fs.Path;
    +import org.apache.hadoop.hbase.client.Scan;
    +import org.apache.hadoop.io.NullWritable;
    +import org.apache.hadoop.io.Text;
    +import org.apache.hadoop.mapred.FileSplit;
    +import org.apache.hadoop.mapred.InputSplit;
    +import org.apache.hadoop.mapred.JobConf;
    +import org.apache.hadoop.mapred.RecordReader;
    +import org.apache.hadoop.mapred.Reporter;
    +import org.apache.hadoop.mapreduce.JobContext;
    +import org.apache.hadoop.mapreduce.lib.db.DBWritable;
    +import org.apache.phoenix.compile.QueryPlan;
    +import org.apache.phoenix.compile.StatementContext;
    +import org.apache.phoenix.hive.util.HiveConnectionUtil;
    +import org.apache.phoenix.jdbc.PhoenixStatement;
    +import org.apache.phoenix.mapreduce.*;
    +import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil;
    +import org.apache.phoenix.query.KeyRange;
    +import org.apache.phoenix.schema.TableRef;
    +import org.apache.phoenix.util.ScanUtil;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Lists;
    +
    +
    +/**
    +* HivePhoenixInputFormat
    +* Need to extend the standard PhoenixInputFormat but also implement the mapred inputFormat for Hive compliance
    +*
    +* @version 1.0
    +* @since   2015-02-08 
    +*/
    +
    +public class HivePhoenixInputFormat<T extends DBWritable> extends org.apache.phoenix.mapreduce.PhoenixInputFormat<T> implements org.apache.hadoop.mapred.InputFormat<NullWritable, T>{
    --- End diff --
    
    +1 to avoid code duplication. We can also have the code be different in our different branches if that helps.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by mravi <gi...@git.apache.org>.

Github user mravi commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r30006906
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixRecordReader.java ---
    @@ -0,0 +1,182 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.base.Throwables;
    +
    +import java.io.IOException;
    +import java.io.PrintStream;
    +import java.sql.SQLException;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hbase.client.Scan;
    +import org.apache.hadoop.io.NullWritable;
    +import org.apache.hadoop.mapred.InputSplit;
    +import org.apache.hadoop.mapreduce.TaskAttemptContext;
    +import org.apache.hadoop.mapreduce.lib.db.DBWritable;
    +import org.apache.hadoop.util.ReflectionUtils;
    +import org.apache.phoenix.compile.QueryPlan;
    +import org.apache.phoenix.compile.ScanRanges;
    +import org.apache.phoenix.compile.SequenceManager;
    +import org.apache.phoenix.compile.StatementContext;
    +import org.apache.phoenix.iterate.ResultIterator;
    +import org.apache.phoenix.iterate.SequenceResultIterator;
    +import org.apache.phoenix.iterate.TableResultIterator;
    +import org.apache.phoenix.jdbc.PhoenixResultSet;
    +import org.apache.phoenix.mapreduce.PhoenixRecordReader;
    +import org.apache.phoenix.query.KeyRange;
    +import org.apache.phoenix.util.ScanUtil;
    +
    +/**
    +* HivePhoenixRecordReader
    +* implements mapred as well for hive needs
    +*
    +* @version 1.0
    +* @since   2015-02-08 
    +*/
    +
    +
    +public class HivePhoenixRecordReader<T extends DBWritable> extends PhoenixRecordReader<T> implements
    --- End diff --
    
    I agree. But, I don't see any code of PhoenixRecordReader used in this class except for the call to super  in the constructor.  Am I missing anything.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by nmaillard <gi...@git.apache.org>.

Github user nmaillard commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29030806
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixRecordWriter.java ---
    @@ -0,0 +1,85 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import java.io.IOException;
    +import java.sql.Connection;
    +import java.sql.PreparedStatement;
    +import java.sql.SQLException;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.io.NullWritable;
    +import org.apache.hadoop.mapred.RecordWriter;
    +import org.apache.hadoop.mapred.Reporter;
    +import org.apache.hadoop.mapreduce.lib.db.DBWritable;
    +import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil;
    +
    +public class HivePhoenixRecordWriter<T extends DBWritable> implements RecordWriter<NullWritable, T> {
    +    private static final Log LOG = LogFactory.getLog(HivePhoenixRecordWriter.class);
    +
    +    private long numRecords = 0L;
    +    private final Connection conn;
    +    private final PreparedStatement statement;
    +    private final long batchSize;
    +
    +    public HivePhoenixRecordWriter(Connection conn, Configuration config) throws SQLException {
    +        this.conn = conn;
    +        this.batchSize = PhoenixConfigurationUtil.getBatchSize(config);
    +        String upsertQuery = PhoenixConfigurationUtil.getUpsertStatement(config);
    +        this.statement = this.conn.prepareStatement(upsertQuery);
    +    }
    +
    +    public void write(NullWritable n, T record) throws IOException {
    +        try {
    +            record.write(this.statement);
    --- End diff --
    
    The type conversion is all done in  HiveTypeUtil, HiveType2PdataType does Hive to phoenix.
    In PhoenixHiveDBWritable the write class leverage that information to create the prepared statement. T extends DBWritable and is a PhoenixHiveDBWritable in a this instance.
    Some complex types like Array are still missing, this will be follow up work


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by nmaillard <gi...@git.apache.org>.

Github user nmaillard commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29259215
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/PhoenixSerde.java ---
    @@ -0,0 +1,169 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.List;
    +import java.util.Properties;
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hive.serde2.SerDe;
    +import org.apache.hadoop.hive.serde2.SerDeException;
    +import org.apache.hadoop.hive.serde2.SerDeStats;
    +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
    +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
    +import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
    +import org.apache.hadoop.hive.serde2.objectinspector.StructField;
    +import org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector;
    +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
    +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils;
    +import org.apache.hadoop.io.Writable;
    +import org.apache.hadoop.mapred.lib.db.DBWritable;
    +import org.apache.phoenix.hive.util.HiveConstants;
    +import org.apache.phoenix.hive.util.HiveTypeUtil;
    +import org.apache.phoenix.schema.types.PDataType;
    +
    +public class PhoenixSerde implements SerDe {
    +    static Log LOG = LogFactory.getLog(PhoenixSerde.class.getName());
    +    private PhoenixHiveDBWritable phrecord;
    +    private List<String> columnNames;
    +    private List<TypeInfo> columnTypes;
    +    private ObjectInspector ObjectInspector;
    +    private int fieldCount;
    +    private List<Object> row;
    +    private List<ObjectInspector> fieldOIs;
    +    
    +    
    +    /**
    +     * This method initializes the Hive SerDe
    +     * incoming hive types.
    +     * @param conf conf job configuration
    +     *  @param tblProps table properties
    +     */
    +    public void initialize(Configuration conf, Properties tblProps) throws SerDeException {
    +        if (conf != null) {
    +            conf.setClass("phoenix.input.class", PhoenixHiveDBWritable.class, DBWritable.class);
    +        }
    +        this.columnNames = Arrays.asList(tblProps.getProperty(HiveConstants.COLUMNS).split(","));
    +        this.columnTypes =
    +                TypeInfoUtils.getTypeInfosFromTypeString(tblProps
    +                        .getProperty(HiveConstants.COLUMNS_TYPES));
    +        LOG.debug("columnNames: " + this.columnNames);
    +        LOG.debug("columnTypes: " + this.columnTypes);
    +        this.fieldCount = this.columnTypes.size();
    +        PDataType[] types = HiveTypeUtil.hiveTypesToSqlTypes(this.columnTypes);
    +        this.phrecord = new PhoenixHiveDBWritable(types);
    +        this.fieldOIs = new ArrayList(this.columnNames.size());
    +
    +        for (TypeInfo typeInfo : this.columnTypes) {
    +            this.fieldOIs.add(TypeInfoUtils
    +                    .getStandardWritableObjectInspectorFromTypeInfo(typeInfo));
    +        }
    +        this.ObjectInspector =
    +                ObjectInspectorFactory.getStandardStructObjectInspector(this.columnNames,
    +                    this.fieldOIs);
    +        this.row = new ArrayList(this.columnNames.size());
    +    }
    +    
    +    
    +    /**
    +     * This Deserializes a result from Phoenix to a Hive result
    +     * @param wr the phoenix writable Object here PhoenixHiveDBWritable
    +     * @return  Object for Hive
    +     */
    +
    +    public Object deserialize(Writable wr) throws SerDeException {
    +        if (!(wr instanceof PhoenixHiveDBWritable)) throw new SerDeException(
    +                "Serialized Object is not of type PhoenixHiveDBWritable");
    +        try {
    +            this.row.clear();
    +            PhoenixHiveDBWritable phdbw = (PhoenixHiveDBWritable) wr;
    +            for (int i = 0; i < this.fieldCount; i++) {
    +                Object value = phdbw.get((String) this.columnNames.get(i));
    +                if (value != null) this.row.add(HiveTypeUtil.SQLType2Writable(
    +                    ((TypeInfo) this.columnTypes.get(i)).getTypeName(), value));
    +                else {
    +                    this.row.add(null);
    +                }
    +            }
    +            return this.row;
    +        } catch (Exception e) {
    +            e.printStackTrace();
    +            throw new SerDeException(e.getCause());
    +        }
    +    }
    +
    +    public ObjectInspector getObjectInspector() throws SerDeException {
    +        return this.ObjectInspector;
    +    }
    +
    +    public SerDeStats getSerDeStats() {
    +        return null;
    +    }
    +
    +    /**
    +     * This is a getter for the  serialized class to use with this SerDE
    +     * @return  The class PhoenixHiveDBWritable
    +     */
    +    
    +    public Class<? extends Writable> getSerializedClass() {
    +        return PhoenixHiveDBWritable.class;
    +    }
    +
    +    
    +    /**
    +     * This serializes a Hive row to a Phoenix entry
    +     * incoming hive types.
    +     * @param row Hive row
    +     * @param inspector inspector for the Hive row
    +     */
    +    
    +    public Writable serialize(Object row, ObjectInspector inspector) throws SerDeException {
    +        final StructObjectInspector structInspector = (StructObjectInspector) inspector;
    +        final List<? extends StructField> fields = structInspector.getAllStructFieldRefs();
    +
    +        if (fields.size() != fieldCount) {
    +            throw new SerDeException(String.format("Required %d columns, received %d.", fieldCount,
    +                fields.size()));
    +        }
    +        phrecord.clear();
    +        for (int i = 0; i < fieldCount; i++) {
    +            StructField structField = fields.get(i);
    +            if (structField != null) {
    +                Object field = structInspector.getStructFieldData(row, structField);
    +                ObjectInspector fieldOI = structField.getFieldObjectInspector();
    +                switch (fieldOI.getCategory()) {
    +                case PRIMITIVE:
    +                    Writable value =
    +                            (Writable) ((PrimitiveObjectInspector) fieldOI)
    +                                    .getPrimitiveWritableObject(field);
    +                    phrecord.add(value);
    --- End diff --
    
    this is data coming from hive and writes out the writable value to the PhoenixRecordObject. This object will serialize it to a Phoenix Format at write time in the write method


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by mravi <gi...@git.apache.org>.

Github user mravi commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29023350
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixInputFormat.java ---
    @@ -0,0 +1,161 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import java.io.IOException;
    +import java.sql.Connection;
    +import java.sql.Statement;
    +import java.util.List;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.fs.Path;
    +import org.apache.hadoop.hbase.client.Scan;
    +import org.apache.hadoop.io.NullWritable;
    +import org.apache.hadoop.io.Text;
    +import org.apache.hadoop.mapred.FileSplit;
    +import org.apache.hadoop.mapred.InputSplit;
    +import org.apache.hadoop.mapred.JobConf;
    +import org.apache.hadoop.mapred.RecordReader;
    +import org.apache.hadoop.mapred.Reporter;
    +import org.apache.hadoop.mapreduce.JobContext;
    +import org.apache.hadoop.mapreduce.lib.db.DBWritable;
    +import org.apache.phoenix.compile.QueryPlan;
    +import org.apache.phoenix.compile.StatementContext;
    +import org.apache.phoenix.hive.util.HiveConnectionUtil;
    +import org.apache.phoenix.jdbc.PhoenixStatement;
    +import org.apache.phoenix.mapreduce.*;
    +import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil;
    +import org.apache.phoenix.query.KeyRange;
    +import org.apache.phoenix.schema.TableRef;
    +import org.apache.phoenix.util.ScanUtil;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Lists;
    +
    +
    +/**
    +* HivePhoenixInputFormat
    +* Need to extend the standard PhoenixInputFormat but also implement the mapred inputFormat for Hive compliance
    +*
    +* @version 1.0
    +* @since   2015-02-08 
    +*/
    +
    +public class HivePhoenixInputFormat<T extends DBWritable> extends org.apache.phoenix.mapreduce.PhoenixInputFormat<T> implements org.apache.hadoop.mapred.InputFormat<NullWritable, T>{
    --- End diff --
    
    How about the idea we have an abstract BasePhoenixInputFormat that provides the implementation for both the old and the new mapreduce API and provide an abstract method to convert the KeyRange to a PhoenixInputSplit or HivePhoenixInputSplit.
    On the other hand, if we wanted to completely avoid support for mapred api, we can push the code in getQueryPlan() and part of generateSplits into a Util class that way we avoid a lot of code duplication.  
    
    @ddraj , @JamesRTaylor  What is your recommendation?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by ddraj <gi...@git.apache.org>.

Github user ddraj commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29026068
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixRecordWriter.java ---
    @@ -0,0 +1,85 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import java.io.IOException;
    +import java.sql.Connection;
    +import java.sql.PreparedStatement;
    +import java.sql.SQLException;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.io.NullWritable;
    +import org.apache.hadoop.mapred.RecordWriter;
    +import org.apache.hadoop.mapred.Reporter;
    +import org.apache.hadoop.mapreduce.lib.db.DBWritable;
    +import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil;
    +
    +public class HivePhoenixRecordWriter<T extends DBWritable> implements RecordWriter<NullWritable, T> {
    +    private static final Log LOG = LogFactory.getLog(HivePhoenixRecordWriter.class);
    +
    +    private long numRecords = 0L;
    +    private final Connection conn;
    +    private final PreparedStatement statement;
    +    private final long batchSize;
    +
    +    public HivePhoenixRecordWriter(Connection conn, Configuration config) throws SQLException {
    +        this.conn = conn;
    +        this.batchSize = PhoenixConfigurationUtil.getBatchSize(config);
    +        String upsertQuery = PhoenixConfigurationUtil.getUpsertStatement(config);
    +        this.statement = this.conn.prepareStatement(upsertQuery);
    +    }
    +
    +    public void write(NullWritable n, T record) throws IOException {
    +        try {
    +            record.write(this.statement);
    --- End diff --
    
    Who does the datatype conversion from Hive to Phoenix?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by nmaillard <gi...@git.apache.org>.

Github user nmaillard commented on the pull request:

    https://github.com/apache/phoenix/pull/74#issuecomment-100843474
  
    Have made the changes per the first feedback, let me know if there is anything else


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by nmaillard <gi...@git.apache.org>.

Github user nmaillard commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29030493
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/PhoenixMetaHook.java ---
    @@ -0,0 +1,210 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import com.google.common.base.Splitter;
    +import com.google.common.base.Splitter.MapSplitter;
    +
    +import java.sql.Connection;
    +import java.sql.SQLException;
    +import java.util.LinkedHashMap;
    +import java.util.Map;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.hive.metastore.HiveMetaHook;
    +import org.apache.hadoop.hive.metastore.TableType;
    +import org.apache.hadoop.hive.metastore.api.FieldSchema;
    +import org.apache.hadoop.hive.metastore.api.MetaException;
    +import org.apache.hadoop.hive.metastore.api.Table;
    +import org.apache.hadoop.hive.serde2.SerDeException;
    +import org.apache.phoenix.hive.util.HiveConnectionUtil;
    +import org.apache.phoenix.hive.util.HiveConfigurationUtil;
    +import org.apache.phoenix.hive.util.HiveTypeUtil;
    +import org.apache.phoenix.hive.util.PhoenixUtil;
    +
    +/**
    +* PhoenixMetaHook
    +* This class captures all create and delete Hive queries and passes them to phoenix
    +*
    +* @version 1.0
    +* @since   2015-02-08 
    +*/
    +
    +public class PhoenixMetaHook
    +  implements HiveMetaHook
    +{
    +  static Log LOG = LogFactory.getLog(PhoenixMetaHook.class.getName());
    +
    +  /**
    +   *commitCreateTable creates a Phoenix table after the hive table has been created 
    +   * incoming hive types.
    +   * @param tbl the table properties
    +   * 
    +   */
    +  //Too much logic in this function must revisit and dispatch
    +  public void commitCreateTable(Table tbl)
    --- End diff --
    
    Actually the idea came from the use of the ES connector; I thought it made life at lot easier when creating the 80% of simple tables. Otherwise you need to go to the 2 systems create each table, nothing hard just adds work and no real value. In a second iteration I wanted to add some more logic here and extract table name from Hive table name, extract PK from the first column or from partition columns.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by JamesRTaylor <gi...@git.apache.org>.

Github user JamesRTaylor commented on the pull request:

    https://github.com/apache/phoenix/pull/74#issuecomment-100526094
  
    This is looking good, @nmaillard. Still some copy/paste code that we should aim to cleanup. Also, take care that no exceptions are swallowed. Wrap checked exceptions in a SQLException if it's declared in the  method and otherwise wrap in a RuntimeException and throw it. Also, make sure to remove or convert all System.out.println calls to logging instead.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by JamesRTaylor <gi...@git.apache.org>.

Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29994945
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/util/PhoenixUtil.java ---
    @@ -0,0 +1,163 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive.util;
    +
    +import java.sql.Connection;
    +import java.sql.DatabaseMetaData;
    +import java.sql.ResultSet;
    +import java.sql.SQLException;
    +import java.util.ArrayList;
    +import java.util.Iterator;
    +import java.util.LinkedHashMap;
    +import java.util.List;
    +import java.util.Map;
    +import java.util.Map.Entry;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.hive.metastore.api.MetaException;
    +
    +import com.google.common.base.Joiner;
    +import com.google.common.base.Preconditions;
    +
    +public class PhoenixUtil {
    +    static Log LOG = LogFactory.getLog(PhoenixUtil.class.getName());
    +
    +    public static boolean createTable(Connection conn, String TableName,
    +            Map<String, String> fields, String[] pks, boolean addIfNotExists, int salt_buckets,
    +            String compression,int versions_num) throws SQLException, MetaException {
    +        Preconditions.checkNotNull(conn);
    +        if (pks == null || pks.length == 0) {
    +            throw new SQLException("Phoenix Table no Rowkeys specified in "
    +                    + HiveConfigurationUtil.PHOENIX_ROWKEYS);
    +        }
    +        for (String pk : pks) {
    +            String val = fields.get(pk.toLowerCase());
    +            if (val == null) {
    +                throw new MetaException("Phoenix Table rowkey " + pk
    +                        + " does not belong to listed fields ");
    +            }
    +            val += " not null";
    +            fields.put(pk, val);
    +        }
    +
    +        StringBuffer query = new StringBuffer("CREATE TABLE ");
    +        if (addIfNotExists) {
    +            query.append("IF NOT EXISTS ");
    +        }
    +        query.append(TableName + " ( ");
    +        Joiner.MapJoiner mapJoiner = Joiner.on(',').withKeyValueSeparator(" ");
    +        query.append(" " + mapJoiner.join(fields));
    +        if (pks != null && pks.length > 0) {
    +            query.append(" CONSTRAINT pk PRIMARY KEY (");
    +            Joiner joiner = Joiner.on(" , ");
    +            query.append(" " + joiner.join(pks) + " )");
    +        }
    +        query.append(" )");
    +        if (salt_buckets > 0) {
    +            query.append(" SALT_BUCKETS = " + salt_buckets);
    +        }
    +        if (compression != null) {
    +            query.append(" ,COMPRESSION='GZ'");
    --- End diff --
    
    Is there a reason that GZ is the default compression? Why not just let Phoenix default it? Same for VERSIONS below.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by JamesRTaylor <gi...@git.apache.org>.

Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29994905
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/PhoenixStorageHandler.java ---
    @@ -0,0 +1,147 @@
    +/*
    + * Copyright 2010 The Apache Software Foundation
    + *
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + *distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you maynot use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicablelaw or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import java.sql.SQLException;
    +import java.util.Iterator;
    +import java.util.Map;
    +import java.util.Map.Entry;
    +import java.util.Properties;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hive.metastore.HiveMetaHook;
    +import org.apache.hadoop.hive.ql.metadata.DefaultStorageHandler;
    +import org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler;
    +import org.apache.hadoop.hive.ql.plan.ExprNodeDesc;
    +import org.apache.hadoop.hive.ql.plan.TableDesc;
    +import org.apache.hadoop.hive.serde.Constants;
    +import org.apache.hadoop.hive.serde.serdeConstants;
    +import org.apache.hadoop.hive.serde2.Deserializer;
    +import org.apache.hadoop.hive.serde2.SerDe;
    +import org.apache.hadoop.hive.serde2.SerDeException;
    +import org.apache.hadoop.mapred.InputFormat;
    +import org.apache.hadoop.mapred.JobConf;
    +import org.apache.hadoop.mapred.OutputFormat;
    +import org.apache.phoenix.hive.util.HiveConfigurationUtil;
    +
    +
    +/**
    +* PhoenixStorageHandler
    +* This class manages all the Phoenix/Hive table initial configurations and SerDe Election
    +*/
    +
    +public class PhoenixStorageHandler extends DefaultStorageHandler implements
    +        HiveStoragePredicateHandler {
    +    static Log LOG = LogFactory.getLog(PhoenixStorageHandler.class.getName());
    +
    +    private Configuration conf = null;
    +
    +    public PhoenixStorageHandler() {
    +    }
    +
    +    @Override
    +    public Configuration getConf() {
    +        return conf;
    +    }
    +
    +    @Override
    +    public void setConf(Configuration conf) {
    +        this.conf = conf;
    +    }
    +
    +    @Override
    +    public HiveMetaHook getMetaHook() {
    +        return new PhoenixMetaHook();
    +    }
    +
    +    @Override
    +    public void configureInputJobProperties(TableDesc tableDesc, Map<String, String> jobProperties) {
    +        configureJobProperties(tableDesc, jobProperties);
    +    }
    +
    +    @Override
    +    public void
    +            configureOutputJobProperties(TableDesc tableDesc, Map<String, String> jobProperties) {
    +        configureJobProperties(tableDesc, jobProperties);
    +    }
    +
    +    @Override
    +    public void configureTableJobProperties(TableDesc tableDesc, Map<String, String> jobProperties) {
    +        configureJobProperties(tableDesc, jobProperties);
    +    }
    +
    +    /**
    +     * Extract all job properties to configure this job 
    +     * parameter tableDesc tabledescription, jobProperties
    +     * TODO this avoids any pushdown must revisit
    +     */
    +    private void configureJobProperties(TableDesc tableDesc, Map<String, String> jobProperties)
    +    {
    +      Properties tblProps = tableDesc.getProperties();
    +      tblProps.getProperty("phoenix.hbase.table.name");
    +      HiveConfigurationUtil.setProperties(tblProps, jobProperties);
    +
    +      //TODO this avoids any pushdown must revisit and extract meaningful parts
    --- End diff --
    
    Can you elaborate on this? What is this code trying to do?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by JamesRTaylor <gi...@git.apache.org>.

Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29994928
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/util/HiveConnectionUtil.java ---
    @@ -0,0 +1,161 @@
    +/*
    + * Copyright 2010 The Apache Software Foundation
    + *
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + *distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you maynot use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicablelaw or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive.util;
    +
    +import java.sql.Connection;
    +import java.sql.DriverManager;
    +import java.sql.SQLException;
    +import java.util.Map;
    +import java.util.Properties;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hbase.HConstants;
    +import org.apache.hadoop.hive.metastore.api.Table;
    +import org.apache.phoenix.util.PhoenixRuntime;
    +import org.apache.phoenix.hive.util.HiveConfigurationUtil;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * Describe your class here.
    + *
    + * @since 138
    + */
    +public final class HiveConnectionUtil {
    +    private static final Log LOG = LogFactory.getLog(HiveConnectionUtil.class);
    +    private static Connection connection = null;
    +    
    +    /**
    +     * Returns the {#link Connection} from Configuration
    +     * @param configuration
    +     * @return
    +     * @throws SQLException
    +     */
    +    public static Connection getConnection(final Configuration configuration) throws SQLException {
    +        Preconditions.checkNotNull(configuration);
    +        if (connection == null) {
    +            final Properties props = new Properties();
    +            String quorum =
    +                    configuration
    +                            .get(HiveConfigurationUtil.ZOOKEEPER_QUORUM != null ? HiveConfigurationUtil.ZOOKEEPER_QUORUM
    +                                    : configuration.get(HConstants.ZOOKEEPER_QUORUM));
    +            String znode =
    +                    configuration
    +                            .get(HiveConfigurationUtil.ZOOKEEPER_PARENT != null ? HiveConfigurationUtil.ZOOKEEPER_PARENT
    +                                    : configuration.get(HConstants.ZOOKEEPER_ZNODE_PARENT));
    +            String port =
    +                    configuration
    +                            .get(HiveConfigurationUtil.ZOOKEEPER_PORT != null ? HiveConfigurationUtil.ZOOKEEPER_PORT
    +                                    : configuration.get(HConstants.ZOOKEEPER_CLIENT_PORT));
    +            if (!znode.startsWith("/")) {
    +                znode = "/" + znode;
    +            }
    +
    +            try {
    +                // Not necessary shoud pick it up
    +                Class.forName("org.apache.phoenix.jdbc.PhoenixDriver");
    +
    +                LOG.info("Connection info: " + PhoenixRuntime.JDBC_PROTOCOL
    +                        + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + quorum
    +                        + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + port
    +                        + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + znode);
    +
    +                final Connection conn =
    +                        DriverManager.getConnection(PhoenixRuntime.JDBC_PROTOCOL
    +                                + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + quorum
    +                                + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + port
    +                                + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + znode);
    +                String autocommit = configuration.get(HiveConfigurationUtil.AUTOCOMMIT);
    +                if (autocommit != null && autocommit.equalsIgnoreCase("true")) {
    +                    conn.setAutoCommit(true);
    +                } else {
    +                    conn.setAutoCommit(false);
    +                }
    +                connection = conn;
    +            } catch (ClassNotFoundException e) {
    +                // TODO Auto-generated catch block
    +                e.printStackTrace();
    +            }
    +        }
    +        return connection;
    +    }
    +    
    +    /**
    +     * Returns the {#link Connection} from Configuration
    +     * @param configuration
    +     * @return
    +     * @throws SQLException
    +     */
    +    //TODO redundant
    +    public static Connection getConnection(final Table tbl) throws SQLException {
    +        Preconditions.checkNotNull(tbl);
    +        Map<String, String> TblParams = tbl.getParameters();
    +        String quorum =
    +                TblParams.get(HiveConfigurationUtil.ZOOKEEPER_QUORUM) != null ? TblParams.get(
    +                    HiveConfigurationUtil.ZOOKEEPER_QUORUM).trim()
    +                        : HiveConfigurationUtil.ZOOKEEPER_QUORUM_DEFAULT;
    +        String port =
    +                TblParams.get(HiveConfigurationUtil.ZOOKEEPER_PORT) != null ? TblParams.get(
    +                    HiveConfigurationUtil.ZOOKEEPER_PORT).trim()
    +                        : HiveConfigurationUtil.ZOOKEEPER_PORT_DEFAULT;
    +        String znode =
    +                TblParams.get(HiveConfigurationUtil.ZOOKEEPER_PARENT) != null ? TblParams.get(
    +                    HiveConfigurationUtil.ZOOKEEPER_PARENT).trim()
    +                        : HiveConfigurationUtil.ZOOKEEPER_PARENT_DEFAULT;
    +        if (!znode.startsWith("/")) {
    +        	znode = "/" + znode;
    +        }
    +        try {
    +            Class.forName("org.apache.phoenix.jdbc.PhoenixDriver");
    +            final Connection conn =
    +                    DriverManager.getConnection((PhoenixRuntime.JDBC_PROTOCOL
    +                            + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + quorum
    +                            + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + port
    +                            + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + znode));
    +            String autocommit = TblParams.get(HiveConfigurationUtil.AUTOCOMMIT);
    +            if(autocommit!=null && autocommit.equalsIgnoreCase("true")){
    +                conn.setAutoCommit(true);
    +            }else{
    +                conn.setAutoCommit(false);
    +            }
    +            
    +            return conn;
    +        } catch (ClassNotFoundException e) {
    +            e.printStackTrace();
    --- End diff --
    
    Here too. Please make sure there aren't other occurrences of swallowing exceptions. Just throw a SQLException that wraps the ClassNotFoundException.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by nmaillard <gi...@git.apache.org>.

Github user nmaillard commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29261031
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/util/HiveTypeUtil.java ---
    @@ -0,0 +1,151 @@
    +/*
    + * Copyright 2010 The Apache Software Foundation Licensed to the Apache Software Foundation (ASF)
    + * under one or more contributor license agreements. See the NOTICE filedistributed with this work
    + * for additional information regarding copyright ownership. The ASF licenses this file to you under
    + * the Apache License, Version 2.0 (the "License"); you maynot use this file except in compliance
    + * with the License. You may obtain a copy of the License at
    + * http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicablelaw or agreed to in
    + * writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
    + * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific
    + * language governing permissions and limitations under the License.
    + */
    +package org.apache.phoenix.hive.util;
    +
    +import java.sql.Date;
    +import java.sql.Timestamp;
    +import java.util.List;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.hive.common.type.HiveChar;
    +import org.apache.hadoop.hive.common.type.HiveVarchar;
    +import org.apache.hadoop.hive.serde2.SerDeException;
    +import org.apache.hadoop.hive.serde2.io.DateWritable;
    +import org.apache.hadoop.hive.serde2.io.DoubleWritable;
    +import org.apache.hadoop.hive.serde2.io.HiveCharWritable;
    +import org.apache.hadoop.hive.serde2.io.HiveVarcharWritable;
    +import org.apache.hadoop.hive.serde2.io.ShortWritable;
    +import org.apache.hadoop.hive.serde2.io.TimestampWritable;
    +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
    +import org.apache.hadoop.io.BooleanWritable;
    +import org.apache.hadoop.io.FloatWritable;
    +import org.apache.hadoop.io.IntWritable;
    +import org.apache.hadoop.io.LongWritable;
    +import org.apache.hadoop.io.Text;
    +import org.apache.hadoop.io.Writable;
    +import org.apache.phoenix.schema.types.PBinary;
    +import org.apache.phoenix.schema.types.PBoolean;
    +import org.apache.phoenix.schema.types.PChar;
    +import org.apache.phoenix.schema.types.PDataType;
    +import org.apache.phoenix.schema.types.PDate;
    +import org.apache.phoenix.schema.types.PDouble;
    +import org.apache.phoenix.schema.types.PFloat;
    +import org.apache.phoenix.schema.types.PInteger;
    +import org.apache.phoenix.schema.types.PLong;
    +import org.apache.phoenix.schema.types.PSmallint;
    +import org.apache.phoenix.schema.types.PTime;
    +import org.apache.phoenix.schema.types.PTimestamp;
    +import org.apache.phoenix.schema.types.PVarchar;
    +
    +public class HiveTypeUtil {
    +    private static final Log LOG = LogFactory.getLog(HiveTypeUtil.class);
    +
    +    private HiveTypeUtil() {
    +    }
    +
    +    /**
    +     * This method returns an array of most appropriates PDataType associated with a list of
    +     * incoming hive types.
    +     * @param List of TypeInfo
    +     * @return Array PDataType
    +     */
    +    public static PDataType[] hiveTypesToSqlTypes(List<TypeInfo> columnTypes) throws SerDeException {
    +        final PDataType[] result = new PDataType[columnTypes.size()];
    +        for (int i = 0; i < columnTypes.size(); i++) {
    +            result[i] = HiveType2PDataType(columnTypes.get(i));
    +        }
    +        return result;
    +    }
    +
    +    /**
    +     * This method returns the most appropriate PDataType associated with the incoming primitive
    +     * hive type.
    +     * @param hiveType
    +     * @return PDataType
    +     */
    +    public static PDataType HiveType2PDataType(TypeInfo hiveType) throws SerDeException {
    +        switch (hiveType.getCategory()) {
    +        /* Integrate Complex types like Array */
    +        case PRIMITIVE:
    +            return HiveType2PDataType(hiveType.getTypeName());
    +        default:
    +            throw new SerDeException("Phoenix unsupported column type: "
    +                    + hiveType.getCategory().name());
    +        }
    +    }
    +
    +    /**
    +     * This method returns the most appropriate PDataType associated with the incoming hive type
    +     * name.
    +     * @param hiveType
    +     * @return PDataType
    +     */
    +    public static PDataType HiveType2PDataType(String hiveType) throws SerDeException {
    +        final String lctype = hiveType.toLowerCase();
    +        if ("string".equals(lctype)) {
    --- End diff --
    
    yes, correct


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by mravi <gi...@git.apache.org>.

Github user mravi commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r30007221
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/util/PhoenixHiveConfiguration.java ---
    @@ -0,0 +1,103 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.phoenix.hive.util;
    +
    +import java.sql.Connection;
    +import java.sql.DriverManager;
    +import java.sql.SQLException;
    +import java.util.Map;
    +import java.util.Properties;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hive.metastore.api.Table;
    +import org.apache.phoenix.jdbc.PhoenixConnection;
    +import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil;
    +import org.apache.phoenix.util.QueryUtil;
    +
    +import com.google.common.base.Preconditions;
    +
    +public class PhoenixHiveConfiguration {
    +    private static final Log LOG = LogFactory.getLog(PhoenixHiveConfiguration.class);
    +    private PhoenixHiveConfigurationUtil util;
    +    private final Configuration conf = null;
    +
    +    private String Quorum = HiveConfigurationUtil.ZOOKEEPER_QUORUM_DEFAULT;
    +    private String Port = HiveConfigurationUtil.ZOOKEEPER_PORT_DEFAULT;
    +    private String Parent = HiveConfigurationUtil.ZOOKEEPER_PARENT_DEFAULT;
    +    private String TableName;
    +    private String DbName;
    +    private long BatchSize = PhoenixConfigurationUtil.DEFAULT_UPSERT_BATCH_SIZE;
    +
    +    public PhoenixHiveConfiguration(Configuration conf) {
    +        // this.conf = conf;
    +        this.util = new PhoenixHiveConfigurationUtil();
    +    }
    +
    +    public PhoenixHiveConfiguration(Table tbl) {
    +        Map<String, String> mps = tbl.getParameters();
    +        String quorum =
    +                mps.get(HiveConfigurationUtil.ZOOKEEPER_QUORUM) != null ? mps
    +                        .get(HiveConfigurationUtil.ZOOKEEPER_QUORUM)
    +                        : HiveConfigurationUtil.ZOOKEEPER_QUORUM_DEFAULT;
    +        String port =
    +                mps.get(HiveConfigurationUtil.ZOOKEEPER_PORT) != null ? mps
    +                        .get(HiveConfigurationUtil.ZOOKEEPER_PORT)
    +                        : HiveConfigurationUtil.ZOOKEEPER_PORT_DEFAULT;
    +        String parent =
    +                mps.get(HiveConfigurationUtil.ZOOKEEPER_PARENT) != null ? mps
    +                        .get(HiveConfigurationUtil.ZOOKEEPER_PARENT)
    +                        : HiveConfigurationUtil.ZOOKEEPER_PARENT_DEFAULT;
    +        String pk = mps.get(HiveConfigurationUtil.PHOENIX_ROWKEYS);
    +        if (!parent.startsWith("/")) {
    +            parent = "/" + parent;
    +        }
    +        String tablename =
    +                (mps.get(HiveConfigurationUtil.TABLE_NAME) != null) ? mps
    +                        .get(HiveConfigurationUtil.TABLE_NAME) : tbl.getTableName();
    +
    +        String mapping = mps.get(HiveConfigurationUtil.COLUMN_MAPPING);
    +    }
    +
    +    public void configure(String server, String tableName, long batchSize) {
    +        // configure(server, tableName, batchSize, null);
    +    }
    +
    +    public void configure(String quorum, String port, String parent, long batchSize,
    --- End diff --
    
    Are we using this configure method from any code ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by JamesRTaylor <gi...@git.apache.org>.

Github user JamesRTaylor commented on the pull request:

    https://github.com/apache/phoenix/pull/74#issuecomment-100526103
  
    @mravi  - would you mind giving this a review?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by JamesRTaylor <gi...@git.apache.org>.

Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29994860
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixRecordReader.java ---
    @@ -0,0 +1,148 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.base.Throwables;
    +
    +import java.io.IOException;
    +import java.io.PrintStream;
    +import java.sql.SQLException;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hbase.client.Scan;
    +import org.apache.hadoop.io.NullWritable;
    +import org.apache.hadoop.mapred.InputSplit;
    +import org.apache.hadoop.mapreduce.TaskAttemptContext;
    +import org.apache.hadoop.mapreduce.lib.db.DBWritable;
    +import org.apache.hadoop.util.ReflectionUtils;
    +import org.apache.phoenix.compile.QueryPlan;
    +import org.apache.phoenix.compile.ScanRanges;
    +import org.apache.phoenix.compile.SequenceManager;
    +import org.apache.phoenix.compile.StatementContext;
    +import org.apache.phoenix.iterate.ResultIterator;
    +import org.apache.phoenix.iterate.SequenceResultIterator;
    +import org.apache.phoenix.iterate.TableResultIterator;
    +import org.apache.phoenix.jdbc.PhoenixResultSet;
    +import org.apache.phoenix.mapreduce.PhoenixRecordReader;
    +import org.apache.phoenix.query.KeyRange;
    +import org.apache.phoenix.util.ScanUtil;
    +
    +/**
    +* HivePhoenixRecordReader
    +*
    +* @version 1.0
    +* @since   2015-02-08 
    +*/
    +
    +
    +public class HivePhoenixRecordReader<T extends DBWritable> extends PhoenixRecordReader<T> implements
    +        org.apache.hadoop.mapred.RecordReader<NullWritable, T> {
    +    private static final Log LOG = LogFactory.getLog(HivePhoenixRecordReader.class);
    +    private final Configuration  configuration;
    +    private final QueryPlan queryPlan;
    +    private NullWritable key =  NullWritable.get();
    +    private T value = null;
    +    private Class<T> inputClass;
    +    private ResultIterator resultIterator = null;
    +    private PhoenixResultSet resultSet;
    +
    +
    +    public HivePhoenixRecordReader(Class<T> inputClass, Configuration configuration, QueryPlan queryPlan) {
    +        super(inputClass,configuration,queryPlan);
    +        Preconditions.checkNotNull(configuration);
    +        Preconditions.checkNotNull(queryPlan);
    +        this.inputClass = inputClass;
    +        this.configuration = configuration;
    +        this.queryPlan = queryPlan;
    +    }
    +
    +    public float getProgress() {
    +        return 0.0F;
    +    }
    +
    +    public void init(InputSplit split) throws IOException,
    --- End diff --
    
    Why is this code repeated, as it's in the base MR integration already?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by ddraj <gi...@git.apache.org>.

Github user ddraj commented on the pull request:

    https://github.com/apache/phoenix/pull/74#issuecomment-101949365
  
    Ping @JamesRTaylor  @mravi .. Any more inputs for @nmaillard ? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by mravi <gi...@git.apache.org>.

Github user mravi commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29023409
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixOutputCommitter.java ---
    @@ -0,0 +1,65 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import java.io.IOException;
    +import java.sql.Connection;
    +import java.sql.SQLException;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.phoenix.mapreduce.PhoenixOutputCommitter;
    +
    +/**
    +* HivePhoenixOutputCommitter
    +*
    +* @version 1.0
    +* @since   2015-02-08 
    +*/
    +
    +public class HivePhoenixOutputCommitter extends PhoenixOutputCommitter{
    +    public final Log LOG = LogFactory.getLog(HivePhoenixOutputCommitter.class);
    +    
    +    public void commit(Connection connection)
    --- End diff --
    
    Are we using this class anywhere? I don't see any reference.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by ddraj <gi...@git.apache.org>.

Github user ddraj commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29018370
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixRecordReader.java ---
    @@ -0,0 +1,182 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.base.Throwables;
    +
    +import java.io.IOException;
    +import java.io.PrintStream;
    +import java.sql.SQLException;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hbase.client.Scan;
    +import org.apache.hadoop.io.NullWritable;
    +import org.apache.hadoop.mapred.InputSplit;
    +import org.apache.hadoop.mapreduce.TaskAttemptContext;
    +import org.apache.hadoop.mapreduce.lib.db.DBWritable;
    +import org.apache.hadoop.util.ReflectionUtils;
    +import org.apache.phoenix.compile.QueryPlan;
    +import org.apache.phoenix.compile.ScanRanges;
    +import org.apache.phoenix.compile.SequenceManager;
    +import org.apache.phoenix.compile.StatementContext;
    +import org.apache.phoenix.iterate.ResultIterator;
    +import org.apache.phoenix.iterate.SequenceResultIterator;
    +import org.apache.phoenix.iterate.TableResultIterator;
    +import org.apache.phoenix.jdbc.PhoenixResultSet;
    +import org.apache.phoenix.mapreduce.PhoenixRecordReader;
    +import org.apache.phoenix.query.KeyRange;
    +import org.apache.phoenix.util.ScanUtil;
    +
    +/**
    +* HivePhoenixRecordReader
    +* implements mapred as well for hive needs
    +*
    +* @version 1.0
    +* @since   2015-02-08 
    +*/
    +
    +
    +public class HivePhoenixRecordReader<T extends DBWritable> extends PhoenixRecordReader<T> implements
    +        org.apache.hadoop.mapred.RecordReader<NullWritable, T> {
    +    private static final Log LOG = LogFactory.getLog(HivePhoenixRecordReader.class);
    +    private final Configuration  configuration;
    +    private final QueryPlan queryPlan;
    +    private NullWritable key =  NullWritable.get();
    +    private T value = null;
    +    private Class<T> inputClass;
    +    private ResultIterator resultIterator = null;
    +    private PhoenixResultSet resultSet;
    +
    +
    +    public HivePhoenixRecordReader(Class<T> inputClass, Configuration configuration, QueryPlan queryPlan) {
    +        super(inputClass,configuration,queryPlan);
    +        Preconditions.checkNotNull(configuration);
    +        Preconditions.checkNotNull(queryPlan);
    +        this.inputClass = inputClass;
    +        this.configuration = configuration;
    +        this.queryPlan = queryPlan;
    +    }
    +
    +
    +   /* public NullWritable getCurrentKey() throws IOException, InterruptedException {
    --- End diff --
    
    Please remove all commented out code


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by mravi <gi...@git.apache.org>.

Github user mravi commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r30006953
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/PhoenixStorageHandler.java ---
    @@ -0,0 +1,147 @@
    +/*
    + * Copyright 2010 The Apache Software Foundation
    + *
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + *distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you maynot use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicablelaw or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import java.sql.SQLException;
    +import java.util.Iterator;
    +import java.util.Map;
    +import java.util.Map.Entry;
    +import java.util.Properties;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hive.metastore.HiveMetaHook;
    +import org.apache.hadoop.hive.ql.metadata.DefaultStorageHandler;
    +import org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler;
    +import org.apache.hadoop.hive.ql.plan.ExprNodeDesc;
    +import org.apache.hadoop.hive.ql.plan.TableDesc;
    +import org.apache.hadoop.hive.serde.Constants;
    +import org.apache.hadoop.hive.serde.serdeConstants;
    +import org.apache.hadoop.hive.serde2.Deserializer;
    +import org.apache.hadoop.hive.serde2.SerDe;
    +import org.apache.hadoop.hive.serde2.SerDeException;
    +import org.apache.hadoop.mapred.InputFormat;
    +import org.apache.hadoop.mapred.JobConf;
    +import org.apache.hadoop.mapred.OutputFormat;
    +import org.apache.phoenix.hive.util.HiveConfigurationUtil;
    +
    +
    +/**
    +* PhoenixStorageHandler
    +* This class manages all the Phoenix/Hive table initial configurations and SerDe Election
    +*/
    +
    +public class PhoenixStorageHandler extends DefaultStorageHandler implements
    +        HiveStoragePredicateHandler {
    +    static Log LOG = LogFactory.getLog(PhoenixStorageHandler.class.getName());
    +
    +    private Configuration conf = null;
    +
    +    public PhoenixStorageHandler() {
    +    }
    +
    +    @Override
    +    public Configuration getConf() {
    +        return conf;
    +    }
    +
    +    @Override
    +    public void setConf(Configuration conf) {
    +        this.conf = conf;
    +    }
    +
    +    @Override
    +    public HiveMetaHook getMetaHook() {
    +        return new PhoenixMetaHook();
    +    }
    +
    +    @Override
    +    public void configureInputJobProperties(TableDesc tableDesc, Map<String, String> jobProperties) {
    +        configureJobProperties(tableDesc, jobProperties);
    +    }
    +
    +    @Override
    +    public void
    +            configureOutputJobProperties(TableDesc tableDesc, Map<String, String> jobProperties) {
    +        configureJobProperties(tableDesc, jobProperties);
    +    }
    +
    +    @Override
    +    public void configureTableJobProperties(TableDesc tableDesc, Map<String, String> jobProperties) {
    +        configureJobProperties(tableDesc, jobProperties);
    +    }
    +
    +    /**
    +     * Extract all job properties to configure this job 
    +     * parameter tableDesc tabledescription, jobProperties
    +     * TODO this avoids any pushdown must revisit
    +     */
    +    private void configureJobProperties(TableDesc tableDesc, Map<String, String> jobProperties)
    +    {
    +      Properties tblProps = tableDesc.getProperties();
    +      tblProps.getProperty("phoenix.hbase.table.name");
    +      HiveConfigurationUtil.setProperties(tblProps, jobProperties);
    +
    +      //TODO this avoids any pushdown must revisit and extract meaningful parts
    +      jobProperties.put("phoenix.select.stmt", "select * from " + (String)jobProperties.get("phoenix.hbase.table.name"));
    +      if(((String)jobProperties.get("phoenix.hbase.table.name")).contains("limit")==true){
    +          
    +      }
    +      LOG.debug("ConfigurationUtil.SELECT_STATEMENT " + (String)jobProperties.get("phoenix.select.stmt"));
    +    }
    +    
    +    /**
    --- End diff --
    
    Can you provide an implementation of HiveStorageHandler#configureJobConf and have HBaseConfiguration passed to the JobConf. This way, the InputFormat classes and RecordReaders have the necessary HBase configurations. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by mravi <gi...@git.apache.org>.

Github user mravi commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29023418
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixOutputFormat.java ---
    @@ -0,0 +1,81 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import java.io.IOException;
    +import java.sql.Connection;
    +import java.sql.SQLException;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.fs.FileSystem;
    +import org.apache.hadoop.io.NullWritable;
    +import org.apache.hadoop.mapred.JobConf;
    +import org.apache.hadoop.mapred.RecordWriter;
    +import org.apache.hadoop.mapreduce.lib.db.DBWritable;
    +import org.apache.hadoop.util.Progressable;
    +import org.apache.phoenix.hive.util.HiveConnectionUtil;
    +
    +/**
    +* HivePhoenixOutputFormat
    +* Need to extend the standard PhoenixOutputFormat but also implement the mapred OutputFormat for Hive compliance
    +*
    +* @version 1.0
    +* @since   2015-02-08 
    +*/
    +
    +
    +public class HivePhoenixOutputFormat<T extends DBWritable> extends org.apache.phoenix.mapreduce.PhoenixOutputFormat<T> implements
    +org.apache.hadoop.mapred.OutputFormat<NullWritable, T> {
    +    private static final Log LOG = LogFactory.getLog(HivePhoenixOutputFormat.class);
    +    private Connection connection;
    +    private Configuration config;
    +
    +    public RecordWriter<NullWritable, T> getRecordWriter(FileSystem ignored, JobConf job,
    +            String name, Progressable progress) throws IOException {
    +        try {
    +            return new HivePhoenixRecordWriter(getConnection(job), job);
    +        } catch (SQLException e) {
    +            throw new IOException(e);
    +        }
    +    }
    +
    +    public void checkOutputSpecs(FileSystem ignored, JobConf job) throws IOException {
    +        LOG.debug("checkOutputSpecs");
    +        
    +    }
    +    
    +    synchronized Connection getConnection(Configuration configuration) throws IOException {
    --- End diff --
    
    Lets push this to HivePhoenixRecordWriter so we can initialize a connection and close it in that class itself.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by mravi <gi...@git.apache.org>.

Github user mravi commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29023563
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/util/PhoenixHiveConfiguration.java ---
    @@ -0,0 +1,106 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.phoenix.hive.util;
    +
    +import java.sql.Connection;
    +import java.sql.DriverManager;
    +import java.sql.SQLException;
    +import java.util.Map;
    +import java.util.Properties;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hive.metastore.api.Table;
    +import org.apache.phoenix.jdbc.PhoenixConnection;
    +import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil;
    +import org.apache.phoenix.util.QueryUtil;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * Created by nmaillard on 6/23/14.
    + */
    +public class PhoenixHiveConfiguration {
    +    private static final Log LOG = LogFactory.getLog(PhoenixHiveConfiguration.class);
    --- End diff --
    
    I don't see any reference to this class in the code.  Are we using it ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by mravi <gi...@git.apache.org>.

Github user mravi commented on the pull request:

    https://github.com/apache/phoenix/pull/74#issuecomment-95799980
  
    Thanks a lot for the patch @nmaillard . Couple of additional minor fixes 
    1) Couple of classes have your name as the author. Please remove it.
    2) Certain classes have lot of commented code . 
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by JamesRTaylor <gi...@git.apache.org>.

Github user JamesRTaylor commented on the pull request:

    https://github.com/apache/phoenix/pull/74#issuecomment-102078845
  
    @ddraj - I left feedback and it was never responded to or implemented.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by nmaillard <gi...@git.apache.org>.

Github user nmaillard commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29044634
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/util/HiveConnectionUtil.java ---
    @@ -0,0 +1,161 @@
    +/*
    + * Copyright 2010 The Apache Software Foundation
    + *
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + *distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you maynot use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicablelaw or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive.util;
    +
    +import java.sql.Connection;
    +import java.sql.DriverManager;
    +import java.sql.SQLException;
    +import java.util.Map;
    +import java.util.Properties;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hbase.HConstants;
    +import org.apache.hadoop.hive.metastore.api.Table;
    +import org.apache.phoenix.util.PhoenixRuntime;
    +import org.apache.phoenix.hive.util.HiveConfigurationUtil;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * Describe your class here.
    + *
    + * @since 138
    + */
    +public final class HiveConnectionUtil {
    --- End diff --
    
    Yes very true, however currently the getConnections does not allow to get another znode, always using hbase, in my test I hd base-unsecure. This being said I should maybe fix that instead of extending


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by mravi <gi...@git.apache.org>.

Github user mravi commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r30007211
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/util/PhoenixHiveConfiguration.java ---
    @@ -0,0 +1,103 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.phoenix.hive.util;
    +
    +import java.sql.Connection;
    +import java.sql.DriverManager;
    +import java.sql.SQLException;
    +import java.util.Map;
    +import java.util.Properties;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hive.metastore.api.Table;
    +import org.apache.phoenix.jdbc.PhoenixConnection;
    +import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil;
    +import org.apache.phoenix.util.QueryUtil;
    +
    +import com.google.common.base.Preconditions;
    +
    +public class PhoenixHiveConfiguration {
    +    private static final Log LOG = LogFactory.getLog(PhoenixHiveConfiguration.class);
    +    private PhoenixHiveConfigurationUtil util;
    +    private final Configuration conf = null;
    +
    +    private String Quorum = HiveConfigurationUtil.ZOOKEEPER_QUORUM_DEFAULT;
    +    private String Port = HiveConfigurationUtil.ZOOKEEPER_PORT_DEFAULT;
    +    private String Parent = HiveConfigurationUtil.ZOOKEEPER_PARENT_DEFAULT;
    +    private String TableName;
    +    private String DbName;
    +    private long BatchSize = PhoenixConfigurationUtil.DEFAULT_UPSERT_BATCH_SIZE;
    +
    +    public PhoenixHiveConfiguration(Configuration conf) {
    +        // this.conf = conf;
    +        this.util = new PhoenixHiveConfigurationUtil();
    +    }
    +
    +    public PhoenixHiveConfiguration(Table tbl) {
    +        Map<String, String> mps = tbl.getParameters();
    +        String quorum =
    +                mps.get(HiveConfigurationUtil.ZOOKEEPER_QUORUM) != null ? mps
    +                        .get(HiveConfigurationUtil.ZOOKEEPER_QUORUM)
    +                        : HiveConfigurationUtil.ZOOKEEPER_QUORUM_DEFAULT;
    +        String port =
    +                mps.get(HiveConfigurationUtil.ZOOKEEPER_PORT) != null ? mps
    +                        .get(HiveConfigurationUtil.ZOOKEEPER_PORT)
    +                        : HiveConfigurationUtil.ZOOKEEPER_PORT_DEFAULT;
    +        String parent =
    +                mps.get(HiveConfigurationUtil.ZOOKEEPER_PARENT) != null ? mps
    +                        .get(HiveConfigurationUtil.ZOOKEEPER_PARENT)
    +                        : HiveConfigurationUtil.ZOOKEEPER_PARENT_DEFAULT;
    +        String pk = mps.get(HiveConfigurationUtil.PHOENIX_ROWKEYS);
    +        if (!parent.startsWith("/")) {
    +            parent = "/" + parent;
    +        }
    +        String tablename =
    +                (mps.get(HiveConfigurationUtil.TABLE_NAME) != null) ? mps
    +                        .get(HiveConfigurationUtil.TABLE_NAME) : tbl.getTableName();
    +
    +        String mapping = mps.get(HiveConfigurationUtil.COLUMN_MAPPING);
    +    }
    +
    +    public void configure(String server, String tableName, long batchSize) {
    +        // configure(server, tableName, batchSize, null);
    +    }
    +
    +    public void configure(String quorum, String port, String parent, long batchSize,
    +            String tableName, String dbname, String columns) {
    +        Quorum = quorum != null ? quorum : HiveConfigurationUtil.ZOOKEEPER_QUORUM_DEFAULT;
    +        Port = port != null ? port : HiveConfigurationUtil.ZOOKEEPER_PORT_DEFAULT;
    +        Parent = parent != null ? parent : HiveConfigurationUtil.ZOOKEEPER_PARENT_DEFAULT;
    +        // BatchSize = batchSize!=null?batchSize:ConfigurationUtil.DEFAULT_UPSERT_BATCH_SIZE;
    +    }
    +
    +    static class PhoenixHiveConfigurationUtil {
    +
    +        public Connection getConnection(final Configuration configuration) throws SQLException {
    --- End diff --
    
    Can we remove this. I don't see any caller to this method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by mravi <gi...@git.apache.org>.

Github user mravi commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29023412
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixOutputFormat.java ---
    @@ -0,0 +1,81 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import java.io.IOException;
    +import java.sql.Connection;
    +import java.sql.SQLException;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.fs.FileSystem;
    +import org.apache.hadoop.io.NullWritable;
    +import org.apache.hadoop.mapred.JobConf;
    +import org.apache.hadoop.mapred.RecordWriter;
    +import org.apache.hadoop.mapreduce.lib.db.DBWritable;
    +import org.apache.hadoop.util.Progressable;
    +import org.apache.phoenix.hive.util.HiveConnectionUtil;
    +
    +/**
    +* HivePhoenixOutputFormat
    +* Need to extend the standard PhoenixOutputFormat but also implement the mapred OutputFormat for Hive compliance
    +*
    +* @version 1.0
    +* @since   2015-02-08 
    +*/
    +
    +
    +public class HivePhoenixOutputFormat<T extends DBWritable> extends org.apache.phoenix.mapreduce.PhoenixOutputFormat<T> implements
    +org.apache.hadoop.mapred.OutputFormat<NullWritable, T> {
    --- End diff --
    
    We can avoid extending PhoenixOutputFormat here as it isn't necessary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by JamesRTaylor <gi...@git.apache.org>.

Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29994951
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/util/PhoenixUtil.java ---
    @@ -0,0 +1,163 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive.util;
    +
    +import java.sql.Connection;
    +import java.sql.DatabaseMetaData;
    +import java.sql.ResultSet;
    +import java.sql.SQLException;
    +import java.util.ArrayList;
    +import java.util.Iterator;
    +import java.util.LinkedHashMap;
    +import java.util.List;
    +import java.util.Map;
    +import java.util.Map.Entry;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.hive.metastore.api.MetaException;
    +
    +import com.google.common.base.Joiner;
    +import com.google.common.base.Preconditions;
    +
    +public class PhoenixUtil {
    +    static Log LOG = LogFactory.getLog(PhoenixUtil.class.getName());
    +
    +    public static boolean createTable(Connection conn, String TableName,
    +            Map<String, String> fields, String[] pks, boolean addIfNotExists, int salt_buckets,
    +            String compression,int versions_num) throws SQLException, MetaException {
    +        Preconditions.checkNotNull(conn);
    +        if (pks == null || pks.length == 0) {
    +            throw new SQLException("Phoenix Table no Rowkeys specified in "
    +                    + HiveConfigurationUtil.PHOENIX_ROWKEYS);
    +        }
    +        for (String pk : pks) {
    +            String val = fields.get(pk.toLowerCase());
    +            if (val == null) {
    +                throw new MetaException("Phoenix Table rowkey " + pk
    +                        + " does not belong to listed fields ");
    +            }
    +            val += " not null";
    +            fields.put(pk, val);
    +        }
    +
    +        StringBuffer query = new StringBuffer("CREATE TABLE ");
    +        if (addIfNotExists) {
    +            query.append("IF NOT EXISTS ");
    +        }
    +        query.append(TableName + " ( ");
    +        Joiner.MapJoiner mapJoiner = Joiner.on(',').withKeyValueSeparator(" ");
    +        query.append(" " + mapJoiner.join(fields));
    +        if (pks != null && pks.length > 0) {
    +            query.append(" CONSTRAINT pk PRIMARY KEY (");
    +            Joiner joiner = Joiner.on(" , ");
    +            query.append(" " + joiner.join(pks) + " )");
    +        }
    +        query.append(" )");
    +        if (salt_buckets > 0) {
    +            query.append(" SALT_BUCKETS = " + salt_buckets);
    +        }
    +        if (compression != null) {
    +            query.append(" ,COMPRESSION='GZ'");
    +        }
    +        if (versions_num > 0) {
    +            query.append(" ,VERSIONS="+versions_num);
    +        }
    +        System.out.println("CREATED QUERY " +query.toString());
    --- End diff --
    
    Remove all System.out.println calls and replace with logging instead.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by mravi <gi...@git.apache.org>.

Github user mravi commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29023545
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java ---
    @@ -0,0 +1,275 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive.util;
    +
    +import java.util.List;
    +import java.util.Map;
    +import java.util.Properties;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.hbase.HConstants;
    +import org.apache.hadoop.hive.serde2.SerDeException;
    +import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
    +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
    +import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil;
    +import org.apache.phoenix.schema.types.PBinary;
    +import org.apache.phoenix.schema.types.PBoolean;
    +import org.apache.phoenix.schema.types.PChar;
    +import org.apache.phoenix.schema.types.PDataType;
    +import org.apache.phoenix.schema.types.PDate;
    +import org.apache.phoenix.schema.types.PDecimal;
    +import org.apache.phoenix.schema.types.PDouble;
    +import org.apache.phoenix.schema.types.PFloat;
    +import org.apache.phoenix.schema.types.PInteger;
    +import org.apache.phoenix.schema.types.PLong;
    +import org.apache.phoenix.schema.types.PSmallint;
    +import org.apache.phoenix.schema.types.PTime;
    +import org.apache.phoenix.schema.types.PTimestamp;
    +import org.apache.phoenix.schema.types.PTinyint;
    +import org.apache.phoenix.schema.types.PVarchar;
    +import org.apache.phoenix.util.PhoenixRuntime;
    +
    +/**
    + *
    + */
    +public class HiveConfigurationUtil {
    +    static Log LOG = LogFactory.getLog(HiveConfigurationUtil.class.getName());
    +
    +    public static final String TABLE_NAME = "phoenix.hbase.table.name";
    +    public static final String ZOOKEEPER_QUORUM = "phoenix.zookeeper.quorum";
    +    public static final String ZOOKEEPER_PORT = "phoenix.zookeeper.client.port";
    +    public static final String ZOOKEEPER_PARENT = "phoenix.zookeeper.znode.parent";
    +    public static final String ZOOKEEPER_QUORUM_DEFAULT = "localhost";
    +    public static final String ZOOKEEPER_PORT_DEFAULT = "2181";
    +    public static final String ZOOKEEPER_PARENT_DEFAULT = "/hbase-unsecure";
    +
    +    public static final String COLUMN_MAPPING = "phoenix.column.mapping";
    +    public static final String AUTOCREATE = "autocreate";
    +    public static final String AUTODROP = "autodrop";
    +    public static final String AUTOCOMMIT = "autocommit";
    +    public static final String PHOENIX_ROWKEYS = "phoenix.rowkeys";
    +    public static final String SALT_BUCKETS = "saltbuckets";
    +    public static final String COMPRESSION = "compression";
    +    public static final String VERSIONS = "versions";
    +    public static final int VERSIONS_NUM = 5;
    +    public static final String SPLIT = "split";
    +    public static final String REDUCE_SPECULATIVE_EXEC =
    +            "mapred.reduce.tasks.speculative.execution";
    +    public static final String MAP_SPECULATIVE_EXEC = "mapred.map.tasks.speculative.execution";
    +
    +    public static void setProperties(Properties tblProps, Map<String, String> jobProperties) {
    +        String quorum = tblProps.getProperty(HiveConfigurationUtil.ZOOKEEPER_QUORUM) != null ?
    +                        tblProps.getProperty(HiveConfigurationUtil.ZOOKEEPER_QUORUM) :
    +                        HiveConfigurationUtil.ZOOKEEPER_QUORUM_DEFAULT;
    +        String znode = tblProps.getProperty(HiveConfigurationUtil.ZOOKEEPER_PARENT) != null ?
    +                        tblProps.getProperty(HiveConfigurationUtil.ZOOKEEPER_PARENT) :
    +                        HiveConfigurationUtil.ZOOKEEPER_PARENT_DEFAULT;
    +        String port = tblProps.getProperty(HiveConfigurationUtil.ZOOKEEPER_PORT) != null ?
    +                        tblProps.getProperty(HiveConfigurationUtil.ZOOKEEPER_PORT) :
    +                        HiveConfigurationUtil.ZOOKEEPER_PORT_DEFAULT;
    +        if (!znode.startsWith("/")) {
    +            znode = "/" + znode;
    +        }
    +        LOG.debug("quorum:" + quorum);
    +        LOG.debug("port:" + port);
    +        LOG.debug("parent:" +znode);
    +        LOG.debug("table:" + tblProps.getProperty(HiveConfigurationUtil.TABLE_NAME));
    +        LOG.debug("batch:" + tblProps.getProperty(PhoenixConfigurationUtil.UPSERT_BATCH_SIZE));
    +        
    +        jobProperties.put(HiveConfigurationUtil.ZOOKEEPER_QUORUM, quorum);
    +        jobProperties.put(HiveConfigurationUtil.ZOOKEEPER_PORT, port);
    +        jobProperties.put(HiveConfigurationUtil.ZOOKEEPER_PARENT, znode);
    +        String tableName = tblProps.getProperty(HiveConfigurationUtil.TABLE_NAME);
    +        if (tableName == null) {
    +            tableName = tblProps.get("name").toString();
    +            tableName = tableName.split(".")[1];
    +        }
    +        // TODO this is synch with common Phoenix mechanism revisit to make wiser decisions
    +        jobProperties.put(HConstants.ZOOKEEPER_QUORUM,quorum+PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR
    +                + port + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR+znode);
    +        jobProperties.put(HiveConfigurationUtil.TABLE_NAME, tableName);
    +        // TODO this is synch with common Phoenix mechanism revisit to make wiser decisions
    +        jobProperties.put(PhoenixConfigurationUtil.OUTPUT_TABLE_NAME, tableName);
    +        jobProperties.put(PhoenixConfigurationUtil.INPUT_TABLE_NAME, tableName);
    +    }
    +
    +    public static PDataType[] hiveTypesToPDataType(
    +            PrimitiveObjectInspector.PrimitiveCategory[] hiveTypes) throws SerDeException {
    +        final PDataType[] result = new PDataType[hiveTypes.length];
    +        for (int i = 0; i < hiveTypes.length; i++) {
    +            result[i] = hiveTypeToPDataType(hiveTypes[i]);
    +        }
    +        return result;
    +    }
    +
    +    public static PDataType[] hiveTypesToSqlTypes(List<TypeInfo> columnTypes) throws SerDeException {
    +        final PDataType[] result = new PDataType[columnTypes.size()];
    +        for (int i = 0; i < columnTypes.size(); i++) {
    +            result[i] = hiveTypeToPDataType(columnTypes.get(i));
    +        }
    +        return result;
    +    }
    +
    +    /**
    +     * This method returns the most appropriate PDataType associated with the incoming hive type.
    +     * @param hiveType
    +     * @return PDataType
    +     */
    +
    +    public static PDataType hiveTypeToPDataType(TypeInfo hiveType) throws SerDeException {
    +        switch (hiveType.getCategory()) {
    +        case PRIMITIVE:
    +            // return
    +            // hiveTypeToPDataType(((PrimitiveObjectInspector)hiveType).getPrimitiveCategory());
    +            return hiveTypeToPDataType(hiveType.getTypeName());
    +        default:
    +
    +            throw new SerDeException("Unsupported column type: " + hiveType.getCategory().name());
    +        }
    +    }
    +
    +    public static PDataType hiveTypeToPDataType(String hiveType) throws SerDeException {
    +        final String lctype = hiveType.toLowerCase();
    +        if ("string".equals(lctype)) {
    +            return PVarchar.INSTANCE;
    +        } else if ("float".equals(lctype)) {
    +            return PFloat.INSTANCE;
    +        } else if ("double".equals(lctype)) {
    +            return PDouble.INSTANCE;
    +        } else if ("boolean".equals(lctype)) {
    +            return PBoolean.INSTANCE;
    +        } else if ("tinyint".equals(lctype)) {
    +            return PTinyint.INSTANCE;
    +        } else if ("smallint".equals(lctype)) {
    +            return PSmallint.INSTANCE;
    +        } else if ("int".equals(lctype)) {
    +            return PInteger.INSTANCE;
    +        } else if ("bigint".equals(lctype)) {
    +            return PLong.INSTANCE;
    +        } else if ("timestamp".equals(lctype)) {
    +            return PTimestamp.INSTANCE;
    +        } else if ("binary".equals(lctype)) {
    +            return PBinary.INSTANCE;
    +        } else if ("date".equals(lctype)) {
    +            return PDate.INSTANCE;
    +        }
    +
    +        throw new SerDeException("Unrecognized column type: " + hiveType);
    +    }
    +
    +    /**
    +     * This method returns the most appropriate PDataType associated with the incoming hive type.
    +     * @param hiveType
    +     * @return PDataType
    +     */
    +
    +    public static PDataType hiveTypeToPDataType(PrimitiveObjectInspector.PrimitiveCategory hiveType)
    +            throws SerDeException {
    +        /* TODO check backward type compatibility prior to hive 0.12 */
    +
    +        if (hiveType == null) {
    +            return null;
    +        }
    +        switch (hiveType) {
    +        case BOOLEAN:
    +            return PBoolean.INSTANCE;
    +        case BYTE:
    +            return PBinary.INSTANCE;
    +        case DATE:
    +            return PDate.INSTANCE;
    +        case DECIMAL:
    +            return PDecimal.INSTANCE;
    +        case DOUBLE:
    +            return PDouble.INSTANCE;
    +        case FLOAT:
    +            return PFloat.INSTANCE;
    +        case INT:
    +            return PInteger.INSTANCE;
    +        case LONG:
    +            return PLong.INSTANCE;
    +        case SHORT:
    +            return PSmallint.INSTANCE;
    +        case STRING:
    +            return PVarchar.INSTANCE;
    +        case TIMESTAMP:
    +            return PTimestamp.INSTANCE;
    +        case VARCHAR:
    +            return PVarchar.INSTANCE;
    +        case VOID:
    +            return PChar.INSTANCE;
    +        case UNKNOWN:
    +            throw new RuntimeException("Unknown primitive");
    +        default:
    +            new SerDeException("Unrecognized column type: " + hiveType);
    +        }
    +        return null;
    +    }
    +
    +    /**
    +     * This method encodes a value with Phoenix data type. It begins with checking whether an object
    +     * is BINARY and makes a call to {@link #castBytes(Object, PDataType)} to convert bytes to
    +     * targetPhoenixType
    +     * @param o
    +     * @param targetPhoenixType
    +     * @return Object
    +     */
    +    public static Object castHiveTypeToPhoenix(Object o, byte objectType,
    +            PDataType targetPhoenixType) {
    +        /*
    +         * PDataType inferredPType = getType(o, objectType); if (inferredPType == null) { return
    --- End diff --
    
    please remove the comments


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by mravi <gi...@git.apache.org>.

Github user mravi commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29023552
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/util/HiveConnectionUtil.java ---
    @@ -0,0 +1,161 @@
    +/*
    + * Copyright 2010 The Apache Software Foundation
    + *
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + *distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you maynot use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicablelaw or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive.util;
    +
    +import java.sql.Connection;
    +import java.sql.DriverManager;
    +import java.sql.SQLException;
    +import java.util.Map;
    +import java.util.Properties;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hbase.HConstants;
    +import org.apache.hadoop.hive.metastore.api.Table;
    +import org.apache.phoenix.util.PhoenixRuntime;
    +import org.apache.phoenix.hive.util.HiveConfigurationUtil;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * Describe your class here.
    + *
    + * @since 138
    + */
    +public final class HiveConnectionUtil {
    --- End diff --
    
    We can avoid this class by reusing the functionality in ConnectionUtil to get connections.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by mravi <gi...@git.apache.org>.

Github user mravi commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29023421
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixRecordReader.java ---
    @@ -0,0 +1,182 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.base.Throwables;
    +
    +import java.io.IOException;
    +import java.io.PrintStream;
    +import java.sql.SQLException;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hbase.client.Scan;
    +import org.apache.hadoop.io.NullWritable;
    +import org.apache.hadoop.mapred.InputSplit;
    +import org.apache.hadoop.mapreduce.TaskAttemptContext;
    +import org.apache.hadoop.mapreduce.lib.db.DBWritable;
    +import org.apache.hadoop.util.ReflectionUtils;
    +import org.apache.phoenix.compile.QueryPlan;
    +import org.apache.phoenix.compile.ScanRanges;
    +import org.apache.phoenix.compile.SequenceManager;
    +import org.apache.phoenix.compile.StatementContext;
    +import org.apache.phoenix.iterate.ResultIterator;
    +import org.apache.phoenix.iterate.SequenceResultIterator;
    +import org.apache.phoenix.iterate.TableResultIterator;
    +import org.apache.phoenix.jdbc.PhoenixResultSet;
    +import org.apache.phoenix.mapreduce.PhoenixRecordReader;
    +import org.apache.phoenix.query.KeyRange;
    +import org.apache.phoenix.util.ScanUtil;
    +
    +/**
    +* HivePhoenixRecordReader
    +* implements mapred as well for hive needs
    +*
    +* @version 1.0
    +* @since   2015-02-08 
    +*/
    +
    +
    +public class HivePhoenixRecordReader<T extends DBWritable> extends PhoenixRecordReader<T> implements
    --- End diff --
    
    Can you please share your thoughts on why we need to extend PhoenixRecordReader here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by nmaillard <gi...@git.apache.org>.

Github user nmaillard commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29038931
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixInputFormat.java ---
    @@ -0,0 +1,161 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import java.io.IOException;
    +import java.sql.Connection;
    +import java.sql.Statement;
    +import java.util.List;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.fs.Path;
    +import org.apache.hadoop.hbase.client.Scan;
    +import org.apache.hadoop.io.NullWritable;
    +import org.apache.hadoop.io.Text;
    +import org.apache.hadoop.mapred.FileSplit;
    +import org.apache.hadoop.mapred.InputSplit;
    +import org.apache.hadoop.mapred.JobConf;
    +import org.apache.hadoop.mapred.RecordReader;
    +import org.apache.hadoop.mapred.Reporter;
    +import org.apache.hadoop.mapreduce.JobContext;
    +import org.apache.hadoop.mapreduce.lib.db.DBWritable;
    +import org.apache.phoenix.compile.QueryPlan;
    +import org.apache.phoenix.compile.StatementContext;
    +import org.apache.phoenix.hive.util.HiveConnectionUtil;
    +import org.apache.phoenix.jdbc.PhoenixStatement;
    +import org.apache.phoenix.mapreduce.*;
    +import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil;
    +import org.apache.phoenix.query.KeyRange;
    +import org.apache.phoenix.schema.TableRef;
    +import org.apache.phoenix.util.ScanUtil;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Lists;
    +
    +
    +/**
    +* HivePhoenixInputFormat
    +* Need to extend the standard PhoenixInputFormat but also implement the mapred inputFormat for Hive compliance
    +*
    +* @version 1.0
    +* @since   2015-02-08 
    +*/
    +
    +public class HivePhoenixInputFormat<T extends DBWritable> extends org.apache.phoenix.mapreduce.PhoenixInputFormat<T> implements org.apache.hadoop.mapred.InputFormat<NullWritable, T>{
    --- End diff --
    
    +1 to avoid code duplication as well. I do believe this is more a Hive issue and should be fixed there and not use the mapred API anymore an we could get rid of this whole class, and a couple others. This might be a longer battle so open to put this code where seems to be the best place.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by JamesRTaylor <gi...@git.apache.org>.

Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29994876
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/util/HiveConnectionUtil.java ---
    @@ -0,0 +1,161 @@
    +/*
    + * Copyright 2010 The Apache Software Foundation
    + *
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + *distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you maynot use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicablelaw or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive.util;
    +
    +import java.sql.Connection;
    +import java.sql.DriverManager;
    +import java.sql.SQLException;
    +import java.util.Map;
    +import java.util.Properties;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hbase.HConstants;
    +import org.apache.hadoop.hive.metastore.api.Table;
    +import org.apache.phoenix.util.PhoenixRuntime;
    +import org.apache.phoenix.hive.util.HiveConfigurationUtil;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * Describe your class here.
    + *
    + * @since 138
    + */
    +public final class HiveConnectionUtil {
    +    private static final Log LOG = LogFactory.getLog(HiveConnectionUtil.class);
    +    private static Connection connection = null;
    +    
    +    /**
    +     * Returns the {#link Connection} from Configuration
    +     * @param configuration
    +     * @return
    +     * @throws SQLException
    +     */
    +    public static Connection getConnection(final Configuration configuration) throws SQLException {
    +        Preconditions.checkNotNull(configuration);
    +        if (connection == null) {
    +            final Properties props = new Properties();
    +            String quorum =
    +                    configuration
    +                            .get(HiveConfigurationUtil.ZOOKEEPER_QUORUM != null ? HiveConfigurationUtil.ZOOKEEPER_QUORUM
    +                                    : configuration.get(HConstants.ZOOKEEPER_QUORUM));
    +            String znode =
    +                    configuration
    +                            .get(HiveConfigurationUtil.ZOOKEEPER_PARENT != null ? HiveConfigurationUtil.ZOOKEEPER_PARENT
    +                                    : configuration.get(HConstants.ZOOKEEPER_ZNODE_PARENT));
    +            String port =
    +                    configuration
    +                            .get(HiveConfigurationUtil.ZOOKEEPER_PORT != null ? HiveConfigurationUtil.ZOOKEEPER_PORT
    +                                    : configuration.get(HConstants.ZOOKEEPER_CLIENT_PORT));
    +            if (!znode.startsWith("/")) {
    +                znode = "/" + znode;
    +            }
    +
    +            try {
    +                // Not necessary shoud pick it up
    +                Class.forName("org.apache.phoenix.jdbc.PhoenixDriver");
    +
    +                LOG.info("Connection info: " + PhoenixRuntime.JDBC_PROTOCOL
    +                        + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + quorum
    +                        + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + port
    +                        + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + znode);
    +
    +                final Connection conn =
    +                        DriverManager.getConnection(PhoenixRuntime.JDBC_PROTOCOL
    +                                + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + quorum
    +                                + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + port
    +                                + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + znode);
    +                String autocommit = configuration.get(HiveConfigurationUtil.AUTOCOMMIT);
    +                if (autocommit != null && autocommit.equalsIgnoreCase("true")) {
    +                    conn.setAutoCommit(true);
    +                } else {
    +                    conn.setAutoCommit(false);
    +                }
    +                connection = conn;
    +            } catch (ClassNotFoundException e) {
    +                // TODO Auto-generated catch block
    +                e.printStackTrace();
    --- End diff --
    
    Throw here - don't swallow this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by JamesRTaylor <gi...@git.apache.org>.

Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29994962
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/util/PhoenixUtil.java ---
    @@ -0,0 +1,163 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive.util;
    +
    +import java.sql.Connection;
    +import java.sql.DatabaseMetaData;
    +import java.sql.ResultSet;
    +import java.sql.SQLException;
    +import java.util.ArrayList;
    +import java.util.Iterator;
    +import java.util.LinkedHashMap;
    +import java.util.List;
    +import java.util.Map;
    +import java.util.Map.Entry;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.hive.metastore.api.MetaException;
    +
    +import com.google.common.base.Joiner;
    +import com.google.common.base.Preconditions;
    +
    +public class PhoenixUtil {
    +    static Log LOG = LogFactory.getLog(PhoenixUtil.class.getName());
    +
    +    public static boolean createTable(Connection conn, String TableName,
    +            Map<String, String> fields, String[] pks, boolean addIfNotExists, int salt_buckets,
    +            String compression,int versions_num) throws SQLException, MetaException {
    +        Preconditions.checkNotNull(conn);
    +        if (pks == null || pks.length == 0) {
    +            throw new SQLException("Phoenix Table no Rowkeys specified in "
    +                    + HiveConfigurationUtil.PHOENIX_ROWKEYS);
    +        }
    +        for (String pk : pks) {
    +            String val = fields.get(pk.toLowerCase());
    +            if (val == null) {
    +                throw new MetaException("Phoenix Table rowkey " + pk
    +                        + " does not belong to listed fields ");
    +            }
    +            val += " not null";
    +            fields.put(pk, val);
    +        }
    +
    +        StringBuffer query = new StringBuffer("CREATE TABLE ");
    +        if (addIfNotExists) {
    +            query.append("IF NOT EXISTS ");
    +        }
    +        query.append(TableName + " ( ");
    +        Joiner.MapJoiner mapJoiner = Joiner.on(',').withKeyValueSeparator(" ");
    +        query.append(" " + mapJoiner.join(fields));
    +        if (pks != null && pks.length > 0) {
    +            query.append(" CONSTRAINT pk PRIMARY KEY (");
    +            Joiner joiner = Joiner.on(" , ");
    +            query.append(" " + joiner.join(pks) + " )");
    +        }
    +        query.append(" )");
    +        if (salt_buckets > 0) {
    +            query.append(" SALT_BUCKETS = " + salt_buckets);
    +        }
    +        if (compression != null) {
    +            query.append(" ,COMPRESSION='GZ'");
    +        }
    +        if (versions_num > 0) {
    +            query.append(" ,VERSIONS="+versions_num);
    +        }
    +        System.out.println("CREATED QUERY " +query.toString());
    +        LOG.info("Create table query statement " + query.toString());
    +        return createTable(conn, query.toString());
    +    }
    +
    +    public static boolean createTable(Connection conn, String query) throws SQLException {
    +        Preconditions.checkNotNull(conn);
    +        return conn.createStatement().execute(query);
    +    }
    +
    +    public static boolean findTable(Connection conn, String name) throws SQLException {
    +        Preconditions.checkNotNull(conn);
    +        Preconditions.checkNotNull(name);
    +        DatabaseMetaData dbm = conn.getMetaData();
    +        ResultSet rs = dbm.getTables(null, null, name, null);
    +        LOG.info("looking for table");
    +        if (rs.next()) {
    +            LOG.info("found the table " + rs.getString("TABLE_NAME"));
    +            while (rs.next()) {
    +                LOG.info("found the table " + rs.getString("TABLE_NAME"));
    +            }
    +            return true;
    +        }
    +        return false;
    +    }
    +
    +    public static boolean testTable(Connection conn, String name, Map<String, String> fields)
    --- End diff --
    
    Move out any test-specific methods to test util classes instead of having them in runtime classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by mravi <gi...@git.apache.org>.

Github user mravi commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r30007153
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java ---
    @@ -0,0 +1,111 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive.util;
    +
    +import java.util.List;
    +import java.util.Map;
    +import java.util.Properties;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.hbase.HConstants;
    +import org.apache.hadoop.hive.serde2.SerDeException;
    +import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
    +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
    +import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil;
    +import org.apache.phoenix.schema.types.PBinary;
    +import org.apache.phoenix.schema.types.PBoolean;
    +import org.apache.phoenix.schema.types.PChar;
    +import org.apache.phoenix.schema.types.PDataType;
    +import org.apache.phoenix.schema.types.PDate;
    +import org.apache.phoenix.schema.types.PDecimal;
    +import org.apache.phoenix.schema.types.PDouble;
    +import org.apache.phoenix.schema.types.PFloat;
    +import org.apache.phoenix.schema.types.PInteger;
    +import org.apache.phoenix.schema.types.PLong;
    +import org.apache.phoenix.schema.types.PSmallint;
    +import org.apache.phoenix.schema.types.PTime;
    +import org.apache.phoenix.schema.types.PTimestamp;
    +import org.apache.phoenix.schema.types.PTinyint;
    +import org.apache.phoenix.schema.types.PVarchar;
    +import org.apache.phoenix.util.PhoenixRuntime;
    +
    +/**
    + *
    + */
    +public class HiveConfigurationUtil {
    +    static Log LOG = LogFactory.getLog(HiveConfigurationUtil.class.getName());
    +
    +    public static final String TABLE_NAME = "phoenix.hbase.table.name";
    +    public static final String ZOOKEEPER_QUORUM = "phoenix.zookeeper.quorum";
    +    public static final String ZOOKEEPER_PORT = "phoenix.zookeeper.client.port";
    +    public static final String ZOOKEEPER_PARENT = "phoenix.zookeeper.znode.parent";
    +    public static final String ZOOKEEPER_QUORUM_DEFAULT = "localhost";
    +    public static final String ZOOKEEPER_PORT_DEFAULT = "2181";
    +    public static final String ZOOKEEPER_PARENT_DEFAULT = "/hbase-unsecure";
    +
    +    public static final String COLUMN_MAPPING = "phoenix.column.mapping";
    +    public static final String AUTOCREATE = "autocreate";
    +    public static final String AUTODROP = "autodrop";
    +    public static final String AUTOCOMMIT = "autocommit";
    +    public static final String PHOENIX_ROWKEYS = "phoenix.rowkeys";
    +    public static final String SALT_BUCKETS = "saltbuckets";
    +    public static final String COMPRESSION = "compression";
    +    public static final String VERSIONS = "versions";
    +    public static final int VERSIONS_NUM = 5;
    +    public static final String SPLIT = "split";
    +    public static final String REDUCE_SPECULATIVE_EXEC =
    +            "mapred.reduce.tasks.speculative.execution";
    +    public static final String MAP_SPECULATIVE_EXEC = "mapred.map.tasks.speculative.execution";
    +
    +    public static void setProperties(Properties tblProps, Map<String, String> jobProperties) {
    --- End diff --
    
    Since this method is being called from PhoenixHiveStorageHandler, can you 
    a) Ensure the HBase configuration is loaded in PhoenixHiveStorageHandler#setConf.
       this.con = HBaseConfiguration.create(conf);
    b) Pass the configuration object to the method setProperties() from PhoenixHiveStorageHandler#configureJobProperties .   
    
    That way, we avoid setting default values.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by mravi <gi...@git.apache.org>.

Github user mravi commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r30006897
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/util/HiveConnectionUtil.java ---
    @@ -0,0 +1,161 @@
    +/*
    + * Copyright 2010 The Apache Software Foundation
    + *
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + *distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you maynot use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicablelaw or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive.util;
    +
    +import java.sql.Connection;
    +import java.sql.DriverManager;
    +import java.sql.SQLException;
    +import java.util.Map;
    +import java.util.Properties;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hbase.HConstants;
    +import org.apache.hadoop.hive.metastore.api.Table;
    +import org.apache.phoenix.util.PhoenixRuntime;
    +import org.apache.phoenix.hive.util.HiveConfigurationUtil;
    +
    +import com.google.common.base.Preconditions;
    +
    +/**
    + * Describe your class here.
    + *
    + * @since 138
    + */
    +public final class HiveConnectionUtil {
    --- End diff --
    
    If you override HiveStorageHandler#configureJobConf in PhoenixStorageHandler and provide a way to add HBase properties to the configuration, I believe we can avoid this class and leverage ConnectionUtil


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by ddraj <gi...@git.apache.org>.

Github user ddraj commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29017940
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixInputSplit.java ---
    @@ -0,0 +1,111 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import com.google.common.base.Preconditions;
    +
    +import java.io.DataInput;
    +import java.io.DataOutput;
    +import java.io.IOException;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.fs.Path;
    +import org.apache.hadoop.io.Text;
    +import org.apache.hadoop.mapred.FileSplit;
    +import org.apache.phoenix.query.KeyRange;
    +
    +
    +/**
    +* HivePhoenixInputSplit
    +* Need to extend Mapred for Hive compliance reasons
    +*
    +* @version 1.0
    +* @since   2015-02-08 
    +*/
    +
    +public class HivePhoenixInputSplit extends FileSplit {
    +    private static final Log LOG = LogFactory.getLog(HivePhoenixInputSplit.class);
    +    private KeyRange keyRange;
    +    private Path path;
    +
    +    public HivePhoenixInputSplit() {
    +        super((Path) null, 0, 0, (String[]) null);
    +    }
    +
    +    public HivePhoenixInputSplit(KeyRange keyRange) {
    +        Preconditions.checkNotNull(keyRange);
    +        this.keyRange = keyRange;
    +    }
    +
    +    public HivePhoenixInputSplit(KeyRange keyRange, Path path) {
    +        Preconditions.checkNotNull(keyRange);
    +        Preconditions.checkNotNull(path);
    +        LOG.debug("path: " + path);
    +
    +        this.keyRange = keyRange;
    +        this.path = path;
    +    }
    +
    +    public void readFields(DataInput input) throws IOException {
    +        this.path = new Path(Text.readString(input));
    +        this.keyRange = new KeyRange();
    +        this.keyRange.readFields(input);
    +    }
    +
    +    public void write(DataOutput output) throws IOException {
    +        Preconditions.checkNotNull(this.keyRange);
    +        Text.writeString(output, path.toString());
    +        this.keyRange.write(output);
    +    }
    +
    +    public long getLength() {
    +        return 0L;
    +    }
    +
    +    public String[] getLocations() {
    +        return new String[0];
    +    }
    +
    +    public KeyRange getKeyRange() {
    +        return this.keyRange;
    +    }
    +
    +    @Override
    +    public Path getPath() {
    +        return this.path;
    +    }
    +
    +    public int hashCode() {
    +        int prime = 31;
    +        int result = 1;
    +        result = 31 * result + (this.keyRange == null ? 0 : this.keyRange.hashCode());
    +        return result;
    +    }
    +
    +    public boolean equals(Object obj) {
    --- End diff --
    
    Shouldn't the equals and hashCode take into account the 'path' field?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by JamesRTaylor <gi...@git.apache.org>.

Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29096976
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/util/HiveTypeUtil.java ---
    @@ -0,0 +1,151 @@
    +/*
    + * Copyright 2010 The Apache Software Foundation Licensed to the Apache Software Foundation (ASF)
    + * under one or more contributor license agreements. See the NOTICE filedistributed with this work
    + * for additional information regarding copyright ownership. The ASF licenses this file to you under
    + * the Apache License, Version 2.0 (the "License"); you maynot use this file except in compliance
    + * with the License. You may obtain a copy of the License at
    + * http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicablelaw or agreed to in
    + * writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
    + * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific
    + * language governing permissions and limitations under the License.
    + */
    +package org.apache.phoenix.hive.util;
    +
    +import java.sql.Date;
    +import java.sql.Timestamp;
    +import java.util.List;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.hive.common.type.HiveChar;
    +import org.apache.hadoop.hive.common.type.HiveVarchar;
    +import org.apache.hadoop.hive.serde2.SerDeException;
    +import org.apache.hadoop.hive.serde2.io.DateWritable;
    +import org.apache.hadoop.hive.serde2.io.DoubleWritable;
    +import org.apache.hadoop.hive.serde2.io.HiveCharWritable;
    +import org.apache.hadoop.hive.serde2.io.HiveVarcharWritable;
    +import org.apache.hadoop.hive.serde2.io.ShortWritable;
    +import org.apache.hadoop.hive.serde2.io.TimestampWritable;
    +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
    +import org.apache.hadoop.io.BooleanWritable;
    +import org.apache.hadoop.io.FloatWritable;
    +import org.apache.hadoop.io.IntWritable;
    +import org.apache.hadoop.io.LongWritable;
    +import org.apache.hadoop.io.Text;
    +import org.apache.hadoop.io.Writable;
    +import org.apache.phoenix.schema.types.PBinary;
    +import org.apache.phoenix.schema.types.PBoolean;
    +import org.apache.phoenix.schema.types.PChar;
    +import org.apache.phoenix.schema.types.PDataType;
    +import org.apache.phoenix.schema.types.PDate;
    +import org.apache.phoenix.schema.types.PDouble;
    +import org.apache.phoenix.schema.types.PFloat;
    +import org.apache.phoenix.schema.types.PInteger;
    +import org.apache.phoenix.schema.types.PLong;
    +import org.apache.phoenix.schema.types.PSmallint;
    +import org.apache.phoenix.schema.types.PTime;
    +import org.apache.phoenix.schema.types.PTimestamp;
    +import org.apache.phoenix.schema.types.PVarchar;
    +
    +public class HiveTypeUtil {
    +    private static final Log LOG = LogFactory.getLog(HiveTypeUtil.class);
    +
    +    private HiveTypeUtil() {
    +    }
    +
    +    /**
    +     * This method returns an array of most appropriates PDataType associated with a list of
    +     * incoming hive types.
    +     * @param List of TypeInfo
    +     * @return Array PDataType
    +     */
    +    public static PDataType[] hiveTypesToSqlTypes(List<TypeInfo> columnTypes) throws SerDeException {
    +        final PDataType[] result = new PDataType[columnTypes.size()];
    +        for (int i = 0; i < columnTypes.size(); i++) {
    +            result[i] = HiveType2PDataType(columnTypes.get(i));
    +        }
    +        return result;
    +    }
    +
    +    /**
    +     * This method returns the most appropriate PDataType associated with the incoming primitive
    +     * hive type.
    +     * @param hiveType
    +     * @return PDataType
    +     */
    +    public static PDataType HiveType2PDataType(TypeInfo hiveType) throws SerDeException {
    +        switch (hiveType.getCategory()) {
    +        /* Integrate Complex types like Array */
    +        case PRIMITIVE:
    +            return HiveType2PDataType(hiveType.getTypeName());
    +        default:
    +            throw new SerDeException("Phoenix unsupported column type: "
    +                    + hiveType.getCategory().name());
    +        }
    +    }
    +
    +    /**
    +     * This method returns the most appropriate PDataType associated with the incoming hive type
    +     * name.
    +     * @param hiveType
    +     * @return PDataType
    +     */
    +    public static PDataType HiveType2PDataType(String hiveType) throws SerDeException {
    +        final String lctype = hiveType.toLowerCase();
    +        if ("string".equals(lctype)) {
    --- End diff --
    
    Looks like the same as the earlier one?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by ddraj <gi...@git.apache.org>.

Github user ddraj commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29012278
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixInputFormat.java ---
    @@ -0,0 +1,161 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import java.io.IOException;
    +import java.sql.Connection;
    +import java.sql.Statement;
    +import java.util.List;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.fs.Path;
    +import org.apache.hadoop.hbase.client.Scan;
    +import org.apache.hadoop.io.NullWritable;
    +import org.apache.hadoop.io.Text;
    +import org.apache.hadoop.mapred.FileSplit;
    +import org.apache.hadoop.mapred.InputSplit;
    +import org.apache.hadoop.mapred.JobConf;
    +import org.apache.hadoop.mapred.RecordReader;
    +import org.apache.hadoop.mapred.Reporter;
    +import org.apache.hadoop.mapreduce.JobContext;
    +import org.apache.hadoop.mapreduce.lib.db.DBWritable;
    +import org.apache.phoenix.compile.QueryPlan;
    +import org.apache.phoenix.compile.StatementContext;
    +import org.apache.phoenix.hive.util.HiveConnectionUtil;
    +import org.apache.phoenix.jdbc.PhoenixStatement;
    +import org.apache.phoenix.mapreduce.*;
    +import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil;
    +import org.apache.phoenix.query.KeyRange;
    +import org.apache.phoenix.schema.TableRef;
    +import org.apache.phoenix.util.ScanUtil;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Lists;
    +
    +
    +/**
    +* HivePhoenixInputFormat
    +* Need to extend the standard PhoenixInputFormat but also implement the mapred inputFormat for Hive compliance
    +*
    +* @version 1.0
    --- End diff --
    
    Please remove these tags


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by nmaillard <gi...@git.apache.org>.

Github user nmaillard commented on the pull request:

    https://github.com/apache/phoenix/pull/74#issuecomment-102079613
  
    thanks everyone, sorry I have had little connectivity the last couple of days. I will adress new feedback right away


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by JamesRTaylor <gi...@git.apache.org>.

Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29096970
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java ---
    @@ -0,0 +1,275 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive.util;
    +
    +import java.util.List;
    +import java.util.Map;
    +import java.util.Properties;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.hbase.HConstants;
    +import org.apache.hadoop.hive.serde2.SerDeException;
    +import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
    +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
    +import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil;
    +import org.apache.phoenix.schema.types.PBinary;
    +import org.apache.phoenix.schema.types.PBoolean;
    +import org.apache.phoenix.schema.types.PChar;
    +import org.apache.phoenix.schema.types.PDataType;
    +import org.apache.phoenix.schema.types.PDate;
    +import org.apache.phoenix.schema.types.PDecimal;
    +import org.apache.phoenix.schema.types.PDouble;
    +import org.apache.phoenix.schema.types.PFloat;
    +import org.apache.phoenix.schema.types.PInteger;
    +import org.apache.phoenix.schema.types.PLong;
    +import org.apache.phoenix.schema.types.PSmallint;
    +import org.apache.phoenix.schema.types.PTime;
    +import org.apache.phoenix.schema.types.PTimestamp;
    +import org.apache.phoenix.schema.types.PTinyint;
    +import org.apache.phoenix.schema.types.PVarchar;
    +import org.apache.phoenix.util.PhoenixRuntime;
    +
    +/**
    + *
    + */
    +public class HiveConfigurationUtil {
    +    static Log LOG = LogFactory.getLog(HiveConfigurationUtil.class.getName());
    +
    +    public static final String TABLE_NAME = "phoenix.hbase.table.name";
    +    public static final String ZOOKEEPER_QUORUM = "phoenix.zookeeper.quorum";
    +    public static final String ZOOKEEPER_PORT = "phoenix.zookeeper.client.port";
    +    public static final String ZOOKEEPER_PARENT = "phoenix.zookeeper.znode.parent";
    +    public static final String ZOOKEEPER_QUORUM_DEFAULT = "localhost";
    +    public static final String ZOOKEEPER_PORT_DEFAULT = "2181";
    +    public static final String ZOOKEEPER_PARENT_DEFAULT = "/hbase-unsecure";
    +
    +    public static final String COLUMN_MAPPING = "phoenix.column.mapping";
    +    public static final String AUTOCREATE = "autocreate";
    +    public static final String AUTODROP = "autodrop";
    +    public static final String AUTOCOMMIT = "autocommit";
    +    public static final String PHOENIX_ROWKEYS = "phoenix.rowkeys";
    +    public static final String SALT_BUCKETS = "saltbuckets";
    +    public static final String COMPRESSION = "compression";
    +    public static final String VERSIONS = "versions";
    +    public static final int VERSIONS_NUM = 5;
    +    public static final String SPLIT = "split";
    +    public static final String REDUCE_SPECULATIVE_EXEC =
    +            "mapred.reduce.tasks.speculative.execution";
    +    public static final String MAP_SPECULATIVE_EXEC = "mapred.map.tasks.speculative.execution";
    +
    +    public static void setProperties(Properties tblProps, Map<String, String> jobProperties) {
    +        String quorum = tblProps.getProperty(HiveConfigurationUtil.ZOOKEEPER_QUORUM) != null ?
    +                        tblProps.getProperty(HiveConfigurationUtil.ZOOKEEPER_QUORUM) :
    +                        HiveConfigurationUtil.ZOOKEEPER_QUORUM_DEFAULT;
    +        String znode = tblProps.getProperty(HiveConfigurationUtil.ZOOKEEPER_PARENT) != null ?
    +                        tblProps.getProperty(HiveConfigurationUtil.ZOOKEEPER_PARENT) :
    +                        HiveConfigurationUtil.ZOOKEEPER_PARENT_DEFAULT;
    +        String port = tblProps.getProperty(HiveConfigurationUtil.ZOOKEEPER_PORT) != null ?
    +                        tblProps.getProperty(HiveConfigurationUtil.ZOOKEEPER_PORT) :
    +                        HiveConfigurationUtil.ZOOKEEPER_PORT_DEFAULT;
    +        if (!znode.startsWith("/")) {
    +            znode = "/" + znode;
    +        }
    +        LOG.debug("quorum:" + quorum);
    +        LOG.debug("port:" + port);
    +        LOG.debug("parent:" +znode);
    +        LOG.debug("table:" + tblProps.getProperty(HiveConfigurationUtil.TABLE_NAME));
    +        LOG.debug("batch:" + tblProps.getProperty(PhoenixConfigurationUtil.UPSERT_BATCH_SIZE));
    +        
    +        jobProperties.put(HiveConfigurationUtil.ZOOKEEPER_QUORUM, quorum);
    +        jobProperties.put(HiveConfigurationUtil.ZOOKEEPER_PORT, port);
    +        jobProperties.put(HiveConfigurationUtil.ZOOKEEPER_PARENT, znode);
    +        String tableName = tblProps.getProperty(HiveConfigurationUtil.TABLE_NAME);
    +        if (tableName == null) {
    +            tableName = tblProps.get("name").toString();
    +            tableName = tableName.split(".")[1];
    +        }
    +        // TODO this is synch with common Phoenix mechanism revisit to make wiser decisions
    +        jobProperties.put(HConstants.ZOOKEEPER_QUORUM,quorum+PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR
    +                + port + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR+znode);
    +        jobProperties.put(HiveConfigurationUtil.TABLE_NAME, tableName);
    +        // TODO this is synch with common Phoenix mechanism revisit to make wiser decisions
    +        jobProperties.put(PhoenixConfigurationUtil.OUTPUT_TABLE_NAME, tableName);
    +        jobProperties.put(PhoenixConfigurationUtil.INPUT_TABLE_NAME, tableName);
    +    }
    +
    +    public static PDataType[] hiveTypesToPDataType(
    +            PrimitiveObjectInspector.PrimitiveCategory[] hiveTypes) throws SerDeException {
    +        final PDataType[] result = new PDataType[hiveTypes.length];
    +        for (int i = 0; i < hiveTypes.length; i++) {
    +            result[i] = hiveTypeToPDataType(hiveTypes[i]);
    +        }
    +        return result;
    +    }
    +
    +    public static PDataType[] hiveTypesToSqlTypes(List<TypeInfo> columnTypes) throws SerDeException {
    +        final PDataType[] result = new PDataType[columnTypes.size()];
    +        for (int i = 0; i < columnTypes.size(); i++) {
    +            result[i] = hiveTypeToPDataType(columnTypes.get(i));
    +        }
    +        return result;
    +    }
    +
    +    /**
    +     * This method returns the most appropriate PDataType associated with the incoming hive type.
    +     * @param hiveType
    +     * @return PDataType
    +     */
    +
    +    public static PDataType hiveTypeToPDataType(TypeInfo hiveType) throws SerDeException {
    +        switch (hiveType.getCategory()) {
    +        case PRIMITIVE:
    +            // return
    +            // hiveTypeToPDataType(((PrimitiveObjectInspector)hiveType).getPrimitiveCategory());
    +            return hiveTypeToPDataType(hiveType.getTypeName());
    +        default:
    +
    +            throw new SerDeException("Unsupported column type: " + hiveType.getCategory().name());
    +        }
    +    }
    +
    +    public static PDataType hiveTypeToPDataType(String hiveType) throws SerDeException {
    +        final String lctype = hiveType.toLowerCase();
    --- End diff --
    
    Switch statement? Does Hive have constants for these?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by JamesRTaylor <gi...@git.apache.org>.

Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29096962
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/PhoenixSerde.java ---
    @@ -0,0 +1,169 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.List;
    +import java.util.Properties;
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hive.serde2.SerDe;
    +import org.apache.hadoop.hive.serde2.SerDeException;
    +import org.apache.hadoop.hive.serde2.SerDeStats;
    +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
    +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
    +import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
    +import org.apache.hadoop.hive.serde2.objectinspector.StructField;
    +import org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector;
    +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
    +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils;
    +import org.apache.hadoop.io.Writable;
    +import org.apache.hadoop.mapred.lib.db.DBWritable;
    +import org.apache.phoenix.hive.util.HiveConstants;
    +import org.apache.phoenix.hive.util.HiveTypeUtil;
    +import org.apache.phoenix.schema.types.PDataType;
    +
    +public class PhoenixSerde implements SerDe {
    +    static Log LOG = LogFactory.getLog(PhoenixSerde.class.getName());
    +    private PhoenixHiveDBWritable phrecord;
    +    private List<String> columnNames;
    +    private List<TypeInfo> columnTypes;
    +    private ObjectInspector ObjectInspector;
    +    private int fieldCount;
    +    private List<Object> row;
    +    private List<ObjectInspector> fieldOIs;
    +    
    +    
    +    /**
    +     * This method initializes the Hive SerDe
    +     * incoming hive types.
    +     * @param conf conf job configuration
    +     *  @param tblProps table properties
    +     */
    +    public void initialize(Configuration conf, Properties tblProps) throws SerDeException {
    +        if (conf != null) {
    +            conf.setClass("phoenix.input.class", PhoenixHiveDBWritable.class, DBWritable.class);
    +        }
    +        this.columnNames = Arrays.asList(tblProps.getProperty(HiveConstants.COLUMNS).split(","));
    +        this.columnTypes =
    +                TypeInfoUtils.getTypeInfosFromTypeString(tblProps
    +                        .getProperty(HiveConstants.COLUMNS_TYPES));
    +        LOG.debug("columnNames: " + this.columnNames);
    +        LOG.debug("columnTypes: " + this.columnTypes);
    +        this.fieldCount = this.columnTypes.size();
    +        PDataType[] types = HiveTypeUtil.hiveTypesToSqlTypes(this.columnTypes);
    +        this.phrecord = new PhoenixHiveDBWritable(types);
    +        this.fieldOIs = new ArrayList(this.columnNames.size());
    +
    +        for (TypeInfo typeInfo : this.columnTypes) {
    +            this.fieldOIs.add(TypeInfoUtils
    +                    .getStandardWritableObjectInspectorFromTypeInfo(typeInfo));
    +        }
    +        this.ObjectInspector =
    +                ObjectInspectorFactory.getStandardStructObjectInspector(this.columnNames,
    +                    this.fieldOIs);
    +        this.row = new ArrayList(this.columnNames.size());
    +    }
    +    
    +    
    +    /**
    +     * This Deserializes a result from Phoenix to a Hive result
    +     * @param wr the phoenix writable Object here PhoenixHiveDBWritable
    +     * @return  Object for Hive
    +     */
    +
    +    public Object deserialize(Writable wr) throws SerDeException {
    +        if (!(wr instanceof PhoenixHiveDBWritable)) throw new SerDeException(
    +                "Serialized Object is not of type PhoenixHiveDBWritable");
    +        try {
    +            this.row.clear();
    +            PhoenixHiveDBWritable phdbw = (PhoenixHiveDBWritable) wr;
    +            for (int i = 0; i < this.fieldCount; i++) {
    +                Object value = phdbw.get((String) this.columnNames.get(i));
    +                if (value != null) this.row.add(HiveTypeUtil.SQLType2Writable(
    +                    ((TypeInfo) this.columnTypes.get(i)).getTypeName(), value));
    +                else {
    +                    this.row.add(null);
    +                }
    +            }
    +            return this.row;
    +        } catch (Exception e) {
    +            e.printStackTrace();
    +            throw new SerDeException(e.getCause());
    +        }
    +    }
    +
    +    public ObjectInspector getObjectInspector() throws SerDeException {
    +        return this.ObjectInspector;
    +    }
    +
    +    public SerDeStats getSerDeStats() {
    +        return null;
    +    }
    +
    +    /**
    +     * This is a getter for the  serialized class to use with this SerDE
    +     * @return  The class PhoenixHiveDBWritable
    +     */
    +    
    +    public Class<? extends Writable> getSerializedClass() {
    +        return PhoenixHiveDBWritable.class;
    +    }
    +
    +    
    +    /**
    +     * This serializes a Hive row to a Phoenix entry
    +     * incoming hive types.
    +     * @param row Hive row
    +     * @param inspector inspector for the Hive row
    +     */
    +    
    +    public Writable serialize(Object row, ObjectInspector inspector) throws SerDeException {
    +        final StructObjectInspector structInspector = (StructObjectInspector) inspector;
    +        final List<? extends StructField> fields = structInspector.getAllStructFieldRefs();
    +
    +        if (fields.size() != fieldCount) {
    +            throw new SerDeException(String.format("Required %d columns, received %d.", fieldCount,
    +                fields.size()));
    +        }
    +        phrecord.clear();
    +        for (int i = 0; i < fieldCount; i++) {
    +            StructField structField = fields.get(i);
    +            if (structField != null) {
    +                Object field = structInspector.getStructFieldData(row, structField);
    +                ObjectInspector fieldOI = structField.getFieldObjectInspector();
    +                switch (fieldOI.getCategory()) {
    +                case PRIMITIVE:
    +                    Writable value =
    +                            (Writable) ((PrimitiveObjectInspector) fieldOI)
    +                                    .getPrimitiveWritableObject(field);
    +                    phrecord.add(value);
    --- End diff --
    
    Will this write the data in a Phoenix serialized format? This is for data coming in from Hive?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by nmaillard <gi...@git.apache.org>.

Github user nmaillard commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29044389
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixRecordReader.java ---
    @@ -0,0 +1,182 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.base.Throwables;
    +
    +import java.io.IOException;
    +import java.io.PrintStream;
    +import java.sql.SQLException;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hbase.client.Scan;
    +import org.apache.hadoop.io.NullWritable;
    +import org.apache.hadoop.mapred.InputSplit;
    +import org.apache.hadoop.mapreduce.TaskAttemptContext;
    +import org.apache.hadoop.mapreduce.lib.db.DBWritable;
    +import org.apache.hadoop.util.ReflectionUtils;
    +import org.apache.phoenix.compile.QueryPlan;
    +import org.apache.phoenix.compile.ScanRanges;
    +import org.apache.phoenix.compile.SequenceManager;
    +import org.apache.phoenix.compile.StatementContext;
    +import org.apache.phoenix.iterate.ResultIterator;
    +import org.apache.phoenix.iterate.SequenceResultIterator;
    +import org.apache.phoenix.iterate.TableResultIterator;
    +import org.apache.phoenix.jdbc.PhoenixResultSet;
    +import org.apache.phoenix.mapreduce.PhoenixRecordReader;
    +import org.apache.phoenix.query.KeyRange;
    +import org.apache.phoenix.util.ScanUtil;
    +
    +/**
    +* HivePhoenixRecordReader
    +* implements mapred as well for hive needs
    +*
    +* @version 1.0
    +* @since   2015-02-08 
    +*/
    +
    +
    +public class HivePhoenixRecordReader<T extends DBWritable> extends PhoenixRecordReader<T> implements
    --- End diff --
    
    I could extend RecordReader directly, the idea here it is to reuse as much a possible and only add hive specific parst


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by mravi <gi...@git.apache.org>.

Github user mravi commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29023429
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixRecordWriter.java ---
    @@ -0,0 +1,85 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import java.io.IOException;
    +import java.sql.Connection;
    +import java.sql.PreparedStatement;
    +import java.sql.SQLException;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.io.NullWritable;
    +import org.apache.hadoop.mapred.RecordWriter;
    +import org.apache.hadoop.mapred.Reporter;
    +import org.apache.hadoop.mapreduce.lib.db.DBWritable;
    +import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil;
    +
    +public class HivePhoenixRecordWriter<T extends DBWritable> implements RecordWriter<NullWritable, T> {
    +    private static final Log LOG = LogFactory.getLog(HivePhoenixRecordWriter.class);
    +
    +    private long numRecords = 0L;
    +    private final Connection conn;
    +    private final PreparedStatement statement;
    +    private final long batchSize;
    +
    +    public HivePhoenixRecordWriter(Connection conn, Configuration config) throws SQLException {
    +        this.conn = conn;
    +        this.batchSize = PhoenixConfigurationUtil.getBatchSize(config);
    +        String upsertQuery = PhoenixConfigurationUtil.getUpsertStatement(config);
    +        this.statement = this.conn.prepareStatement(upsertQuery);
    +    }
    +
    +    public void write(NullWritable n, T record) throws IOException {
    +        try {
    +            record.write(this.statement);
    +            this.numRecords += 1L;
    +            this.statement.addBatch();
    +
    +            if (this.numRecords % this.batchSize == 0L) {
    +                LOG.info("log commit called on a batch of size : " + this.batchSize);
    +                this.statement.executeBatch();
    +                //this.conn.commit();
    --- End diff --
    
    Can you please uncomment this line to ensure we commit each batch


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by nmaillard <gi...@git.apache.org>.

Github user nmaillard commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29038603
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/HivePhoenixInputSplit.java ---
    @@ -0,0 +1,111 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import com.google.common.base.Preconditions;
    +
    +import java.io.DataInput;
    +import java.io.DataOutput;
    +import java.io.IOException;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.fs.Path;
    +import org.apache.hadoop.io.Text;
    +import org.apache.hadoop.mapred.FileSplit;
    +import org.apache.phoenix.query.KeyRange;
    +
    +
    +/**
    +* HivePhoenixInputSplit
    +* Need to extend Mapred for Hive compliance reasons
    +*
    +* @version 1.0
    +* @since   2015-02-08 
    +*/
    +
    +public class HivePhoenixInputSplit extends FileSplit {
    +    private static final Log LOG = LogFactory.getLog(HivePhoenixInputSplit.class);
    +    private KeyRange keyRange;
    +    private Path path;
    +
    +    public HivePhoenixInputSplit() {
    +        super((Path) null, 0, 0, (String[]) null);
    +    }
    +
    +    public HivePhoenixInputSplit(KeyRange keyRange) {
    +        Preconditions.checkNotNull(keyRange);
    +        this.keyRange = keyRange;
    +    }
    +
    +    public HivePhoenixInputSplit(KeyRange keyRange, Path path) {
    +        Preconditions.checkNotNull(keyRange);
    +        Preconditions.checkNotNull(path);
    +        LOG.debug("path: " + path);
    +
    +        this.keyRange = keyRange;
    +        this.path = path;
    +    }
    +
    +    public void readFields(DataInput input) throws IOException {
    +        this.path = new Path(Text.readString(input));
    +        this.keyRange = new KeyRange();
    +        this.keyRange.readFields(input);
    +    }
    +
    +    public void write(DataOutput output) throws IOException {
    +        Preconditions.checkNotNull(this.keyRange);
    +        Text.writeString(output, path.toString());
    +        this.keyRange.write(output);
    +    }
    +
    +    public long getLength() {
    +        return 0L;
    +    }
    +
    +    public String[] getLocations() {
    +        return new String[0];
    +    }
    +
    +    public KeyRange getKeyRange() {
    +        return this.keyRange;
    +    }
    +
    +    @Override
    +    public Path getPath() {
    +        return this.path;
    +    }
    +
    +    public int hashCode() {
    +        int prime = 31;
    +        int result = 1;
    +        result = 31 * result + (this.keyRange == null ? 0 : this.keyRange.hashCode());
    +        return result;
    +    }
    +
    +    public boolean equals(Object obj) {
    --- End diff --
    
    The Path part is a workaround because Hive in the storage handler part will expect everything to be HDFS file based


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: Phoenix 331- Phoenix-Hive initial commit

Posted by mravi <gi...@git.apache.org>.

Github user mravi commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/74#discussion_r29023526
  
    --- Diff: phoenix-hive/src/main/java/org/apache/phoenix/hive/PhoenixMetaHook.java ---
    @@ -0,0 +1,210 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.phoenix.hive;
    +
    +import com.google.common.base.Splitter;
    +import com.google.common.base.Splitter.MapSplitter;
    +
    +import java.sql.Connection;
    +import java.sql.SQLException;
    +import java.util.LinkedHashMap;
    +import java.util.Map;
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.hive.metastore.HiveMetaHook;
    +import org.apache.hadoop.hive.metastore.TableType;
    +import org.apache.hadoop.hive.metastore.api.FieldSchema;
    +import org.apache.hadoop.hive.metastore.api.MetaException;
    +import org.apache.hadoop.hive.metastore.api.Table;
    +import org.apache.hadoop.hive.serde2.SerDeException;
    +import org.apache.phoenix.hive.util.HiveConnectionUtil;
    +import org.apache.phoenix.hive.util.HiveConfigurationUtil;
    +import org.apache.phoenix.hive.util.HiveTypeUtil;
    +import org.apache.phoenix.hive.util.PhoenixUtil;
    +
    +/**
    +* PhoenixMetaHook
    +* This class captures all create and delete Hive queries and passes them to phoenix
    +*
    +* @version 1.0
    +* @since   2015-02-08 
    +*/
    +
    +public class PhoenixMetaHook
    +  implements HiveMetaHook
    +{
    +  static Log LOG = LogFactory.getLog(PhoenixMetaHook.class.getName());
    +
    +  /**
    +   *commitCreateTable creates a Phoenix table after the hive table has been created 
    +   * incoming hive types.
    +   * @param tbl the table properties
    +   * 
    +   */
    +  //Too much logic in this function must revisit and dispatch
    +  public void commitCreateTable(Table tbl)
    --- End diff --
    
    Should we support creating Phoenix tables from Hive? My vote would be to restrict the code to do validations against the existence of phoenix table.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---