You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hawq.apache.org by jiadexin <gi...@git.apache.org> on 2016/10/25 08:14:37 UTC

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

GitHub user jiadexin opened a pull request:

    https://github.com/apache/incubator-hawq/pull/972

    HAWQ-1108 Add JDBC PXF Plugin

    The PXF JDBC plug-in reads data stored in Traditional relational database,ie : mysql,ORACLE,postgresql.
    For more information, please refer to:https://github.com/inspur-insight/incubator-hawq/blob/HAWQ-1108/pxf/pxf-jdbc/README.md .

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/inspur-insight/incubator-hawq HAWQ-1108

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-hawq/pull/972.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #972
    
----
commit 2cc75e672d01926ef97d7c50485f4979d4866b3c
Author: Devin Jia <ji...@inspur.com>
Date:   2016-10-18T08:08:50Z

    Merge pull request #1 from apache/master
    
    re fork

commit 10f68af5ade550b6c24abe371fff4a40349829b3
Author: Devin Jia <ji...@inspur.com>
Date:   2016-10-25T06:31:07Z

    the first commit

commit 5a814211ecf8f8f1e7d1487bdda33c3b72f1b990
Author: Devin Jia <ji...@inspur.com>
Date:   2016-10-25T07:10:12Z

    modify parent pxf build.gradle

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85473491
  
    --- Diff: pxf/pxf-jdbc/src/test/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenterTest.java ---
    @@ -0,0 +1,208 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.junit.Test;
    +
    +import java.text.ParseException;
    +import java.util.Calendar;
    +import java.util.List;
    +
    +import static org.junit.Assert.*;
    +import static org.mockito.Mockito.mock;
    +import static org.mockito.Mockito.when;
    +
    +
    +public class JdbcPartitionFragmenterTest {
    +    InputData inputData;
    +
    +    @Test
    +    public void testPartionByDate() throws Exception {
    +        prepareConstruction();
    +        when(inputData.getDataSource()).thenReturn("sales");
    +        when(inputData.getUserProperty("PARTITION_BY")).thenReturn("cdate:date");
    +        when(inputData.getUserProperty("RANGE")).thenReturn("2008-01-01:2009-01-01");
    +        when(inputData.getUserProperty("INTERVAL")).thenReturn("1:month");
    +
    +        JdbcPartitionFragmenter fragment = new JdbcPartitionFragmenter(inputData);
    +        List<Fragment> fragments = fragment.getFragments();
    +        assertEquals(fragments.size(), 12);
    +
    +        //fragment - 1
    +        byte[] frag_meta = fragments.get(0).getMetadata();
    +        byte[][] newb = ByteUtil.splitBytes(frag_meta, 8);
    +        long frag_start = ByteUtil.toLong(newb[0]);
    +        long frag_end = ByteUtil.toLong(newb[1]);
    +        assertDateEquals(frag_start, 2008, 1, 1);
    +        assertDateEquals(frag_end, 2008, 2, 1);
    +
    +        //fragment - 12
    +        frag_meta = fragments.get(11).getMetadata();
    +        newb = ByteUtil.splitBytes(frag_meta, 8);
    +        frag_start = ByteUtil.toLong(newb[0]);
    +        frag_end = ByteUtil.toLong(newb[1]);
    +        assertDateEquals(frag_start, 2008, 12, 1);
    +        assertDateEquals(frag_end, 2009, 1, 1);
    +
    +        //when end_date > start_date
    +        when(inputData.getUserProperty("RANGE")).thenReturn("2008-01-01:2001-01-01");
    +        fragment = new JdbcPartitionFragmenter(inputData);
    +        assertEquals(0,fragment.getFragments().size());
    +    }
    +
    +    @Test
    +    public void testPartionByDate2() throws Exception {
    --- End diff --
    
    typo Partition


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85472961
  
    --- Diff: pxf/pxf-jdbc/src/test/java/org/apache/hawq/pxf/plugins/jdbc/JdbcFilterBuilderTest.java ---
    @@ -0,0 +1,81 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + * 
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + * 
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +
    +import org.apache.hawq.pxf.api.BasicFilter;
    +import org.apache.hawq.pxf.api.FilterParser.LogicalOperation;
    +import org.apache.hawq.pxf.api.LogicalFilter;
    +import org.junit.Test;
    +
    +import static org.apache.hawq.pxf.api.FilterParser.Operation.*;
    +import static org.junit.Assert.assertEquals;
    +
    +public class JdbcFilterBuilderTest {
    +    @Test
    +    public void parseFilterWithThreeOperations() throws Exception {
    +        //orgin sql => cdate>'2008-02-01' and cdate<'2008-12-01' and amt > 1200
    +        //filterstr="a1c\"first\"o5a2c2o2l0";//col_1=first and col_2=2
    +        String filterstr = "a1c\"2008-02-01\"o2a1c\"2008-12-01\"o1l0a2c1200o2l1"; //col_1>'first' and col_1<'2008-12-01' or col_2 > 1200;
    +        JdbcFilterBuilder builder = new JdbcFilterBuilder();
    +
    +        LogicalFilter filterList = (LogicalFilter) builder.getFilterObject(filterstr);
    +        assertEquals(LogicalOperation.HDOP_OR, filterList.getOperator());
    +        LogicalFilter l1_left = (LogicalFilter) filterList.getFilterList().get(0);
    +        BasicFilter l1_right = (BasicFilter) filterList.getFilterList().get(1);
    +        //column_2 > 1200
    +        assertEquals(2, l1_right.getColumn().index());
    +        assertEquals(HDOP_GT, l1_right.getOperation());
    +        assertEquals(1200L, l1_right.getConstant().constant());
    +
    +        assertEquals(LogicalOperation.HDOP_AND, l1_left.getOperator());
    +        BasicFilter l2_left = (BasicFilter) l1_left.getFilterList().get(0);
    +        BasicFilter l2_right = (BasicFilter) l1_left.getFilterList().get(1);
    +
    +        //column_1 > '2008-02-01'
    +        assertEquals(1, l2_left.getColumn().index());
    +        assertEquals(HDOP_GT, l2_left.getOperation());
    +        assertEquals("2008-02-01", l2_left.getConstant().constant());
    +
    +        //column_2 = 5
    --- End diff --
    
    also this comment


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by jiadexin <gi...@git.apache.org>.

Github user jiadexin commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85677162
  
    --- Diff: pxf/pxf-jdbc/src/test/java/org/apache/hawq/pxf/plugins/jdbc/JdbcMySqlExtensionTest.java ---
    @@ -0,0 +1,303 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import com.sun.org.apache.xml.internal.utils.StringComparable;
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hawq.pxf.api.FilterParser;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.OneField;
    +import org.apache.hawq.pxf.api.OneRow;
    +import org.apache.hawq.pxf.api.utilities.ColumnDescriptor;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +import org.apache.hawq.pxf.api.io.DataType;
    +import org.junit.After;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import java.sql.SQLException;
    +import java.sql.Statement;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +import static org.junit.Assert.assertEquals;
    +import static org.junit.Assert.assertTrue;
    +import static org.mockito.Mockito.mock;
    +import static org.mockito.Mockito.when;
    +
    +public class JdbcMySqlExtensionTest {
    +    private static final Log LOG = LogFactory.getLog(JdbcMySqlExtensionTest.class);
    +    static String MYSQL_URL = "jdbc:mysql://localhost:3306/demodb";
    --- End diff --
    
    This `test` is used to test the correctness of sql, it is important.
    If want to keep it in the project, how to do?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by jiadexin <gi...@git.apache.org>.

Github user jiadexin commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85683989
  
    --- Diff: pxf/pxf-jdbc/src/test/java/org/apache/hawq/pxf/plugins/jdbc/JdbcMySqlExtensionTest.java ---
    @@ -0,0 +1,303 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import com.sun.org.apache.xml.internal.utils.StringComparable;
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hawq.pxf.api.FilterParser;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.OneField;
    +import org.apache.hawq.pxf.api.OneRow;
    +import org.apache.hawq.pxf.api.utilities.ColumnDescriptor;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +import org.apache.hawq.pxf.api.io.DataType;
    +import org.junit.After;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import java.sql.SQLException;
    +import java.sql.Statement;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +import static org.junit.Assert.assertEquals;
    +import static org.junit.Assert.assertTrue;
    +import static org.mockito.Mockito.mock;
    +import static org.mockito.Mockito.when;
    +
    +public class JdbcMySqlExtensionTest {
    +    private static final Log LOG = LogFactory.getLog(JdbcMySqlExtensionTest.class);
    +    static String MYSQL_URL = "jdbc:mysql://localhost:3306/demodb";
    --- End diff --
    
    I renamed JdbcMySqlExtensionTest to SqlBuilderTest.
     Validate SQL string generated by the  JdbcPartitionFragmenter.buildFragmenterSql method and the WhereSQLBuilder.buildWhereSQL method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by shivzone <gi...@git.apache.org>.

Github user shivzone commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r86202717
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,297 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.api.UserDataException;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    + *
    + * Extends the {@link Fragmenter} abstract class, with the purpose of transforming
    + * an input data path  (an JDBC Database table name  and user request parameters)  into a list of regions
    + * that belong to this table.
    + * <p>
    + * <h4>The parameter Patterns </h4>
    + * There are three  parameters,  the format is as follows:<p>
    + * <pre>
    + * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
    + * </pre>
    + * The <code>PARTITION_BY</code> parameter can be split by colon(':'),the <code>column_type</code> current supported : <code>date,int,enum</code> .
    + * The Date format is 'yyyy-MM-dd'. <p>
    + * The <code>RANGE</code> parameter can be split by colon(':') ,used to identify the starting range of each fragment.
    + * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
    + * the <code>end_value</code> can be empty. If the <code>column_type</code>is <code>enum</code>,the parameter <code>RANGE</code> can be empty. <p>
    + * The <code>INTERVAL</code> parameter can be split by colon(':'), indicate the interval value of one fragment.
    + * When <code>column_type</code> is <code>date</code>,this parameter must be split by colon, and <code>interval_unit</code> can be <code>year,month,day</code>.
    + * When <code>column_type</code> is <code>int</code>, the <code>interval_unit</code> can be empty.
    + * When <code>column_type</code> is <code>enum</code>,the <code>INTERVAL</code> parameter can be empty.
    + * </p>
    + * <p>
    + * The syntax examples is :<p>
    + * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
    + * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
    + * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
    + * </p>
    + *
    + */
    +public class JdbcPartitionFragmenter extends Fragmenter {
    +    String[] partitionBy = null;
    +    String[] range = null;
    +    String[] interval = null;
    +    PartitionType partitionType = null;
    +    String partitionColumn = null;
    +    IntervalType intervalType = null;
    +    int intervalNum = 1;
    +
    +    enum PartitionType {
    +        DATE,
    +        INT,
    +        ENUM;
    +
    +        public static PartitionType getType(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    enum IntervalType {
    +        DAY,
    +        MONTH,
    +        YEAR;
    +
    +        public static IntervalType type(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    //The unit interval, in milliseconds, that is used to estimate the number of slices for the date partition type
    +    static Map<IntervalType, Long> intervals = new HashMap<IntervalType, Long>();
    +
    +    static {
    +        intervals.put(IntervalType.DAY, (long) 24 * 60 * 60 * 1000);
    --- End diff --
    
    Please define private static final variables for each of these units eg: SECONDS_IN_MINUTE MINUTES_IN_HOUR HOURS_IN_DAY or to keep things simple just define MILLISECONDS_IN_DAY and the other 2 can be derived


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by shivzone <gi...@git.apache.org>.

Github user shivzone commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r86204956
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,297 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.api.UserDataException;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    + *
    + * Extends the {@link Fragmenter} abstract class, with the purpose of transforming
    + * an input data path  (an JDBC Database table name  and user request parameters)  into a list of regions
    + * that belong to this table.
    + * <p>
    + * <h4>The parameter Patterns </h4>
    + * There are three  parameters,  the format is as follows:<p>
    + * <pre>
    + * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
    + * </pre>
    + * The <code>PARTITION_BY</code> parameter can be split by colon(':'),the <code>column_type</code> current supported : <code>date,int,enum</code> .
    + * The Date format is 'yyyy-MM-dd'. <p>
    + * The <code>RANGE</code> parameter can be split by colon(':') ,used to identify the starting range of each fragment.
    + * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
    + * the <code>end_value</code> can be empty. If the <code>column_type</code>is <code>enum</code>,the parameter <code>RANGE</code> can be empty. <p>
    + * The <code>INTERVAL</code> parameter can be split by colon(':'), indicate the interval value of one fragment.
    + * When <code>column_type</code> is <code>date</code>,this parameter must be split by colon, and <code>interval_unit</code> can be <code>year,month,day</code>.
    + * When <code>column_type</code> is <code>int</code>, the <code>interval_unit</code> can be empty.
    + * When <code>column_type</code> is <code>enum</code>,the <code>INTERVAL</code> parameter can be empty.
    + * </p>
    + * <p>
    + * The syntax examples is :<p>
    + * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
    + * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
    + * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
    + * </p>
    + *
    + */
    +public class JdbcPartitionFragmenter extends Fragmenter {
    +    String[] partitionBy = null;
    +    String[] range = null;
    +    String[] interval = null;
    +    PartitionType partitionType = null;
    +    String partitionColumn = null;
    +    IntervalType intervalType = null;
    +    int intervalNum = 1;
    +
    +    enum PartitionType {
    +        DATE,
    +        INT,
    +        ENUM;
    +
    +        public static PartitionType getType(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    enum IntervalType {
    +        DAY,
    +        MONTH,
    +        YEAR;
    +
    +        public static IntervalType type(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    //The unit interval, in milliseconds, that is used to estimate the number of slices for the date partition type
    +    static Map<IntervalType, Long> intervals = new HashMap<IntervalType, Long>();
    +
    +    static {
    +        intervals.put(IntervalType.DAY, (long) 24 * 60 * 60 * 1000);
    +        //30 days
    +        intervals.put(IntervalType.MONTH, (long) 30 * 24 * 60 * 60 * 1000);
    +        //365 days
    +        intervals.put(IntervalType.YEAR, (long) 365 * 30 * 24 * 60 * 60 * 1000);
    +    }
    +
    +    /**
    +     * Constructor for JdbcPartitionFragmenter.
    +     *
    +     * @param inConf input data such as which Jdbc table to scan
    +     * @throws UserDataException
    +     */
    +    public JdbcPartitionFragmenter(InputData inConf) throws UserDataException {
    +        super(inConf);
    +        if (inConf.getUserProperty("PARTITION_BY") == null)
    +            return;
    +        try {
    +            partitionBy = inConf.getUserProperty("PARTITION_BY").split(":");
    +            partitionColumn = partitionBy[0];
    +            partitionType = PartitionType.getType(partitionBy[1]);
    --- End diff --
    
    Need to gracefully handle case when user provides a partition type that's not supported or no partition type


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by jiadexin <gi...@git.apache.org>.

Github user jiadexin commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r86278188
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,297 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.api.UserDataException;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    + *
    + * Extends the {@link Fragmenter} abstract class, with the purpose of transforming
    + * an input data path  (an JDBC Database table name  and user request parameters)  into a list of regions
    + * that belong to this table.
    + * <p>
    + * <h4>The parameter Patterns </h4>
    + * There are three  parameters,  the format is as follows:<p>
    + * <pre>
    + * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
    + * </pre>
    + * The <code>PARTITION_BY</code> parameter can be split by colon(':'),the <code>column_type</code> current supported : <code>date,int,enum</code> .
    + * The Date format is 'yyyy-MM-dd'. <p>
    + * The <code>RANGE</code> parameter can be split by colon(':') ,used to identify the starting range of each fragment.
    + * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
    + * the <code>end_value</code> can be empty. If the <code>column_type</code>is <code>enum</code>,the parameter <code>RANGE</code> can be empty. <p>
    + * The <code>INTERVAL</code> parameter can be split by colon(':'), indicate the interval value of one fragment.
    + * When <code>column_type</code> is <code>date</code>,this parameter must be split by colon, and <code>interval_unit</code> can be <code>year,month,day</code>.
    + * When <code>column_type</code> is <code>int</code>, the <code>interval_unit</code> can be empty.
    + * When <code>column_type</code> is <code>enum</code>,the <code>INTERVAL</code> parameter can be empty.
    + * </p>
    + * <p>
    + * The syntax examples is :<p>
    + * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
    + * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
    + * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
    + * </p>
    + *
    + */
    +public class JdbcPartitionFragmenter extends Fragmenter {
    +    String[] partitionBy = null;
    +    String[] range = null;
    +    String[] interval = null;
    +    PartitionType partitionType = null;
    +    String partitionColumn = null;
    +    IntervalType intervalType = null;
    +    int intervalNum = 1;
    +
    +    enum PartitionType {
    --- End diff --
    
    At present  support of these three kinds of commonly used, the future can also increase other types.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq issue #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by shivzone <gi...@git.apache.org>.

Github user shivzone commented on the issue:

    https://github.com/apache/incubator-hawq/pull/972
  
    +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85473433
  
    --- Diff: pxf/pxf-jdbc/src/test/java/org/apache/hawq/pxf/plugins/jdbc/JdbcMySqlExtensionTest.java ---
    @@ -0,0 +1,303 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import com.sun.org.apache.xml.internal.utils.StringComparable;
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hawq.pxf.api.FilterParser;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.OneField;
    +import org.apache.hawq.pxf.api.OneRow;
    +import org.apache.hawq.pxf.api.utilities.ColumnDescriptor;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +import org.apache.hawq.pxf.api.io.DataType;
    +import org.junit.After;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import java.sql.SQLException;
    +import java.sql.Statement;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +import static org.junit.Assert.assertEquals;
    +import static org.junit.Assert.assertTrue;
    +import static org.mockito.Mockito.mock;
    +import static org.mockito.Mockito.when;
    +
    +public class JdbcMySqlExtensionTest {
    +    private static final Log LOG = LogFactory.getLog(JdbcMySqlExtensionTest.class);
    +    static String MYSQL_URL = "jdbc:mysql://localhost:3306/demodb";
    +    InputData inputData;
    +
    +    @Before
    +    public void setup() throws Exception {
    +        LOG.info("JdbcMySqlExtensionTest.setup()");
    +        prepareConstruction();
    +        JdbcWriter writer = new JdbcWriter(inputData);
    +        writer.openForWrite();
    +
    +        //create table
    +        writer.executeSQL("CREATE TABLE sales (id int primary key, cdate date, amt decimal(10,2),grade varchar(30))");
    +        //INSERT DEMO data
    +        String[] inserts = {"INSERT INTO sales values (1, DATE('2008-01-01'), 1000,'general')",
    +                "INSERT INTO sales values (2, DATE('2008-02-01'), 900,'bad')",
    +                "INSERT INTO sales values (3, DATE('2008-03-10'), 1200,'good')",
    +                "INSERT INTO sales values (4, DATE('2008-04-10'), 1100,'good')",
    +                "INSERT INTO sales values (5, DATE('2008-05-01'), 1010,'general')",
    +                "INSERT INTO sales values (6, DATE('2008-06-01'), 850,'bad')",
    +                "INSERT INTO sales values (7, DATE('2008-07-01'), 1400,'excellent')",
    +                "INSERT INTO sales values (8, DATE('2008-08-01'), 1500,'excellent')",
    +                "INSERT INTO sales values (9, DATE('2008-09-01'), 1000,'good')",
    +                "INSERT INTO sales values (10, DATE('2008-10-01'), 800,'bad')",
    +                "INSERT INTO sales values (11, DATE('2008-11-01'), 1250,'good')",
    +                "INSERT INTO sales values (12, DATE('2008-12-01'), 1300,'excellent')",
    +                "INSERT INTO sales values (15, DATE('2009-01-01'), 1500,'excellent')",
    +                "INSERT INTO sales values (16, DATE('2009-02-01'), 1340,'excellent')",
    +                "INSERT INTO sales values (13, DATE('2009-03-01'), 1250,'good')",
    +                "INSERT INTO sales values (14, DATE('2009-04-01'), 1300,'excellent')"};
    +        for (String sql : inserts)
    +            writer.executeSQL(sql);
    +
    +        writer.closeWrite();
    +    }
    +
    +    @After
    +    public void cleanup() throws Exception {
    +        LOG.info("JdbcMySqlExtensionTest.cleanup()");
    +        prepareConstruction();
    +        JdbcWriter writer = new JdbcWriter(inputData);
    +        writer.openForWrite();
    +        writer.executeSQL("drop table sales");
    +        writer.closeWrite();
    +    }
    +
    +    @Test
    +    public void testIdFilter() throws Exception {
    +        prepareConstruction();
    +        when(inputData.hasFilter()).thenReturn(true);
    +        when(inputData.getFilterString()).thenReturn("a0c1o5");//id=1
    +        JdbcReadAccessor reader = new JdbcReadAccessor(inputData);
    +        reader.openForRead();
    +        ArrayList<List<OneField>> row_list = readAllRows(reader);
    +        assertEquals(1, row_list.size());
    +
    +        List<OneField> fields = row_list.get(0);
    +        assertEquals(4, fields.size());
    +        assertEquals(1, fields.get(0).val);
    +        assertEquals("2008-01-01", fields.get(1).val.toString());
    +        assertEquals("general", fields.get(3).val);
    +
    +        reader.closeForRead();
    +
    +    }
    +
    +    @Test
    +    public void testDateAndAmtFilter() throws Exception {
    +        prepareConstruction();
    +        when(inputData.hasFilter()).thenReturn(true);
    +        // cdate>'2008-02-01' and cdate<'2008-12-01' and amt > 1200
    +        when(inputData.getFilterString()).thenReturn("a1c\"2008-02-01\"o2a1c\"2008-12-01\"o1l0a2c1200o2l0");
    +        JdbcReadAccessor reader = new JdbcReadAccessor(inputData);
    +        reader.openForRead();
    +
    +        ArrayList<List<OneField>> row_list = readAllRows(reader);
    +
    +        assertEquals(3, row_list.size());
    +
    +        //assert random row
    +        Random random = new Random();
    +        SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd");
    +        List<OneField> fields = row_list.get(random.nextInt(row_list.size()));
    +        assertTrue(((Date)(fields.get(1).val)).after(dateFormat.parse("2008-02-01")));
    +        assertTrue(((Date)(fields.get(1).val)).before(dateFormat.parse("2008-12-01")));
    +        assertTrue(((Double)(fields.get(2).val)) > 1200);
    +
    +        reader.closeForRead();
    +    }
    +
    +    @Test
    +    public void testUnsupportedOperationFilter() throws Exception {
    +        prepareConstruction();
    +        when(inputData.hasFilter()).thenReturn(true);
    +        // grade like 'bad'
    +        when(inputData.getFilterString()).thenReturn("a3c\"bad\"o7");
    +        JdbcReadAccessor reader = new JdbcReadAccessor(inputData);
    +        reader.openForRead();
    +
    +        ArrayList<List<OneField>> row_list = readAllRows(reader);
    +
    +        //return all data
    +        assertEquals(16, row_list.size());
    +
    +        reader.closeForRead();
    +    }
    +
    +    @Test
    +    public void testUnsupportedLogicalFilter() throws Exception {
    +        prepareConstruction();
    +        when(inputData.hasFilter()).thenReturn(true);
    +        // cdate>'2008-02-01' or amt < 1200
    +        when(inputData.getFilterString()).thenReturn("a1c\"2008-02-01\"o2a2c1200o2l1");
    +        JdbcReadAccessor reader = new JdbcReadAccessor(inputData);
    +        reader.openForRead();
    +
    +        ArrayList<List<OneField>> row_list = readAllRows(reader);
    +
    +        //all data
    +        assertEquals(16, row_list.size());
    +
    +        reader.closeForRead();
    +    }
    +
    +    @Test
    +    public void testDatePartition() throws Exception {
    +        prepareConstruction();
    +        when(inputData.hasFilter()).thenReturn(false);
    +        when(inputData.getUserProperty("PARTITION_BY")).thenReturn("cdate:date");
    +        when(inputData.getUserProperty("RANGE")).thenReturn("2008-01-01:2009-01-01");
    +        when(inputData.getUserProperty("INTERVAL")).thenReturn("2:month");
    +        JdbcPartitionFragmenter fragment = new JdbcPartitionFragmenter(inputData);
    +        List<Fragment> fragments = fragment.getFragments();
    +        assertEquals(6, fragments.size());
    +
    +        //partition-1 : cdate>=2008-01-01 and cdate<2008-03-01
    +        when(inputData.getFragmentMetadata()).thenReturn(fragments.get(0).getMetadata());
    +
    +        JdbcReadAccessor reader = new JdbcReadAccessor(inputData);
    +        reader.openForRead();
    +
    +        ArrayList<List<OneField>> row_list = readAllRows(reader);
    +
    +        //all data
    +        assertEquals(2, row_list.size());
    +
    +        //assert random row
    +        Random random = new Random();
    +        SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd");
    +        List<OneField> fields = row_list.get(random.nextInt(row_list.size()));
    +        assertTrue(((Date)(fields.get(1).val)).after(dateFormat.parse("2007-12-31")));
    +        assertTrue(((Date)(fields.get(1).val)).before(dateFormat.parse("2008-03-01")));
    +
    +        reader.closeForRead();
    +    }
    +
    +    @Test
    +    public void testFilterAndPartition() throws Exception {
    +        prepareConstruction();
    +        when(inputData.hasFilter()).thenReturn(true);
    +        when(inputData.getFilterString()).thenReturn("a0c5o2");//id>5
    +        when(inputData.getUserProperty("PARTITION_BY")).thenReturn("grade:enum");
    +        when(inputData.getUserProperty("RANGE")).thenReturn("excellent:good:general:bad");
    +        JdbcPartitionFragmenter fragment = new JdbcPartitionFragmenter(inputData);
    +        List<Fragment> fragments = fragment.getFragments();
    +
    +        //partition-1 : id>5 and grade='excellent'
    +        when(inputData.getFragmentMetadata()).thenReturn(fragments.get(0).getMetadata());
    +
    +        JdbcReadAccessor reader = new JdbcReadAccessor(inputData);
    +        reader.openForRead();
    +
    +        ArrayList<List<OneField>> row_list = readAllRows(reader);
    +
    +        //all data
    +        assertEquals(6, row_list.size());
    +
    +        //assert random row
    +        Random random = new Random();
    +        List<OneField> fields = row_list.get(random.nextInt(row_list.size()));
    +        assertTrue((Integer)(fields.get(0).val) > 5);
    +        assertEquals("excellent",fields.get(3).val);
    +
    +        reader.closeForRead();
    +    }
    +
    +    private ArrayList<List<OneField>> readAllRows(JdbcReadAccessor reader) throws Exception {
    +        JdbcReadResolver resolver = new JdbcReadResolver(inputData);
    +        ArrayList<List<OneField>> row_list = new ArrayList<>();
    +        OneRow row;
    +        do {
    +            row = reader.readNextObject();
    +            if (row != null)
    +                row_list.add(resolver.getFields(row));
    +        } while (row != null);
    +
    +        return row_list;
    +    }
    +
    +
    +    private void prepareConstruction() throws Exception {
    +        inputData = mock(InputData.class);
    +        when(inputData.getUserProperty("JDBC_DRIVER")).thenReturn("com.mysql.jdbc.Driver");
    +        when(inputData.getUserProperty("DB_URL")).thenReturn(MYSQL_URL);
    +        when(inputData.getUserProperty("USER")).thenReturn("root");
    +        when(inputData.getUserProperty("PASS")).thenReturn("root");
    +        when(inputData.getDataSource()).thenReturn("sales");
    +
    +
    +        ArrayList<ColumnDescriptor> columns = new ArrayList<ColumnDescriptor>();
    +        columns.add(new ColumnDescriptor("id", DataType.INTEGER.getOID(), 0, "int4", null));
    +        columns.add(new ColumnDescriptor("cdate", DataType.DATE.getOID(), 1, "date", null));
    +        columns.add(new ColumnDescriptor("amt", DataType.FLOAT8.getOID(), 2, "float8", null));
    +        columns.add(new ColumnDescriptor("grade", DataType.TEXT.getOID(), 3, "text", null));
    +        when(inputData.getTupleDescription()).thenReturn(columns);
    +        when(inputData.getColumn(0)).thenReturn(columns.get(0));
    +        when(inputData.getColumn(1)).thenReturn(columns.get(1));
    +        when(inputData.getColumn(2)).thenReturn(columns.get(2));
    +        when(inputData.getColumn(3)).thenReturn(columns.get(3));
    +        //*/
    --- End diff --
    
    please remove comment


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq issue #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by shivzone <gi...@git.apache.org>.

Github user shivzone commented on the issue:

    https://github.com/apache/incubator-hawq/pull/972
  
    Its merged. Please close the pull request


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85473415
  
    --- Diff: pxf/pxf-jdbc/src/test/java/org/apache/hawq/pxf/plugins/jdbc/JdbcMySqlExtensionTest.java ---
    @@ -0,0 +1,303 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import com.sun.org.apache.xml.internal.utils.StringComparable;
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hawq.pxf.api.FilterParser;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.OneField;
    +import org.apache.hawq.pxf.api.OneRow;
    +import org.apache.hawq.pxf.api.utilities.ColumnDescriptor;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +import org.apache.hawq.pxf.api.io.DataType;
    +import org.junit.After;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import java.sql.SQLException;
    +import java.sql.Statement;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +import static org.junit.Assert.assertEquals;
    +import static org.junit.Assert.assertTrue;
    +import static org.mockito.Mockito.mock;
    +import static org.mockito.Mockito.when;
    +
    +public class JdbcMySqlExtensionTest {
    +    private static final Log LOG = LogFactory.getLog(JdbcMySqlExtensionTest.class);
    +    static String MYSQL_URL = "jdbc:mysql://localhost:3306/demodb";
    --- End diff --
    
    does this test require a mysql instance running locally? if so, it cannot be here as part of the unit-tests, but rather should be moved to the automation suite. The unit tests have to be self contained.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq issue #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by shivzone <gi...@git.apache.org>.

Github user shivzone commented on the issue:

    https://github.com/apache/incubator-hawq/pull/972
  
    @jiadexin java8 has stricter checks with code comments to be javadoc compatible. Please fix the above errors pointed by @edespino 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq issue #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by jiadexin <gi...@git.apache.org>.

Github user jiadexin commented on the issue:

    https://github.com/apache/incubator-hawq/pull/972
  
    closes #972


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by jiadexin <gi...@git.apache.org>.

Github user jiadexin commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85867986
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/utils/ByteUtil.java ---
    @@ -0,0 +1,86 @@
    +package org.apache.hawq.pxf.plugins.jdbc.utils;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +
    +/**
    + * A tool class, used to deal with byte array merging, split and other methods.
    + */
    +public class ByteUtil {
    +
    +    public static byte[] mergeBytes(byte[] b1, byte[] b2) {
    --- End diff --
    
    This method is simple, I do not want to import a dependency.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85473708
  
    --- Diff: pxf/pxf-jdbc/src/test/resources/core-site.xml ---
    @@ -0,0 +1,28 @@
    +<?xml version="1.0" encoding="UTF-8"?>
    --- End diff --
    
    The test/resource directory contains information about specific setup (hadoop & mysql), and should probably not be included here.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85476340
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,284 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    + *
    + * Extends the {@link Fragmenter} abstract class, with the purpose of transforming
    + * an input data path  (an JDBC Database table name  and user request parameters)  into a list of regions
    + * that belong to this table.
    + * <p>
    + * <h4>The parameter Patterns </h4>
    + * There are three  parameters,  the format is as follows:<p>
    + * <pre>
    + * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
    + * </pre>
    + * The <code>PARTITION_BY</code> parameter can be split by colon(':'),the <code>column_type</code> current supported : <code>date,int,enum</code> .
    + * The Date format is 'yyyy-MM-dd'. <p>
    + * The <code>RANGE</code> parameter can be split by colon(':') ,used to identify the starting range of each fragment.
    + * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
    + * the <code>end_value</code> can be empty. If the <code>column_type</code>is <code>enum</code>,the parameter <code>RANGE</code> can be empty. <p>
    + * The <code>INTERVAL</code> parameter can be split by colon(':'), indicate the interval value of one fragment.
    + * When <code>column_type</code> is <code>date</code>,this parameter must be split by colon, and <code>interval_unit</code> can be <code>year,month,day</code>.
    + * When <code>column_type</code> is <code>int</code>, the <code>interval_unit</code> can be empty.
    + * When <code>column_type</code> is <code>enum</code>,the <code>INTERVAL</code> parameter can be empty.
    + * </p>
    + * <p>
    + * The syntax examples is :<p>
    + * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
    + * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
    + * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
    + * </p>
    + *
    + */
    +public class JdbcPartitionFragmenter extends Fragmenter {
    +    String[] partition_by = null;
    +    String[] range = null;
    +    String[] interval = null;
    +    PartitionType partitionType = null;
    +    String partitionColumn = null;
    +    IntervalType intervalType = null;
    +    int intervalNum = 1;
    +
    +    enum PartitionType {
    +        DATE,
    +        INT,
    +        ENUM;
    +
    +        public static PartitionType getType(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    enum IntervalType {
    +        DAY,
    +        MONTH,
    +        YEAR;
    +
    +        public static IntervalType type(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    //The unit interval, in milliseconds, that is used to estimate the number of slices for the date partition type
    +    static Map<IntervalType, Long> intervals = new HashMap<IntervalType, Long>();
    +
    +    static {
    +        intervals.put(IntervalType.DAY, (long) 24 * 60 * 60 * 1000);
    +        intervals.put(IntervalType.MONTH, (long) 30 * 24 * 60 * 60 * 1000);//30 day
    +        intervals.put(IntervalType.YEAR, (long) 365 * 30 * 24 * 60 * 60 * 1000);//365 day
    +    }
    +
    +    /**
    +     * Constructor for JdbcPartitionFragmenter.
    +     *
    +     * @param inConf input data such as which Jdbc table to scan
    +     * @throws JdbcFragmentException
    +     */
    +    public JdbcPartitionFragmenter(InputData inConf) throws JdbcFragmentException {
    +        super(inConf);
    +        if(inConf.getUserProperty("PARTITION_BY") == null )
    --- End diff --
    
    if PARTITION_BY must be defined, we should probably throw an exception here instead of returning.
    I think that  `getFragments()` will throw an exception because `partitionType` will remain null.
    
    Alternatively, if there is no partition define, the behaviour can be to scan the whole table - and maybe make `getFragments()` return a single fragment with WHERE 1=1 or without a where clause.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq issue #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by edespino <gi...@git.apache.org>.

Github user edespino commented on the issue:

    https://github.com/apache/incubator-hawq/pull/972
  
    @shivzone - for future reference, there are two ways a PR can be automatically closed that I rely on regularly:
    
    - You can reference the pull request in the commit description or body of the commit message (e.g. "closes #972), the PR will automatically be closed. 
    
    - If the review branch has been recently rebased on master, the sha1 in the review branch and master can be the same after the merge process. If this is the case, the PR will detect the change has been merged and the PR is automatically closed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by sansanichfb <gi...@git.apache.org>.

Github user sansanichfb commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85825494
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,298 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.api.UserDataException;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    + *
    + * Extends the {@link Fragmenter} abstract class, with the purpose of transforming
    + * an input data path  (an JDBC Database table name  and user request parameters)  into a list of regions
    + * that belong to this table.
    + * <p>
    + * <h4>The parameter Patterns </h4>
    + * There are three  parameters,  the format is as follows:<p>
    + * <pre>
    + * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
    + * </pre>
    + * The <code>PARTITION_BY</code> parameter can be split by colon(':'),the <code>column_type</code> current supported : <code>date,int,enum</code> .
    + * The Date format is 'yyyy-MM-dd'. <p>
    + * The <code>RANGE</code> parameter can be split by colon(':') ,used to identify the starting range of each fragment.
    + * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
    + * the <code>end_value</code> can be empty. If the <code>column_type</code>is <code>enum</code>,the parameter <code>RANGE</code> can be empty. <p>
    + * The <code>INTERVAL</code> parameter can be split by colon(':'), indicate the interval value of one fragment.
    + * When <code>column_type</code> is <code>date</code>,this parameter must be split by colon, and <code>interval_unit</code> can be <code>year,month,day</code>.
    + * When <code>column_type</code> is <code>int</code>, the <code>interval_unit</code> can be empty.
    + * When <code>column_type</code> is <code>enum</code>,the <code>INTERVAL</code> parameter can be empty.
    + * </p>
    + * <p>
    + * The syntax examples is :<p>
    + * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
    + * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
    + * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
    + * </p>
    + *
    + */
    +public class JdbcPartitionFragmenter extends Fragmenter {
    +    String[] partitionBy = null;
    +    String[] range = null;
    +    String[] interval = null;
    +    PartitionType partitionType = null;
    +    String partitionColumn = null;
    +    IntervalType intervalType = null;
    +    int intervalNum = 1;
    +
    +    enum PartitionType {
    +        DATE,
    +        INT,
    +        ENUM;
    +
    +        public static PartitionType getType(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    enum IntervalType {
    +        DAY,
    +        MONTH,
    +        YEAR;
    +
    +        public static IntervalType type(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    //The unit interval, in milliseconds, that is used to estimate the number of slices for the date partition type
    +    static Map<IntervalType, Long> intervals = new HashMap<IntervalType, Long>();
    +
    +    static {
    +        intervals.put(IntervalType.DAY, (long) 24 * 60 * 60 * 1000);
    +        //30 days
    +        intervals.put(IntervalType.MONTH, (long) 30 * 24 * 60 * 60 * 1000);
    +        //365 days
    +        intervals.put(IntervalType.YEAR, (long) 365 * 30 * 24 * 60 * 60 * 1000);
    +    }
    +
    +    /**
    +     * Constructor for JdbcPartitionFragmenter.
    +     *
    +     * @param inConf input data such as which Jdbc table to scan
    +     * @throws UserDataException
    +     */
    +    public JdbcPartitionFragmenter(InputData inConf) throws UserDataException  {
    +        super(inConf);
    +        if(inConf.getUserProperty("PARTITION_BY") == null )
    +            return;
    +        try {
    +            partitionBy = inConf.getUserProperty("PARTITION_BY").split(":");
    +            partitionColumn = partitionBy[0];
    +            partitionType = PartitionType.getType(partitionBy[1]);
    +
    +            range = inConf.getUserProperty("RANGE").split(":");
    +
    +            //parse and validate parameter-INTERVAL
    +            if (inConf.getUserProperty("INTERVAL") != null) {
    +                interval = inConf.getUserProperty("INTERVAL").split(":");
    +                intervalNum = Integer.parseInt(interval[0]);
    +                if (interval.length > 1)
    +                    intervalType = IntervalType.type(interval[1]);
    +            }
    +            if (intervalNum < 1)
    +                throw new UserDataException("The parameter{INTERVAL} must > 1, but actual is '" + intervalNum+"'");
    +        }catch (IllegalArgumentException e1){
    +            throw new UserDataException(e1);
    +        }catch (UserDataException e2){
    +            throw e2;
    +        }
    +    }
    +
    +    /**
    +     * Returns statistics for Jdbc table. Currently it's not implemented.
    +     */
    +    @Override
    +    public FragmentsStats getFragmentsStats() throws Exception {
    +        throw new UnsupportedOperationException("ANALYZE for Jdbc plugin is not supported");
    +    }
    +
    +    /**
    +     * Returns list of fragments containing all of the
    +     * Jdbc table data.
    +     *
    +     * @return a list of fragments
    +     */
    +    @Override
    +    public List<Fragment> getFragments() throws Exception {
    +        if(partitionType == null ) {
    +            byte[] fragmentMetadata = null;
    +            byte[] userData = null;
    +            Fragment fragment = new Fragment(inputData.getDataSource(), null, fragmentMetadata, userData);
    +            fragments.add(fragment);
    +            return prepareHosts(fragments);
    +        }
    +        switch (partitionType) {
    +            case DATE: {
    +                SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd");
    +                Date t_start = df.parse(range[0]);
    +                Date t_end = df.parse(range[1]);
    +                int curr_interval = intervalNum;
    +
    +                Calendar frag_start = Calendar.getInstance();
    --- End diff --
    
    Should we use camel case fro variables naming?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85471996
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcReadResolver.java ---
    @@ -0,0 +1,116 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hawq.pxf.api.*;
    +import org.apache.hawq.pxf.api.io.DataType;
    +import org.apache.hawq.pxf.api.utilities.ColumnDescriptor;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +import org.apache.hawq.pxf.api.utilities.Plugin;
    +import java.sql.ResultSet;
    +import java.util.ArrayList;
    +import java.util.LinkedList;
    +import java.util.List;
    +
    +/**
    + * Class JdbcReadResolver Read the Jdbc ResultSet, and generates the data type - List <OneField>.
    + */
    +public class JdbcReadResolver extends Plugin implements ReadResolver {
    +    private static final Log LOG = LogFactory.getLog(JdbcReadResolver.class);
    +    //HAWQ Table column definitions
    +    private ArrayList<ColumnDescriptor> columns = null;
    +
    +    public JdbcReadResolver(InputData input) {
    +        super(input);
    +        columns = input.getTupleDescription();
    +    }
    +
    +    @Override
    +    public List<OneField> getFields(OneRow row) throws Exception {
    +        //LOG.info("getFields");
    --- End diff --
    
    please remove all commented code in this function


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq issue #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by michaelandrepearce <gi...@git.apache.org>.

Github user michaelandrepearce commented on the issue:

    https://github.com/apache/incubator-hawq/pull/972
  
      Can this be merged now?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq issue #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by edespino <gi...@git.apache.org>.

Github user edespino commented on the issue:

    https://github.com/apache/incubator-hawq/pull/972
  
    Shouldn't there be an update of the classpath files located in pxf/pxf-service/src/main/resources with:
    
    ```
    /usr/lib/pxf/pxf-jdbc-*[0-9].jar
    ```
    
    Should the vendor specific jdbc drivers also be added to these files?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85470541
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,284 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    + *
    + * Extends the {@link Fragmenter} abstract class, with the purpose of transforming
    + * an input data path  (an JDBC Database table name  and user request parameters)  into a list of regions
    + * that belong to this table.
    + * <p>
    + * <h4>The parameter Patterns </h4>
    + * There are three  parameters,  the format is as follows:<p>
    + * <pre>
    + * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
    + * </pre>
    + * The <code>PARTITION_BY</code> parameter can be split by colon(':'),the <code>column_type</code> current supported : <code>date,int,enum</code> .
    + * The Date format is 'yyyy-MM-dd'. <p>
    + * The <code>RANGE</code> parameter can be split by colon(':') ,used to identify the starting range of each fragment.
    + * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
    + * the <code>end_value</code> can be empty. If the <code>column_type</code>is <code>enum</code>,the parameter <code>RANGE</code> can be empty. <p>
    + * The <code>INTERVAL</code> parameter can be split by colon(':'), indicate the interval value of one fragment.
    + * When <code>column_type</code> is <code>date</code>,this parameter must be split by colon, and <code>interval_unit</code> can be <code>year,month,day</code>.
    + * When <code>column_type</code> is <code>int</code>, the <code>interval_unit</code> can be empty.
    + * When <code>column_type</code> is <code>enum</code>,the <code>INTERVAL</code> parameter can be empty.
    + * </p>
    + * <p>
    + * The syntax examples is :<p>
    + * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
    + * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
    + * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
    + * </p>
    + *
    + */
    +public class JdbcPartitionFragmenter extends Fragmenter {
    +    String[] partition_by = null;
    +    String[] range = null;
    +    String[] interval = null;
    +    PartitionType partitionType = null;
    +    String partitionColumn = null;
    +    IntervalType intervalType = null;
    +    int intervalNum = 1;
    +
    +    enum PartitionType {
    +        DATE,
    +        INT,
    +        ENUM;
    +
    +        public static PartitionType getType(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    enum IntervalType {
    +        DAY,
    +        MONTH,
    +        YEAR;
    +
    +        public static IntervalType type(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    //The unit interval, in milliseconds, that is used to estimate the number of slices for the date partition type
    +    static Map<IntervalType, Long> intervals = new HashMap<IntervalType, Long>();
    +
    +    static {
    +        intervals.put(IntervalType.DAY, (long) 24 * 60 * 60 * 1000);
    +        intervals.put(IntervalType.MONTH, (long) 30 * 24 * 60 * 60 * 1000);//30 day
    +        intervals.put(IntervalType.YEAR, (long) 365 * 30 * 24 * 60 * 60 * 1000);//365 day
    +    }
    +
    +    /**
    +     * Constructor for JdbcPartitionFragmenter.
    +     *
    +     * @param inConf input data such as which Jdbc table to scan
    +     * @throws JdbcFragmentException
    +     */
    +    public JdbcPartitionFragmenter(InputData inConf) throws JdbcFragmentException {
    +        super(inConf);
    +        if(inConf.getUserProperty("PARTITION_BY") == null )
    +            return;
    +        partition_by = inConf.getUserProperty("PARTITION_BY").split(":");
    --- End diff --
    
    since all of these properties are user provided, maybe check that they match the expectations as specified in the documentation above, and if not provide a user friendly error message. `UserDataException` might be appropriate.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by jiadexin <gi...@git.apache.org>.

Github user jiadexin commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85677094
  
    --- Diff: pxf/pxf-jdbc/src/test/java/org/apache/hawq/pxf/plugins/jdbc/JdbcFilterBuilderTest.java ---
    @@ -0,0 +1,81 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + * 
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + * 
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +
    +import org.apache.hawq.pxf.api.BasicFilter;
    +import org.apache.hawq.pxf.api.FilterParser.LogicalOperation;
    +import org.apache.hawq.pxf.api.LogicalFilter;
    +import org.junit.Test;
    +
    +import static org.apache.hawq.pxf.api.FilterParser.Operation.*;
    +import static org.junit.Assert.assertEquals;
    +
    +public class JdbcFilterBuilderTest {
    +    @Test
    +    public void parseFilterWithThreeOperations() throws Exception {
    +        //orgin sql => cdate>'2008-02-01' and cdate<'2008-12-01' and amt > 1200
    +        //filterstr="a1c\"first\"o5a2c2o2l0";//col_1=first and col_2=2
    +        String filterstr = "a1c\"2008-02-01\"o2a1c\"2008-12-01\"o1l0a2c1200o2l1"; //col_1>'first' and col_1<'2008-12-01' or col_2 > 1200;
    +        JdbcFilterBuilder builder = new JdbcFilterBuilder();
    +
    +        LogicalFilter filterList = (LogicalFilter) builder.getFilterObject(filterstr);
    +        assertEquals(LogicalOperation.HDOP_OR, filterList.getOperator());
    +        LogicalFilter l1_left = (LogicalFilter) filterList.getFilterList().get(0);
    +        BasicFilter l1_right = (BasicFilter) filterList.getFilterList().get(1);
    +        //column_2 > 1200
    +        assertEquals(2, l1_right.getColumn().index());
    +        assertEquals(HDOP_GT, l1_right.getOperation());
    +        assertEquals(1200L, l1_right.getConstant().constant());
    +
    +        assertEquals(LogicalOperation.HDOP_AND, l1_left.getOperator());
    +        BasicFilter l2_left = (BasicFilter) l1_left.getFilterList().get(0);
    +        BasicFilter l2_right = (BasicFilter) l1_left.getFilterList().get(1);
    +
    +        //column_1 > '2008-02-01'
    +        assertEquals(1, l2_left.getColumn().index());
    +        assertEquals(HDOP_GT, l2_left.getOperation());
    +        assertEquals("2008-02-01", l2_left.getConstant().constant());
    +
    +        //column_2 = 5
    +        assertEquals(1, l2_right.getColumn().index());
    +        assertEquals(HDOP_LT, l2_right.getOperation());
    +        assertEquals("2008-12-01", l2_right.getConstant().constant());
    +
    +    }
    +
    +    @Test
    +    public void parseFilterWithLogicalOperation() throws Exception {
    +        WhereSQLBuilder builder = new WhereSQLBuilder(null);
    --- End diff --
    
    WhereSQLBuilder used to build sql statement, its 'test' through JdbcMySqlExtensionTest completed.
    There are other ways to test the correctness of sql statement?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85470831
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,284 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    + *
    + * Extends the {@link Fragmenter} abstract class, with the purpose of transforming
    + * an input data path  (an JDBC Database table name  and user request parameters)  into a list of regions
    + * that belong to this table.
    + * <p>
    + * <h4>The parameter Patterns </h4>
    + * There are three  parameters,  the format is as follows:<p>
    + * <pre>
    + * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
    + * </pre>
    + * The <code>PARTITION_BY</code> parameter can be split by colon(':'),the <code>column_type</code> current supported : <code>date,int,enum</code> .
    + * The Date format is 'yyyy-MM-dd'. <p>
    + * The <code>RANGE</code> parameter can be split by colon(':') ,used to identify the starting range of each fragment.
    + * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
    + * the <code>end_value</code> can be empty. If the <code>column_type</code>is <code>enum</code>,the parameter <code>RANGE</code> can be empty. <p>
    + * The <code>INTERVAL</code> parameter can be split by colon(':'), indicate the interval value of one fragment.
    + * When <code>column_type</code> is <code>date</code>,this parameter must be split by colon, and <code>interval_unit</code> can be <code>year,month,day</code>.
    + * When <code>column_type</code> is <code>int</code>, the <code>interval_unit</code> can be empty.
    + * When <code>column_type</code> is <code>enum</code>,the <code>INTERVAL</code> parameter can be empty.
    + * </p>
    + * <p>
    + * The syntax examples is :<p>
    + * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
    + * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
    + * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
    + * </p>
    + *
    + */
    +public class JdbcPartitionFragmenter extends Fragmenter {
    +    String[] partition_by = null;
    --- End diff --
    
    Please change all variables to be camelCase, for example `partitionBy` instead of `partition_by`.
    (The coding conventions can be found here https://cwiki.apache.org/confluence/display/HAWQ/PXF+Coding+Conventions#PXFCodingConventions-Naming)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by jiadexin <gi...@git.apache.org>.

Github user jiadexin commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85867311
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/utils/ByteUtil.java ---
    @@ -0,0 +1,86 @@
    +package org.apache.hawq.pxf.plugins.jdbc.utils;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +
    +/**
    + * A tool class, used to deal with byte array merging, split and other methods.
    + */
    +public class ByteUtil {
    +
    +    public static byte[] mergeBytes(byte[] b1, byte[] b2) {
    --- End diff --
    
    This method is simple, I do not want to import a dependency.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85678564
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,284 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    + *
    + * Extends the {@link Fragmenter} abstract class, with the purpose of transforming
    + * an input data path  (an JDBC Database table name  and user request parameters)  into a list of regions
    + * that belong to this table.
    + * <p>
    + * <h4>The parameter Patterns </h4>
    + * There are three  parameters,  the format is as follows:<p>
    + * <pre>
    + * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
    + * </pre>
    + * The <code>PARTITION_BY</code> parameter can be split by colon(':'),the <code>column_type</code> current supported : <code>date,int,enum</code> .
    + * The Date format is 'yyyy-MM-dd'. <p>
    + * The <code>RANGE</code> parameter can be split by colon(':') ,used to identify the starting range of each fragment.
    + * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
    + * the <code>end_value</code> can be empty. If the <code>column_type</code>is <code>enum</code>,the parameter <code>RANGE</code> can be empty. <p>
    + * The <code>INTERVAL</code> parameter can be split by colon(':'), indicate the interval value of one fragment.
    + * When <code>column_type</code> is <code>date</code>,this parameter must be split by colon, and <code>interval_unit</code> can be <code>year,month,day</code>.
    + * When <code>column_type</code> is <code>int</code>, the <code>interval_unit</code> can be empty.
    + * When <code>column_type</code> is <code>enum</code>,the <code>INTERVAL</code> parameter can be empty.
    + * </p>
    + * <p>
    + * The syntax examples is :<p>
    + * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
    + * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
    + * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
    + * </p>
    + *
    + */
    +public class JdbcPartitionFragmenter extends Fragmenter {
    +    String[] partition_by = null;
    +    String[] range = null;
    +    String[] interval = null;
    +    PartitionType partitionType = null;
    +    String partitionColumn = null;
    +    IntervalType intervalType = null;
    +    int intervalNum = 1;
    +
    +    enum PartitionType {
    +        DATE,
    +        INT,
    +        ENUM;
    +
    +        public static PartitionType getType(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    enum IntervalType {
    +        DAY,
    +        MONTH,
    +        YEAR;
    +
    +        public static IntervalType type(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    //The unit interval, in milliseconds, that is used to estimate the number of slices for the date partition type
    +    static Map<IntervalType, Long> intervals = new HashMap<IntervalType, Long>();
    +
    +    static {
    +        intervals.put(IntervalType.DAY, (long) 24 * 60 * 60 * 1000);
    +        intervals.put(IntervalType.MONTH, (long) 30 * 24 * 60 * 60 * 1000);//30 day
    +        intervals.put(IntervalType.YEAR, (long) 365 * 30 * 24 * 60 * 60 * 1000);//365 day
    +    }
    +
    +    /**
    +     * Constructor for JdbcPartitionFragmenter.
    +     *
    +     * @param inConf input data such as which Jdbc table to scan
    +     * @throws JdbcFragmentException
    +     */
    +    public JdbcPartitionFragmenter(InputData inConf) throws JdbcFragmentException {
    +        super(inConf);
    +        if(inConf.getUserProperty("PARTITION_BY") == null )
    +            return;
    +        partition_by = inConf.getUserProperty("PARTITION_BY").split(":");
    --- End diff --
    
    Thank you! could you please update the documentation above to say that PARTITION_BY is not mandatory? 
    Also, please make sure that getFragments will not fail when this parameter is not defined.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85476012
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,284 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    + *
    + * Extends the {@link Fragmenter} abstract class, with the purpose of transforming
    + * an input data path  (an JDBC Database table name  and user request parameters)  into a list of regions
    + * that belong to this table.
    + * <p>
    + * <h4>The parameter Patterns </h4>
    + * There are three  parameters,  the format is as follows:<p>
    + * <pre>
    + * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
    + * </pre>
    + * The <code>PARTITION_BY</code> parameter can be split by colon(':'),the <code>column_type</code> current supported : <code>date,int,enum</code> .
    + * The Date format is 'yyyy-MM-dd'. <p>
    + * The <code>RANGE</code> parameter can be split by colon(':') ,used to identify the starting range of each fragment.
    + * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
    + * the <code>end_value</code> can be empty. If the <code>column_type</code>is <code>enum</code>,the parameter <code>RANGE</code> can be empty. <p>
    + * The <code>INTERVAL</code> parameter can be split by colon(':'), indicate the interval value of one fragment.
    + * When <code>column_type</code> is <code>date</code>,this parameter must be split by colon, and <code>interval_unit</code> can be <code>year,month,day</code>.
    + * When <code>column_type</code> is <code>int</code>, the <code>interval_unit</code> can be empty.
    + * When <code>column_type</code> is <code>enum</code>,the <code>INTERVAL</code> parameter can be empty.
    + * </p>
    + * <p>
    + * The syntax examples is :<p>
    + * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
    + * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
    + * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
    + * </p>
    + *
    + */
    +public class JdbcPartitionFragmenter extends Fragmenter {
    +    String[] partition_by = null;
    +    String[] range = null;
    +    String[] interval = null;
    +    PartitionType partitionType = null;
    +    String partitionColumn = null;
    +    IntervalType intervalType = null;
    +    int intervalNum = 1;
    +
    +    enum PartitionType {
    +        DATE,
    +        INT,
    +        ENUM;
    +
    +        public static PartitionType getType(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    enum IntervalType {
    +        DAY,
    +        MONTH,
    +        YEAR;
    +
    +        public static IntervalType type(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    //The unit interval, in milliseconds, that is used to estimate the number of slices for the date partition type
    +    static Map<IntervalType, Long> intervals = new HashMap<IntervalType, Long>();
    +
    +    static {
    +        intervals.put(IntervalType.DAY, (long) 24 * 60 * 60 * 1000);
    +        intervals.put(IntervalType.MONTH, (long) 30 * 24 * 60 * 60 * 1000);//30 day
    +        intervals.put(IntervalType.YEAR, (long) 365 * 30 * 24 * 60 * 60 * 1000);//365 day
    +    }
    +
    +    /**
    +     * Constructor for JdbcPartitionFragmenter.
    +     *
    +     * @param inConf input data such as which Jdbc table to scan
    +     * @throws JdbcFragmentException
    +     */
    +    public JdbcPartitionFragmenter(InputData inConf) throws JdbcFragmentException {
    +        super(inConf);
    +        if(inConf.getUserProperty("PARTITION_BY") == null )
    +            return;
    +        partition_by = inConf.getUserProperty("PARTITION_BY").split(":");
    +        partitionColumn = partition_by[0];
    +        partitionType = PartitionType.getType(partition_by[1]);
    +
    +        range = inConf.getUserProperty("RANGE").split(":");
    +
    +        //parse and validate parameter-INTERVAL
    +        if (inConf.getUserProperty("INTERVAL") != null) {
    +            interval = inConf.getUserProperty("INTERVAL").split(":");
    +            intervalNum = Integer.parseInt(interval[0]);
    +            if (interval.length > 1)
    +                intervalType = IntervalType.type(interval[1]);
    +        }
    +        if (intervalNum < 1)
    +            throw new JdbcFragmentException("The parameter{INTERVAL} must > 1, but actual is '" + intervalNum+"'");
    +    }
    +    /**
    +     * Returns statistics for Jdbc table. Currently it's not implemented.
    +     */
    +    @Override
    +    public FragmentsStats getFragmentsStats() throws Exception {
    +        throw new UnsupportedOperationException("ANALYZE for Jdbc plugin is not supported");
    +    }
    +
    --- End diff --
    
    please remove extra lines


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85470195
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,284 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    + *
    + * Extends the {@link Fragmenter} abstract class, with the purpose of transforming
    + * an input data path  (an JDBC Database table name  and user request parameters)  into a list of regions
    + * that belong to this table.
    + * <p>
    + * <h4>The parameter Patterns </h4>
    + * There are three  parameters,  the format is as follows:<p>
    + * <pre>
    + * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
    + * </pre>
    + * The <code>PARTITION_BY</code> parameter can be split by colon(':'),the <code>column_type</code> current supported : <code>date,int,enum</code> .
    + * The Date format is 'yyyy-MM-dd'. <p>
    + * The <code>RANGE</code> parameter can be split by colon(':') ,used to identify the starting range of each fragment.
    + * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
    + * the <code>end_value</code> can be empty. If the <code>column_type</code>is <code>enum</code>,the parameter <code>RANGE</code> can be empty. <p>
    + * The <code>INTERVAL</code> parameter can be split by colon(':'), indicate the interval value of one fragment.
    + * When <code>column_type</code> is <code>date</code>,this parameter must be split by colon, and <code>interval_unit</code> can be <code>year,month,day</code>.
    + * When <code>column_type</code> is <code>int</code>, the <code>interval_unit</code> can be empty.
    + * When <code>column_type</code> is <code>enum</code>,the <code>INTERVAL</code> parameter can be empty.
    + * </p>
    + * <p>
    + * The syntax examples is :<p>
    + * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
    + * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
    + * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
    + * </p>
    + *
    + */
    +public class JdbcPartitionFragmenter extends Fragmenter {
    +    String[] partition_by = null;
    +    String[] range = null;
    +    String[] interval = null;
    +    PartitionType partitionType = null;
    +    String partitionColumn = null;
    +    IntervalType intervalType = null;
    +    int intervalNum = 1;
    +
    +    enum PartitionType {
    +        DATE,
    +        INT,
    +        ENUM;
    +
    +        public static PartitionType getType(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    enum IntervalType {
    +        DAY,
    +        MONTH,
    +        YEAR;
    +
    +        public static IntervalType type(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    //The unit interval, in milliseconds, that is used to estimate the number of slices for the date partition type
    +    static Map<IntervalType, Long> intervals = new HashMap<IntervalType, Long>();
    +
    +    static {
    +        intervals.put(IntervalType.DAY, (long) 24 * 60 * 60 * 1000);
    +        intervals.put(IntervalType.MONTH, (long) 30 * 24 * 60 * 60 * 1000);//30 day
    --- End diff --
    
    minor - please add a space before the comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq issue #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by jiadexin <gi...@git.apache.org>.

Github user jiadexin commented on the issue:

    https://github.com/apache/incubator-hawq/pull/972
  
    @hornn Thank you for your suggestion,  the original code has some of casually.
    Some of the recommendations I replied to you , the other already changed in accordance with the your  recommendations  .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85470036
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcFilterBuilder.java ---
    @@ -0,0 +1,145 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.BasicFilter;
    +import org.apache.hawq.pxf.api.FilterParser;
    +import org.apache.hawq.pxf.api.LogicalFilter;
    +
    +import java.util.Arrays;
    +import java.util.LinkedList;
    +import java.util.List;
    +import java.util.regex.Matcher;
    +import java.util.regex.Pattern;
    +
    +/**
    + * Uses the filter parser code to build a filter object, either simple - a
    + * single {@link BasicFilter} object or a
    + * compound - a {@link java.util.List} of
    + * {@link BasicFilter} objects.
    + * The subclass {@link org.apache.hawq.pxf.plugins.jdbc.WhereSQLBuilder} will use the filter for
    + * generate WHERE statement.
    + */
    +public class JdbcFilterBuilder implements FilterParser.FilterBuilder  {
    --- End diff --
    
    this class looks very similar to `HiveFilterBuilder.java` - maybe consider inheriting from it or creating a common implementation?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq issue #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by ig-michaelpearce <gi...@git.apache.org>.

Github user ig-michaelpearce commented on the issue:

    https://github.com/apache/incubator-hawq/pull/972
  
    @shivzone do other pxf plugins have this? I cannot see this. Could this be merged and then later address in a follow on JIRA? Seems this has stalled.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85471780
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPlugin.java ---
    @@ -0,0 +1,114 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +import org.apache.hawq.pxf.api.utilities.Plugin;
    +
    +import java.sql.*;
    +
    +/**
    + * This class resolves the jdbc connection parameter and manages the opening and closing of the jdbc connection.
    + * Implemented subclasses: {@link JdbcReadAccessor}.
    + *
    + */
    +public class JdbcPlugin extends Plugin {
    +    private static final Log LOG = LogFactory.getLog(JdbcPlugin.class);
    +
    +    //jdbc connection parameters
    +    protected String jdbcDriver = null;
    +    protected String dbUrl = null;
    +    protected String user = null;
    +    protected String pass = null;
    +    protected String tblName = null;
    +    protected int batchSize = 100;
    +
    +    //jdbc connection
    +    protected Connection dbconn = null;
    +    //database type\uff0cfrom DatabaseMetaData.getDatabaseProductName()
    +    protected String dbProduct = null;
    +
    +    /**
    +     * parse
    +     *
    +     * @param input the input data
    +     */
    +    public JdbcPlugin(InputData input) {
    +        super(input);
    +        jdbcDriver = input.getUserProperty("JDBC_DRIVER");
    +        dbUrl = input.getUserProperty("DB_URL");
    +        //dbUrl = "jdbc:mysql://192.168.200.6:3306/demodb";
    --- End diff --
    
    please remove commented code, or edit to be an example.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85678810
  
    --- Diff: pxf/pxf-jdbc/src/test/java/org/apache/hawq/pxf/plugins/jdbc/JdbcMySqlExtensionTest.java ---
    @@ -0,0 +1,303 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import com.sun.org.apache.xml.internal.utils.StringComparable;
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hawq.pxf.api.FilterParser;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.OneField;
    +import org.apache.hawq.pxf.api.OneRow;
    +import org.apache.hawq.pxf.api.utilities.ColumnDescriptor;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +import org.apache.hawq.pxf.api.io.DataType;
    +import org.junit.After;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import java.sql.SQLException;
    +import java.sql.Statement;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +import static org.junit.Assert.assertEquals;
    +import static org.junit.Assert.assertTrue;
    +import static org.mockito.Mockito.mock;
    +import static org.mockito.Mockito.when;
    +
    +public class JdbcMySqlExtensionTest {
    +    private static final Log LOG = LogFactory.getLog(JdbcMySqlExtensionTest.class);
    +    static String MYSQL_URL = "jdbc:mysql://localhost:3306/demodb";
    --- End diff --
    
    I think that the sql builder logic can be tested separately as part of the WhereSQLBuilder tests.
    If you still want to keep the test, one option is to add it to the automation tests of PXF, which as far as I know were not open sourced yet (maybe this is a good reason to do that :)).
    Another option is to make it a separate module - not part of the unit tests, and not something that is running as part of the compilation. Maybe wait for other people to give their opinion on that matter...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85473219
  
    --- Diff: pxf/pxf-jdbc/src/test/java/org/apache/hawq/pxf/plugins/jdbc/JdbcFilterBuilderTest.java ---
    @@ -0,0 +1,81 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + * 
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + * 
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +
    +import org.apache.hawq.pxf.api.BasicFilter;
    +import org.apache.hawq.pxf.api.FilterParser.LogicalOperation;
    +import org.apache.hawq.pxf.api.LogicalFilter;
    +import org.junit.Test;
    +
    +import static org.apache.hawq.pxf.api.FilterParser.Operation.*;
    +import static org.junit.Assert.assertEquals;
    +
    +public class JdbcFilterBuilderTest {
    +    @Test
    +    public void parseFilterWithThreeOperations() throws Exception {
    +        //orgin sql => cdate>'2008-02-01' and cdate<'2008-12-01' and amt > 1200
    +        //filterstr="a1c\"first\"o5a2c2o2l0";//col_1=first and col_2=2
    +        String filterstr = "a1c\"2008-02-01\"o2a1c\"2008-12-01\"o1l0a2c1200o2l1"; //col_1>'first' and col_1<'2008-12-01' or col_2 > 1200;
    +        JdbcFilterBuilder builder = new JdbcFilterBuilder();
    +
    +        LogicalFilter filterList = (LogicalFilter) builder.getFilterObject(filterstr);
    +        assertEquals(LogicalOperation.HDOP_OR, filterList.getOperator());
    +        LogicalFilter l1_left = (LogicalFilter) filterList.getFilterList().get(0);
    +        BasicFilter l1_right = (BasicFilter) filterList.getFilterList().get(1);
    +        //column_2 > 1200
    +        assertEquals(2, l1_right.getColumn().index());
    +        assertEquals(HDOP_GT, l1_right.getOperation());
    +        assertEquals(1200L, l1_right.getConstant().constant());
    +
    +        assertEquals(LogicalOperation.HDOP_AND, l1_left.getOperator());
    +        BasicFilter l2_left = (BasicFilter) l1_left.getFilterList().get(0);
    +        BasicFilter l2_right = (BasicFilter) l1_left.getFilterList().get(1);
    +
    +        //column_1 > '2008-02-01'
    +        assertEquals(1, l2_left.getColumn().index());
    +        assertEquals(HDOP_GT, l2_left.getOperation());
    +        assertEquals("2008-02-01", l2_left.getConstant().constant());
    +
    +        //column_2 = 5
    +        assertEquals(1, l2_right.getColumn().index());
    +        assertEquals(HDOP_LT, l2_right.getOperation());
    +        assertEquals("2008-12-01", l2_right.getConstant().constant());
    +
    +    }
    +
    +    @Test
    +    public void parseFilterWithLogicalOperation() throws Exception {
    +        WhereSQLBuilder builder = new WhereSQLBuilder(null);
    --- End diff --
    
    could you add tests for `WhereSQLBuilder.buildWhereSQL()`, not just for the parsing?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85470171
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,284 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    --- End diff --
    
    Great documentation! thanks for providing the examples as well.
    minor - could you please remove the double spaces?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by jiadexin <gi...@git.apache.org>.

Github user jiadexin closed the pull request at:

    https://github.com/apache/incubator-hawq/pull/972


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by shivzone <gi...@git.apache.org>.

Github user shivzone commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r86221613
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,284 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    + *
    + * Extends the {@link Fragmenter} abstract class, with the purpose of transforming
    + * an input data path  (an JDBC Database table name  and user request parameters)  into a list of regions
    + * that belong to this table.
    + * <p>
    + * <h4>The parameter Patterns </h4>
    + * There are three  parameters,  the format is as follows:<p>
    + * <pre>
    + * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
    + * </pre>
    + * The <code>PARTITION_BY</code> parameter can be split by colon(':'),the <code>column_type</code> current supported : <code>date,int,enum</code> .
    + * The Date format is 'yyyy-MM-dd'. <p>
    + * The <code>RANGE</code> parameter can be split by colon(':') ,used to identify the starting range of each fragment.
    + * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
    + * the <code>end_value</code> can be empty. If the <code>column_type</code>is <code>enum</code>,the parameter <code>RANGE</code> can be empty. <p>
    + * The <code>INTERVAL</code> parameter can be split by colon(':'), indicate the interval value of one fragment.
    + * When <code>column_type</code> is <code>date</code>,this parameter must be split by colon, and <code>interval_unit</code> can be <code>year,month,day</code>.
    + * When <code>column_type</code> is <code>int</code>, the <code>interval_unit</code> can be empty.
    + * When <code>column_type</code> is <code>enum</code>,the <code>INTERVAL</code> parameter can be empty.
    + * </p>
    + * <p>
    + * The syntax examples is :<p>
    + * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
    + * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
    + * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
    + * </p>
    + *
    + */
    +public class JdbcPartitionFragmenter extends Fragmenter {
    +    String[] partition_by = null;
    +    String[] range = null;
    +    String[] interval = null;
    +    PartitionType partitionType = null;
    +    String partitionColumn = null;
    +    IntervalType intervalType = null;
    +    int intervalNum = 1;
    +
    +    enum PartitionType {
    +        DATE,
    +        INT,
    +        ENUM;
    +
    +        public static PartitionType getType(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    enum IntervalType {
    +        DAY,
    +        MONTH,
    +        YEAR;
    +
    +        public static IntervalType type(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    //The unit interval, in milliseconds, that is used to estimate the number of slices for the date partition type
    +    static Map<IntervalType, Long> intervals = new HashMap<IntervalType, Long>();
    +
    +    static {
    +        intervals.put(IntervalType.DAY, (long) 24 * 60 * 60 * 1000);
    +        intervals.put(IntervalType.MONTH, (long) 30 * 24 * 60 * 60 * 1000);//30 day
    +        intervals.put(IntervalType.YEAR, (long) 365 * 30 * 24 * 60 * 60 * 1000);//365 day
    +    }
    +
    +    /**
    +     * Constructor for JdbcPartitionFragmenter.
    +     *
    +     * @param inConf input data such as which Jdbc table to scan
    +     * @throws JdbcFragmentException
    +     */
    +    public JdbcPartitionFragmenter(InputData inConf) throws JdbcFragmentException {
    +        super(inConf);
    +        if(inConf.getUserProperty("PARTITION_BY") == null )
    +            return;
    +        partition_by = inConf.getUserProperty("PARTITION_BY").split(":");
    +        partitionColumn = partition_by[0];
    +        partitionType = PartitionType.getType(partition_by[1]);
    +
    +        range = inConf.getUserProperty("RANGE").split(":");
    +
    +        //parse and validate parameter-INTERVAL
    +        if (inConf.getUserProperty("INTERVAL") != null) {
    +            interval = inConf.getUserProperty("INTERVAL").split(":");
    +            intervalNum = Integer.parseInt(interval[0]);
    +            if (interval.length > 1)
    +                intervalType = IntervalType.type(interval[1]);
    +        }
    +        if (intervalNum < 1)
    +            throw new JdbcFragmentException("The parameter{INTERVAL} must > 1, but actual is '" + intervalNum+"'");
    +    }
    +    /**
    +     * Returns statistics for Jdbc table. Currently it's not implemented.
    +     */
    +    @Override
    +    public FragmentsStats getFragmentsStats() throws Exception {
    +        throw new UnsupportedOperationException("ANALYZE for Jdbc plugin is not supported");
    +    }
    +
    +
    +
    +    /**
    +     * Returns list of fragments containing all of the
    +     * Jdbc table data.
    +     *
    +     * @return a list of fragments
    +     */
    +    @Override
    +    public List<Fragment> getFragments() throws Exception {
    +        switch (partitionType) {
    +            case DATE: {
    +                SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd");
    +                Date t_start = df.parse(range[0]);
    +                Date t_end = df.parse(range[1]);
    +                int curr_interval = intervalNum;
    +
    +                Calendar frag_start = Calendar.getInstance();
    +                Calendar c_end = Calendar.getInstance();
    +                frag_start.setTime(t_start);
    +                c_end.setTime(t_end);
    +                while (frag_start.before(c_end)) {//|| frag_start.compareTo(c_end) == 0) {
    +                    Calendar frag_end = (Calendar) frag_start.clone();
    +                    switch (intervalType) {
    +                        case DAY:
    +                            frag_end.add(Calendar.DAY_OF_MONTH, curr_interval);
    +                            break;
    +                        case MONTH:
    +                            frag_end.add(Calendar.MONTH, curr_interval);
    +                            break;
    +                        //case YEAR:
    +                        default:
    +                            frag_end.add(Calendar.YEAR, curr_interval);
    +                            break;
    +                    }
    +                    if (frag_end.after(c_end)) frag_end = (Calendar) c_end.clone();
    +
    +                    //make metadata of this fragment , converts the date to a millisecond,then get bytes.
    +                    byte[] ms_start = ByteUtil.getBytes(frag_start.getTimeInMillis());
    +                    byte[] ms_end = ByteUtil.getBytes(frag_end.getTimeInMillis());
    +                    byte[] fragmentMetadata = ByteUtil.mergeBytes(ms_start, ms_end);
    +
    +                    byte[] userData = new byte[0];
    +                    Fragment fragment = new Fragment(inputData.getDataSource(), null, fragmentMetadata, userData);
    +                    fragments.add(fragment);
    +
    +                    //continue next fragment.
    +                    frag_start = frag_end;
    +                }
    +                break;
    +            }
    +            case INT: {
    +                int i_start = Integer.parseInt(range[0]);
    +                int i_end = Integer.parseInt(range[1]);
    +                int curr_interval = intervalNum;
    +
    +                //validate : curr_interval > 0
    +                int frag_start = i_start;
    +                while (frag_start < i_end) {
    +                    int frag_end = frag_start + curr_interval;
    +                    if (frag_end > i_end) frag_end = i_end;
    +
    +                    byte[] b_start = ByteUtil.getBytes(frag_start);
    +                    byte[] b_end = ByteUtil.getBytes(frag_end);
    +                    byte[] fragmentMetadata = ByteUtil.mergeBytes(b_start, b_end);
    +
    +                    byte[] userData = new byte[0];
    +                    Fragment fragment = new Fragment(inputData.getDataSource(), null, fragmentMetadata, userData);
    +                    fragments.add(fragment);
    +
    +                    //continue next fragment.
    +                    frag_start = frag_end;// + 1;
    +                }
    +                break;
    +            }
    +            case ENUM:
    +                for (String frag : range) {
    +                    byte[] fragmentMetadata = frag.getBytes();
    +                    Fragment fragment = new Fragment(inputData.getDataSource(), null, fragmentMetadata, new byte[0]);
    +                    fragments.add(fragment);
    +                }
    +                break;
    +        }
    +
    +        return prepareHosts(fragments);
    +    }
    +
    +    /**
    +     * For each fragment , assigned a host address.
    +     * In Jdbc Plugin, 'replicas' is the host address of the PXF engine that is running, not the database engine.
    +     * Since the other PXF host addresses can not be probed, only the host name of the current PXF engine is returned.
    --- End diff --
    
    It is a valid concern. Every query will now only be routed through the PXF host configured in the LOCATION in the DDL.
    If there are no partitions, why should we invoke the Fragmenter api ?
    When we do have partitions, would be nice if we go with Noa's approach of assigning a different PXF host for each partition. That way we don't rely on only one JVM to handle potentially multiple requests where we also have to do data type resolution


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85472163
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/WhereSQLBuilder.java ---
    @@ -0,0 +1,140 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import java.util.ArrayList;
    +import java.util.List;
    +
    +import org.apache.hawq.pxf.api.LogicalFilter;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.api.BasicFilter;
    +import org.apache.hawq.pxf.api.FilterParser;
    +import org.apache.hawq.pxf.api.io.DataType;
    +import org.apache.hawq.pxf.api.utilities.ColumnDescriptor;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +/**
    + * Parse filter object generated by parent class  {@link org.apache.hawq.pxf.plugins.jdbc.JdbcFilterBuilder},
    + * and build WHERE statement.
    + * For Multiple filters , currently only support HDOP_AND .
    + * The unsupported Filter operation and  LogicalOperation ,will return null statement.
    + *
    + */
    +public class WhereSQLBuilder extends JdbcFilterBuilder {
    +    private InputData inputData;
    +
    +    public WhereSQLBuilder(InputData input) {
    +        inputData = input;
    +    }
    +
    +    /**
    +     * 1.check for LogicalOperator, Jdbc currently only support HDOP_AND.
    +     * 2.and convert to BasicFilter List.
    +     */
    +    private static List<BasicFilter> convertBasicFilterList(Object filter, List<BasicFilter> returnList) throws UnsupportedFilterException {
    +        if (returnList == null)
    +            returnList = new ArrayList<>();
    +        if (filter instanceof BasicFilter) {
    +            returnList.add((BasicFilter) filter);
    +            return returnList;
    +        }
    +        LogicalFilter lfilter = (LogicalFilter) filter;
    +        if (lfilter.getOperator() != FilterParser.LogicalOperation.HDOP_AND)
    +            throw new UnsupportedFilterException("unsupported LogicalOperation : " + lfilter.getOperator());
    +        for (Object f : lfilter.getFilterList()) {
    +            returnList = convertBasicFilterList(f, returnList);
    +        }
    +        return returnList;
    +    }
    +
    +    public String buildWhereSQL(String db_product) throws Exception {
    +        if (!inputData.hasFilter()) return null;
    +        List<BasicFilter> filters = null;
    +        try {
    +            String filterString = inputData.getFilterString();
    +            Object filterobj = getFilterObject(filterString);
    +
    +            filters = convertBasicFilterList(filterobj, filters);
    +            StringBuffer sb = new StringBuffer("1=1");
    +            for (Object obj : filters) {
    +                BasicFilter filter = (BasicFilter) obj;
    +                sb.append(" AND ");
    +
    +                ColumnDescriptor column = inputData.getColumn(filter.getColumn().index());
    +                //the column name of filter
    +                sb.append(column.columnName());
    +
    +                //the operation of filter
    +                FilterParser.Operation op = filter.getOperation();
    +                switch (op) {
    +                    case HDOP_LT:
    +                        sb.append("<");
    +                        break;
    +                    case HDOP_GT:
    +                        sb.append(">");
    +                        break;
    +                    case HDOP_LE:
    +                        sb.append("<=");
    +                        break;
    +                    case HDOP_GE:
    +                        sb.append(">=");
    +                        break;
    +                    case HDOP_EQ:
    +                        sb.append("=");
    +                        break;
    +                    default:
    +                        throw new UnsupportedFilterException("unsupported Filter operation : " + op);
    +                }
    +                //
    +                DbProduct dbProduct = DbProduct.getDbProduct(db_product);
    +                Object val = filter.getConstant().constant();
    +                switch (DataType.get(column.columnTypeCode())) {
    +                    case SMALLINT:
    +                    case INTEGER:
    +                    case BIGINT:
    +                    case FLOAT8:
    +                    case REAL:
    +                    case BOOLEAN:
    +                        sb.append(val.toString());
    +                        break;
    +                    case TEXT:
    +                        sb.append("'").append(val.toString()).append("'");
    +                        break;
    +                    case DATE:
    +                        //According to the database products, for the date field for special treatment.
    +                        sb.append(dbProduct.wrapDate(val));
    +                        break;
    +                    default:
    +                        throw new UnsupportedFilterException("unsupported column type for filtering : " + column.columnTypeCode());
    +                }
    +
    +                sb.append("");
    --- End diff --
    
    why is it needed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by edespino <gi...@git.apache.org>.

Github user edespino commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r102664824
  
    --- Diff: pxf/build.gradle ---
    @@ -431,6 +431,38 @@ project('pxf-hbase') {
         }
     }
     
    +
    +project('pxf-jdbc') {
    +    dependencies {
    +        compile(project(':pxf-api'))
    +        compile(project(':pxf-service'))
    +        compile "org.apache.hadoop:hadoop-common:$hadoopVersion"
    +        compile "org.apache.hadoop:hadoop-hdfs:$hadoopVersion"
    +        testCompile "mysql:mysql-connector-java:5.1.6"
    +    }
    +    tasks.withType(JavaCompile) {
    +        options.encoding = "UTF-8"
    +    }
    +
    +    ospackage {
    +        packageName = versionedPackageName("${project.name}")
    +        summary = 'HAWQ Extension Framework (PXF), JDBC plugin'
    +        description = 'Querying external data stored in RelationDatabase using JDBC.'
    --- End diff --
    
    "RelationDatabase" should be "Relation Database"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85469744
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcFilterBuilder.java ---
    @@ -0,0 +1,145 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.BasicFilter;
    +import org.apache.hawq.pxf.api.FilterParser;
    +import org.apache.hawq.pxf.api.LogicalFilter;
    +
    +import java.util.Arrays;
    +import java.util.LinkedList;
    +import java.util.List;
    +import java.util.regex.Matcher;
    +import java.util.regex.Pattern;
    +
    +/**
    + * Uses the filter parser code to build a filter object, either simple - a
    + * single {@link BasicFilter} object or a
    + * compound - a {@link java.util.List} of
    + * {@link BasicFilter} objects.
    + * The subclass {@link org.apache.hawq.pxf.plugins.jdbc.WhereSQLBuilder} will use the filter for
    + * generate WHERE statement.
    + */
    +public class JdbcFilterBuilder implements FilterParser.FilterBuilder  {
    +    /**
    +     * Translates a filterString into a {@link BasicFilter} or a
    +     * list of such filters.
    +     *
    +     * @param filterString the string representation of the filter
    +     * @return a single {@link BasicFilter}
    +     *         object or a {@link java.util.List} of
    +     *         {@link BasicFilter} objects.
    +     * @throws Exception if parsing the filter failed or filter is not a basic
    +     *             filter or list of basic filters
    +     */
    +    public Object getFilterObject(String filterString) throws Exception {
    +        // First check for LogicalOperator, Jdbc currently only support HDOP_AND.
    +      //  if (filterString.contains("l1") || filterString.contains("l2"))
    --- End diff --
    
    please remove commented code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq issue #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by shivzone <gi...@git.apache.org>.

Github user shivzone commented on the issue:

    https://github.com/apache/incubator-hawq/pull/972
  
    @jiadexin the next step is for us to discuss the Integration testing. We can discuss about this in the Jira HAWQ-1108


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85678674
  
    --- Diff: pxf/pxf-jdbc/src/test/java/org/apache/hawq/pxf/plugins/jdbc/JdbcFilterBuilderTest.java ---
    @@ -0,0 +1,81 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + * 
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + * 
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +
    +import org.apache.hawq.pxf.api.BasicFilter;
    +import org.apache.hawq.pxf.api.FilterParser.LogicalOperation;
    +import org.apache.hawq.pxf.api.LogicalFilter;
    +import org.junit.Test;
    +
    +import static org.apache.hawq.pxf.api.FilterParser.Operation.*;
    +import static org.junit.Assert.assertEquals;
    +
    +public class JdbcFilterBuilderTest {
    +    @Test
    +    public void parseFilterWithThreeOperations() throws Exception {
    +        //orgin sql => cdate>'2008-02-01' and cdate<'2008-12-01' and amt > 1200
    +        //filterstr="a1c\"first\"o5a2c2o2l0";//col_1=first and col_2=2
    +        String filterstr = "a1c\"2008-02-01\"o2a1c\"2008-12-01\"o1l0a2c1200o2l1"; //col_1>'first' and col_1<'2008-12-01' or col_2 > 1200;
    +        JdbcFilterBuilder builder = new JdbcFilterBuilder();
    +
    +        LogicalFilter filterList = (LogicalFilter) builder.getFilterObject(filterstr);
    +        assertEquals(LogicalOperation.HDOP_OR, filterList.getOperator());
    +        LogicalFilter l1_left = (LogicalFilter) filterList.getFilterList().get(0);
    +        BasicFilter l1_right = (BasicFilter) filterList.getFilterList().get(1);
    +        //column_2 > 1200
    +        assertEquals(2, l1_right.getColumn().index());
    +        assertEquals(HDOP_GT, l1_right.getOperation());
    +        assertEquals(1200L, l1_right.getConstant().constant());
    +
    +        assertEquals(LogicalOperation.HDOP_AND, l1_left.getOperator());
    +        BasicFilter l2_left = (BasicFilter) l1_left.getFilterList().get(0);
    +        BasicFilter l2_right = (BasicFilter) l1_left.getFilterList().get(1);
    +
    +        //column_1 > '2008-02-01'
    +        assertEquals(1, l2_left.getColumn().index());
    +        assertEquals(HDOP_GT, l2_left.getOperation());
    +        assertEquals("2008-02-01", l2_left.getConstant().constant());
    +
    +        //column_2 = 5
    +        assertEquals(1, l2_right.getColumn().index());
    +        assertEquals(HDOP_LT, l2_right.getOperation());
    +        assertEquals("2008-12-01", l2_right.getConstant().constant());
    +
    +    }
    +
    +    @Test
    +    public void parseFilterWithLogicalOperation() throws Exception {
    +        WhereSQLBuilder builder = new WhereSQLBuilder(null);
    --- End diff --
    
    Since `buildWhereSQL()` is a public function, it should be easy to test it - create a `WhereSQLBuilder` instance like you do here, mock the input to have a filter string property, and check that calling this function returns the expected sql string.
    That way the test won't rely on any outsider resources, and errors in the logic will be easy to identify.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85472352
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/utils/DbProduct.java ---
    @@ -0,0 +1,51 @@
    +package org.apache.hawq.pxf.plugins.jdbc.utils;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.UnsupportedTypeException;
    +import org.iq80.leveldb.DB;
    --- End diff --
    
    licensing question - do we need to add this resource to the licenses list?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq issue #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by jiadexin <gi...@git.apache.org>.

Github user jiadexin commented on the issue:

    https://github.com/apache/incubator-hawq/pull/972
  
    @shivzoneIf there are no partitions, then only one fragment.
    if have partitions ,HAWQ-MASTER scheduling strategy support is required.
    
    HDFS is distributed, and pxf-hdfs can allocate hosts for the fragment via the hdfs file metadata.
    The role of pxf-jdbc is mainly used to integrate the enterprise has been running the traditional relational database, these systems are generally stand-alone system, may not deploy PXF Engine.
    In the current HAWQ PXF engine implementation, pxf-instance can not know all pxf-hosts, so only the current host name can be assigned.
    if configured in the LOCATION in the DDL, when a large number of 'pxf hosts' is used, the DDL statement is extremely long. and not flexible.
    i think, HAWQ-MASTER wants to support a scheduling strategy: when the fragment host name is null, HAWQ-MASTER automatically assigned a HAWQ-segment host.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85471130
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,284 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    + *
    + * Extends the {@link Fragmenter} abstract class, with the purpose of transforming
    + * an input data path  (an JDBC Database table name  and user request parameters)  into a list of regions
    + * that belong to this table.
    + * <p>
    + * <h4>The parameter Patterns </h4>
    + * There are three  parameters,  the format is as follows:<p>
    + * <pre>
    + * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
    + * </pre>
    + * The <code>PARTITION_BY</code> parameter can be split by colon(':'),the <code>column_type</code> current supported : <code>date,int,enum</code> .
    + * The Date format is 'yyyy-MM-dd'. <p>
    + * The <code>RANGE</code> parameter can be split by colon(':') ,used to identify the starting range of each fragment.
    + * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
    + * the <code>end_value</code> can be empty. If the <code>column_type</code>is <code>enum</code>,the parameter <code>RANGE</code> can be empty. <p>
    + * The <code>INTERVAL</code> parameter can be split by colon(':'), indicate the interval value of one fragment.
    + * When <code>column_type</code> is <code>date</code>,this parameter must be split by colon, and <code>interval_unit</code> can be <code>year,month,day</code>.
    + * When <code>column_type</code> is <code>int</code>, the <code>interval_unit</code> can be empty.
    + * When <code>column_type</code> is <code>enum</code>,the <code>INTERVAL</code> parameter can be empty.
    + * </p>
    + * <p>
    + * The syntax examples is :<p>
    + * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
    + * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
    + * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
    + * </p>
    + *
    + */
    +public class JdbcPartitionFragmenter extends Fragmenter {
    +    String[] partition_by = null;
    +    String[] range = null;
    +    String[] interval = null;
    +    PartitionType partitionType = null;
    +    String partitionColumn = null;
    +    IntervalType intervalType = null;
    +    int intervalNum = 1;
    +
    +    enum PartitionType {
    +        DATE,
    +        INT,
    +        ENUM;
    +
    +        public static PartitionType getType(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    enum IntervalType {
    +        DAY,
    +        MONTH,
    +        YEAR;
    +
    +        public static IntervalType type(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    //The unit interval, in milliseconds, that is used to estimate the number of slices for the date partition type
    +    static Map<IntervalType, Long> intervals = new HashMap<IntervalType, Long>();
    +
    +    static {
    +        intervals.put(IntervalType.DAY, (long) 24 * 60 * 60 * 1000);
    +        intervals.put(IntervalType.MONTH, (long) 30 * 24 * 60 * 60 * 1000);//30 day
    +        intervals.put(IntervalType.YEAR, (long) 365 * 30 * 24 * 60 * 60 * 1000);//365 day
    +    }
    +
    +    /**
    +     * Constructor for JdbcPartitionFragmenter.
    +     *
    +     * @param inConf input data such as which Jdbc table to scan
    +     * @throws JdbcFragmentException
    +     */
    +    public JdbcPartitionFragmenter(InputData inConf) throws JdbcFragmentException {
    +        super(inConf);
    +        if(inConf.getUserProperty("PARTITION_BY") == null )
    +            return;
    +        partition_by = inConf.getUserProperty("PARTITION_BY").split(":");
    +        partitionColumn = partition_by[0];
    +        partitionType = PartitionType.getType(partition_by[1]);
    +
    +        range = inConf.getUserProperty("RANGE").split(":");
    +
    +        //parse and validate parameter-INTERVAL
    +        if (inConf.getUserProperty("INTERVAL") != null) {
    +            interval = inConf.getUserProperty("INTERVAL").split(":");
    +            intervalNum = Integer.parseInt(interval[0]);
    +            if (interval.length > 1)
    +                intervalType = IntervalType.type(interval[1]);
    +        }
    +        if (intervalNum < 1)
    +            throw new JdbcFragmentException("The parameter{INTERVAL} must > 1, but actual is '" + intervalNum+"'");
    +    }
    +    /**
    +     * Returns statistics for Jdbc table. Currently it's not implemented.
    +     */
    +    @Override
    +    public FragmentsStats getFragmentsStats() throws Exception {
    +        throw new UnsupportedOperationException("ANALYZE for Jdbc plugin is not supported");
    +    }
    +
    +
    +
    +    /**
    +     * Returns list of fragments containing all of the
    +     * Jdbc table data.
    +     *
    +     * @return a list of fragments
    +     */
    +    @Override
    +    public List<Fragment> getFragments() throws Exception {
    +        switch (partitionType) {
    +            case DATE: {
    +                SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd");
    +                Date t_start = df.parse(range[0]);
    +                Date t_end = df.parse(range[1]);
    +                int curr_interval = intervalNum;
    +
    +                Calendar frag_start = Calendar.getInstance();
    +                Calendar c_end = Calendar.getInstance();
    +                frag_start.setTime(t_start);
    +                c_end.setTime(t_end);
    +                while (frag_start.before(c_end)) {//|| frag_start.compareTo(c_end) == 0) {
    +                    Calendar frag_end = (Calendar) frag_start.clone();
    +                    switch (intervalType) {
    +                        case DAY:
    +                            frag_end.add(Calendar.DAY_OF_MONTH, curr_interval);
    +                            break;
    +                        case MONTH:
    +                            frag_end.add(Calendar.MONTH, curr_interval);
    +                            break;
    +                        //case YEAR:
    +                        default:
    +                            frag_end.add(Calendar.YEAR, curr_interval);
    +                            break;
    +                    }
    +                    if (frag_end.after(c_end)) frag_end = (Calendar) c_end.clone();
    --- End diff --
    
    for readability, please break this statement to two lines.
    ```
      if ... {
        ...
      }
    ```
    https://cwiki.apache.org/confluence/display/HAWQ/PXF+Coding+Conventions#PXFCodingConventions-OneLinersinBrackets


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by shivzone <gi...@git.apache.org>.

Github user shivzone commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r86202298
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,297 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.api.UserDataException;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    + *
    + * Extends the {@link Fragmenter} abstract class, with the purpose of transforming
    + * an input data path  (an JDBC Database table name  and user request parameters)  into a list of regions
    + * that belong to this table.
    + * <p>
    + * <h4>The parameter Patterns </h4>
    + * There are three  parameters,  the format is as follows:<p>
    + * <pre>
    + * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
    + * </pre>
    + * The <code>PARTITION_BY</code> parameter can be split by colon(':'),the <code>column_type</code> current supported : <code>date,int,enum</code> .
    + * The Date format is 'yyyy-MM-dd'. <p>
    + * The <code>RANGE</code> parameter can be split by colon(':') ,used to identify the starting range of each fragment.
    + * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
    + * the <code>end_value</code> can be empty. If the <code>column_type</code>is <code>enum</code>,the parameter <code>RANGE</code> can be empty. <p>
    + * The <code>INTERVAL</code> parameter can be split by colon(':'), indicate the interval value of one fragment.
    + * When <code>column_type</code> is <code>date</code>,this parameter must be split by colon, and <code>interval_unit</code> can be <code>year,month,day</code>.
    + * When <code>column_type</code> is <code>int</code>, the <code>interval_unit</code> can be empty.
    + * When <code>column_type</code> is <code>enum</code>,the <code>INTERVAL</code> parameter can be empty.
    + * </p>
    + * <p>
    + * The syntax examples is :<p>
    + * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
    + * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
    + * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
    + * </p>
    + *
    + */
    +public class JdbcPartitionFragmenter extends Fragmenter {
    +    String[] partitionBy = null;
    +    String[] range = null;
    +    String[] interval = null;
    +    PartitionType partitionType = null;
    +    String partitionColumn = null;
    +    IntervalType intervalType = null;
    +    int intervalNum = 1;
    +
    +    enum PartitionType {
    +        DATE,
    +        INT,
    +        ENUM;
    +
    +        public static PartitionType getType(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    enum IntervalType {
    +        DAY,
    +        MONTH,
    +        YEAR;
    +
    +        public static IntervalType type(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    //The unit interval, in milliseconds, that is used to estimate the number of slices for the date partition type
    +    static Map<IntervalType, Long> intervals = new HashMap<IntervalType, Long>();
    +
    +    static {
    +        intervals.put(IntervalType.DAY, (long) 24 * 60 * 60 * 1000);
    +        //30 days
    +        intervals.put(IntervalType.MONTH, (long) 30 * 24 * 60 * 60 * 1000);
    +        //365 days
    +        intervals.put(IntervalType.YEAR, (long) 365 * 30 * 24 * 60 * 60 * 1000);
    --- End diff --
    
    what is the factor of 30 ? You are converting a year to 365 days .. why apply a multiple of 30 ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by jiadexin <gi...@git.apache.org>.

Github user jiadexin commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85676673
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,284 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    + *
    + * Extends the {@link Fragmenter} abstract class, with the purpose of transforming
    + * an input data path  (an JDBC Database table name  and user request parameters)  into a list of regions
    + * that belong to this table.
    + * <p>
    + * <h4>The parameter Patterns </h4>
    + * There are three  parameters,  the format is as follows:<p>
    + * <pre>
    + * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
    + * </pre>
    + * The <code>PARTITION_BY</code> parameter can be split by colon(':'),the <code>column_type</code> current supported : <code>date,int,enum</code> .
    + * The Date format is 'yyyy-MM-dd'. <p>
    + * The <code>RANGE</code> parameter can be split by colon(':') ,used to identify the starting range of each fragment.
    + * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
    + * the <code>end_value</code> can be empty. If the <code>column_type</code>is <code>enum</code>,the parameter <code>RANGE</code> can be empty. <p>
    + * The <code>INTERVAL</code> parameter can be split by colon(':'), indicate the interval value of one fragment.
    + * When <code>column_type</code> is <code>date</code>,this parameter must be split by colon, and <code>interval_unit</code> can be <code>year,month,day</code>.
    + * When <code>column_type</code> is <code>int</code>, the <code>interval_unit</code> can be empty.
    + * When <code>column_type</code> is <code>enum</code>,the <code>INTERVAL</code> parameter can be empty.
    + * </p>
    + * <p>
    + * The syntax examples is :<p>
    + * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
    + * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
    + * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
    + * </p>
    + *
    + */
    +public class JdbcPartitionFragmenter extends Fragmenter {
    +    String[] partition_by = null;
    +    String[] range = null;
    +    String[] interval = null;
    +    PartitionType partitionType = null;
    +    String partitionColumn = null;
    +    IntervalType intervalType = null;
    +    int intervalNum = 1;
    +
    +    enum PartitionType {
    +        DATE,
    +        INT,
    +        ENUM;
    +
    +        public static PartitionType getType(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    enum IntervalType {
    +        DAY,
    +        MONTH,
    +        YEAR;
    +
    +        public static IntervalType type(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    //The unit interval, in milliseconds, that is used to estimate the number of slices for the date partition type
    +    static Map<IntervalType, Long> intervals = new HashMap<IntervalType, Long>();
    +
    +    static {
    +        intervals.put(IntervalType.DAY, (long) 24 * 60 * 60 * 1000);
    +        intervals.put(IntervalType.MONTH, (long) 30 * 24 * 60 * 60 * 1000);//30 day
    +        intervals.put(IntervalType.YEAR, (long) 365 * 30 * 24 * 60 * 60 * 1000);//365 day
    +    }
    +
    +    /**
    +     * Constructor for JdbcPartitionFragmenter.
    +     *
    +     * @param inConf input data such as which Jdbc table to scan
    +     * @throws JdbcFragmentException
    +     */
    +    public JdbcPartitionFragmenter(InputData inConf) throws JdbcFragmentException {
    +        super(inConf);
    +        if(inConf.getUserProperty("PARTITION_BY") == null )
    +            return;
    +        partition_by = inConf.getUserProperty("PARTITION_BY").split(":");
    +        partitionColumn = partition_by[0];
    +        partitionType = PartitionType.getType(partition_by[1]);
    +
    +        range = inConf.getUserProperty("RANGE").split(":");
    +
    +        //parse and validate parameter-INTERVAL
    +        if (inConf.getUserProperty("INTERVAL") != null) {
    +            interval = inConf.getUserProperty("INTERVAL").split(":");
    +            intervalNum = Integer.parseInt(interval[0]);
    +            if (interval.length > 1)
    +                intervalType = IntervalType.type(interval[1]);
    +        }
    +        if (intervalNum < 1)
    +            throw new JdbcFragmentException("The parameter{INTERVAL} must > 1, but actual is '" + intervalNum+"'");
    +    }
    +    /**
    +     * Returns statistics for Jdbc table. Currently it's not implemented.
    +     */
    +    @Override
    +    public FragmentsStats getFragmentsStats() throws Exception {
    +        throw new UnsupportedOperationException("ANALYZE for Jdbc plugin is not supported");
    +    }
    +
    +
    +
    +    /**
    +     * Returns list of fragments containing all of the
    +     * Jdbc table data.
    +     *
    +     * @return a list of fragments
    +     */
    +    @Override
    +    public List<Fragment> getFragments() throws Exception {
    +        switch (partitionType) {
    +            case DATE: {
    +                SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd");
    +                Date t_start = df.parse(range[0]);
    +                Date t_end = df.parse(range[1]);
    +                int curr_interval = intervalNum;
    +
    +                Calendar frag_start = Calendar.getInstance();
    +                Calendar c_end = Calendar.getInstance();
    +                frag_start.setTime(t_start);
    +                c_end.setTime(t_end);
    +                while (frag_start.before(c_end)) {//|| frag_start.compareTo(c_end) == 0) {
    +                    Calendar frag_end = (Calendar) frag_start.clone();
    +                    switch (intervalType) {
    +                        case DAY:
    +                            frag_end.add(Calendar.DAY_OF_MONTH, curr_interval);
    +                            break;
    +                        case MONTH:
    +                            frag_end.add(Calendar.MONTH, curr_interval);
    +                            break;
    +                        //case YEAR:
    +                        default:
    +                            frag_end.add(Calendar.YEAR, curr_interval);
    +                            break;
    +                    }
    +                    if (frag_end.after(c_end)) frag_end = (Calendar) c_end.clone();
    +
    +                    //make metadata of this fragment , converts the date to a millisecond,then get bytes.
    +                    byte[] ms_start = ByteUtil.getBytes(frag_start.getTimeInMillis());
    +                    byte[] ms_end = ByteUtil.getBytes(frag_end.getTimeInMillis());
    +                    byte[] fragmentMetadata = ByteUtil.mergeBytes(ms_start, ms_end);
    +
    +                    byte[] userData = new byte[0];
    +                    Fragment fragment = new Fragment(inputData.getDataSource(), null, fragmentMetadata, userData);
    +                    fragments.add(fragment);
    +
    +                    //continue next fragment.
    +                    frag_start = frag_end;
    +                }
    +                break;
    +            }
    +            case INT: {
    +                int i_start = Integer.parseInt(range[0]);
    +                int i_end = Integer.parseInt(range[1]);
    +                int curr_interval = intervalNum;
    +
    +                //validate : curr_interval > 0
    +                int frag_start = i_start;
    +                while (frag_start < i_end) {
    +                    int frag_end = frag_start + curr_interval;
    +                    if (frag_end > i_end) frag_end = i_end;
    +
    +                    byte[] b_start = ByteUtil.getBytes(frag_start);
    +                    byte[] b_end = ByteUtil.getBytes(frag_end);
    +                    byte[] fragmentMetadata = ByteUtil.mergeBytes(b_start, b_end);
    +
    +                    byte[] userData = new byte[0];
    +                    Fragment fragment = new Fragment(inputData.getDataSource(), null, fragmentMetadata, userData);
    +                    fragments.add(fragment);
    +
    +                    //continue next fragment.
    +                    frag_start = frag_end;// + 1;
    +                }
    +                break;
    +            }
    +            case ENUM:
    +                for (String frag : range) {
    +                    byte[] fragmentMetadata = frag.getBytes();
    +                    Fragment fragment = new Fragment(inputData.getDataSource(), null, fragmentMetadata, new byte[0]);
    +                    fragments.add(fragment);
    +                }
    +                break;
    +        }
    +
    +        return prepareHosts(fragments);
    +    }
    +
    +    /**
    +     * For each fragment , assigned a host address.
    +     * In Jdbc Plugin, 'replicas' is the host address of the PXF engine that is running, not the database engine.
    +     * Since the other PXF host addresses can not be probed, only the host name of the current PXF engine is returned.
    --- End diff --
    
    The master node's postgresql.conf file is configured with the following parameters:
    `pxf_isilon = true`
    Can also be assigned to multiple pxf examples?
    
    hd_work_mgr.c in the following code:
    
    > 
    
    /*
    				 * in case of remote storage, the segment host is also where the PXF will be running
    				 * so we set allocated->host accordingly, instead of the remote storage system - datanode ip.
    				 */
    				if (pxf_isilon)
    				{
    					pfree(allocated->host);
    					allocated->host = pstrdup(host_ip);
    				}
    > 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by hornn <gi...@git.apache.org>.

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85471415
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,284 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    + *
    + * Extends the {@link Fragmenter} abstract class, with the purpose of transforming
    + * an input data path  (an JDBC Database table name  and user request parameters)  into a list of regions
    + * that belong to this table.
    + * <p>
    + * <h4>The parameter Patterns </h4>
    + * There are three  parameters,  the format is as follows:<p>
    + * <pre>
    + * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
    + * </pre>
    + * The <code>PARTITION_BY</code> parameter can be split by colon(':'),the <code>column_type</code> current supported : <code>date,int,enum</code> .
    + * The Date format is 'yyyy-MM-dd'. <p>
    + * The <code>RANGE</code> parameter can be split by colon(':') ,used to identify the starting range of each fragment.
    + * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
    + * the <code>end_value</code> can be empty. If the <code>column_type</code>is <code>enum</code>,the parameter <code>RANGE</code> can be empty. <p>
    + * The <code>INTERVAL</code> parameter can be split by colon(':'), indicate the interval value of one fragment.
    + * When <code>column_type</code> is <code>date</code>,this parameter must be split by colon, and <code>interval_unit</code> can be <code>year,month,day</code>.
    + * When <code>column_type</code> is <code>int</code>, the <code>interval_unit</code> can be empty.
    + * When <code>column_type</code> is <code>enum</code>,the <code>INTERVAL</code> parameter can be empty.
    + * </p>
    + * <p>
    + * The syntax examples is :<p>
    + * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
    + * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
    + * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
    + * </p>
    + *
    + */
    +public class JdbcPartitionFragmenter extends Fragmenter {
    +    String[] partition_by = null;
    +    String[] range = null;
    +    String[] interval = null;
    +    PartitionType partitionType = null;
    +    String partitionColumn = null;
    +    IntervalType intervalType = null;
    +    int intervalNum = 1;
    +
    +    enum PartitionType {
    +        DATE,
    +        INT,
    +        ENUM;
    +
    +        public static PartitionType getType(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    enum IntervalType {
    +        DAY,
    +        MONTH,
    +        YEAR;
    +
    +        public static IntervalType type(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    //The unit interval, in milliseconds, that is used to estimate the number of slices for the date partition type
    +    static Map<IntervalType, Long> intervals = new HashMap<IntervalType, Long>();
    +
    +    static {
    +        intervals.put(IntervalType.DAY, (long) 24 * 60 * 60 * 1000);
    +        intervals.put(IntervalType.MONTH, (long) 30 * 24 * 60 * 60 * 1000);//30 day
    +        intervals.put(IntervalType.YEAR, (long) 365 * 30 * 24 * 60 * 60 * 1000);//365 day
    +    }
    +
    +    /**
    +     * Constructor for JdbcPartitionFragmenter.
    +     *
    +     * @param inConf input data such as which Jdbc table to scan
    +     * @throws JdbcFragmentException
    +     */
    +    public JdbcPartitionFragmenter(InputData inConf) throws JdbcFragmentException {
    +        super(inConf);
    +        if(inConf.getUserProperty("PARTITION_BY") == null )
    +            return;
    +        partition_by = inConf.getUserProperty("PARTITION_BY").split(":");
    +        partitionColumn = partition_by[0];
    +        partitionType = PartitionType.getType(partition_by[1]);
    +
    +        range = inConf.getUserProperty("RANGE").split(":");
    +
    +        //parse and validate parameter-INTERVAL
    +        if (inConf.getUserProperty("INTERVAL") != null) {
    +            interval = inConf.getUserProperty("INTERVAL").split(":");
    +            intervalNum = Integer.parseInt(interval[0]);
    +            if (interval.length > 1)
    +                intervalType = IntervalType.type(interval[1]);
    +        }
    +        if (intervalNum < 1)
    +            throw new JdbcFragmentException("The parameter{INTERVAL} must > 1, but actual is '" + intervalNum+"'");
    +    }
    +    /**
    +     * Returns statistics for Jdbc table. Currently it's not implemented.
    +     */
    +    @Override
    +    public FragmentsStats getFragmentsStats() throws Exception {
    +        throw new UnsupportedOperationException("ANALYZE for Jdbc plugin is not supported");
    +    }
    +
    +
    +
    +    /**
    +     * Returns list of fragments containing all of the
    +     * Jdbc table data.
    +     *
    +     * @return a list of fragments
    +     */
    +    @Override
    +    public List<Fragment> getFragments() throws Exception {
    +        switch (partitionType) {
    +            case DATE: {
    +                SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd");
    +                Date t_start = df.parse(range[0]);
    +                Date t_end = df.parse(range[1]);
    +                int curr_interval = intervalNum;
    +
    +                Calendar frag_start = Calendar.getInstance();
    +                Calendar c_end = Calendar.getInstance();
    +                frag_start.setTime(t_start);
    +                c_end.setTime(t_end);
    +                while (frag_start.before(c_end)) {//|| frag_start.compareTo(c_end) == 0) {
    +                    Calendar frag_end = (Calendar) frag_start.clone();
    +                    switch (intervalType) {
    +                        case DAY:
    +                            frag_end.add(Calendar.DAY_OF_MONTH, curr_interval);
    +                            break;
    +                        case MONTH:
    +                            frag_end.add(Calendar.MONTH, curr_interval);
    +                            break;
    +                        //case YEAR:
    +                        default:
    +                            frag_end.add(Calendar.YEAR, curr_interval);
    +                            break;
    +                    }
    +                    if (frag_end.after(c_end)) frag_end = (Calendar) c_end.clone();
    +
    +                    //make metadata of this fragment , converts the date to a millisecond,then get bytes.
    +                    byte[] ms_start = ByteUtil.getBytes(frag_start.getTimeInMillis());
    +                    byte[] ms_end = ByteUtil.getBytes(frag_end.getTimeInMillis());
    +                    byte[] fragmentMetadata = ByteUtil.mergeBytes(ms_start, ms_end);
    +
    +                    byte[] userData = new byte[0];
    +                    Fragment fragment = new Fragment(inputData.getDataSource(), null, fragmentMetadata, userData);
    +                    fragments.add(fragment);
    +
    +                    //continue next fragment.
    +                    frag_start = frag_end;
    +                }
    +                break;
    +            }
    +            case INT: {
    +                int i_start = Integer.parseInt(range[0]);
    +                int i_end = Integer.parseInt(range[1]);
    +                int curr_interval = intervalNum;
    +
    +                //validate : curr_interval > 0
    +                int frag_start = i_start;
    +                while (frag_start < i_end) {
    +                    int frag_end = frag_start + curr_interval;
    +                    if (frag_end > i_end) frag_end = i_end;
    +
    +                    byte[] b_start = ByteUtil.getBytes(frag_start);
    +                    byte[] b_end = ByteUtil.getBytes(frag_end);
    +                    byte[] fragmentMetadata = ByteUtil.mergeBytes(b_start, b_end);
    +
    +                    byte[] userData = new byte[0];
    +                    Fragment fragment = new Fragment(inputData.getDataSource(), null, fragmentMetadata, userData);
    +                    fragments.add(fragment);
    +
    +                    //continue next fragment.
    +                    frag_start = frag_end;// + 1;
    +                }
    +                break;
    +            }
    +            case ENUM:
    +                for (String frag : range) {
    +                    byte[] fragmentMetadata = frag.getBytes();
    +                    Fragment fragment = new Fragment(inputData.getDataSource(), null, fragmentMetadata, new byte[0]);
    +                    fragments.add(fragment);
    +                }
    +                break;
    +        }
    +
    +        return prepareHosts(fragments);
    +    }
    +
    +    /**
    +     * For each fragment , assigned a host address.
    +     * In Jdbc Plugin, 'replicas' is the host address of the PXF engine that is running, not the database engine.
    +     * Since the other PXF host addresses can not be probed, only the host name of the current PXF engine is returned.
    --- End diff --
    
    this limitation probably has performance effect, because all of the fragments will be processed by the same PXF instance. In the future we should probably provide a list of available PXF instances, to make sure we have real parallelism.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq issue #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by edespino <gi...@git.apache.org>.

Github user edespino commented on the issue:

    https://github.com/apache/incubator-hawq/pull/972
  
    @jiadexin - Thanks, I have confirmed the javadocs are now passing in my Java 8 environment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by jiadexin <gi...@git.apache.org>.

Github user jiadexin commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r86278294
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,297 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.api.UserDataException;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    + *
    + * Extends the {@link Fragmenter} abstract class, with the purpose of transforming
    + * an input data path  (an JDBC Database table name  and user request parameters)  into a list of regions
    + * that belong to this table.
    + * <p>
    + * <h4>The parameter Patterns </h4>
    + * There are three  parameters,  the format is as follows:<p>
    + * <pre>
    + * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
    + * </pre>
    + * The <code>PARTITION_BY</code> parameter can be split by colon(':'),the <code>column_type</code> current supported : <code>date,int,enum</code> .
    + * The Date format is 'yyyy-MM-dd'. <p>
    + * The <code>RANGE</code> parameter can be split by colon(':') ,used to identify the starting range of each fragment.
    + * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
    + * the <code>end_value</code> can be empty. If the <code>column_type</code>is <code>enum</code>,the parameter <code>RANGE</code> can be empty. <p>
    + * The <code>INTERVAL</code> parameter can be split by colon(':'), indicate the interval value of one fragment.
    + * When <code>column_type</code> is <code>date</code>,this parameter must be split by colon, and <code>interval_unit</code> can be <code>year,month,day</code>.
    + * When <code>column_type</code> is <code>int</code>, the <code>interval_unit</code> can be empty.
    + * When <code>column_type</code> is <code>enum</code>,the <code>INTERVAL</code> parameter can be empty.
    + * </p>
    + * <p>
    + * The syntax examples is :<p>
    + * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
    + * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
    + * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
    + * </p>
    + *
    + */
    +public class JdbcPartitionFragmenter extends Fragmenter {
    +    String[] partitionBy = null;
    +    String[] range = null;
    +    String[] interval = null;
    +    PartitionType partitionType = null;
    +    String partitionColumn = null;
    +    IntervalType intervalType = null;
    +    int intervalNum = 1;
    +
    +    enum PartitionType {
    +        DATE,
    +        INT,
    +        ENUM;
    +
    +        public static PartitionType getType(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    enum IntervalType {
    +        DAY,
    +        MONTH,
    +        YEAR;
    +
    +        public static IntervalType type(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    //The unit interval, in milliseconds, that is used to estimate the number of slices for the date partition type
    +    static Map<IntervalType, Long> intervals = new HashMap<IntervalType, Long>();
    +
    +    static {
    +        intervals.put(IntervalType.DAY, (long) 24 * 60 * 60 * 1000);
    +        //30 days
    +        intervals.put(IntervalType.MONTH, (long) 30 * 24 * 60 * 60 * 1000);
    +        //365 days
    +        intervals.put(IntervalType.YEAR, (long) 365 * 30 * 24 * 60 * 60 * 1000);
    --- End diff --
    
    This is the original legacy code, has not used, can be deleted.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by jiadexin <gi...@git.apache.org>.

Github user jiadexin commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/972#discussion_r85678597
  
    --- Diff: pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java ---
    @@ -0,0 +1,284 @@
    +package org.apache.hawq.pxf.plugins.jdbc;
    +
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +import org.apache.hawq.pxf.api.Fragmenter;
    +import org.apache.hawq.pxf.api.FragmentsStats;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.DbProduct;
    +import org.apache.hawq.pxf.plugins.jdbc.utils.ByteUtil;
    +import org.apache.hawq.pxf.api.Fragment;
    +import org.apache.hawq.pxf.api.utilities.InputData;
    +
    +import java.net.InetAddress;
    +import java.text.SimpleDateFormat;
    +import java.util.*;
    +
    +
    +/**
    + * Fragmenter class for JDBC data resources.
    + *
    + * Extends the {@link Fragmenter} abstract class, with the purpose of transforming
    + * an input data path  (an JDBC Database table name  and user request parameters)  into a list of regions
    + * that belong to this table.
    + * <p>
    + * <h4>The parameter Patterns </h4>
    + * There are three  parameters,  the format is as follows:<p>
    + * <pre>
    + * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
    + * </pre>
    + * The <code>PARTITION_BY</code> parameter can be split by colon(':'),the <code>column_type</code> current supported : <code>date,int,enum</code> .
    + * The Date format is 'yyyy-MM-dd'. <p>
    + * The <code>RANGE</code> parameter can be split by colon(':') ,used to identify the starting range of each fragment.
    + * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
    + * the <code>end_value</code> can be empty. If the <code>column_type</code>is <code>enum</code>,the parameter <code>RANGE</code> can be empty. <p>
    + * The <code>INTERVAL</code> parameter can be split by colon(':'), indicate the interval value of one fragment.
    + * When <code>column_type</code> is <code>date</code>,this parameter must be split by colon, and <code>interval_unit</code> can be <code>year,month,day</code>.
    + * When <code>column_type</code> is <code>int</code>, the <code>interval_unit</code> can be empty.
    + * When <code>column_type</code> is <code>enum</code>,the <code>INTERVAL</code> parameter can be empty.
    + * </p>
    + * <p>
    + * The syntax examples is :<p>
    + * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
    + * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
    + * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
    + * </p>
    + *
    + */
    +public class JdbcPartitionFragmenter extends Fragmenter {
    +    String[] partition_by = null;
    +    String[] range = null;
    +    String[] interval = null;
    +    PartitionType partitionType = null;
    +    String partitionColumn = null;
    +    IntervalType intervalType = null;
    +    int intervalNum = 1;
    +
    +    enum PartitionType {
    +        DATE,
    +        INT,
    +        ENUM;
    +
    +        public static PartitionType getType(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    enum IntervalType {
    +        DAY,
    +        MONTH,
    +        YEAR;
    +
    +        public static IntervalType type(String str) {
    +            return valueOf(str.toUpperCase());
    +        }
    +    }
    +
    +    //The unit interval, in milliseconds, that is used to estimate the number of slices for the date partition type
    +    static Map<IntervalType, Long> intervals = new HashMap<IntervalType, Long>();
    +
    +    static {
    +        intervals.put(IntervalType.DAY, (long) 24 * 60 * 60 * 1000);
    +        intervals.put(IntervalType.MONTH, (long) 30 * 24 * 60 * 60 * 1000);//30 day
    +        intervals.put(IntervalType.YEAR, (long) 365 * 30 * 24 * 60 * 60 * 1000);//365 day
    +    }
    +
    +    /**
    +     * Constructor for JdbcPartitionFragmenter.
    +     *
    +     * @param inConf input data such as which Jdbc table to scan
    +     * @throws JdbcFragmentException
    +     */
    +    public JdbcPartitionFragmenter(InputData inConf) throws JdbcFragmentException {
    +        super(inConf);
    +        if(inConf.getUserProperty("PARTITION_BY") == null )
    --- End diff --
    
    The PARTITION_BY parameter is not must required.
    Has been revised to: return a single fragment


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq issue #972: HAWQ-1108 Add JDBC PXF Plugin

Posted by edespino <gi...@git.apache.org>.

Github user edespino commented on the issue:

    https://github.com/apache/incubator-hawq/pull/972
  
    I pulled the PR source locally and attempted to build it on my Mac OS X 10.12.3/Java 1.8.0_121.  The pxf-jdbc:javadoc build operation failed with the following output:
    
    ```
    08:50 $ javac -version
    javac 1.8.0_121
    \u2714 ~/workspace/HAWQ/incubator-hawq/pxf [PR-972 L|\u20263] 
    08:50 $ java -version
    java version "1.8.0_121"
    Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
    Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
    \u2714 ~/workspace/HAWQ/incubator-hawq/pxf [PR-972 L|\u20263] 
    08:50 $ system_profiler SPSoftwareDataType
    
    Software:
    
        System Software Overview:
    
          System Version: macOS 10.12.3 (16D32)
          Kernel Version: Darwin 16.4.0
          Boot Volume: Macintosh HD
          Boot Mode: Normal
          Computer Name: EdEspino
          User Name: Ed Espino  (eespino)
          Secure Virtual Memory: Enabled
          System Integrity Protection: Disabled
          Time since boot: 3 days 19:44
    
    \u2714 ~/workspace/HAWQ/incubator-hawq/pxf [PR-972 L|\u20263] 
    08:51 $ \u2714 ~/workspace/HAWQ/incubator-hawq/pxf [PR-972 L|\u20263] 
    08:51 $ make
    ./gradlew clean release 
    :clean UP-TO-DATE
    :pxf:clean
    :pxf-api:clean
    :pxf-hbase:clean
    :pxf-hdfs:clean
    :pxf-hive:clean
    :pxf-jdbc:clean
    :pxf-json:clean UP-TO-DATE
    :pxf-service:clean
    :pxf:compileJava UP-TO-DATE
    :pxf:processResources UP-TO-DATE
    :pxf:classes UP-TO-DATE
    :pxf:jar SKIPPED
    :pxf:assemble UP-TO-DATE
    :pxf:compileTestJava UP-TO-DATE
    :pxf:processTestResources UP-TO-DATE
    :pxf:testClasses UP-TO-DATE
    :pxf:test UP-TO-DATE
    :pxf:check UP-TO-DATE
    :pxf:build UP-TO-DATE
    :pxf:buildRpm
    :pxf:distTar UP-TO-DATE
    :pxf:javadoc UP-TO-DATE
    :pxf-api:compileJava
    :pxf-api:processResources UP-TO-DATE
    :pxf-api:classes
    :pxf-api:jar
    :pxf-api:assemble
    :pxf-api:compileTestJavaNote: /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-api/src/test/java/org/apache/hawq/pxf/api/utilities/ColumnDescriptorTest.java uses or overrides a deprecated API.
    Note: Recompile with -Xlint:deprecation for details.
    
    :pxf-api:processTestResources UP-TO-DATE
    :pxf-api:testClasses
    :pxf-api:test
    :pxf-api:check
    :pxf-api:build
    :pxf-api:javadoc
    :pxf-hbase:compileJava
    :pxf-hbase:processResources UP-TO-DATE
    :pxf-hbase:classes
    :pxf-hbase:jar
    :pxf-hbase:assemble
    :pxf-hbase:compileTestJava
    :pxf-hbase:processTestResources UP-TO-DATE
    :pxf-hbase:testClasses
    :pxf-hbase:test
    :pxf-hbase:check
    :pxf-hbase:build
    :pxf-hbase:buildRpm
    :pxf-hbase:distTar
    :pxf-hbase:javadoc
    :pxf-hdfs:compileJava
    :pxf-hdfs:processResources UP-TO-DATE
    :pxf-hdfs:classes
    :pxf-hdfs:jar
    :pxf-hdfs:assemble
    :pxf-hdfs:compileTestJava
    :pxf-hdfs:processTestResources UP-TO-DATE
    :pxf-hdfs:testClasses
    :pxf-hdfs:test
    :pxf-hdfs:check
    :pxf-hdfs:build
    :pxf-hdfs:buildRpm
    :pxf-hdfs:distTar
    :pxf-hdfs:javadoc
    :pxf-service:generateSources
    :pxf-service:compileJava
    :pxf-service:processResources
    :pxf-service:classes
    :pxf-service:jar
    :pxf-hive:compileJava
    :pxf-hive:processResources UP-TO-DATE
    :pxf-hive:classes
    :pxf-hive:jar
    :pxf-hive:assemble
    :pxf-hive:compileTestJavaNote: /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-hive/src/test/java/org/apache/hawq/pxf/plugins/hive/utilities/HiveUtilitiesTest.java uses or overrides a deprecated API.
    Note: Recompile with -Xlint:deprecation for details.
    Note: /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-hive/src/test/java/org/apache/hawq/pxf/plugins/hive/HiveORCAccessorTest.java uses unchecked or unsafe operations.
    Note: Recompile with -Xlint:unchecked for details.
    
    :pxf-hive:processTestResources UP-TO-DATE
    :pxf-hive:testClasses
    :pxf-hive:test
    :pxf-hive:check
    :pxf-hive:build
    :pxf-hive:buildRpm
    :pxf-hive:distTar
    :pxf-service:javadoc UP-TO-DATE
    :pxf-hive:javadoc
    :pxf-jdbc:compileJava
    :pxf-jdbc:processResources UP-TO-DATE
    :pxf-jdbc:classes
    :pxf-jdbc:jar
    :pxf-jdbc:assemble
    :pxf-jdbc:compileTestJava
    :pxf-jdbc:processTestResources UP-TO-DATE
    :pxf-jdbc:testClasses
    :pxf-jdbc:test
    :pxf-jdbc:check
    :pxf-jdbc:build
    :pxf-jdbc:buildRpm
    :pxf-jdbc:distTar
    :pxf-jdbc:javadoc/Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java:238: warning: no description for @throws
         * @throws Exception
           ^
    /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java:40: warning: empty <p> tag
     * <p>
       ^
    /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java:41: error: header used out of sequence: <H4>
     * <h4>The parameter Patterns </h4>
       ^
    /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java:42: warning: empty <p> tag
     * There are three  parameters,  the format is as follows:<p>
                                                              ^
    /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java:44: error: semicolon missing
     * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
                                                 ^
    /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java:44: error: semicolon missing
     * <code>PARTITION_BY=column_name:column_type&RANGE=start_value[:end_value]&INTERVAL=interval_num[:interval_unit]</code>
                                                                               ^
    /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java:49: error: bad use of '>'
     * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
                                            ^
    /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java:49: error: malformed HTML
     * The range is left-closed, ie:<code> '>= start_value AND < end_value' </code>.If the <code>column_type</code> is <code>int</code>,
                                                               ^
    /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java:58: error: semicolon missing
     * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
                                         ^
    /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java:58: error: semicolon missing
     * <code>PARTITION_BY=createdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:month'</code> <p>
                                                                     ^
    /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java:59: error: semicolon missing
     * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
                                  ^
    /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java:59: error: semicolon missing
     * <code>PARTITION_BY=year:int&RANGE=2008:2010&INTERVAL=1</code> <p>
                                                  ^
    /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java:60: error: semicolon missing
     * <code>PARTITION_BY=grade:enum&RANGE=excellent:good:general:bad</code>
                                    ^
    /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPartitionFragmenter.java:97: warning: no description for @throws
         * @throws UserDataException
           ^
    /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcPlugin.java:56: warning: no @throws for org.apache.hawq.pxf.api.UserDataException
        public JdbcPlugin(InputData input) throws UserDataException {
               ^
    /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcReadAccessor.java:37: error: unknown tag: OneField
     * the data type - List <OneField> that HAWQ needs.
                            ^
    /Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/src/main/java/org/apache/hawq/pxf/plugins/jdbc/JdbcReadResolver.java:36: error: unknown tag: OneField
     * Class JdbcReadResolver Read the Jdbc ResultSet, and generates the data type - List <OneField>.
                                                                                          ^
    
    12 errors
    5 warnings
    :pxf-jdbc:javadoc FAILED
    
    FAILURE: Build failed with an exception.
    
    * What went wrong:
    Execution failed for task ':pxf-jdbc:javadoc'.
    > Javadoc generation failed. Generated Javadoc options file (useful for troubleshooting): '/Users/eespino/workspace/HAWQ/incubator-hawq/pxf/pxf-jdbc/build/tmp/javadoc/javadoc.options'
    
    * Try:
    Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output.
    
    BUILD FAILED
    
    Total time: 55.068 secs
    make: *** [all] Error 1
    \u2718-2 ~/workspace/HAWQ/incubator-hawq/pxf [PR-972 L|\u20263] 
    08:52 $ 
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---