You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by Hitesh Kapoor <hi...@datatorrent.com> on 2016/10/12 15:08:47 UTC

(APEXMALHAR-2290) JDBC operator does not deploy after failure in certain cases

Hi All,

This issue occurs when we try to insert records in a table which has a lot
of data.
The setup method of JdbcPOJOInsertOutputOperator generates the metadata of
the columns in the table. To do so it fires a query of the form "Select *
from tablename" and the extracts the required meta data like column
name,data type and if it allowed to be NULL.
When the table has a lot of data this "Select" query takes up a lot of time
(more than 30 sec) and the operator gets killed.
The fix is straight forward and simple as suggested by Sandeep is to limit
the maximum rows returned by the select query to 1. I am using the JDBC
function setMaxRows() to achieve this.
Will be opening a PR for the same. This fix won't have corresponding unit
test cases and I will test the changes externally via an app.

Regards,
Hitesh Kapoor

Re: (APEXMALHAR-2290) JDBC operator does not deploy after failure in certain cases

Posted by Chinmay Kolhatkar <ch...@apache.org>.
+1

On Thu, Oct 13, 2016 at 7:48 PM, Hitesh Kapoor <hi...@datatorrent.com>
wrote:

> Hi All,
>
> Thank you for your suggestions.
> Following the approach as suggested by Chinmay in option 1.
> Will make the necessary code changes, test it and create a PR.
>
> --Hitesh
>
> On Thu, Oct 13, 2016 at 11:20 AM, Pradeep A. Dalvi <pr...@apache.org>
> wrote:
>
> > +1 for Option 1 to use conn.getMetaData()
> >
> > --prad
> >
> > On Wed, Oct 12, 2016 at 11:39 PM, Chinmay Kolhatkar <ch...@apache.org>
> > wrote:
> >
> > > Hi Hitesh,
> > >
> > > Instead of limiting the row count please use one of the following 2
> > > approaches:
> > >
> > > 1. ResultSet rsColumns = null;
> > >     DatabaseMetaData meta = conn.getMetaData();
> > >     rsColumns = meta.getColumns(null, null, "tablename", null);
> > >     while (rsColumns.next()) {
> > >       System.out.println(rsColumns.getString("TYPE_NAME"));
> > >       System.out.println(rsColumns.getString("COLUMN_NAME"));
> > >     }
> > >
> > > Example given here:
> > > http://www.java2s.com/Code/Java/Database-SQL-JDBC/
> > > GetColumnNameAndTypeForATable.htm
> > >
> > > 2. Execute the select statement putting a always false where clause
> like
> > > "select * from table where 1 = 2"
> > >
> > > I would prefer option 1 over option 2.
> > >
> > > -Chinmay.
> > >
> > >
> > > On Wed, Oct 12, 2016 at 8:38 PM, Hitesh Kapoor <hitesh@datatorrent.com
> >
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > This issue occurs when we try to insert records in a table which has
> a
> > > lot
> > > > of data.
> > > > The setup method of JdbcPOJOInsertOutputOperator generates the
> metadata
> > > of
> > > > the columns in the table. To do so it fires a query of the form
> > "Select *
> > > > from tablename" and the extracts the required meta data like column
> > > > name,data type and if it allowed to be NULL.
> > > > When the table has a lot of data this "Select" query takes up a lot
> of
> > > time
> > > > (more than 30 sec) and the operator gets killed.
> > > > The fix is straight forward and simple as suggested by Sandeep is to
> > > limit
> > > > the maximum rows returned by the select query to 1. I am using the
> JDBC
> > > > function setMaxRows() to achieve this.
> > > > Will be opening a PR for the same. This fix won't have corresponding
> > unit
> > > > test cases and I will test the changes externally via an app.
> > > >
> > > > Regards,
> > > > Hitesh Kapoor
> > > >
> > >
> >
>

Re: (APEXMALHAR-2290) JDBC operator does not deploy after failure in certain cases

Posted by Hitesh Kapoor <hi...@datatorrent.com>.
Hi All,

Thank you for your suggestions.
Following the approach as suggested by Chinmay in option 1.
Will make the necessary code changes, test it and create a PR.

--Hitesh

On Thu, Oct 13, 2016 at 11:20 AM, Pradeep A. Dalvi <pr...@apache.org> wrote:

> +1 for Option 1 to use conn.getMetaData()
>
> --prad
>
> On Wed, Oct 12, 2016 at 11:39 PM, Chinmay Kolhatkar <ch...@apache.org>
> wrote:
>
> > Hi Hitesh,
> >
> > Instead of limiting the row count please use one of the following 2
> > approaches:
> >
> > 1. ResultSet rsColumns = null;
> >     DatabaseMetaData meta = conn.getMetaData();
> >     rsColumns = meta.getColumns(null, null, "tablename", null);
> >     while (rsColumns.next()) {
> >       System.out.println(rsColumns.getString("TYPE_NAME"));
> >       System.out.println(rsColumns.getString("COLUMN_NAME"));
> >     }
> >
> > Example given here:
> > http://www.java2s.com/Code/Java/Database-SQL-JDBC/
> > GetColumnNameAndTypeForATable.htm
> >
> > 2. Execute the select statement putting a always false where clause like
> > "select * from table where 1 = 2"
> >
> > I would prefer option 1 over option 2.
> >
> > -Chinmay.
> >
> >
> > On Wed, Oct 12, 2016 at 8:38 PM, Hitesh Kapoor <hi...@datatorrent.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > This issue occurs when we try to insert records in a table which has a
> > lot
> > > of data.
> > > The setup method of JdbcPOJOInsertOutputOperator generates the metadata
> > of
> > > the columns in the table. To do so it fires a query of the form
> "Select *
> > > from tablename" and the extracts the required meta data like column
> > > name,data type and if it allowed to be NULL.
> > > When the table has a lot of data this "Select" query takes up a lot of
> > time
> > > (more than 30 sec) and the operator gets killed.
> > > The fix is straight forward and simple as suggested by Sandeep is to
> > limit
> > > the maximum rows returned by the select query to 1. I am using the JDBC
> > > function setMaxRows() to achieve this.
> > > Will be opening a PR for the same. This fix won't have corresponding
> unit
> > > test cases and I will test the changes externally via an app.
> > >
> > > Regards,
> > > Hitesh Kapoor
> > >
> >
>

Re: (APEXMALHAR-2290) JDBC operator does not deploy after failure in certain cases

Posted by "Pradeep A. Dalvi" <pr...@apache.org>.
+1 for Option 1 to use conn.getMetaData()

--prad

On Wed, Oct 12, 2016 at 11:39 PM, Chinmay Kolhatkar <ch...@apache.org>
wrote:

> Hi Hitesh,
>
> Instead of limiting the row count please use one of the following 2
> approaches:
>
> 1. ResultSet rsColumns = null;
>     DatabaseMetaData meta = conn.getMetaData();
>     rsColumns = meta.getColumns(null, null, "tablename", null);
>     while (rsColumns.next()) {
>       System.out.println(rsColumns.getString("TYPE_NAME"));
>       System.out.println(rsColumns.getString("COLUMN_NAME"));
>     }
>
> Example given here:
> http://www.java2s.com/Code/Java/Database-SQL-JDBC/
> GetColumnNameAndTypeForATable.htm
>
> 2. Execute the select statement putting a always false where clause like
> "select * from table where 1 = 2"
>
> I would prefer option 1 over option 2.
>
> -Chinmay.
>
>
> On Wed, Oct 12, 2016 at 8:38 PM, Hitesh Kapoor <hi...@datatorrent.com>
> wrote:
>
> > Hi All,
> >
> > This issue occurs when we try to insert records in a table which has a
> lot
> > of data.
> > The setup method of JdbcPOJOInsertOutputOperator generates the metadata
> of
> > the columns in the table. To do so it fires a query of the form "Select *
> > from tablename" and the extracts the required meta data like column
> > name,data type and if it allowed to be NULL.
> > When the table has a lot of data this "Select" query takes up a lot of
> time
> > (more than 30 sec) and the operator gets killed.
> > The fix is straight forward and simple as suggested by Sandeep is to
> limit
> > the maximum rows returned by the select query to 1. I am using the JDBC
> > function setMaxRows() to achieve this.
> > Will be opening a PR for the same. This fix won't have corresponding unit
> > test cases and I will test the changes externally via an app.
> >
> > Regards,
> > Hitesh Kapoor
> >
>

Re: (APEXMALHAR-2290) JDBC operator does not deploy after failure in certain cases

Posted by Chinmay Kolhatkar <ch...@apache.org>.
Hi Hitesh,

Instead of limiting the row count please use one of the following 2
approaches:

1. ResultSet rsColumns = null;
    DatabaseMetaData meta = conn.getMetaData();
    rsColumns = meta.getColumns(null, null, "tablename", null);
    while (rsColumns.next()) {
      System.out.println(rsColumns.getString("TYPE_NAME"));
      System.out.println(rsColumns.getString("COLUMN_NAME"));
    }

Example given here:
http://www.java2s.com/Code/Java/Database-SQL-JDBC/GetColumnNameAndTypeForATable.htm

2. Execute the select statement putting a always false where clause like
"select * from table where 1 = 2"

I would prefer option 1 over option 2.

-Chinmay.


On Wed, Oct 12, 2016 at 8:38 PM, Hitesh Kapoor <hi...@datatorrent.com>
wrote:

> Hi All,
>
> This issue occurs when we try to insert records in a table which has a lot
> of data.
> The setup method of JdbcPOJOInsertOutputOperator generates the metadata of
> the columns in the table. To do so it fires a query of the form "Select *
> from tablename" and the extracts the required meta data like column
> name,data type and if it allowed to be NULL.
> When the table has a lot of data this "Select" query takes up a lot of time
> (more than 30 sec) and the operator gets killed.
> The fix is straight forward and simple as suggested by Sandeep is to limit
> the maximum rows returned by the select query to 1. I am using the JDBC
> function setMaxRows() to achieve this.
> Will be opening a PR for the same. This fix won't have corresponding unit
> test cases and I will test the changes externally via an app.
>
> Regards,
> Hitesh Kapoor
>