You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by Alex Behm <al...@cloudera.com> on 2016/05/04 08:18:01 UTC

Re: Fw: Issues with generating testdata for Impala

Hi Valencia,

I'm sorry you are having so much trouble with our setup. Let's see what we
can do.

There was an infra issue with receiving the logs you sent me. The
email/attachment got rejected on our side. Maybe you can upload the logs
somewhere so I can grab them?

See more responses inline below.

On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <vs...@us.ibm.com> wrote:

> Hi Alex,
>
> I was going more deeper through the logs. I have some findings and queries:
>
> 1. At the "Invalidating Metadata" step (as mentioned in below mail), i
> noticed that, it is trying to use kerberos. Perhaps, this is preventing the
> testdata generation from proceeding, as we are not using Kerberos.
> I need to know how this can be done without involving Kerberos support ?
>
Kerberos is certainly not needed to build and run tests.

>
> 2. I had executed the fe tests despite the incomplete testdata generation,
> the tests started and surely have failed. Many of these (null pointer
> exception in AuthorzationTests) have a common cause: "tpch database does
> not exist."
> e.g. as shown in .Impala/cluster_logs/query_tests/test-run-workload.log.
>
> Does the "tpch" database gets created after the current blocker step
> "Invalidating Metadata" ?
>

Yes, the TPCH database is created and loaded as part of that first phase.
However, the data files are not yet publicly accessible. Let me work on
that from my side, and get back to you soon. One way or the other we'll be
able to provide you with the data.


>
> 3. In the fe test console output log, another error shown:
> ============================= test session starts
> ==============================
> platform linux2 -- Python 2.7.5 -- py-1.4.30 -- pytest-2.7.2
> rootdir: /work/, inifile:
> plugins: random, xdist
> ERROR: file not found:/work/I
> mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/
>
> These are not present/created on my vm. May i know when these get created ?
>
> 4. Could you also share the total number of fe tests ?
>

I'll privately send you the console output from a successful FE run.
Hopefully that can help.

Cheers,

Alex

>
>
> Looking forward to your reply.
>
> Regards,
> Valencia
>
>
> [image: Inactive hide details for Valencia Serrao---04/30/2016 09:05:54
> AM---Hi Alex, I've been able to make some progress on testdata]Valencia
> Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been able to make some
> progress on testdata generation, however, i still face the foll
>
> From: Valencia Serrao/Austin/Contr/IBM
> To: dev@impala.incubator.apache.org, Alex Behm <al...@cloudera.com>
> Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
> Panpaliya/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
> Date: 04/30/2016 09:05 AM
> Subject: Fw: Issues with generating testdata for Impala
> ------------------------------
>
>
>
> Hi Alex,
>
> I've been able to make some progress on testdata generation, however, i
> still face the following issues:
>
>
> *******************************************************************************************************************************************************************
> Invalidating Metadata
>
> (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):
> INSERT OVERWRITE TABLE functional_parquet.alltypes partition (year, month)
> SELECT id, bool_col, tinyint_col, smallint_col, int_col, bigint_col,
> float_col, double_col, date_string_col, string_col, timestamp_col, year,
> month
> FROM functional.alltypes
>
> Data Loading from Impala failed with error: ImpalaBeeswaxException:
> INNER EXCEPTION: <class 'socket.error'>
> MESSAGE: [Errno 104] Connection reset by peer
> Error in /root/nishidha/Impala/testdata/bin/create-load-data.sh at line
> 41: while [ -n "$*" ]
> Error in /root/nishidha/Impala/buildall.sh at line 368:
> ${IMPALA_HOME}/testdata/bin/create-load-data.sh ${CREATE_LOAD_DATA_ARGS}
> <<< Y
>
> *************************************************************************************************************************************************************************
>
> i continued with fe tests as is. Here is the complete output log.
> [attachment "fe_test_output.zip" deleted by Valencia
> Serrao/Austin/Contr/IBM]
>
> Cluster logs: [attachment "cluster_logs.7z" deleted by Valencia
> Serrao/Austin/Contr/IBM]
>
> Kindly guide me on the same.
>
> Regards,
> Valencia
> ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on 04/29/2016 10:57 AM
> -----
>
> From: Sudarshan Jagadale/Austin/Contr/IBM
> To: Valencia Serrao/Austin/Contr/IBM@IBMUS
> Date: 04/29/2016 10:49 AM
> Subject: Fw: Issues with generating testdata for Impala
> ------------------------------
>
>
> FYI
> Thanks and Regards
> Sudarshan Jagadale
> Power Open Source Solutions
> ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM on 04/29/2016 10:48
> AM -----
>
> From: Alex Behm <al...@cloudera.com>
> To: dev@impala.incubator.apache.org
> Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
> Panpaliya/Austin/Contr/IBM@IBMUS
> Date: 04/28/2016 09:34 PM
> Subject: Re: Issues with generating testdata for Impala
> ------------------------------
>
>
>
> Hi Valencia,
>
> sorry I did not get the attachment. Would you be able to tar.gz and attach
> the whole cluster_logs directory?
>
> Alex
>
> On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao <*vserrao@us.ibm.com*
> <vs...@us.ibm.com>> wrote:
>
>    Hi Alex,
>
>    I tried building impala again with the following:
>    HDFS CDH 5.7.0 (
>    *http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*
>    <http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3>
>    )
>    HBASE CDH 5.7.0 SNAPSHOT (
>    *http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*
>    <http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz> )
>    - this required to patch in a fix (
>    *https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*
>    <https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch>
>    )
>    HIVE CDH 5.8.0 SNAPSHOT
>
>    With the above combination, i'm able to move past the exception and
>    also have the RegionServer service up and running. However, it now gives
>    error as below:
>
>
>    ********************************************************************************************************************
>    (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>    CREATE EXTERNAL TABLE IF NOT EXISTS functional.decimal_tbl (
>    d1 DECIMAL,
>    d2 DECIMAL(10, 0),
>    d3 DECIMAL(20, 10),
>    d4 DECIMAL(38, 38),
>    d5 DECIMAL(10, 5))
>    PARTITIONED BY (d6 DECIMAL(9, 0))
>    ROW FORMAT delimited fields terminated by ','
>    STORED AS TEXTFILE
>    LOCATION '/test-warehouse/decimal_tbl'
>
>    (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>    USE functional
>
>    (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>    ALTER TABLE decimal_tbl ADD IF NOT EXISTS PARTITION(d6=1)
>
>    Data Loading from Impala failed with error: ImpalaBeeswaxException:
>    INNER EXCEPTION: <class
>    'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
>    MESSAGE:
>    Error: null
>
>    ******************************************************************************************************************
>
>    Here is the complete log for the same. *(See attached file:
>    data-load-functional-exhaustive.log)*
>
>    It would great if you could guide me on this issue, so i could proceed
>    with the fe tests.
>
>    Still awaiting link to the source code of HDFS CDH 5.8.0
>
>    Regards,
>    Valencia
>
>
>
>

Re: Fw: Issues with generating testdata for Impala

Posted by Valencia Serrao <vs...@us.ibm.com>.
Hi Alex/Casey

Thank you for responding and for sharing the testdata. I'm working on using
the testdata to run the fe tests.

Meanwhile, I've posted the logs onto "Impala Dev" google group. Here's the
link:
https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk

Regards,
Valencia



From:	Alex Behm <al...@cloudera.com>
To:	Casey Ching <ca...@cloudera.com>
Cc:	dev@impala.incubator.apache.org, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
            Serrao/Austin/Contr/IBM@IBMUS
Date:	05/04/2016 12:52 PM
Subject:	Re: Fw: Issues with generating testdata for Impala



Ahh, thanks Casey. Did not know about that.

Valencia, Impala's data loading expects the files to be placed
in IMPALA_HOME/testdata/impala-data

On Tue, May 3, 2016 at 11:21 PM, Casey Ching <ca...@cloudera.com> wrote:
  Comment inline below




  On May 3, 2016 at 11:18:06 PM, Alex Behm (alex.behm@cloudera.com) wrote:


        Hi Valencia,

        I'm sorry you are having so much trouble with our setup. Let's see
        what we
        can do.

        There was an infra issue with receiving the logs you sent me. The
        email/attachment got rejected on our side. Maybe you can upload the
        logs
        somewhere so I can grab them?

        See more responses inline below.

        On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <
        vserrao@us.ibm.com> wrote:

        > Hi Alex,
        >
        > I was going more deeper through the logs. I have some findings
        and queries:
        >
        > 1. At the "Invalidating Metadata" step (as mentioned in below
        mail), i
        > noticed that, it is trying to use kerberos. Perhaps, this is
        preventing the
        > testdata generation from proceeding, as we are not using
        Kerberos.
        > I need to know how this can be done without involving Kerberos
        support ?
        >
        Kerberos is certainly not needed to build and run tests.

        >
        > 2. I had executed the fe tests despite the incomplete testdata
        generation,
        > the tests started and surely have failed. Many of these (null
        pointer
        > exception in AuthorzationTests) have a common cause: "tpch
        database does
        > not exist."
        > e.g. as shown
        in .Impala/cluster_logs/query_tests/test-run-workload.log.
        >
        > Does the "tpch" database gets created after the current blocker
        step
        > "Invalidating Metadata" ?
        >

        Yes, the TPCH database is created and loaded as part of that first
        phase.
        However, the data files are not yet publicly accessible. Let me
        work on
        that from my side, and get back to you soon. One way or the other
        we'll be
        able to provide you with the data.


  The data is at
  https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
   . The files are split into 50 MB pieces for git. You can put them back
  together as is done in
  https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile


        >
        > 3. In the fe test console output log, another error shown:
        > ============================= test session starts
        > ==============================
        > platform linux2 -- Python 2.7.5 -- py-1.4.30 -- pytest-2.7.2
        > rootdir: /work/, inifile:
        > plugins: random, xdist
        > ERROR: file not found:/work/I
        > mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/
        >
        > These are not present/created on my vm. May i know when these get
        created ?
        >
        > 4. Could you also share the total number of fe tests ?
        >

        I'll privately send you the console output from a successful FE
        run.
        Hopefully that can help.

        Cheers,

        Alex

        >
        >
        > Looking forward to your reply.
        >
        > Regards,
        > Valencia
        >
        >
        > [image: Inactive hide details for Valencia Serrao---04/30/2016
        09:05:54
        > AM---Hi Alex, I've been able to make some progress on
        testdata]Valencia
        > Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been able to make
        some
        > progress on testdata generation, however, i still face the foll
        >
        > From: Valencia Serrao/Austin/Contr/IBM
        > To: dev@impala.incubator.apache.org, Alex Behm <
        alex.behm@cloudera.com>
        > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
        > Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
        Serrao/Austin/Contr/IBM@IBMUS
        > Date: 04/30/2016 09:05 AM
        > Subject: Fw: Issues with generating testdata for Impala
        > ------------------------------
        >
        >
        >
        > Hi Alex,
        >
        > I've been able to make some progress on testdata generation,
        however, i
        > still face the following issues:
        >
        >
        >
        *******************************************************************************************************************************************************************

        > Invalidating Metadata
        >
        >
        (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):

        > INSERT OVERWRITE TABLE functional_parquet.alltypes partition
        (year, month)
        > SELECT id, bool_col, tinyint_col, smallint_col, int_col,
        bigint_col,
        > float_col, double_col, date_string_col, string_col,
        timestamp_col, year,
        > month
        > FROM functional.alltypes
        >
        > Data Loading from Impala failed with error:
        ImpalaBeeswaxException:
        > INNER EXCEPTION: <class 'socket.error'>
        > MESSAGE: [Errno 104] Connection reset by peer
        > Error in /root/nishidha/Impala/testdata/bin/create-load-data.sh
        at line
        > 41: while [ -n "$*" ]
        > Error in /root/nishidha/Impala/buildall.sh at line 368:
        > ${IMPALA_HOME}/testdata/bin/create-load-data.sh $
        {CREATE_LOAD_DATA_ARGS}
        > <<< Y
        >
        >
        *************************************************************************************************************************************************************************

        >
        > i continued with fe tests as is. Here is the complete output log.

        > [attachment "fe_test_output.zip" deleted by Valencia
        > Serrao/Austin/Contr/IBM]
        >
        > Cluster logs: [attachment "cluster_logs.7z" deleted by Valencia
        > Serrao/Austin/Contr/IBM]
        >
        > Kindly guide me on the same.
        >
        > Regards,
        > Valencia
        > ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on 04/29/2016
        10:57 AM
        > -----
        >
        > From: Sudarshan Jagadale/Austin/Contr/IBM
        > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
        > Date: 04/29/2016 10:49 AM
        > Subject: Fw: Issues with generating testdata for Impala
        > ------------------------------
        >
        >
        > FYI
        > Thanks and Regards
        > Sudarshan Jagadale
        > Power Open Source Solutions
        > ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM on
        04/29/2016 10:48
        > AM -----
        >
        > From: Alex Behm <al...@cloudera.com>
        > To: dev@impala.incubator.apache.org
        > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
        > Panpaliya/Austin/Contr/IBM@IBMUS
        > Date: 04/28/2016 09:34 PM
        > Subject: Re: Issues with generating testdata for Impala
        > ------------------------------
        >
        >
        >
        > Hi Valencia,
        >
        > sorry I did not get the attachment. Would you be able to tar.gz
        and attach
        > the whole cluster_logs directory?
        >
        > Alex
        >
        > On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao <*
        vserrao@us.ibm.com*
        > <vs...@us.ibm.com>> wrote:
        >
        > Hi Alex,
        >
        > I tried building impala again with the following:
        > HDFS CDH 5.7.0 (
        > *
        http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*

        > <
        http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3
        >
        > )
        > HBASE CDH 5.7.0 SNAPSHOT (
        > *
        http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*

        > <
        http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz
        > )
        > - this required to patch in a fix (
        > *
        https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*

        > <
        https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch
        >
        > )
        > HIVE CDH 5.8.0 SNAPSHOT
        >
        > With the above combination, i'm able to move past the exception
        and
        > also have the RegionServer service up and running. However, it
        now gives
        > error as below:
        >
        >
        >
        ********************************************************************************************************************

        >
        (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

        > CREATE EXTERNAL TABLE IF NOT EXISTS functional.decimal_tbl (
        > d1 DECIMAL,
        > d2 DECIMAL(10, 0),
        > d3 DECIMAL(20, 10),
        > d4 DECIMAL(38, 38),
        > d5 DECIMAL(10, 5))
        > PARTITIONED BY (d6 DECIMAL(9, 0))
        > ROW FORMAT delimited fields terminated by ','
        > STORED AS TEXTFILE
        > LOCATION '/test-warehouse/decimal_tbl'
        >
        >
        (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

        > USE functional
        >
        >
        (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

        > ALTER TABLE decimal_tbl ADD IF NOT EXISTS PARTITION(d6=1)
        >
        > Data Loading from Impala failed with error:
        ImpalaBeeswaxException:
        > INNER EXCEPTION: <class
        > 'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
        > MESSAGE:
        > Error: null
        >
        >
        ******************************************************************************************************************

        >
        > Here is the complete log for the same. *(See attached file:
        > data-load-functional-exhaustive.log)*
        >
        > It would great if you could guide me on this issue, so i could
        proceed
        > with the fe tests.
        >
        > Still awaiting link to the source code of HDFS CDH 5.8.0
        >
        > Regards,
        > Valencia
        >
        >
        >
        >


Re: Fw: Issues with generating testdata for Impala

Posted by Valencia Serrao <vs...@us.ibm.com>.
Thanks, Casey!

I will let you know the test status.



From:	Casey Ching <ca...@cloudera.com>
To:	Alex Behm <al...@cloudera.com>, Valencia
            Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
Cc:	Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS,
            dev@impala.incubator.apache.org
Date:	05/05/2016 01:09 PM
Subject:	Re: Fw: Issues with generating testdata for Impala








On May 4, 2016 at 11:08:07 PM, Valencia Serrao (vserrao@us.ibm.com) wrote:


      Hi Alex,

      I've placed the individual testdata tars at the
      IMPALA_HOME/testdata/impala-data. Steps 1...10 i've already executed.
      Some queries about step no:11 and step no:12, that i want to clarify:

      1) . bin/impala-config.sh
      2) mkdir -p $IMPALA_HOME/testdata/impala-data
      3) pushd $IMPALA_HOME/testdata/impala-data
      4) cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz
      5) tar -xzf tpch.tar.gz
      6) rm tpch.tar.gz
      7) cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz
      8) tar -xzf tpcds.tar.gz
      9) rm tpcds.tar.gz
      10) popd

      11) ./buildall.sh -notests -noclean -format
      -----Here I've removed the -testdata option.
      The reason to do this is to clear the previously generated partial
      schemas.


I think the -format option is supposed to clear out any old state. The
-testdata flag is probably needed to generate and load the test data.




      12) sudo rm -rf $IMPALA_HOME/testdata/impala-data ---- Is this step
      required? Why?


That is only for docker. It helps to reduct the image size. You shouldn’t
need to do that or any of the other rm commands.




      Could you kindly confirm on these steps ? If any corrections, please
      let me know.

      Regards,
      Valencia



       Valencia Serrao---05/04/2016 04:18:24 PM---Hi Alex/Casey Thank you
      for responding and for sharing the testdata. I'm working on using the
      testda

      From: Valencia Serrao/Austin/Contr/IBM
      To: Alex Behm <al...@cloudera.com>
      Cc: Casey Ching <ca...@cloudera.com>,
      dev@impala.incubator.apache.org, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS
      Date: 05/04/2016 04:18 PM
      Subject: Re: Fw: Issues with generating testdata for Impala




      Hi Alex/Casey

      Thank you for responding and for sharing the testdata. I'm working on
      using the testdata to run the fe tests.

      Meanwhile, I've posted the logs onto "Impala Dev" google group.
      Here's the link:
      https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk


      Regards,
      Valencia


       Alex Behm ---05/04/2016 12:52:44 PM---Ahh, thanks Casey. Did not
      know about that. Valencia, Impala's data loading expects the files to
      be

      From: Alex Behm <al...@cloudera.com>
      To: Casey Ching <ca...@cloudera.com>
      Cc: dev@impala.incubator.apache.org, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
      Serrao/Austin/Contr/IBM@IBMUS
      Date: 05/04/2016 12:52 PM
      Subject: Re: Fw: Issues with generating testdata for Impala



      Ahh, thanks Casey. Did not know about that.

      Valencia, Impala's data loading expects the files to be placed in
      IMPALA_HOME/testdata/impala-data

      On Tue, May 3, 2016 at 11:21 PM, Casey Ching <ca...@cloudera.com>
      wrote:
          Comment inline below



          On May 3, 2016 at 11:18:06 PM, Alex Behm (alex.behm@cloudera.com)
          wrote:


                  Hi Valencia,

                  I'm sorry you are having so much trouble with our setup.
                  Let's see what we
                  can do.

                  There was an infra issue with receiving the logs you sent
                  me. The
                  email/attachment got rejected on our side. Maybe you can
                  upload the logs
                  somewhere so I can grab them?

                  See more responses inline below.

                  On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <
                  vserrao@us.ibm.com> wrote:

                  > Hi Alex,
                  >
                  > I was going more deeper through the logs. I have some
                  findings and queries:
                  >
                  > 1. At the "Invalidating Metadata" step (as mentioned in
                  below mail), i
                  > noticed that, it is trying to use kerberos. Perhaps,
                  this is preventing the
                  > testdata generation from proceeding, as we are not
                  using Kerberos.
                  > I need to know how this can be done without involving
                  Kerberos support ?
                  >
                  Kerberos is certainly not needed to build and run tests.

                  >
                  > 2. I had executed the fe tests despite the incomplete
                  testdata generation,
                  > the tests started and surely have failed. Many of these
                  (null pointer
                  > exception in AuthorzationTests) have a common cause:
                  "tpch database does
                  > not exist."
                  > e.g. as shown
                  in .Impala/cluster_logs/query_tests/test-run-workload.log.

                  >
                  > Does the "tpch" database gets created after the current
                  blocker step
                  > "Invalidating Metadata" ?
                  >

                  Yes, the TPCH database is created and loaded as part of
                  that first phase.
                  However, the data files are not yet publicly accessible.
                  Let me work on
                  that from my side, and get back to you soon. One way or
                  the other we'll be
                  able to provide you with the data.

          The data is at
          https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
           . The files are split into 50 MB pieces for git. You can put
          them back together as is done in
          https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile

                  >
                  > 3. In the fe test console output log, another error
                  shown:
                  > ============================= test session starts
                  > ==============================
                  > platform linux2 -- Python 2.7.5 -- py-1.4.30 --
                  pytest-2.7.2
                  > rootdir: /work/, inifile:
                  > plugins: random, xdist
                  > ERROR: file not found:/work/I
                  >
                  mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/

                  >
                  > These are not present/created on my vm. May i know when
                  these get created ?
                  >
                  > 4. Could you also share the total number of fe tests ?
                  >

                  I'll privately send you the console output from a
                  successful FE run.
                  Hopefully that can help.

                  Cheers,

                  Alex

                  >
                  >
                  > Looking forward to your reply.
                  >
                  > Regards,
                  > Valencia
                  >
                  >
                  > [image: Inactive hide details for Valencia
                  Serrao---04/30/2016 09:05:54
                  > AM---Hi Alex, I've been able to make some progress on
                  testdata]Valencia
                  > Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been
                  able to make some
                  > progress on testdata generation, however, i still face
                  the foll
                  >
                  > From: Valencia Serrao/Austin/Contr/IBM
                  > To: dev@impala.incubator.apache.org, Alex Behm <
                  alex.behm@cloudera.com>
                  > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                  > Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
                  Serrao/Austin/Contr/IBM@IBMUS
                  > Date: 04/30/2016 09:05 AM
                  > Subject: Fw: Issues with generating testdata for Impala
                  > ------------------------------
                  >
                  >
                  >
                  > Hi Alex,
                  >
                  > I've been able to make some progress on testdata
                  generation, however, i
                  > still face the following issues:
                  >
                  >
                  >
                  *******************************************************************************************************************************************************************

                  > Invalidating Metadata
                  >
                  >
                  (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):

                  > INSERT OVERWRITE TABLE functional_parquet.alltypes
                  partition (year, month)
                  > SELECT id, bool_col, tinyint_col, smallint_col,
                  int_col, bigint_col,
                  > float_col, double_col, date_string_col, string_col,
                  timestamp_col, year,
                  > month
                  > FROM functional.alltypes
                  >
                  > Data Loading from Impala failed with error:
                  ImpalaBeeswaxException:
                  > INNER EXCEPTION: <class 'socket.error'>
                  > MESSAGE: [Errno 104] Connection reset by peer
                  > Error
                  in /root/nishidha/Impala/testdata/bin/create-load-data.sh
                  at line
                  > 41: while [ -n "$*" ]
                  > Error in /root/nishidha/Impala/buildall.sh at line 368:
                  > ${IMPALA_HOME}/testdata/bin/create-load-data.sh $
                  {CREATE_LOAD_DATA_ARGS}
                  > <<< Y
                  >
                  >
                  *************************************************************************************************************************************************************************

                  >
                  > i continued with fe tests as is. Here is the complete
                  output log.
                  > [attachment "fe_test_output.zip" deleted by Valencia
                  > Serrao/Austin/Contr/IBM]
                  >
                  > Cluster logs: [attachment "cluster_logs.7z" deleted by
                  Valencia
                  > Serrao/Austin/Contr/IBM]
                  >
                  > Kindly guide me on the same.
                  >
                  > Regards,
                  > Valencia
                  > ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on
                  04/29/2016 10:57 AM
                  > -----
                  >
                  > From: Sudarshan Jagadale/Austin/Contr/IBM
                  > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
                  > Date: 04/29/2016 10:49 AM
                  > Subject: Fw: Issues with generating testdata for Impala
                  > ------------------------------
                  >
                  >
                  > FYI
                  > Thanks and Regards
                  > Sudarshan Jagadale
                  > Power Open Source Solutions
                  > ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM
                  on 04/29/2016 10:48
                  > AM -----
                  >
                  > From: Alex Behm <al...@cloudera.com>
                  > To: dev@impala.incubator.apache.org
                  > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                  > Panpaliya/Austin/Contr/IBM@IBMUS
                  > Date: 04/28/2016 09:34 PM
                  > Subject: Re: Issues with generating testdata for Impala
                  > ------------------------------
                  >
                  >
                  >
                  > Hi Valencia,
                  >
                  > sorry I did not get the attachment. Would you be able
                  to tar.gz and attach
                  > the whole cluster_logs directory?
                  >
                  > Alex
                  >
                  > On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao <*
                  vserrao@us.ibm.com*
                  > <vs...@us.ibm.com>> wrote:
                  >
                  > Hi Alex,
                  >
                  > I tried building impala again with the following:
                  > HDFS CDH 5.7.0 (
                  > *
                  http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*

                  > <
                  http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3
                  >
                  > )
                  > HBASE CDH 5.7.0 SNAPSHOT (
                  > *
                  http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*

                  > <
                  http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz
                  > )
                  > - this required to patch in a fix (
                  > *
                  https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*

                  > <
                  https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch
                  >
                  > )
                  > HIVE CDH 5.8.0 SNAPSHOT
                  >
                  > With the above combination, i'm able to move past the
                  exception and
                  > also have the RegionServer service up and running.
                  However, it now gives
                  > error as below:
                  >
                  >
                  >
                  ********************************************************************************************************************

                  >
                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                  > CREATE EXTERNAL TABLE IF NOT EXISTS
                  functional.decimal_tbl (
                  > d1 DECIMAL,
                  > d2 DECIMAL(10, 0),
                  > d3 DECIMAL(20, 10),
                  > d4 DECIMAL(38, 38),
                  > d5 DECIMAL(10, 5))
                  > PARTITIONED BY (d6 DECIMAL(9, 0))
                  > ROW FORMAT delimited fields terminated by ','
                  > STORED AS TEXTFILE
                  > LOCATION '/test-warehouse/decimal_tbl'
                  >
                  >
                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                  > USE functional
                  >
                  >
                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                  > ALTER TABLE decimal_tbl ADD IF NOT EXISTS PARTITION
                  (d6=1)
                  >
                  > Data Loading from Impala failed with error:
                  ImpalaBeeswaxException:
                  > INNER EXCEPTION: <class
                  > 'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
                  > MESSAGE:
                  > Error: null
                  >
                  >
                  ******************************************************************************************************************

                  >
                  > Here is the complete log for the same. *(See attached
                  file:
                  > data-load-functional-exhaustive.log)*
                  >
                  > It would great if you could guide me on this issue, so
                  i could proceed
                  > with the fe tests.
                  >
                  > Still awaiting link to the source code of HDFS CDH
                  5.8.0
                  >
                  > Regards,
                  > Valencia
                  >
                  >
                  >
                  >



Re: Fw: Issues with generating testdata for Impala

Posted by Valencia Serrao <vs...@us.ibm.com>.
Hi Alex/Casey,

I tried to run the frontend tests with the data provided. Following is the
result:
	Tests run: 545, Failures: 226, Errors: 77, Skipped: 36    (See
attached file: data-load-functional-exhaustive.zip)


Earlier, the number of "Errors" were 87 , so now they have reduced by 10.
However, the "Failures" count is still the same. Most of the Failures in
PlannerTest and AuthorizationTest are related to tpch (e.g. Database
doesn't exist: tpch).

With regard to the directory "impala_data", i've observed that it is not
being accessed/used by any script. Are we missing on any configuration ?

Kindly guide me on this.

Regards,
Valencia




From:	Valencia Serrao/Austin/Contr/IBM
To:	Casey Ching <ca...@cloudera.com>
Cc:	Alex Behm <al...@cloudera.com>,
            dev@impala.incubator.apache.org, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, David
            Clissold/Austin/IBM@IBMUS, Valencia
            Serrao/Austin/Contr/IBM@IBMUS
Date:	05/05/2016 02:21 PM
Subject:	Re: Fw: Issues with generating testdata for Impala


Thanks, Casey!

I will let you know the test status.




From:	Casey Ching <ca...@cloudera.com>
To:	Alex Behm <al...@cloudera.com>, Valencia
            Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
Cc:	Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS,
            dev@impala.incubator.apache.org
Date:	05/05/2016 01:09 PM
Subject:	Re: Fw: Issues with generating testdata for Impala








On May 4, 2016 at 11:08:07 PM, Valencia Serrao (vserrao@us.ibm.com) wrote:


      Hi Alex,

      I've placed the individual testdata tars at the
      IMPALA_HOME/testdata/impala-data. Steps 1...10 i've already executed.
      Some queries about step no:11 and step no:12, that i want to clarify:

      1) . bin/impala-config.sh
      2) mkdir -p $IMPALA_HOME/testdata/impala-data
      3) pushd $IMPALA_HOME/testdata/impala-data
      4) cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz
      5) tar -xzf tpch.tar.gz
      6) rm tpch.tar.gz
      7) cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz
      8) tar -xzf tpcds.tar.gz
      9) rm tpcds.tar.gz
      10) popd

      11) ./buildall.sh -notests -noclean -format
      -----Here I've removed the -testdata option.
      The reason to do this is to clear the previously generated partial
      schemas.


I think the -format option is supposed to clear out any old state. The
-testdata flag is probably needed to generate and load the test data.




      12) sudo rm -rf $IMPALA_HOME/testdata/impala-data ---- Is this step
      required? Why?


That is only for docker. It helps to reduct the image size. You shouldn’t
need to do that or any of the other rm commands.




      Could you kindly confirm on these steps ? If any corrections, please
      let me know.

      Regards,
      Valencia



       Valencia Serrao---05/04/2016 04:18:24 PM---Hi Alex/Casey Thank you
      for responding and for sharing the testdata. I'm working on using the
      testda

      From: Valencia Serrao/Austin/Contr/IBM
      To: Alex Behm <al...@cloudera.com>
      Cc: Casey Ching <ca...@cloudera.com>,
      dev@impala.incubator.apache.org, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS
      Date: 05/04/2016 04:18 PM
      Subject: Re: Fw: Issues with generating testdata for Impala




      Hi Alex/Casey

      Thank you for responding and for sharing the testdata. I'm working on
      using the testdata to run the fe tests.

      Meanwhile, I've posted the logs onto "Impala Dev" google group.
      Here's the link:
      https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk


      Regards,
      Valencia


       Alex Behm ---05/04/2016 12:52:44 PM---Ahh, thanks Casey. Did not
      know about that. Valencia, Impala's data loading expects the files to
      be

      From: Alex Behm <al...@cloudera.com>
      To: Casey Ching <ca...@cloudera.com>
      Cc: dev@impala.incubator.apache.org, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
      Serrao/Austin/Contr/IBM@IBMUS
      Date: 05/04/2016 12:52 PM
      Subject: Re: Fw: Issues with generating testdata for Impala



      Ahh, thanks Casey. Did not know about that.

      Valencia, Impala's data loading expects the files to be placed in
      IMPALA_HOME/testdata/impala-data

      On Tue, May 3, 2016 at 11:21 PM, Casey Ching <ca...@cloudera.com>
      wrote:
          Comment inline below



          On May 3, 2016 at 11:18:06 PM, Alex Behm (alex.behm@cloudera.com)
          wrote:


                  Hi Valencia,

                  I'm sorry you are having so much trouble with our setup.
                  Let's see what we
                  can do.

                  There was an infra issue with receiving the logs you sent
                  me. The
                  email/attachment got rejected on our side. Maybe you can
                  upload the logs
                  somewhere so I can grab them?

                  See more responses inline below.

                  On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <
                  vserrao@us.ibm.com> wrote:

                  > Hi Alex,
                  >
                  > I was going more deeper through the logs. I have some
                  findings and queries:
                  >
                  > 1. At the "Invalidating Metadata" step (as mentioned in
                  below mail), i
                  > noticed that, it is trying to use kerberos. Perhaps,
                  this is preventing the
                  > testdata generation from proceeding, as we are not
                  using Kerberos.
                  > I need to know how this can be done without involving
                  Kerberos support ?
                  >
                  Kerberos is certainly not needed to build and run tests.

                  >
                  > 2. I had executed the fe tests despite the incomplete
                  testdata generation,
                  > the tests started and surely have failed. Many of these
                  (null pointer
                  > exception in AuthorzationTests) have a common cause:
                  "tpch database does
                  > not exist."
                  > e.g. as shown
                  in .Impala/cluster_logs/query_tests/test-run-workload.log.

                  >
                  > Does the "tpch" database gets created after the current
                  blocker step
                  > "Invalidating Metadata" ?
                  >

                  Yes, the TPCH database is created and loaded as part of
                  that first phase.
                  However, the data files are not yet publicly accessible.
                  Let me work on
                  that from my side, and get back to you soon. One way or
                  the other we'll be
                  able to provide you with the data.

          The data is at
          https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
           . The files are split into 50 MB pieces for git. You can put
          them back together as is done in
          https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile

                  >
                  > 3. In the fe test console output log, another error
                  shown:
                  > ============================= test session starts
                  > ==============================
                  > platform linux2 -- Python 2.7.5 -- py-1.4.30 --
                  pytest-2.7.2
                  > rootdir: /work/, inifile:
                  > plugins: random, xdist
                  > ERROR: file not found:/work/I
                  >
                  mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/

                  >
                  > These are not present/created on my vm. May i know when
                  these get created ?
                  >
                  > 4. Could you also share the total number of fe tests ?
                  >

                  I'll privately send you the console output from a
                  successful FE run.
                  Hopefully that can help.

                  Cheers,

                  Alex

                  >
                  >
                  > Looking forward to your reply.
                  >
                  > Regards,
                  > Valencia
                  >
                  >
                  > [image: Inactive hide details for Valencia
                  Serrao---04/30/2016 09:05:54
                  > AM---Hi Alex, I've been able to make some progress on
                  testdata]Valencia
                  > Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been
                  able to make some
                  > progress on testdata generation, however, i still face
                  the foll
                  >
                  > From: Valencia Serrao/Austin/Contr/IBM
                  > To: dev@impala.incubator.apache.org, Alex Behm <
                  alex.behm@cloudera.com>
                  > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                  > Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
                  Serrao/Austin/Contr/IBM@IBMUS
                  > Date: 04/30/2016 09:05 AM
                  > Subject: Fw: Issues with generating testdata for Impala
                  > ------------------------------
                  >
                  >
                  >
                  > Hi Alex,
                  >
                  > I've been able to make some progress on testdata
                  generation, however, i
                  > still face the following issues:
                  >
                  >
                  >
                  *******************************************************************************************************************************************************************

                  > Invalidating Metadata
                  >
                  >
                  (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):

                  > INSERT OVERWRITE TABLE functional_parquet.alltypes
                  partition (year, month)
                  > SELECT id, bool_col, tinyint_col, smallint_col,
                  int_col, bigint_col,
                  > float_col, double_col, date_string_col, string_col,
                  timestamp_col, year,
                  > month
                  > FROM functional.alltypes
                  >
                  > Data Loading from Impala failed with error:
                  ImpalaBeeswaxException:
                  > INNER EXCEPTION: <class 'socket.error'>
                  > MESSAGE: [Errno 104] Connection reset by peer
                  > Error
                  in /root/nishidha/Impala/testdata/bin/create-load-data.sh
                  at line
                  > 41: while [ -n "$*" ]
                  > Error in /root/nishidha/Impala/buildall.sh at line 368:
                  > ${IMPALA_HOME}/testdata/bin/create-load-data.sh $
                  {CREATE_LOAD_DATA_ARGS}
                  > <<< Y
                  >
                  >
                  *************************************************************************************************************************************************************************

                  >
                  > i continued with fe tests as is. Here is the complete
                  output log.
                  > [attachment "fe_test_output.zip" deleted by Valencia
                  > Serrao/Austin/Contr/IBM]
                  >
                  > Cluster logs: [attachment "cluster_logs.7z" deleted by
                  Valencia
                  > Serrao/Austin/Contr/IBM]
                  >
                  > Kindly guide me on the same.
                  >
                  > Regards,
                  > Valencia
                  > ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on
                  04/29/2016 10:57 AM
                  > -----
                  >
                  > From: Sudarshan Jagadale/Austin/Contr/IBM
                  > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
                  > Date: 04/29/2016 10:49 AM
                  > Subject: Fw: Issues with generating testdata for Impala
                  > ------------------------------
                  >
                  >
                  > FYI
                  > Thanks and Regards
                  > Sudarshan Jagadale
                  > Power Open Source Solutions
                  > ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM
                  on 04/29/2016 10:48
                  > AM -----
                  >
                  > From: Alex Behm <al...@cloudera.com>
                  > To: dev@impala.incubator.apache.org
                  > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                  > Panpaliya/Austin/Contr/IBM@IBMUS
                  > Date: 04/28/2016 09:34 PM
                  > Subject: Re: Issues with generating testdata for Impala
                  > ------------------------------
                  >
                  >
                  >
                  > Hi Valencia,
                  >
                  > sorry I did not get the attachment. Would you be able
                  to tar.gz and attach
                  > the whole cluster_logs directory?
                  >
                  > Alex
                  >
                  > On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao <*
                  vserrao@us.ibm.com*
                  > <vs...@us.ibm.com>> wrote:
                  >
                  > Hi Alex,
                  >
                  > I tried building impala again with the following:
                  > HDFS CDH 5.7.0 (
                  > *
                  http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*

                  > <
                  http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3
                  >
                  > )
                  > HBASE CDH 5.7.0 SNAPSHOT (
                  > *
                  http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*

                  > <
                  http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz
                  > )
                  > - this required to patch in a fix (
                  > *
                  https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*

                  > <
                  https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch
                  >
                  > )
                  > HIVE CDH 5.8.0 SNAPSHOT
                  >
                  > With the above combination, i'm able to move past the
                  exception and
                  > also have the RegionServer service up and running.
                  However, it now gives
                  > error as below:
                  >
                  >
                  >
                  ********************************************************************************************************************

                  >
                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                  > CREATE EXTERNAL TABLE IF NOT EXISTS
                  functional.decimal_tbl (
                  > d1 DECIMAL,
                  > d2 DECIMAL(10, 0),
                  > d3 DECIMAL(20, 10),
                  > d4 DECIMAL(38, 38),
                  > d5 DECIMAL(10, 5))
                  > PARTITIONED BY (d6 DECIMAL(9, 0))
                  > ROW FORMAT delimited fields terminated by ','
                  > STORED AS TEXTFILE
                  > LOCATION '/test-warehouse/decimal_tbl'
                  >
                  >
                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                  > USE functional
                  >
                  >
                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                  > ALTER TABLE decimal_tbl ADD IF NOT EXISTS PARTITION
                  (d6=1)
                  >
                  > Data Loading from Impala failed with error:
                  ImpalaBeeswaxException:
                  > INNER EXCEPTION: <class
                  > 'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
                  > MESSAGE:
                  > Error: null
                  >
                  >
                  ******************************************************************************************************************

                  >
                  > Here is the complete log for the same. *(See attached
                  file:
                  > data-load-functional-exhaustive.log)*
                  >
                  > It would great if you could guide me on this issue, so
                  i could proceed
                  > with the fe tests.
                  >
                  > Still awaiting link to the source code of HDFS CDH
                  5.8.0
                  >
                  > Regards,
                  > Valencia
                  >
                  >
                  >
                  >



Re: Fw: Issues with generating testdata for Impala

Posted by Valencia Serrao <vs...@us.ibm.com>.
Hi Casey,

Thank you for the response.

Yes, we tried to setup the x86 environment, but here also testdata
generation fails. Yes, we are looking more deeply into the ppc and x86
logs. I will let you know the findings.

As you suggested, i also tried running the  data loading step and verified
if tpch exists through impala-shell. The tpch database doesn't exist.
Command used:
[testvm:21000] > describe database tpch;
Result:
Query: describe database tpch
ERROR: AnalysisException: Database does not exist: tpch

Please could you share the build or test results/logs, so we can verify our
setup. e.g.
1. Output of:  buildall.sh -noclean -notests -format -testdata
2. The cluster_logs

Looking forward to your reply.

Regards,
Valencia




From:	Casey Ching <ca...@cloudera.com>
To:	Alex Behm <al...@cloudera.com>, Valencia
            Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
Cc:	Valencia Serrao/Austin/Contr/IBM@IBMUS,
            dev@impala.incubator.apache.org, David
            Clissold/Austin/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS
Date:	05/09/2016 10:45 PM
Subject:	Re: Fw: Issues with generating testdata for Impala



Hi Valencia,

Have you tried setting up an x86 environment? That could be useful for
comparing to the ppc environment to see what is/isn’t working and being
able to see what the logs should look like.

If the tpch database isn’t there, that should mean data loading failed and
there should have been an error that caused the data loading to exit early
along with an error message in the logs. Did you see anything like that?
You might want to try only running the data loading step, then verifying
that the tpch database exists afterwards.

Casey


On May 9, 2016 at 5:27:49 AM, Valencia Serrao (vserrao@us.ibm.com) wrote:


      Hi Alex/Casey,

      I re-ran the fe tests with the testdata you provided, but the result
      is the same as that reported in the earlier mail, with most of the
      failures occurring due to tpch database not existing.

      Steps followed to test are as follows:
      1. copy the testdata to IMPALA_HOME/testdata/impala-data.
      2. ./buildall.sh -notests -noclean -format -testdata
      3. ./bin/run_all_tests.sh

      We had also tried the testdata generation on Ubuntu x86 ppc machine
      however, it stops at the same "Invalidate Metadata" step with the
      exception.

      Any pointers on these issues will be helpful.

      Regards,
      Valencia

      Valencia Serrao---05/05/2016 06:47:59 PM---Hi Alex/Casey, I tried to
      run the frontend tests with the data provided. Following is the
      result:

      From: Valencia Serrao/Austin/Contr/IBM
      To: Casey Ching <ca...@cloudera.com>
      Cc: Alex Behm <al...@cloudera.com>,
      dev@impala.incubator.apache.org, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS,
      Valencia Serrao/Austin/Contr/IBM@IBMUS
      Date: 05/05/2016 06:47 PM
      Subject: Re: Fw: Issues with generating testdata for Impala




      Hi Alex/Casey,

      I tried to run the frontend tests with the data provided. Following
      is the result:
      Tests run: 545, Failures: 226, Errors: 77, Skipped: 36 [attachment
      "data-load-functional-exhaustive.zip" deleted by Valencia
      Serrao/Austin/Contr/IBM]


      Earlier, the number of "Errors" were 87 , so now they have reduced by
      10. However, the "Failures" count is still the same. Most of the
      Failures in PlannerTest and AuthorizationTest are related to tpch
      (e.g. Database doesn't exist: tpch).

      With regard to the directory "impala_data", i've observed that it is
      not being accessed/used by any script. Are we missing on any
      configuration ?

      Kindly guide me on this.

      Regards,
      Valencia



      Valencia Serrao---05/05/2016 02:21:56 PM---Thanks, Casey! I will let
      you know the test status.

      From: Valencia Serrao/Austin/Contr/IBM
      To: Casey Ching <ca...@cloudera.com>
      Cc: Alex Behm <al...@cloudera.com>,
      dev@impala.incubator.apache.org, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS,
      Valencia Serrao/Austin/Contr/IBM@IBMUS
      Date: 05/05/2016 02:21 PM
      Subject: Re: Fw: Issues with generating testdata for Impala


      Thanks, Casey!

      I will let you know the test status.


      Casey Ching ---05/05/2016 01:09:11 PM---On May 4, 2016 at 11:08:07
      PM, Valencia Serrao (vserrao@us.ibm.com) wrote: Hi Alex,

      From: Casey Ching <ca...@cloudera.com>
      To: Alex Behm <al...@cloudera.com>, Valencia
      Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
      Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
      Date: 05/05/2016 01:09 PM
      Subject: Re: Fw: Issues with generating testdata for Impala







      On May 4, 2016 at 11:08:07 PM, Valencia Serrao (vserrao@us.ibm.com)
      wrote:


              Hi Alex,

              I've placed the individual testdata tars at the
              IMPALA_HOME/testdata/impala-data. Steps 1...10 i've already
              executed. Some queries about step no:11 and step no:12, that
              i want to clarify:

              1) . bin/impala-config.sh
              2) mkdir -p $IMPALA_HOME/testdata/impala-data
              3) pushd $IMPALA_HOME/testdata/impala-data
              4) cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz
              5) tar -xzf tpch.tar.gz
              6) rm tpch.tar.gz
              7) cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz
              8) tar -xzf tpcds.tar.gz
              9) rm tpcds.tar.gz
              10) popd

              11) ./buildall.sh -notests -noclean -format
              -----Here I've removed the -testdata option.
              The reason to do this is to clear the previously generated
              partial schemas.
      I think the -format option is supposed to clear out any old state.
      The -testdata flag is probably needed to generate and load the test
      data.


              12) sudo rm -rf $IMPALA_HOME/testdata/impala-data ---- Is
              this step required? Why?
      That is only for docker. It helps to reduct the image size. You
      shouldn’t need to do that or any of the other rm commands.


              Could you kindly confirm on these steps ? If any corrections,
              please let me know.

              Regards,
              Valencia



              Valencia Serrao---05/04/2016 04:18:24 PM---Hi Alex/Casey
              Thank you for responding and for sharing the testdata. I'm
              working on using the testda

              From: Valencia Serrao/Austin/Contr/IBM
              To: Alex Behm <al...@cloudera.com>
              Cc: Casey Ching <ca...@cloudera.com>,
              dev@impala.incubator.apache.org, Nishidha
              Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
              Jagadale/Austin/Contr/IBM@IBMUS, David
              Clissold/Austin/IBM@IBMUS
              Date: 05/04/2016 04:18 PM
              Subject: Re: Fw: Issues with generating testdata for Impala




              Hi Alex/Casey

              Thank you for responding and for sharing the testdata. I'm
              working on using the testdata to run the fe tests.

              Meanwhile, I've posted the logs onto "Impala Dev" google
              group. Here's the link:
              https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk


              Regards,
              Valencia


              Alex Behm ---05/04/2016 12:52:44 PM---Ahh, thanks Casey. Did
              not know about that. Valencia, Impala's data loading expects
              the files to be

              From: Alex Behm <al...@cloudera.com>
              To: Casey Ching <ca...@cloudera.com>
              Cc: dev@impala.incubator.apache.org, Sudarshan
              Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
              Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
              Serrao/Austin/Contr/IBM@IBMUS
              Date: 05/04/2016 12:52 PM
              Subject: Re: Fw: Issues with generating testdata for Impala



              Ahh, thanks Casey. Did not know about that.

              Valencia, Impala's data loading expects the files to be
              placed in IMPALA_HOME/testdata/impala-data

              On Tue, May 3, 2016 at 11:21 PM, Casey Ching <
              casey@cloudera.com> wrote:


                  Comment inline below


                  On May 3, 2016 at 11:18:06 PM, Alex Behm (
                  alex.behm@cloudera.com) wrote:


                              Hi Valencia,

                              I'm sorry you are having so much trouble with
                              our setup. Let's see what we
                              can do.

                              There was an infra issue with receiving the
                              logs you sent me. The
                              email/attachment got rejected on our side.
                              Maybe you can upload the logs
                              somewhere so I can grab them?

                              See more responses inline below.

                              On Sat, Apr 30, 2016 at 5:01 AM, Valencia
                              Serrao <vs...@us.ibm.com> wrote:

                              > Hi Alex,
                              >
                              > I was going more deeper through the logs. I
                              have some findings and queries:
                              >
                              > 1. At the "Invalidating Metadata" step (as
                              mentioned in below mail), i
                              > noticed that, it is trying to use kerberos.
                              Perhaps, this is preventing the
                              > testdata generation from proceeding, as we
                              are not using Kerberos.
                              > I need to know how this can be done without
                              involving Kerberos support ?
                              >
                              Kerberos is certainly not needed to build and
                              run tests.

                              >
                              > 2. I had executed the fe tests despite the
                              incomplete testdata generation,
                              > the tests started and surely have failed.
                              Many of these (null pointer
                              > exception in AuthorzationTests) have a
                              common cause: "tpch database does
                              > not exist."
                              > e.g. as shown
                              in .Impala/cluster_logs/query_tests/test-run-workload.log.

                              >
                              > Does the "tpch" database gets created after
                              the current blocker step
                              > "Invalidating Metadata" ?
                              >

                              Yes, the TPCH database is created and loaded
                              as part of that first phase.
                              However, the data files are not yet publicly
                              accessible. Let me work on
                              that from my side, and get back to you soon.
                              One way or the other we'll be
                              able to provide you with the data.

                  The data is at
                  https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
                   . The files are split into 50 MB pieces for git. You can
                  put them back together as is done in
                  https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile


                              >
                              > 3. In the fe test console output log,
                              another error shown:
                              > ============================= test session
                              starts
                              > ==============================
                              > platform linux2 -- Python 2.7.5 --
                              py-1.4.30 -- pytest-2.7.2
                              > rootdir: /work/, inifile:
                              > plugins: random, xdist
                              > ERROR: file not found:/work/I
                              >
                              mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/

                              >
                              > These are not present/created on my vm. May
                              i know when these get created ?
                              >
                              > 4. Could you also share the total number of
                              fe tests ?
                              >

                              I'll privately send you the console output
                              from a successful FE run.
                              Hopefully that can help.

                              Cheers,

                              Alex

                              >
                              >
                              > Looking forward to your reply.
                              >
                              > Regards,
                              > Valencia
                              >
                              >
                              > [image: Inactive hide details for Valencia
                              Serrao---04/30/2016 09:05:54
                              > AM---Hi Alex, I've been able to make some
                              progress on testdata]Valencia
                              > Serrao---04/30/2016 09:05:54 AM---Hi Alex,
                              I've been able to make some
                              > progress on testdata generation, however, i
                              still face the foll
                              >
                              > From: Valencia Serrao/Austin/Contr/IBM
                              > To: dev@impala.incubator.apache.org, Alex
                              Behm <al...@cloudera.com>
                              > Cc: Sudarshan
                              Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                              > Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
                              Serrao/Austin/Contr/IBM@IBMUS
                              > Date: 04/30/2016 09:05 AM
                              > Subject: Fw: Issues with generating
                              testdata for Impala
                              > ------------------------------
                              >
                              >
                              >
                              > Hi Alex,
                              >
                              > I've been able to make some progress on
                              testdata generation, however, i
                              > still face the following issues:
                              >
                              >
                              >
                              *******************************************************************************************************************************************************************

                              > Invalidating Metadata
                              >
                              >
                              (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):

                              > INSERT OVERWRITE TABLE
                              functional_parquet.alltypes partition (year,
                              month)
                              > SELECT id, bool_col, tinyint_col,
                              smallint_col, int_col, bigint_col,
                              > float_col, double_col, date_string_col,
                              string_col, timestamp_col, year,
                              > month
                              > FROM functional.alltypes
                              >
                              > Data Loading from Impala failed with error:
                              ImpalaBeeswaxException:
                              > INNER EXCEPTION: <class 'socket.error'>
                              > MESSAGE: [Errno 104] Connection reset by
                              peer
                              > Error
                              in /root/nishidha/Impala/testdata/bin/create-load-data.sh
 at line
                              > 41: while [ -n "$*" ]
                              > Error in /root/nishidha/Impala/buildall.sh
                              at line 368:
                              > $
                              {IMPALA_HOME}/testdata/bin/create-load-data.sh
 ${CREATE_LOAD_DATA_ARGS}
                              > <<< Y
                              >
                              >
                              *************************************************************************************************************************************************************************

                              >
                              > i continued with fe tests as is. Here is
                              the complete output log.
                              > [attachment "fe_test_output.zip" deleted by
                              Valencia
                              > Serrao/Austin/Contr/IBM]
                              >
                              > Cluster logs: [attachment "cluster_logs.7z"
                              deleted by Valencia
                              > Serrao/Austin/Contr/IBM]
                              >
                              > Kindly guide me on the same.
                              >
                              > Regards,
                              > Valencia
                              > ----- Forwarded by Valencia
                              Serrao/Austin/Contr/IBM on 04/29/2016 10:57
                              AM
                              > -----
                              >
                              > From: Sudarshan Jagadale/Austin/Contr/IBM
                              > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
                              > Date: 04/29/2016 10:49 AM
                              > Subject: Fw: Issues with generating
                              testdata for Impala
                              > ------------------------------
                              >
                              >
                              > FYI
                              > Thanks and Regards
                              > Sudarshan Jagadale
                              > Power Open Source Solutions
                              > ----- Forwarded by Sudarshan
                              Jagadale/Austin/Contr/IBM on 04/29/2016 10:48
                              > AM -----
                              >
                              > From: Alex Behm <al...@cloudera.com>
                              > To: dev@impala.incubator.apache.org
                              > Cc: Sudarshan
                              Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                              > Panpaliya/Austin/Contr/IBM@IBMUS
                              > Date: 04/28/2016 09:34 PM
                              > Subject: Re: Issues with generating
                              testdata for Impala
                              > ------------------------------
                              >
                              >
                              >
                              > Hi Valencia,
                              >
                              > sorry I did not get the attachment. Would
                              you be able to tar.gz and attach
                              > the whole cluster_logs directory?
                              >
                              > Alex
                              >
                              > On Thu, Apr 28, 2016 at 6:23 AM, Valencia
                              Serrao <*vserrao@us.ibm.com*
                              > <vs...@us.ibm.com>> wrote:
                              >
                              > Hi Alex,
                              >
                              > I tried building impala again with the
                              following:
                              > HDFS CDH 5.7.0 (
                              > *
                              http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*

                              > <
                              http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3
                              >
                              > )
                              > HBASE CDH 5.7.0 SNAPSHOT (
                              > *
                              http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*

                              > <
                              http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz
                              > )
                              > - this required to patch in a fix (
                              > *
                              https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*

                              > <
                              https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch
                              >
                              > )
                              > HIVE CDH 5.8.0 SNAPSHOT
                              >
                              > With the above combination, i'm able to
                              move past the exception and
                              > also have the RegionServer service up and
                              running. However, it now gives
                              > error as below:
                              >
                              >
                              >
                              ********************************************************************************************************************

                              >
                              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                              > CREATE EXTERNAL TABLE IF NOT EXISTS
                              functional.decimal_tbl (
                              > d1 DECIMAL,
                              > d2 DECIMAL(10, 0),
                              > d3 DECIMAL(20, 10),
                              > d4 DECIMAL(38, 38),
                              > d5 DECIMAL(10, 5))
                              > PARTITIONED BY (d6 DECIMAL(9, 0))
                              > ROW FORMAT delimited fields terminated by
                              ','
                              > STORED AS TEXTFILE
                              > LOCATION '/test-warehouse/decimal_tbl'
                              >
                              >
                              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                              > USE functional
                              >
                              >
                              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                              > ALTER TABLE decimal_tbl ADD IF NOT EXISTS
                              PARTITION(d6=1)
                              >
                              > Data Loading from Impala failed with error:
                              ImpalaBeeswaxException:
                              > INNER EXCEPTION: <class
                              >
                              'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>

                              > MESSAGE:
                              > Error: null
                              >
                              >
                              ******************************************************************************************************************

                              >
                              > Here is the complete log for the same.
                              *(See attached file:
                              > data-load-functional-exhaustive.log)*
                              >
                              > It would great if you could guide me on
                              this issue, so i could proceed
                              > with the fe tests.
                              >
                              > Still awaiting link to the source code of
                              HDFS CDH 5.8.0
                              >
                              > Regards,
                              > Valencia
                              >
                              >
                              >
                              >




Re: Fw: Issues with generating testdata for Impala

Posted by Jim Apple <jb...@cloudera.com>.
I'd suggest that it will be easier to address any cluster test issues
one-by-one in its own email thread or bug report, since each one may have a
different cause.

On Wed, Jun 8, 2016 at 5:30 AM, Valencia Serrao <vs...@us.ibm.com> wrote:

> Hi Casey,
>
> Data loading issues on ppc are resolved. I have been able to successfully
> complete the data loading on ppc for Impala. The FE tests are also test
> successfully with 545 tests passing and 36 tests skipped.
>
> I also executed the Custom cluster tests, (tests=41, failures=5, errors=0,
> skipped=0). PFA the log for the same. *(See attached file:
> 8June_cc_tests.txt)*
>
> It would be great if you could share any pointers on these issues.
>
> Regards,
> Valencia
>
>
>
> [image: Inactive hide details for Casey Ching ---05/09/2016 10:45:15
> PM---Hi Valencia, Have you tried setting up an x86 environment? Th]Casey
> Ching ---05/09/2016 10:45:15 PM---Hi Valencia, Have you tried setting up an
> x86 environment? That could be useful for comparing to the
>
> From: Casey Ching <ca...@cloudera.com>
> To: Alex Behm <al...@cloudera.com>, Valencia
> Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
> Cc: Valencia Serrao/Austin/Contr/IBM@IBMUS,
> dev@impala.incubator.apache.org, David Clissold/Austin/IBM@IBMUS,
> Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
> Panpaliya/Austin/Contr/IBM@IBMUS
> Date: 05/09/2016 10:45 PM
> Subject: Re: Fw: Issues with generating testdata for Impala
> ------------------------------
>
>
>
> Hi Valencia,
>
> Have you tried setting up an x86 environment? That could be useful for
> comparing to the ppc environment to see what is/isn’t working and being
> able to see what the logs should look like.
>
> If the tpch database isn’t there, that should mean data loading failed and
> there should have been an error that caused the data loading to exit early
> along with an error message in the logs. Did you see anything like that?
> You might want to try only running the data loading step, then verifying
> that the tpch database exists afterwards.
>
> Casey
>
> On May 9, 2016 at 5:27:49 AM, Valencia Serrao (*vserrao@us.ibm.com*
> <vs...@us.ibm.com>) wrote:
>
>    Hi Alex/Casey,
>
>       I re-ran the fe tests with the testdata you provided, but the
>       result is the same as that reported in the earlier mail, with most of the
>       failures occurring due to tpch database not existing.
>
>       Steps followed to test are as follows:
>       1. copy the testdata to IMPALA_HOME/testdata/impala-data.
>       2. ./buildall.sh -notests -noclean -format -testdata
>       3. ./bin/run_all_tests.sh
>
>       We had also tried the testdata generation on Ubuntu x86 ppc machine
>       however, it stops at the same "Invalidate Metadata" step with the exception.
>
>       Any pointers on these issues will be helpful.
>
>       Regards,
>       Valencia
>
>       Valencia Serrao---05/05/2016 06:47:59 PM---Hi Alex/Casey, I tried
>       to run the frontend tests with the data provided. Following is the result:
>
>       From: Valencia Serrao/Austin/Contr/IBM
>       To: Casey Ching <ca...@cloudera.com>
>       Cc: Alex Behm <al...@cloudera.com>,
>       dev@impala.incubator.apache.org, Nishidha
>       Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
>       Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS,
>       Valencia Serrao/Austin/Contr/IBM@IBMUS
>       Date: 05/05/2016 06:47 PM
>       Subject: Re: Fw: Issues with generating testdata for Impala
>
>       ------------------------------
>
>
>       Hi Alex/Casey,
>
>       I tried to run the frontend tests with the data provided. Following
>       is the result:
>       Tests run: 545, Failures: 226, Errors: 77, Skipped: 36 [attachment
>       "data-load-functional-exhaustive.zip" deleted by Valencia
>       Serrao/Austin/Contr/IBM]
>
>
>       Earlier, the number of "Errors" were 87 , so now they have reduced
>       by 10. However, the "Failures" count is still the same. Most of the
>       Failures in PlannerTest and AuthorizationTest are related to tpch (e.g.
>       Database doesn't exist: tpch).
>
>       With regard to the directory "impala_data", i've observed that it
>       is not being accessed/used by any script. Are we missing on any
>       configuration ?
>
>       Kindly guide me on this.
>
>       Regards,
>       Valencia
>
>
>
>       Valencia Serrao---05/05/2016 02:21:56 PM---Thanks, Casey! I will
>       let you know the test status.
>
>       From: Valencia Serrao/Austin/Contr/IBM
>       To: Casey Ching <ca...@cloudera.com>
>       Cc: Alex Behm <al...@cloudera.com>,
>       dev@impala.incubator.apache.org, Nishidha
>       Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
>       Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS,
>       Valencia Serrao/Austin/Contr/IBM@IBMUS
>       Date: 05/05/2016 02:21 PM
>       Subject: Re: Fw: Issues with generating testdata for Impala
>       ------------------------------
>
>
>       Thanks, Casey!
>
>       I will let you know the test status.
>
>
>       Casey Ching ---05/05/2016 01:09:11 PM---On May 4, 2016 at 11:08:07
>       PM, Valencia Serrao (vserrao@us.ibm.com) wrote: Hi Alex,
>
>       From: Casey Ching <ca...@cloudera.com>
>       To: Alex Behm <al...@cloudera.com>, Valencia
>       Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
>       Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
>       Panpaliya/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
>       Date: 05/05/2016 01:09 PM
>       Subject: Re: Fw: Issues with generating testdata for Impala
>       ------------------------------
>
>
>
>
>       On May 4, 2016 at 11:08:07 PM, Valencia Serrao (*vserrao@us.ibm.com*
>       <vs...@us.ibm.com>) wrote:
>       Hi Alex,
>
>                I've placed the individual testdata tars at the
>                IMPALA_HOME/testdata/impala-data. Steps 1...10 i've
>                already executed. Some queries about step no:11 and step no:12, that i want
>                to clarify:
>
>                1) . bin/impala-config.sh
>                2) mkdir -p $IMPALA_HOME/testdata/impala-data
>                3) pushd $IMPALA_HOME/testdata/impala-data
>                4) cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz
>                5) tar -xzf tpch.tar.gz
>                6) rm tpch.tar.gz
>                7) cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz
>                8) tar -xzf tpcds.tar.gz
>                9) rm tpcds.tar.gz
>                10) popd
>
>                11) ./buildall.sh -notests -noclean -format
>                -----Here I've removed the -testdata option.
>                The reason to do this is to clear the previously generated
>                partial schemas.
>             I think the -format option is supposed to clear out any old
>       state. The -testdata flag is probably needed to generate and load the test
>       data.
>
>
>                12) sudo rm -rf $IMPALA_HOME/testdata/impala-data ---- Is
>                this step required? Why?
>             That is only for docker. It helps to reduct the image size.
>       You shouldn’t need to do that or any of the other rm commands.
>
>
>                Could you kindly confirm on these steps ? If any
>                corrections, please let me know.
>
>                Regards,
>                Valencia
>
>
>
>                Valencia Serrao---05/04/2016 04:18:24 PM---Hi Alex/Casey
>                Thank you for responding and for sharing the testdata. I'm working on using
>                the testda
>
>                From: Valencia Serrao/Austin/Contr/IBM
>                To: Alex Behm <al...@cloudera.com>
>                Cc: Casey Ching <ca...@cloudera.com>,
>                dev@impala.incubator.apache.org, Nishidha
>                Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
>                Jagadale/Austin/Contr/IBM@IBMUS, David
>                Clissold/Austin/IBM@IBMUS
>                Date: 05/04/2016 04:18 PM
>                Subject: Re: Fw: Issues with generating testdata for
>                Impala
>
>
>
>                Hi Alex/Casey
>
>                Thank you for responding and for sharing the testdata. I'm
>                working on using the testdata to run the fe tests.
>
>                Meanwhile, I've posted the logs onto "Impala Dev" google
>                group. Here's the link:
>                *https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk*
>                <https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk>
>
>                Regards,
>                Valencia
>
>
>                Alex Behm ---05/04/2016 12:52:44 PM---Ahh, thanks Casey.
>                Did not know about that. Valencia, Impala's data loading expects the files
>                to be
>
>                From: Alex Behm <al...@cloudera.com>
>                To: Casey Ching <ca...@cloudera.com>
>                Cc: dev@impala.incubator.apache.org, Sudarshan
>                Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
>                Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
>                Serrao/Austin/Contr/IBM@IBMUS
>                Date: 05/04/2016 12:52 PM
>                Subject: Re: Fw: Issues with generating testdata for Impala
>
>
>
>                Ahh, thanks Casey. Did not know about that.
>
>                Valencia, Impala's data loading expects the files to be
>                placed in IMPALA_HOME/testdata/impala-data
>
>                On Tue, May 3, 2016 at 11:21 PM, Casey Ching <
>                *casey@cloudera.com* <ca...@cloudera.com>> wrote:
>
>                   Comment inline below
>
>                   On May 3, 2016 at 11:18:06 PM, Alex Behm (
>                   *alex.behm@cloudera.com* <al...@cloudera.com>)
>                   wrote:
>                   Hi Valencia,
>
>                               I'm sorry you are having so much trouble
>                               with our setup. Let's see what we
>                               can do.
>
>                               There was an infra issue with receiving the
>                               logs you sent me. The
>                               email/attachment got rejected on our side.
>                               Maybe you can upload the logs
>                               somewhere so I can grab them?
>
>                               See more responses inline below.
>
>                               On Sat, Apr 30, 2016 at 5:01 AM, Valencia
>                               Serrao <*vserrao@us.ibm.com*
>                               <vs...@us.ibm.com>> wrote:
>
>                               > Hi Alex,
>                               >
>                               > I was going more deeper through the logs.
>                               I have some findings and queries:
>                               >
>                               > 1. At the "Invalidating Metadata" step
>                               (as mentioned in below mail), i
>                               > noticed that, it is trying to use
>                               kerberos. Perhaps, this is preventing the
>                               > testdata generation from proceeding, as
>                               we are not using Kerberos.
>                               > I need to know how this can be done
>                               without involving Kerberos support ?
>                               >
>                               Kerberos is certainly not needed to build
>                               and run tests.
>
>                               >
>                               > 2. I had executed the fe tests despite
>                               the incomplete testdata generation,
>                               > the tests started and surely have failed.
>                               Many of these (null pointer
>                               > exception in AuthorzationTests) have a
>                               common cause: "tpch database does
>                               > not exist."
>                               > e.g. as shown in
>                               .Impala/cluster_logs/query_tests/test-run-workload.log.
>                               >
>                               > Does the "tpch" database gets created
>                               after the current blocker step
>                               > "Invalidating Metadata" ?
>                               >
>
>                               Yes, the TPCH database is created and
>                               loaded as part of that first phase.
>                               However, the data files are not yet
>                               publicly accessible. Let me work on
>                               that from my side, and get back to you
>                               soon. One way or the other we'll be
>                               able to provide you with the data.
>
>                   The data is at
>                   *https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp*
>                   <https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp> .
>                   The files are split into 50 MB pieces for git. You can put them back
>                   together as is done in
>                   *https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile*
>                   <https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile>
>
>                               >
>                               > 3. In the fe test console output log,
>                               another error shown:
>                               > ============================= test
>                               session starts
>                               > ==============================
>                               > platform linux2 -- Python 2.7.5 --
>                               py-1.4.30 -- pytest-2.7.2
>                               > rootdir: /work/, inifile:
>                               > plugins: random, xdist
>                               > ERROR: file not found:/work/I
>                               >
>                               mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/
>                               >
>                               > These are not present/created on my vm.
>                               May i know when these get created ?
>                               >
>                               > 4. Could you also share the total number
>                               of fe tests ?
>                               >
>
>                               I'll privately send you the console output
>                               from a successful FE run.
>                               Hopefully that can help.
>
>                               Cheers,
>
>                               Alex
>
>                               >
>                               >
>                               > Looking forward to your reply.
>                               >
>                               > Regards,
>                               > Valencia
>                               >
>                               >
>                               > [image: Inactive hide details for
>                               Valencia Serrao---04/30/2016 09:05:54
>                               > AM---Hi Alex, I've been able to make some
>                               progress on testdata]Valencia
>                               > Serrao---04/30/2016 09:05:54 AM---Hi
>                               Alex, I've been able to make some
>                               > progress on testdata generation, however,
>                               i still face the foll
>                               >
>                               > From: Valencia Serrao/Austin/Contr/IBM
>                               > To: *dev@impala.incubator.apache.org*
>                               <de...@impala.incubator.apache.org>, Alex
>                               Behm <*alex.behm@cloudera.com*
>                               <al...@cloudera.com>>
>                               > Cc: Sudarshan
>                               Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
>                               > Panpaliya/Austin/Contr/IBM@IBMUS,
>                               Valencia Serrao/Austin/Contr/IBM@IBMUS
>                               > Date: 04/30/2016 09:05 AM
>                               > Subject: Fw: Issues with generating
>                               testdata for Impala
>                               > ------------------------------
>                               >
>                               >
>                               >
>                               > Hi Alex,
>                               >
>                               > I've been able to make some progress on
>                               testdata generation, however, i
>                               > still face the following issues:
>                               >
>                               >
>                               >
>                               *******************************************************************************************************************************************************************
>                               > Invalidating Metadata
>                               >
>                               >
>                               (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):
>                               > INSERT OVERWRITE TABLE
>                               functional_parquet.alltypes partition (year, month)
>                               > SELECT id, bool_col, tinyint_col,
>                               smallint_col, int_col, bigint_col,
>                               > float_col, double_col, date_string_col,
>                               string_col, timestamp_col, year,
>                               > month
>                               > FROM functional.alltypes
>                               >
>                               > Data Loading from Impala failed with
>                               error: ImpalaBeeswaxException:
>                               > INNER EXCEPTION: <class 'socket.error'>
>                               > MESSAGE: [Errno 104] Connection reset by
>                               peer
>                               > Error in
>                               /root/nishidha/Impala/testdata/bin/create-load-data.sh at line
>                               > 41: while [ -n "$*" ]
>                               > Error in
>                               /root/nishidha/Impala/buildall.sh at line 368:
>                               >
>                               ${IMPALA_HOME}/testdata/bin/create-load-data.sh ${CREATE_LOAD_DATA_ARGS}
>                               > <<< Y
>                               >
>                               >
>                               *************************************************************************************************************************************************************************
>                               >
>                               > i continued with fe tests as is. Here is
>                               the complete output log.
>                               > [attachment "fe_test_output.zip" deleted
>                               by Valencia
>                               > Serrao/Austin/Contr/IBM]
>                               >
>                               > Cluster logs: [attachment
>                               "cluster_logs.7z" deleted by Valencia
>                               > Serrao/Austin/Contr/IBM]
>                               >
>                               > Kindly guide me on the same.
>                               >
>                               > Regards,
>                               > Valencia
>                               > ----- Forwarded by Valencia
>                               Serrao/Austin/Contr/IBM on 04/29/2016 10:57 AM
>                               > -----
>                               >
>                               > From: Sudarshan Jagadale/Austin/Contr/IBM
>                               > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
>                               > Date: 04/29/2016 10:49 AM
>                               > Subject: Fw: Issues with generating
>                               testdata for Impala
>                               > ------------------------------
>                               >
>                               >
>                               > FYI
>                               > Thanks and Regards
>                               > Sudarshan Jagadale
>                               > Power Open Source Solutions
>                               > ----- Forwarded by Sudarshan
>                               Jagadale/Austin/Contr/IBM on 04/29/2016 10:48
>                               > AM -----
>                               >
>                               > From: Alex Behm <*alex.behm@cloudera.com*
>                               <al...@cloudera.com>>
>                               > To: *dev@impala.incubator.apache.org*
>                               <de...@impala.incubator.apache.org>
>                               > Cc: Sudarshan
>                               Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
>                               > Panpaliya/Austin/Contr/IBM@IBMUS
>                               > Date: 04/28/2016 09:34 PM
>                               > Subject: Re: Issues with generating
>                               testdata for Impala
>                               > ------------------------------
>                               >
>                               >
>                               >
>                               > Hi Valencia,
>                               >
>                               > sorry I did not get the attachment. Would
>                               you be able to tar.gz and attach
>                               > the whole cluster_logs directory?
>                               >
>                               > Alex
>                               >
>                               > On Thu, Apr 28, 2016 at 6:23 AM, Valencia
>                               Serrao <**vserrao@us.ibm.com*
>                               <vs...@us.ibm.com>*
>                               > <*vserrao@us.ibm.com* <vs...@us.ibm.com>>>
>                               wrote:
>                               >
>                               > Hi Alex,
>                               >
>                               > I tried building impala again with the
>                               following:
>                               > HDFS CDH 5.7.0 (
>                               > *
>                               *http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3**
>                               <http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*>
>                               > <
>                               *http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*
>                               <http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3>
>                               >
>                               > )
>                               > HBASE CDH 5.7.0 SNAPSHOT (
>                               > *
>                               *http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz**
>                               <http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*>
>                               > <
>                               *http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*
>                               <http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz>>
>                               )
>                               > - this required to patch in a fix (
>                               > *
>                               *https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch**
>                               <https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*>
>                               > <
>                               *https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*
>                               <https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch>
>                               >
>                               > )
>                               > HIVE CDH 5.8.0 SNAPSHOT
>                               >
>                               > With the above combination, i'm able to
>                               move past the exception and
>                               > also have the RegionServer service up and
>                               running. However, it now gives
>                               > error as below:
>                               >
>                               >
>                               >
>                               ********************************************************************************************************************
>                               >
>                               (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>                               > CREATE EXTERNAL TABLE IF NOT EXISTS
>                               functional.decimal_tbl (
>                               > d1 DECIMAL,
>                               > d2 DECIMAL(10, 0),
>                               > d3 DECIMAL(20, 10),
>                               > d4 DECIMAL(38, 38),
>                               > d5 DECIMAL(10, 5))
>                               > PARTITIONED BY (d6 DECIMAL(9, 0))
>                               > ROW FORMAT delimited fields terminated by
>                               ','
>                               > STORED AS TEXTFILE
>                               > LOCATION '/test-warehouse/decimal_tbl'
>                               >
>                               >
>                               (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>                               > USE functional
>                               >
>                               >
>                               (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>                               > ALTER TABLE decimal_tbl ADD IF NOT EXISTS
>                               PARTITION(d6=1)
>                               >
>                               > Data Loading from Impala failed with
>                               error: ImpalaBeeswaxException:
>                               > INNER EXCEPTION: <class
>                               >
>                               'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
>                               > MESSAGE:
>                               > Error: null
>                               >
>                               >
>                               ******************************************************************************************************************
>                               >
>                               > Here is the complete log for the same.
>                               *(See attached file:
>                               > data-load-functional-exhaustive.log)*
>                               >
>                               > It would great if you could guide me on
>                               this issue, so i could proceed
>                               > with the fe tests.
>                               >
>                               > Still awaiting link to the source code of
>                               HDFS CDH 5.8.0
>                               >
>                               > Regards,
>                               > Valencia
>                               >
>                               >
>                               >
>                               >
>
>
>
>
>
>
>

Re: Fw: Issues with generating testdata for Impala

Posted by Valencia Serrao <vs...@us.ibm.com>.
Hi Casey,

Data loading issues on ppc are resolved. I have been able to successfully
complete the data loading on ppc for Impala. The FE tests are also test
successfully with 545 tests passing and 36 tests skipped.

I also executed the Custom cluster tests, (tests=41, failures=5, errors=0,
skipped=0). PFA the log for the same. (See attached file:
8June_cc_tests.txt)

It would be great if you could share any pointers on these issues.

Regards,
Valencia





From:	Casey Ching <ca...@cloudera.com>
To:	Alex Behm <al...@cloudera.com>, Valencia
            Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
Cc:	Valencia Serrao/Austin/Contr/IBM@IBMUS,
            dev@impala.incubator.apache.org, David
            Clissold/Austin/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS
Date:	05/09/2016 10:45 PM
Subject:	Re: Fw: Issues with generating testdata for Impala



Hi Valencia,

Have you tried setting up an x86 environment? That could be useful for
comparing to the ppc environment to see what is/isn’t working and being
able to see what the logs should look like.

If the tpch database isn’t there, that should mean data loading failed and
there should have been an error that caused the data loading to exit early
along with an error message in the logs. Did you see anything like that?
You might want to try only running the data loading step, then verifying
that the tpch database exists afterwards.

Casey


On May 9, 2016 at 5:27:49 AM, Valencia Serrao (vserrao@us.ibm.com) wrote:


      Hi Alex/Casey,

      I re-ran the fe tests with the testdata you provided, but the result
      is the same as that reported in the earlier mail, with most of the
      failures occurring due to tpch database not existing.

      Steps followed to test are as follows:
      1. copy the testdata to IMPALA_HOME/testdata/impala-data.
      2. ./buildall.sh -notests -noclean -format -testdata
      3. ./bin/run_all_tests.sh

      We had also tried the testdata generation on Ubuntu x86 ppc machine
      however, it stops at the same "Invalidate Metadata" step with the
      exception.

      Any pointers on these issues will be helpful.

      Regards,
      Valencia

      Valencia Serrao---05/05/2016 06:47:59 PM---Hi Alex/Casey, I tried to
      run the frontend tests with the data provided. Following is the
      result:

      From: Valencia Serrao/Austin/Contr/IBM
      To: Casey Ching <ca...@cloudera.com>
      Cc: Alex Behm <al...@cloudera.com>,
      dev@impala.incubator.apache.org, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS,
      Valencia Serrao/Austin/Contr/IBM@IBMUS
      Date: 05/05/2016 06:47 PM
      Subject: Re: Fw: Issues with generating testdata for Impala




      Hi Alex/Casey,

      I tried to run the frontend tests with the data provided. Following
      is the result:
      Tests run: 545, Failures: 226, Errors: 77, Skipped: 36 [attachment
      "data-load-functional-exhaustive.zip" deleted by Valencia
      Serrao/Austin/Contr/IBM]


      Earlier, the number of "Errors" were 87 , so now they have reduced by
      10. However, the "Failures" count is still the same. Most of the
      Failures in PlannerTest and AuthorizationTest are related to tpch
      (e.g. Database doesn't exist: tpch).

      With regard to the directory "impala_data", i've observed that it is
      not being accessed/used by any script. Are we missing on any
      configuration ?

      Kindly guide me on this.

      Regards,
      Valencia



      Valencia Serrao---05/05/2016 02:21:56 PM---Thanks, Casey! I will let
      you know the test status.

      From: Valencia Serrao/Austin/Contr/IBM
      To: Casey Ching <ca...@cloudera.com>
      Cc: Alex Behm <al...@cloudera.com>,
      dev@impala.incubator.apache.org, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS,
      Valencia Serrao/Austin/Contr/IBM@IBMUS
      Date: 05/05/2016 02:21 PM
      Subject: Re: Fw: Issues with generating testdata for Impala


      Thanks, Casey!

      I will let you know the test status.


      Casey Ching ---05/05/2016 01:09:11 PM---On May 4, 2016 at 11:08:07
      PM, Valencia Serrao (vserrao@us.ibm.com) wrote: Hi Alex,

      From: Casey Ching <ca...@cloudera.com>
      To: Alex Behm <al...@cloudera.com>, Valencia
      Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
      Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
      Date: 05/05/2016 01:09 PM
      Subject: Re: Fw: Issues with generating testdata for Impala







      On May 4, 2016 at 11:08:07 PM, Valencia Serrao (vserrao@us.ibm.com)
      wrote:


              Hi Alex,

              I've placed the individual testdata tars at the
              IMPALA_HOME/testdata/impala-data. Steps 1...10 i've already
              executed. Some queries about step no:11 and step no:12, that
              i want to clarify:

              1) . bin/impala-config.sh
              2) mkdir -p $IMPALA_HOME/testdata/impala-data
              3) pushd $IMPALA_HOME/testdata/impala-data
              4) cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz
              5) tar -xzf tpch.tar.gz
              6) rm tpch.tar.gz
              7) cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz
              8) tar -xzf tpcds.tar.gz
              9) rm tpcds.tar.gz
              10) popd

              11) ./buildall.sh -notests -noclean -format
              -----Here I've removed the -testdata option.
              The reason to do this is to clear the previously generated
              partial schemas.
      I think the -format option is supposed to clear out any old state.
      The -testdata flag is probably needed to generate and load the test
      data.


              12) sudo rm -rf $IMPALA_HOME/testdata/impala-data ---- Is
              this step required? Why?
      That is only for docker. It helps to reduct the image size. You
      shouldn’t need to do that or any of the other rm commands.


              Could you kindly confirm on these steps ? If any corrections,
              please let me know.

              Regards,
              Valencia



              Valencia Serrao---05/04/2016 04:18:24 PM---Hi Alex/Casey
              Thank you for responding and for sharing the testdata. I'm
              working on using the testda

              From: Valencia Serrao/Austin/Contr/IBM
              To: Alex Behm <al...@cloudera.com>
              Cc: Casey Ching <ca...@cloudera.com>,
              dev@impala.incubator.apache.org, Nishidha
              Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
              Jagadale/Austin/Contr/IBM@IBMUS, David
              Clissold/Austin/IBM@IBMUS
              Date: 05/04/2016 04:18 PM
              Subject: Re: Fw: Issues with generating testdata for Impala




              Hi Alex/Casey

              Thank you for responding and for sharing the testdata. I'm
              working on using the testdata to run the fe tests.

              Meanwhile, I've posted the logs onto "Impala Dev" google
              group. Here's the link:
              https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk


              Regards,
              Valencia


              Alex Behm ---05/04/2016 12:52:44 PM---Ahh, thanks Casey. Did
              not know about that. Valencia, Impala's data loading expects
              the files to be

              From: Alex Behm <al...@cloudera.com>
              To: Casey Ching <ca...@cloudera.com>
              Cc: dev@impala.incubator.apache.org, Sudarshan
              Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
              Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
              Serrao/Austin/Contr/IBM@IBMUS
              Date: 05/04/2016 12:52 PM
              Subject: Re: Fw: Issues with generating testdata for Impala



              Ahh, thanks Casey. Did not know about that.

              Valencia, Impala's data loading expects the files to be
              placed in IMPALA_HOME/testdata/impala-data

              On Tue, May 3, 2016 at 11:21 PM, Casey Ching <
              casey@cloudera.com> wrote:


                  Comment inline below


                  On May 3, 2016 at 11:18:06 PM, Alex Behm (
                  alex.behm@cloudera.com) wrote:


                              Hi Valencia,

                              I'm sorry you are having so much trouble with
                              our setup. Let's see what we
                              can do.

                              There was an infra issue with receiving the
                              logs you sent me. The
                              email/attachment got rejected on our side.
                              Maybe you can upload the logs
                              somewhere so I can grab them?

                              See more responses inline below.

                              On Sat, Apr 30, 2016 at 5:01 AM, Valencia
                              Serrao <vs...@us.ibm.com> wrote:

                              > Hi Alex,
                              >
                              > I was going more deeper through the logs. I
                              have some findings and queries:
                              >
                              > 1. At the "Invalidating Metadata" step (as
                              mentioned in below mail), i
                              > noticed that, it is trying to use kerberos.
                              Perhaps, this is preventing the
                              > testdata generation from proceeding, as we
                              are not using Kerberos.
                              > I need to know how this can be done without
                              involving Kerberos support ?
                              >
                              Kerberos is certainly not needed to build and
                              run tests.

                              >
                              > 2. I had executed the fe tests despite the
                              incomplete testdata generation,
                              > the tests started and surely have failed.
                              Many of these (null pointer
                              > exception in AuthorzationTests) have a
                              common cause: "tpch database does
                              > not exist."
                              > e.g. as shown
                              in .Impala/cluster_logs/query_tests/test-run-workload.log.

                              >
                              > Does the "tpch" database gets created after
                              the current blocker step
                              > "Invalidating Metadata" ?
                              >

                              Yes, the TPCH database is created and loaded
                              as part of that first phase.
                              However, the data files are not yet publicly
                              accessible. Let me work on
                              that from my side, and get back to you soon.
                              One way or the other we'll be
                              able to provide you with the data.

                  The data is at
                  https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
                   . The files are split into 50 MB pieces for git. You can
                  put them back together as is done in
                  https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile


                              >
                              > 3. In the fe test console output log,
                              another error shown:
                              > ============================= test session
                              starts
                              > ==============================
                              > platform linux2 -- Python 2.7.5 --
                              py-1.4.30 -- pytest-2.7.2
                              > rootdir: /work/, inifile:
                              > plugins: random, xdist
                              > ERROR: file not found:/work/I
                              >
                              mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/

                              >
                              > These are not present/created on my vm. May
                              i know when these get created ?
                              >
                              > 4. Could you also share the total number of
                              fe tests ?
                              >

                              I'll privately send you the console output
                              from a successful FE run.
                              Hopefully that can help.

                              Cheers,

                              Alex

                              >
                              >
                              > Looking forward to your reply.
                              >
                              > Regards,
                              > Valencia
                              >
                              >
                              > [image: Inactive hide details for Valencia
                              Serrao---04/30/2016 09:05:54
                              > AM---Hi Alex, I've been able to make some
                              progress on testdata]Valencia
                              > Serrao---04/30/2016 09:05:54 AM---Hi Alex,
                              I've been able to make some
                              > progress on testdata generation, however, i
                              still face the foll
                              >
                              > From: Valencia Serrao/Austin/Contr/IBM
                              > To: dev@impala.incubator.apache.org, Alex
                              Behm <al...@cloudera.com>
                              > Cc: Sudarshan
                              Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                              > Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
                              Serrao/Austin/Contr/IBM@IBMUS
                              > Date: 04/30/2016 09:05 AM
                              > Subject: Fw: Issues with generating
                              testdata for Impala
                              > ------------------------------
                              >
                              >
                              >
                              > Hi Alex,
                              >
                              > I've been able to make some progress on
                              testdata generation, however, i
                              > still face the following issues:
                              >
                              >
                              >
                              *******************************************************************************************************************************************************************

                              > Invalidating Metadata
                              >
                              >
                              (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):

                              > INSERT OVERWRITE TABLE
                              functional_parquet.alltypes partition (year,
                              month)
                              > SELECT id, bool_col, tinyint_col,
                              smallint_col, int_col, bigint_col,
                              > float_col, double_col, date_string_col,
                              string_col, timestamp_col, year,
                              > month
                              > FROM functional.alltypes
                              >
                              > Data Loading from Impala failed with error:
                              ImpalaBeeswaxException:
                              > INNER EXCEPTION: <class 'socket.error'>
                              > MESSAGE: [Errno 104] Connection reset by
                              peer
                              > Error
                              in /root/nishidha/Impala/testdata/bin/create-load-data.sh
 at line
                              > 41: while [ -n "$*" ]
                              > Error in /root/nishidha/Impala/buildall.sh
                              at line 368:
                              > $
                              {IMPALA_HOME}/testdata/bin/create-load-data.sh
 ${CREATE_LOAD_DATA_ARGS}
                              > <<< Y
                              >
                              >
                              *************************************************************************************************************************************************************************

                              >
                              > i continued with fe tests as is. Here is
                              the complete output log.
                              > [attachment "fe_test_output.zip" deleted by
                              Valencia
                              > Serrao/Austin/Contr/IBM]
                              >
                              > Cluster logs: [attachment "cluster_logs.7z"
                              deleted by Valencia
                              > Serrao/Austin/Contr/IBM]
                              >
                              > Kindly guide me on the same.
                              >
                              > Regards,
                              > Valencia
                              > ----- Forwarded by Valencia
                              Serrao/Austin/Contr/IBM on 04/29/2016 10:57
                              AM
                              > -----
                              >
                              > From: Sudarshan Jagadale/Austin/Contr/IBM
                              > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
                              > Date: 04/29/2016 10:49 AM
                              > Subject: Fw: Issues with generating
                              testdata for Impala
                              > ------------------------------
                              >
                              >
                              > FYI
                              > Thanks and Regards
                              > Sudarshan Jagadale
                              > Power Open Source Solutions
                              > ----- Forwarded by Sudarshan
                              Jagadale/Austin/Contr/IBM on 04/29/2016 10:48
                              > AM -----
                              >
                              > From: Alex Behm <al...@cloudera.com>
                              > To: dev@impala.incubator.apache.org
                              > Cc: Sudarshan
                              Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                              > Panpaliya/Austin/Contr/IBM@IBMUS
                              > Date: 04/28/2016 09:34 PM
                              > Subject: Re: Issues with generating
                              testdata for Impala
                              > ------------------------------
                              >
                              >
                              >
                              > Hi Valencia,
                              >
                              > sorry I did not get the attachment. Would
                              you be able to tar.gz and attach
                              > the whole cluster_logs directory?
                              >
                              > Alex
                              >
                              > On Thu, Apr 28, 2016 at 6:23 AM, Valencia
                              Serrao <*vserrao@us.ibm.com*
                              > <vs...@us.ibm.com>> wrote:
                              >
                              > Hi Alex,
                              >
                              > I tried building impala again with the
                              following:
                              > HDFS CDH 5.7.0 (
                              > *
                              http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*

                              > <
                              http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3
                              >
                              > )
                              > HBASE CDH 5.7.0 SNAPSHOT (
                              > *
                              http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*

                              > <
                              http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz
                              > )
                              > - this required to patch in a fix (
                              > *
                              https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*

                              > <
                              https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch
                              >
                              > )
                              > HIVE CDH 5.8.0 SNAPSHOT
                              >
                              > With the above combination, i'm able to
                              move past the exception and
                              > also have the RegionServer service up and
                              running. However, it now gives
                              > error as below:
                              >
                              >
                              >
                              ********************************************************************************************************************

                              >
                              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                              > CREATE EXTERNAL TABLE IF NOT EXISTS
                              functional.decimal_tbl (
                              > d1 DECIMAL,
                              > d2 DECIMAL(10, 0),
                              > d3 DECIMAL(20, 10),
                              > d4 DECIMAL(38, 38),
                              > d5 DECIMAL(10, 5))
                              > PARTITIONED BY (d6 DECIMAL(9, 0))
                              > ROW FORMAT delimited fields terminated by
                              ','
                              > STORED AS TEXTFILE
                              > LOCATION '/test-warehouse/decimal_tbl'
                              >
                              >
                              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                              > USE functional
                              >
                              >
                              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                              > ALTER TABLE decimal_tbl ADD IF NOT EXISTS
                              PARTITION(d6=1)
                              >
                              > Data Loading from Impala failed with error:
                              ImpalaBeeswaxException:
                              > INNER EXCEPTION: <class
                              >
                              'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>

                              > MESSAGE:
                              > Error: null
                              >
                              >
                              ******************************************************************************************************************

                              >
                              > Here is the complete log for the same.
                              *(See attached file:
                              > data-load-functional-exhaustive.log)*
                              >
                              > It would great if you could guide me on
                              this issue, so i could proceed
                              > with the fe tests.
                              >
                              > Still awaiting link to the source code of
                              HDFS CDH 5.8.0
                              >
                              > Regards,
                              > Valencia
                              >
                              >
                              >
                              >





Re: Fw: Issues with generating testdata for Impala

Posted by Casey Ching <ca...@cloudera.com>.
I’ve seen that hbase error too and didn’t have time to look into it. I’m pretty sure it’s a test problem though, you can ignore it. 


On May 13, 2016 at 1:35:33 AM, Valencia Serrao (vserrao@us.ibm.com) wrote:

Hi Casey,

Yes, i have tried Impala data loading on x86 environment. The testdata loading completes successfully. Frontend tests have also executed, there is one failure.
Test status on x86: Tests run: 506, Failures: 1, Errors: 0, Skipped: 20
The failure is as assertion in Planner test.

The logs are here: (See attached file: TestLoadingConsoleOutput-X86.zip) and (See attached file: com.cloudera.impala.planner.PlannerTest.zip)

Please comment on the same.

Regarding the ppc and x86 logs for testdata loading, i'll post in separate email

Regards,
Valencia



 Casey Ching ---05/09/2016 10:45:15 PM---Hi Valencia, Have you tried setting up an x86 environment? That could be useful for comparing to the

From: Casey Ching <ca...@cloudera.com>
To: Alex Behm <al...@cloudera.com>, Valencia Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
Cc: Valencia Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org, David Clissold/Austin/IBM@IBMUS, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS
Date: 05/09/2016 10:45 PM
Subject: Re: Fw: Issues with generating testdata for Impala




Hi Valencia,

Have you tried setting up an x86 environment? That could be useful for comparing to the ppc environment to see what is/isn’t working and being able to see what the logs should look like.

If the tpch database isn’t there, that should mean data loading failed and there should have been an error that caused the data loading to exit early along with an error message in the logs. Did you see anything like that? You might want to try only running the data loading step, then verifying that the tpch database exists afterwards.

Casey
On May 9, 2016 at 5:27:49 AM, Valencia Serrao (vserrao@us.ibm.com) wrote:

Hi Alex/Casey,

I re-ran the fe tests with the testdata you provided, but the result is the same as that reported in the earlier mail, with most of the failures occurring due to tpch database not existing.

Steps followed to test are as follows:
1. copy the testdata to IMPALA_HOME/testdata/impala-data.
2. ./buildall.sh -notests -noclean -format -testdata
3. ./bin/run_all_tests.sh

We had also tried the testdata generation on Ubuntu x86 ppc machine however, it stops at the same "Invalidate Metadata" step with the exception.

Any pointers on these issues will be helpful.

Regards,
Valencia

Valencia Serrao---05/05/2016 06:47:59 PM---Hi Alex/Casey, I tried to run the frontend tests with the data provided. Following is the result:

From: Valencia Serrao/Austin/Contr/IBM
To: Casey Ching <ca...@cloudera.com>
Cc: Alex Behm <al...@cloudera.com>, dev@impala.incubator.apache.org, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
Date: 05/05/2016 06:47 PM
Subject: Re: Fw: Issues with generating testdata for Impala  

Hi Alex/Casey,

I tried to run the frontend tests with the data provided. Following is the result:
Tests run: 545, Failures: 226, Errors: 77, Skipped: 36 [attachment "data-load-functional-exhaustive.zip" deleted by Valencia Serrao/Austin/Contr/IBM]


Earlier, the number of "Errors" were 87 , so now they have reduced by 10. However, the "Failures" count is still the same. Most of the Failures in PlannerTest and AuthorizationTest are related to tpch (e.g. Database doesn't exist: tpch).

With regard to the directory "impala_data", i've observed that it is not being accessed/used by any script. Are we missing on any configuration ?

Kindly guide me on this.

Regards,
Valencia



Valencia Serrao---05/05/2016 02:21:56 PM---Thanks, Casey! I will let you know the test status.

From: Valencia Serrao/Austin/Contr/IBM
To: Casey Ching <ca...@cloudera.com>
Cc: Alex Behm <al...@cloudera.com>, dev@impala.incubator.apache.org, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
Date: 05/05/2016 02:21 PM
Subject: Re: Fw: Issues with generating testdata for Impala


Thanks, Casey!

I will let you know the test status.


Casey Ching ---05/05/2016 01:09:11 PM---On May 4, 2016 at 11:08:07 PM, Valencia Serrao (vserrao@us.ibm.com) wrote: Hi Alex,

From: Casey Ching <ca...@cloudera.com>
To: Alex Behm <al...@cloudera.com>, Valencia Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
Date: 05/05/2016 01:09 PM
Subject: Re: Fw: Issues with generating testdata for Impala




On May 4, 2016 at 11:08:07 PM, Valencia Serrao (vserrao@us.ibm.com) wrote:

Hi Alex,

I've placed the individual testdata tars at the IMPALA_HOME/testdata/impala-data. Steps 1...10 i've already executed. Some queries about step no:11 and step no:12, that i want to clarify:

1) . bin/impala-config.sh
2) mkdir -p $IMPALA_HOME/testdata/impala-data
3) pushd $IMPALA_HOME/testdata/impala-data
4) cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz
5) tar -xzf tpch.tar.gz
6) rm tpch.tar.gz
7) cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz
8) tar -xzf tpcds.tar.gz
9) rm tpcds.tar.gz
10) popd

11) ./buildall.sh -notests -noclean -format
-----Here I've removed the -testdata option.
The reason to do this is to clear the previously generated partial schemas.
I think the -format option is supposed to clear out any old state. The -testdata flag is probably needed to generate and load the test data.


12) sudo rm -rf $IMPALA_HOME/testdata/impala-data ---- Is this step required? Why?
That is only for docker. It helps to reduct the image size. You shouldn’t need to do that or any of the other rm commands.


Could you kindly confirm on these steps ? If any corrections, please let me know.

Regards,
Valencia



Valencia Serrao---05/04/2016 04:18:24 PM---Hi Alex/Casey Thank you for responding and for sharing the testdata. I'm working on using the testda

From: Valencia Serrao/Austin/Contr/IBM
To: Alex Behm <al...@cloudera.com>
Cc: Casey Ching <ca...@cloudera.com>, dev@impala.incubator.apache.org, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS
Date: 05/04/2016 04:18 PM
Subject: Re: Fw: Issues with generating testdata for Impala


Hi Alex/Casey

Thank you for responding and for sharing the testdata. I'm working on using the testdata to run the fe tests.

Meanwhile, I've posted the logs onto "Impala Dev" google group. Here's the link: https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk

Regards,
Valencia


Alex Behm ---05/04/2016 12:52:44 PM---Ahh, thanks Casey. Did not know about that. Valencia, Impala's data loading expects the files to be

From: Alex Behm <al...@cloudera.com>
To: Casey Ching <ca...@cloudera.com>
Cc: dev@impala.incubator.apache.org, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
Date: 05/04/2016 12:52 PM
Subject: Re: Fw: Issues with generating testdata for Impala



Ahh, thanks Casey. Did not know about that.

Valencia, Impala's data loading expects the files to be placed in IMPALA_HOME/testdata/impala-data

On Tue, May 3, 2016 at 11:21 PM, Casey Ching <ca...@cloudera.com> wrote:


Comment inline below
On May 3, 2016 at 11:18:06 PM, Alex Behm (alex.behm@cloudera.com) wrote:

Hi Valencia,

I'm sorry you are having so much trouble with our setup. Let's see what we
can do.

There was an infra issue with receiving the logs you sent me. The
email/attachment got rejected on our side. Maybe you can upload the logs
somewhere so I can grab them?

See more responses inline below.

On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <vs...@us.ibm.com> wrote:

> Hi Alex,
>
> I was going more deeper through the logs. I have some findings and queries:
>
> 1. At the "Invalidating Metadata" step (as mentioned in below mail), i
> noticed that, it is trying to use kerberos. Perhaps, this is preventing the
> testdata generation from proceeding, as we are not using Kerberos.
> I need to know how this can be done without involving Kerberos support ?
>
Kerberos is certainly not needed to build and run tests.

>
> 2. I had executed the fe tests despite the incomplete testdata generation,
> the tests started and surely have failed. Many of these (null pointer
> exception in AuthorzationTests) have a common cause: "tpch database does
> not exist."
> e.g. as shown in .Impala/cluster_logs/query_tests/test-run-workload.log.
>
> Does the "tpch" database gets created after the current blocker step
> "Invalidating Metadata" ?
>

Yes, the TPCH database is created and loaded as part of that first phase.
However, the data files are not yet publicly accessible. Let me work on
that from my side, and get back to you soon. One way or the other we'll be
able to provide you with the data.

The data is at https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp . The files are split into 50 MB pieces for git. You can put them back together as is done in https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile

>
> 3. In the fe test console output log, another error shown:
> ============================= test session starts
> ==============================
> platform linux2 -- Python 2.7.5 -- py-1.4.30 -- pytest-2.7.2
> rootdir: /work/, inifile:
> plugins: random, xdist
> ERROR: file not found:/work/I
> mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/
>
> These are not present/created on my vm. May i know when these get created ?
>
> 4. Could you also share the total number of fe tests ?
>

I'll privately send you the console output from a successful FE run.
Hopefully that can help.

Cheers,

Alex

>
>
> Looking forward to your reply.
>
> Regards,
> Valencia
>
>
> [image: Inactive hide details for Valencia Serrao---04/30/2016 09:05:54
> AM---Hi Alex, I've been able to make some progress on testdata]Valencia
> Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been able to make some
> progress on testdata generation, however, i still face the foll
>
> From: Valencia Serrao/Austin/Contr/IBM
> To: dev@impala.incubator.apache.org, Alex Behm <al...@cloudera.com>
> Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
> Panpaliya/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
> Date: 04/30/2016 09:05 AM
> Subject: Fw: Issues with generating testdata for Impala
> ------------------------------
>
>
>
> Hi Alex,
>
> I've been able to make some progress on testdata generation, however, i
> still face the following issues:
>
>
> *******************************************************************************************************************************************************************
> Invalidating Metadata
>
> (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):
> INSERT OVERWRITE TABLE functional_parquet.alltypes partition (year, month)
> SELECT id, bool_col, tinyint_col, smallint_col, int_col, bigint_col,
> float_col, double_col, date_string_col, string_col, timestamp_col, year,
> month
> FROM functional.alltypes
>
> Data Loading from Impala failed with error: ImpalaBeeswaxException:
> INNER EXCEPTION: <class 'socket.error'>
> MESSAGE: [Errno 104] Connection reset by peer
> Error in /root/nishidha/Impala/testdata/bin/create-load-data.sh at line
> 41: while [ -n "$*" ]
> Error in /root/nishidha/Impala/buildall.sh at line 368:
> ${IMPALA_HOME}/testdata/bin/create-load-data.sh ${CREATE_LOAD_DATA_ARGS}
> <<< Y
>
> *************************************************************************************************************************************************************************
>
> i continued with fe tests as is. Here is the complete output log.
> [attachment "fe_test_output.zip" deleted by Valencia
> Serrao/Austin/Contr/IBM]
>
> Cluster logs: [attachment "cluster_logs.7z" deleted by Valencia
> Serrao/Austin/Contr/IBM]
>
> Kindly guide me on the same.
>
> Regards,
> Valencia
> ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on 04/29/2016 10:57 AM
> -----
>
> From: Sudarshan Jagadale/Austin/Contr/IBM
> To: Valencia Serrao/Austin/Contr/IBM@IBMUS
> Date: 04/29/2016 10:49 AM
> Subject: Fw: Issues with generating testdata for Impala
> ------------------------------
>
>
> FYI
> Thanks and Regards
> Sudarshan Jagadale
> Power Open Source Solutions
> ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM on 04/29/2016 10:48
> AM -----
>
> From: Alex Behm <al...@cloudera.com>
> To: dev@impala.incubator.apache.org  
> Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
> Panpaliya/Austin/Contr/IBM@IBMUS
> Date: 04/28/2016 09:34 PM
> Subject: Re: Issues with generating testdata for Impala
> ------------------------------
>
>
>
> Hi Valencia,
>
> sorry I did not get the attachment. Would you be able to tar.gz and attach
> the whole cluster_logs directory?
>
> Alex
>
> On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao <*vserrao@us.ibm.com*
> <vs...@us.ibm.com>> wrote:
>
> Hi Alex,
>
> I tried building impala again with the following:
> HDFS CDH 5.7.0 (
> *http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*  
> <http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3>
> )
> HBASE CDH 5.7.0 SNAPSHOT (
> *http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*  
> <http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz> )
> - this required to patch in a fix (
> *https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*  
> <https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch>
> )
> HIVE CDH 5.8.0 SNAPSHOT
>
> With the above combination, i'm able to move past the exception and
> also have the RegionServer service up and running. However, it now gives
> error as below:
>
>
> ********************************************************************************************************************
> (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
> CREATE EXTERNAL TABLE IF NOT EXISTS functional.decimal_tbl (
> d1 DECIMAL,
> d2 DECIMAL(10, 0),
> d3 DECIMAL(20, 10),
> d4 DECIMAL(38, 38),
> d5 DECIMAL(10, 5))
> PARTITIONED BY (d6 DECIMAL(9, 0))
> ROW FORMAT delimited fields terminated by ','
> STORED AS TEXTFILE
> LOCATION '/test-warehouse/decimal_tbl'
>
> (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
> USE functional
>
> (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
> ALTER TABLE decimal_tbl ADD IF NOT EXISTS PARTITION(d6=1)
>
> Data Loading from Impala failed with error: ImpalaBeeswaxException:
> INNER EXCEPTION: <class
> 'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
> MESSAGE:
> Error: null
>
> ******************************************************************************************************************
>
> Here is the complete log for the same. *(See attached file:
> data-load-functional-exhaustive.log)*
>
> It would great if you could guide me on this issue, so i could proceed
> with the fe tests.
>
> Still awaiting link to the source code of HDFS CDH 5.8.0
>
> Regards,
> Valencia
>
>
>
>




Re: Fw: Issues with generating testdata for Impala

Posted by Valencia Serrao <vs...@us.ibm.com>.
Hi Casey,

Yes, i have tried Impala data loading on x86 environment. The testdata
loading completes successfully. Frontend tests have also executed, there is
one failure.
Test status on x86: Tests run: 506, Failures: 1, Errors: 0, Skipped: 20
The failure is as assertion in Planner test.

The logs are here: (See attached file: TestLoadingConsoleOutput-X86.zip)
and (See attached file: com.cloudera.impala.planner.PlannerTest.zip)

Please comment on the same.

Regarding the ppc and x86 logs for testdata loading, i'll post in separate
email

Regards,
Valencia





From:	Casey Ching <ca...@cloudera.com>
To:	Alex Behm <al...@cloudera.com>, Valencia
            Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
Cc:	Valencia Serrao/Austin/Contr/IBM@IBMUS,
            dev@impala.incubator.apache.org, David
            Clissold/Austin/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS
Date:	05/09/2016 10:45 PM
Subject:	Re: Fw: Issues with generating testdata for Impala



Hi Valencia,

Have you tried setting up an x86 environment? That could be useful for
comparing to the ppc environment to see what is/isn’t working and being
able to see what the logs should look like.

If the tpch database isn’t there, that should mean data loading failed and
there should have been an error that caused the data loading to exit early
along with an error message in the logs. Did you see anything like that?
You might want to try only running the data loading step, then verifying
that the tpch database exists afterwards.

Casey


On May 9, 2016 at 5:27:49 AM, Valencia Serrao (vserrao@us.ibm.com) wrote:


      Hi Alex/Casey,

      I re-ran the fe tests with the testdata you provided, but the result
      is the same as that reported in the earlier mail, with most of the
      failures occurring due to tpch database not existing.

      Steps followed to test are as follows:
      1. copy the testdata to IMPALA_HOME/testdata/impala-data.
      2. ./buildall.sh -notests -noclean -format -testdata
      3. ./bin/run_all_tests.sh

      We had also tried the testdata generation on Ubuntu x86 ppc machine
      however, it stops at the same "Invalidate Metadata" step with the
      exception.

      Any pointers on these issues will be helpful.

      Regards,
      Valencia

      Valencia Serrao---05/05/2016 06:47:59 PM---Hi Alex/Casey, I tried to
      run the frontend tests with the data provided. Following is the
      result:

      From: Valencia Serrao/Austin/Contr/IBM
      To: Casey Ching <ca...@cloudera.com>
      Cc: Alex Behm <al...@cloudera.com>,
      dev@impala.incubator.apache.org, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS,
      Valencia Serrao/Austin/Contr/IBM@IBMUS
      Date: 05/05/2016 06:47 PM
      Subject: Re: Fw: Issues with generating testdata for Impala




      Hi Alex/Casey,

      I tried to run the frontend tests with the data provided. Following
      is the result:
      Tests run: 545, Failures: 226, Errors: 77, Skipped: 36 [attachment
      "data-load-functional-exhaustive.zip" deleted by Valencia
      Serrao/Austin/Contr/IBM]


      Earlier, the number of "Errors" were 87 , so now they have reduced by
      10. However, the "Failures" count is still the same. Most of the
      Failures in PlannerTest and AuthorizationTest are related to tpch
      (e.g. Database doesn't exist: tpch).

      With regard to the directory "impala_data", i've observed that it is
      not being accessed/used by any script. Are we missing on any
      configuration ?

      Kindly guide me on this.

      Regards,
      Valencia



      Valencia Serrao---05/05/2016 02:21:56 PM---Thanks, Casey! I will let
      you know the test status.

      From: Valencia Serrao/Austin/Contr/IBM
      To: Casey Ching <ca...@cloudera.com>
      Cc: Alex Behm <al...@cloudera.com>,
      dev@impala.incubator.apache.org, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS,
      Valencia Serrao/Austin/Contr/IBM@IBMUS
      Date: 05/05/2016 02:21 PM
      Subject: Re: Fw: Issues with generating testdata for Impala


      Thanks, Casey!

      I will let you know the test status.


      Casey Ching ---05/05/2016 01:09:11 PM---On May 4, 2016 at 11:08:07
      PM, Valencia Serrao (vserrao@us.ibm.com) wrote: Hi Alex,

      From: Casey Ching <ca...@cloudera.com>
      To: Alex Behm <al...@cloudera.com>, Valencia
      Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
      Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
      Date: 05/05/2016 01:09 PM
      Subject: Re: Fw: Issues with generating testdata for Impala







      On May 4, 2016 at 11:08:07 PM, Valencia Serrao (vserrao@us.ibm.com)
      wrote:


              Hi Alex,

              I've placed the individual testdata tars at the
              IMPALA_HOME/testdata/impala-data. Steps 1...10 i've already
              executed. Some queries about step no:11 and step no:12, that
              i want to clarify:

              1) . bin/impala-config.sh
              2) mkdir -p $IMPALA_HOME/testdata/impala-data
              3) pushd $IMPALA_HOME/testdata/impala-data
              4) cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz
              5) tar -xzf tpch.tar.gz
              6) rm tpch.tar.gz
              7) cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz
              8) tar -xzf tpcds.tar.gz
              9) rm tpcds.tar.gz
              10) popd

              11) ./buildall.sh -notests -noclean -format
              -----Here I've removed the -testdata option.
              The reason to do this is to clear the previously generated
              partial schemas.
      I think the -format option is supposed to clear out any old state.
      The -testdata flag is probably needed to generate and load the test
      data.


              12) sudo rm -rf $IMPALA_HOME/testdata/impala-data ---- Is
              this step required? Why?
      That is only for docker. It helps to reduct the image size. You
      shouldn’t need to do that or any of the other rm commands.


              Could you kindly confirm on these steps ? If any corrections,
              please let me know.

              Regards,
              Valencia



              Valencia Serrao---05/04/2016 04:18:24 PM---Hi Alex/Casey
              Thank you for responding and for sharing the testdata. I'm
              working on using the testda

              From: Valencia Serrao/Austin/Contr/IBM
              To: Alex Behm <al...@cloudera.com>
              Cc: Casey Ching <ca...@cloudera.com>,
              dev@impala.incubator.apache.org, Nishidha
              Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
              Jagadale/Austin/Contr/IBM@IBMUS, David
              Clissold/Austin/IBM@IBMUS
              Date: 05/04/2016 04:18 PM
              Subject: Re: Fw: Issues with generating testdata for Impala




              Hi Alex/Casey

              Thank you for responding and for sharing the testdata. I'm
              working on using the testdata to run the fe tests.

              Meanwhile, I've posted the logs onto "Impala Dev" google
              group. Here's the link:
              https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk


              Regards,
              Valencia


              Alex Behm ---05/04/2016 12:52:44 PM---Ahh, thanks Casey. Did
              not know about that. Valencia, Impala's data loading expects
              the files to be

              From: Alex Behm <al...@cloudera.com>
              To: Casey Ching <ca...@cloudera.com>
              Cc: dev@impala.incubator.apache.org, Sudarshan
              Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
              Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
              Serrao/Austin/Contr/IBM@IBMUS
              Date: 05/04/2016 12:52 PM
              Subject: Re: Fw: Issues with generating testdata for Impala



              Ahh, thanks Casey. Did not know about that.

              Valencia, Impala's data loading expects the files to be
              placed in IMPALA_HOME/testdata/impala-data

              On Tue, May 3, 2016 at 11:21 PM, Casey Ching <
              casey@cloudera.com> wrote:


                  Comment inline below


                  On May 3, 2016 at 11:18:06 PM, Alex Behm (
                  alex.behm@cloudera.com) wrote:


                              Hi Valencia,

                              I'm sorry you are having so much trouble with
                              our setup. Let's see what we
                              can do.

                              There was an infra issue with receiving the
                              logs you sent me. The
                              email/attachment got rejected on our side.
                              Maybe you can upload the logs
                              somewhere so I can grab them?

                              See more responses inline below.

                              On Sat, Apr 30, 2016 at 5:01 AM, Valencia
                              Serrao <vs...@us.ibm.com> wrote:

                              > Hi Alex,
                              >
                              > I was going more deeper through the logs. I
                              have some findings and queries:
                              >
                              > 1. At the "Invalidating Metadata" step (as
                              mentioned in below mail), i
                              > noticed that, it is trying to use kerberos.
                              Perhaps, this is preventing the
                              > testdata generation from proceeding, as we
                              are not using Kerberos.
                              > I need to know how this can be done without
                              involving Kerberos support ?
                              >
                              Kerberos is certainly not needed to build and
                              run tests.

                              >
                              > 2. I had executed the fe tests despite the
                              incomplete testdata generation,
                              > the tests started and surely have failed.
                              Many of these (null pointer
                              > exception in AuthorzationTests) have a
                              common cause: "tpch database does
                              > not exist."
                              > e.g. as shown
                              in .Impala/cluster_logs/query_tests/test-run-workload.log.

                              >
                              > Does the "tpch" database gets created after
                              the current blocker step
                              > "Invalidating Metadata" ?
                              >

                              Yes, the TPCH database is created and loaded
                              as part of that first phase.
                              However, the data files are not yet publicly
                              accessible. Let me work on
                              that from my side, and get back to you soon.
                              One way or the other we'll be
                              able to provide you with the data.

                  The data is at
                  https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
                   . The files are split into 50 MB pieces for git. You can
                  put them back together as is done in
                  https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile


                              >
                              > 3. In the fe test console output log,
                              another error shown:
                              > ============================= test session
                              starts
                              > ==============================
                              > platform linux2 -- Python 2.7.5 --
                              py-1.4.30 -- pytest-2.7.2
                              > rootdir: /work/, inifile:
                              > plugins: random, xdist
                              > ERROR: file not found:/work/I
                              >
                              mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/

                              >
                              > These are not present/created on my vm. May
                              i know when these get created ?
                              >
                              > 4. Could you also share the total number of
                              fe tests ?
                              >

                              I'll privately send you the console output
                              from a successful FE run.
                              Hopefully that can help.

                              Cheers,

                              Alex

                              >
                              >
                              > Looking forward to your reply.
                              >
                              > Regards,
                              > Valencia
                              >
                              >
                              > [image: Inactive hide details for Valencia
                              Serrao---04/30/2016 09:05:54
                              > AM---Hi Alex, I've been able to make some
                              progress on testdata]Valencia
                              > Serrao---04/30/2016 09:05:54 AM---Hi Alex,
                              I've been able to make some
                              > progress on testdata generation, however, i
                              still face the foll
                              >
                              > From: Valencia Serrao/Austin/Contr/IBM
                              > To: dev@impala.incubator.apache.org, Alex
                              Behm <al...@cloudera.com>
                              > Cc: Sudarshan
                              Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                              > Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
                              Serrao/Austin/Contr/IBM@IBMUS
                              > Date: 04/30/2016 09:05 AM
                              > Subject: Fw: Issues with generating
                              testdata for Impala
                              > ------------------------------
                              >
                              >
                              >
                              > Hi Alex,
                              >
                              > I've been able to make some progress on
                              testdata generation, however, i
                              > still face the following issues:
                              >
                              >
                              >
                              *******************************************************************************************************************************************************************

                              > Invalidating Metadata
                              >
                              >
                              (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):

                              > INSERT OVERWRITE TABLE
                              functional_parquet.alltypes partition (year,
                              month)
                              > SELECT id, bool_col, tinyint_col,
                              smallint_col, int_col, bigint_col,
                              > float_col, double_col, date_string_col,
                              string_col, timestamp_col, year,
                              > month
                              > FROM functional.alltypes
                              >
                              > Data Loading from Impala failed with error:
                              ImpalaBeeswaxException:
                              > INNER EXCEPTION: <class 'socket.error'>
                              > MESSAGE: [Errno 104] Connection reset by
                              peer
                              > Error
                              in /root/nishidha/Impala/testdata/bin/create-load-data.sh
 at line
                              > 41: while [ -n "$*" ]
                              > Error in /root/nishidha/Impala/buildall.sh
                              at line 368:
                              > $
                              {IMPALA_HOME}/testdata/bin/create-load-data.sh
 ${CREATE_LOAD_DATA_ARGS}
                              > <<< Y
                              >
                              >
                              *************************************************************************************************************************************************************************

                              >
                              > i continued with fe tests as is. Here is
                              the complete output log.
                              > [attachment "fe_test_output.zip" deleted by
                              Valencia
                              > Serrao/Austin/Contr/IBM]
                              >
                              > Cluster logs: [attachment "cluster_logs.7z"
                              deleted by Valencia
                              > Serrao/Austin/Contr/IBM]
                              >
                              > Kindly guide me on the same.
                              >
                              > Regards,
                              > Valencia
                              > ----- Forwarded by Valencia
                              Serrao/Austin/Contr/IBM on 04/29/2016 10:57
                              AM
                              > -----
                              >
                              > From: Sudarshan Jagadale/Austin/Contr/IBM
                              > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
                              > Date: 04/29/2016 10:49 AM
                              > Subject: Fw: Issues with generating
                              testdata for Impala
                              > ------------------------------
                              >
                              >
                              > FYI
                              > Thanks and Regards
                              > Sudarshan Jagadale
                              > Power Open Source Solutions
                              > ----- Forwarded by Sudarshan
                              Jagadale/Austin/Contr/IBM on 04/29/2016 10:48
                              > AM -----
                              >
                              > From: Alex Behm <al...@cloudera.com>
                              > To: dev@impala.incubator.apache.org
                              > Cc: Sudarshan
                              Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                              > Panpaliya/Austin/Contr/IBM@IBMUS
                              > Date: 04/28/2016 09:34 PM
                              > Subject: Re: Issues with generating
                              testdata for Impala
                              > ------------------------------
                              >
                              >
                              >
                              > Hi Valencia,
                              >
                              > sorry I did not get the attachment. Would
                              you be able to tar.gz and attach
                              > the whole cluster_logs directory?
                              >
                              > Alex
                              >
                              > On Thu, Apr 28, 2016 at 6:23 AM, Valencia
                              Serrao <*vserrao@us.ibm.com*
                              > <vs...@us.ibm.com>> wrote:
                              >
                              > Hi Alex,
                              >
                              > I tried building impala again with the
                              following:
                              > HDFS CDH 5.7.0 (
                              > *
                              http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*

                              > <
                              http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3
                              >
                              > )
                              > HBASE CDH 5.7.0 SNAPSHOT (
                              > *
                              http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*

                              > <
                              http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz
                              > )
                              > - this required to patch in a fix (
                              > *
                              https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*

                              > <
                              https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch
                              >
                              > )
                              > HIVE CDH 5.8.0 SNAPSHOT
                              >
                              > With the above combination, i'm able to
                              move past the exception and
                              > also have the RegionServer service up and
                              running. However, it now gives
                              > error as below:
                              >
                              >
                              >
                              ********************************************************************************************************************

                              >
                              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                              > CREATE EXTERNAL TABLE IF NOT EXISTS
                              functional.decimal_tbl (
                              > d1 DECIMAL,
                              > d2 DECIMAL(10, 0),
                              > d3 DECIMAL(20, 10),
                              > d4 DECIMAL(38, 38),
                              > d5 DECIMAL(10, 5))
                              > PARTITIONED BY (d6 DECIMAL(9, 0))
                              > ROW FORMAT delimited fields terminated by
                              ','
                              > STORED AS TEXTFILE
                              > LOCATION '/test-warehouse/decimal_tbl'
                              >
                              >
                              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                              > USE functional
                              >
                              >
                              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                              > ALTER TABLE decimal_tbl ADD IF NOT EXISTS
                              PARTITION(d6=1)
                              >
                              > Data Loading from Impala failed with error:
                              ImpalaBeeswaxException:
                              > INNER EXCEPTION: <class
                              >
                              'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>

                              > MESSAGE:
                              > Error: null
                              >
                              >
                              ******************************************************************************************************************

                              >
                              > Here is the complete log for the same.
                              *(See attached file:
                              > data-load-functional-exhaustive.log)*
                              >
                              > It would great if you could guide me on
                              this issue, so i could proceed
                              > with the fe tests.
                              >
                              > Still awaiting link to the source code of
                              HDFS CDH 5.8.0
                              >
                              > Regards,
                              > Valencia
                              >
                              >
                              >
                              >




Re: Fw: Issues with generating testdata for Impala

Posted by Jim Apple <jb...@cloudera.com>.
Nishidha, if I were in your shoes, I would first try to turn on core dumps
(ulimit -c unlimited) and then look at the backtrace. After that, gdb
(+watchpoints, as suggested by Tim in
https://issues.cloudera.org/browse/IMPALA-3308), would probably be what I
tried next, looking for what led to the errant value.

Unfortunately, it is very difficult to debug errors like this remotely,
with person A at the console and person B on the email thread.

On Fri, May 13, 2016 at 5:29 AM, Valencia Serrao <vs...@us.ibm.com> wrote:

> Hi Casey,
>
> As observed from the logs of testdata loading step on ppc:
> 1. There is assertion at: impalad:
> /root/nishidha/Impala/be/src/udf/udf.h:559:
> impala_udf::StringVal::StringVal(uint8_t*, int): Assertion `len >= 0'
> failed.
> 2. Connection Reset:
> I0510 10:45:04.258194 1574 thrift-util.cc:109] TSocket::read() recv()
> <Host: ::ffff:10.77.67.118 Port: 38070>Connection reset by peer
> I0510 10:45:04.258412 1574 thrift-util.cc:109] TThreadedServer client
> died: ECONNRESET
>
> Logs are uploaded here:
> https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/YLX3pKx-MAY
>
>
> Kindly guide me on these issues.
>
> Regards,
> Valencia
>
> [image: Inactive hide details for Valencia Serrao---05/10/2016 11:49:20
> AM---Hi Casey, Thank you for the response.]Valencia Serrao---05/10/2016
> 11:49:20 AM---Hi Casey, Thank you for the response.
>
> From: Valencia Serrao/Austin/Contr/IBM
> To: Casey Ching <ca...@cloudera.com>
> Cc: Alex Behm <al...@cloudera.com>, David Clissold/Austin/IBM@IBMUS,
> dev@impala.incubator.apache.org, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS,
> Sudarshan Jagadale/Austin/Contr/IBM@IBMUS
> Date: 05/10/2016 11:49 AM
> Subject: Re: Fw: Issues with generating testdata for Impala
> ------------------------------
>
>
> Hi Casey,
>
> Thank you for the response.
>
> Yes, we tried to setup the x86 environment, but here also testdata
> generation fails. Yes, we are looking more deeply into the ppc and x86
> logs. I will let you know the findings.
>
> As you suggested, i also tried running the data loading step and verified
> if tpch exists through impala-shell. The tpch database doesn't exist.
> *Command used:*
> [testvm:21000] > describe database tpch;
> *Result: *
> Query: describe database tpch
> ERROR: AnalysisException: Database does not exist: tpch
>
> Please could you share the build or test results/logs, so we can verify
> our setup. e.g.
> 1. Output of: buildall.sh -noclean -notests -format -testdata
> 2. The cluster_logs
>
> Looking forward to your reply.
>
> Regards,
> Valencia
>
>
>
> [image: Inactive hide details for Casey Ching ---05/09/2016 10:45:15
> PM---Hi Valencia, Have you tried setting up an x86 environment? Th]Casey
> Ching ---05/09/2016 10:45:15 PM---Hi Valencia, Have you tried setting up an
> x86 environment? That could be useful for comparing to the
>
> From: Casey Ching <ca...@cloudera.com>
> To: Alex Behm <al...@cloudera.com>, Valencia
> Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
> Cc: Valencia Serrao/Austin/Contr/IBM@IBMUS,
> dev@impala.incubator.apache.org, David Clissold/Austin/IBM@IBMUS,
> Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
> Panpaliya/Austin/Contr/IBM@IBMUS
> Date: 05/09/2016 10:45 PM
> Subject: Re: Fw: Issues with generating testdata for Impala
> ------------------------------
>
>
>
> Hi Valencia,
>
> Have you tried setting up an x86 environment? That could be useful for
> comparing to the ppc environment to see what is/isn’t working and being
> able to see what the logs should look like.
>
> If the tpch database isn’t there, that should mean data loading failed and
> there should have been an error that caused the data loading to exit early
> along with an error message in the logs. Did you see anything like that?
> You might want to try only running the data loading step, then verifying
> that the tpch database exists afterwards.
>
> Casey
>
> On May 9, 2016 at 5:27:49 AM, Valencia Serrao (*vserrao@us.ibm.com*
> <vs...@us.ibm.com>) wrote:
>
>    Hi Alex/Casey,
>
>       I re-ran the fe tests with the testdata you provided, but the
>       result is the same as that reported in the earlier mail, with most of the
>       failures occurring due to tpch database not existing.
>
>       Steps followed to test are as follows:
>       1. copy the testdata to IMPALA_HOME/testdata/impala-data.
>       2. ./buildall.sh -notests -noclean -format -testdata
>       3. ./bin/run_all_tests.sh
>
>       We had also tried the testdata generation on Ubuntu x86 ppc machine
>       however, it stops at the same "Invalidate Metadata" step with the exception.
>
>       Any pointers on these issues will be helpful.
>
>       Regards,
>       Valencia
>
>       Valencia Serrao---05/05/2016 06:47:59 PM---Hi Alex/Casey, I tried
>       to run the frontend tests with the data provided. Following is the result:
>
>       From: Valencia Serrao/Austin/Contr/IBM
>       To: Casey Ching <ca...@cloudera.com>
>       Cc: Alex Behm <al...@cloudera.com>,
>       dev@impala.incubator.apache.org, Nishidha
>       Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
>       Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS,
>       Valencia Serrao/Austin/Contr/IBM@IBMUS
>       Date: 05/05/2016 06:47 PM
>       Subject: Re: Fw: Issues with generating testdata for Impala
>
>       ------------------------------
>
>
>       Hi Alex/Casey,
>
>       I tried to run the frontend tests with the data provided. Following
>       is the result:
>       Tests run: 545, Failures: 226, Errors: 77, Skipped: 36 [attachment
>       "data-load-functional-exhaustive.zip" deleted by Valencia
>       Serrao/Austin/Contr/IBM]
>
>
>       Earlier, the number of "Errors" were 87 , so now they have reduced
>       by 10. However, the "Failures" count is still the same. Most of the
>       Failures in PlannerTest and AuthorizationTest are related to tpch (e.g.
>       Database doesn't exist: tpch).
>
>       With regard to the directory "impala_data", i've observed that it
>       is not being accessed/used by any script. Are we missing on any
>       configuration ?
>
>       Kindly guide me on this.
>
>       Regards,
>       Valencia
>
>
>
>       Valencia Serrao---05/05/2016 02:21:56 PM---Thanks, Casey! I will
>       let you know the test status.
>
>       From: Valencia Serrao/Austin/Contr/IBM
>       To: Casey Ching <ca...@cloudera.com>
>       Cc: Alex Behm <al...@cloudera.com>,
>       dev@impala.incubator.apache.org, Nishidha
>       Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
>       Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS,
>       Valencia Serrao/Austin/Contr/IBM@IBMUS
>       Date: 05/05/2016 02:21 PM
>       Subject: Re: Fw: Issues with generating testdata for Impala
>       ------------------------------
>
>
>       Thanks, Casey!
>
>       I will let you know the test status.
>
>
>       Casey Ching ---05/05/2016 01:09:11 PM---On May 4, 2016 at 11:08:07
>       PM, Valencia Serrao (vserrao@us.ibm.com) wrote: Hi Alex,
>
>       From: Casey Ching <ca...@cloudera.com>
>       To: Alex Behm <al...@cloudera.com>, Valencia
>       Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
>       Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
>       Panpaliya/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
>       Date: 05/05/2016 01:09 PM
>       Subject: Re: Fw: Issues with generating testdata for Impala
>       ------------------------------
>
>
>
>
>       On May 4, 2016 at 11:08:07 PM, Valencia Serrao (*vserrao@us.ibm.com*
>       <vs...@us.ibm.com>) wrote:
>       Hi Alex,
>
>                I've placed the individual testdata tars at the
>                IMPALA_HOME/testdata/impala-data. Steps 1...10 i've
>                already executed. Some queries about step no:11 and step no:12, that i want
>                to clarify:
>
>                1) . bin/impala-config.sh
>                2) mkdir -p $IMPALA_HOME/testdata/impala-data
>                3) pushd $IMPALA_HOME/testdata/impala-data
>                4) cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz
>                5) tar -xzf tpch.tar.gz
>                6) rm tpch.tar.gz
>                7) cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz
>                8) tar -xzf tpcds.tar.gz
>                9) rm tpcds.tar.gz
>                10) popd
>
>                11) ./buildall.sh -notests -noclean -format
>                -----Here I've removed the -testdata option.
>                The reason to do this is to clear the previously generated
>                partial schemas.
>             I think the -format option is supposed to clear out any old
>       state. The -testdata flag is probably needed to generate and load the test
>       data.
>
>
>                12) sudo rm -rf $IMPALA_HOME/testdata/impala-data ---- Is
>                this step required? Why?
>             That is only for docker. It helps to reduct the image size.
>       You shouldn’t need to do that or any of the other rm commands.
>
>
>                Could you kindly confirm on these steps ? If any
>                corrections, please let me know.
>
>                Regards,
>                Valencia
>
>
>
>                Valencia Serrao---05/04/2016 04:18:24 PM---Hi Alex/Casey
>                Thank you for responding and for sharing the testdata. I'm working on using
>                the testda
>
>                From: Valencia Serrao/Austin/Contr/IBM
>                To: Alex Behm <al...@cloudera.com>
>                Cc: Casey Ching <ca...@cloudera.com>,
>                dev@impala.incubator.apache.org, Nishidha
>                Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
>                Jagadale/Austin/Contr/IBM@IBMUS, David
>                Clissold/Austin/IBM@IBMUS
>                Date: 05/04/2016 04:18 PM
>                Subject: Re: Fw: Issues with generating testdata for
>                Impala
>
>
>
>                Hi Alex/Casey
>
>                Thank you for responding and for sharing the testdata. I'm
>                working on using the testdata to run the fe tests.
>
>                Meanwhile, I've posted the logs onto "Impala Dev" google
>                group. Here's the link:
>                *https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk*
>                <https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk>
>
>                Regards,
>                Valencia
>
>
>                Alex Behm ---05/04/2016 12:52:44 PM---Ahh, thanks Casey.
>                Did not know about that. Valencia, Impala's data loading expects the files
>                to be
>
>                From: Alex Behm <al...@cloudera.com>
>                To: Casey Ching <ca...@cloudera.com>
>                Cc: dev@impala.incubator.apache.org, Sudarshan
>                Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
>                Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
>                Serrao/Austin/Contr/IBM@IBMUS
>                Date: 05/04/2016 12:52 PM
>                Subject: Re: Fw: Issues with generating testdata for Impala
>
>
>
>                Ahh, thanks Casey. Did not know about that.
>
>                Valencia, Impala's data loading expects the files to be
>                placed in IMPALA_HOME/testdata/impala-data
>
>                On Tue, May 3, 2016 at 11:21 PM, Casey Ching <
>                *casey@cloudera.com* <ca...@cloudera.com>> wrote:
>
>                   Comment inline below
>
>                   On May 3, 2016 at 11:18:06 PM, Alex Behm (
>                   *alex.behm@cloudera.com* <al...@cloudera.com>)
>                   wrote:
>                   Hi Valencia,
>
>                               I'm sorry you are having so much trouble
>                               with our setup. Let's see what we
>                               can do.
>
>                               There was an infra issue with receiving the
>                               logs you sent me. The
>                               email/attachment got rejected on our side.
>                               Maybe you can upload the logs
>                               somewhere so I can grab them?
>
>                               See more responses inline below.
>
>                               On Sat, Apr 30, 2016 at 5:01 AM, Valencia
>                               Serrao <*vserrao@us.ibm.com*
>                               <vs...@us.ibm.com>> wrote:
>
>                               > Hi Alex,
>                               >
>                               > I was going more deeper through the logs.
>                               I have some findings and queries:
>                               >
>                               > 1. At the "Invalidating Metadata" step
>                               (as mentioned in below mail), i
>                               > noticed that, it is trying to use
>                               kerberos. Perhaps, this is preventing the
>                               > testdata generation from proceeding, as
>                               we are not using Kerberos.
>                               > I need to know how this can be done
>                               without involving Kerberos support ?
>                               >
>                               Kerberos is certainly not needed to build
>                               and run tests.
>
>                               >
>                               > 2. I had executed the fe tests despite
>                               the incomplete testdata generation,
>                               > the tests started and surely have failed.
>                               Many of these (null pointer
>                               > exception in AuthorzationTests) have a
>                               common cause: "tpch database does
>                               > not exist."
>                               > e.g. as shown in
>                               .Impala/cluster_logs/query_tests/test-run-workload.log.
>                               >
>                               > Does the "tpch" database gets created
>                               after the current blocker step
>                               > "Invalidating Metadata" ?
>                               >
>
>                               Yes, the TPCH database is created and
>                               loaded as part of that first phase.
>                               However, the data files are not yet
>                               publicly accessible. Let me work on
>                               that from my side, and get back to you
>                               soon. One way or the other we'll be
>                               able to provide you with the data.
>
>                   The data is at
>                   *https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp*
>                   <https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp> .
>                   The files are split into 50 MB pieces for git. You can put them back
>                   together as is done in
>                   *https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile*
>                   <https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile>
>
>                               >
>                               > 3. In the fe test console output log,
>                               another error shown:
>                               > ============================= test
>                               session starts
>                               > ==============================
>                               > platform linux2 -- Python 2.7.5 --
>                               py-1.4.30 -- pytest-2.7.2
>                               > rootdir: /work/, inifile:
>                               > plugins: random, xdist
>                               > ERROR: file not found:/work/I
>                               >
>                               mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/
>                               >
>                               > These are not present/created on my vm.
>                               May i know when these get created ?
>                               >
>                               > 4. Could you also share the total number
>                               of fe tests ?
>                               >
>
>                               I'll privately send you the console output
>                               from a successful FE run.
>                               Hopefully that can help.
>
>                               Cheers,
>
>                               Alex
>
>                               >
>                               >
>                               > Looking forward to your reply.
>                               >
>                               > Regards,
>                               > Valencia
>                               >
>                               >
>                               > [image: Inactive hide details for
>                               Valencia Serrao---04/30/2016 09:05:54
>                               > AM---Hi Alex, I've been able to make some
>                               progress on testdata]Valencia
>                               > Serrao---04/30/2016 09:05:54 AM---Hi
>                               Alex, I've been able to make some
>                               > progress on testdata generation, however,
>                               i still face the foll
>                               >
>                               > From: Valencia Serrao/Austin/Contr/IBM
>                               > To: *dev@impala.incubator.apache.org*
>                               <de...@impala.incubator.apache.org>, Alex
>                               Behm <*alex.behm@cloudera.com*
>                               <al...@cloudera.com>>
>                               > Cc: Sudarshan
>                               Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
>                               > Panpaliya/Austin/Contr/IBM@IBMUS,
>                               Valencia Serrao/Austin/Contr/IBM@IBMUS
>                               > Date: 04/30/2016 09:05 AM
>                               > Subject: Fw: Issues with generating
>                               testdata for Impala
>                               > ------------------------------
>                               >
>                               >
>                               >
>                               > Hi Alex,
>                               >
>                               > I've been able to make some progress on
>                               testdata generation, however, i
>                               > still face the following issues:
>                               >
>                               >
>                               >
>                               *******************************************************************************************************************************************************************
>                               > Invalidating Metadata
>                               >
>                               >
>                               (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):
>                               > INSERT OVERWRITE TABLE
>                               functional_parquet.alltypes partition (year, month)
>                               > SELECT id, bool_col, tinyint_col,
>                               smallint_col, int_col, bigint_col,
>                               > float_col, double_col, date_string_col,
>                               string_col, timestamp_col, year,
>                               > month
>                               > FROM functional.alltypes
>                               >
>                               > Data Loading from Impala failed with
>                               error: ImpalaBeeswaxException:
>                               > INNER EXCEPTION: <class 'socket.error'>
>                               > MESSAGE: [Errno 104] Connection reset by
>                               peer
>                               > Error in
>                               /root/nishidha/Impala/testdata/bin/create-load-data.sh at line
>                               > 41: while [ -n "$*" ]
>                               > Error in
>                               /root/nishidha/Impala/buildall.sh at line 368:
>                               >
>                               ${IMPALA_HOME}/testdata/bin/create-load-data.sh ${CREATE_LOAD_DATA_ARGS}
>                               > <<< Y
>                               >
>                               >
>                               *************************************************************************************************************************************************************************
>                               >
>                               > i continued with fe tests as is. Here is
>                               the complete output log.
>                               > [attachment "fe_test_output.zip" deleted
>                               by Valencia
>                               > Serrao/Austin/Contr/IBM]
>                               >
>                               > Cluster logs: [attachment
>                               "cluster_logs.7z" deleted by Valencia
>                               > Serrao/Austin/Contr/IBM]
>                               >
>                               > Kindly guide me on the same.
>                               >
>                               > Regards,
>                               > Valencia
>                               > ----- Forwarded by Valencia
>                               Serrao/Austin/Contr/IBM on 04/29/2016 10:57 AM
>                               > -----
>                               >
>                               > From: Sudarshan Jagadale/Austin/Contr/IBM
>                               > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
>                               > Date: 04/29/2016 10:49 AM
>                               > Subject: Fw: Issues with generating
>                               testdata for Impala
>                               > ------------------------------
>                               >
>                               >
>                               > FYI
>                               > Thanks and Regards
>                               > Sudarshan Jagadale
>                               > Power Open Source Solutions
>                               > ----- Forwarded by Sudarshan
>                               Jagadale/Austin/Contr/IBM on 04/29/2016 10:48
>                               > AM -----
>                               >
>                               > From: Alex Behm <*alex.behm@cloudera.com*
>                               <al...@cloudera.com>>
>                               > To: *dev@impala.incubator.apache.org*
>                               <de...@impala.incubator.apache.org>
>                               > Cc: Sudarshan
>                               Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
>                               > Panpaliya/Austin/Contr/IBM@IBMUS
>                               > Date: 04/28/2016 09:34 PM
>                               > Subject: Re: Issues with generating
>                               testdata for Impala
>                               > ------------------------------
>                               >
>                               >
>                               >
>                               > Hi Valencia,
>                               >
>                               > sorry I did not get the attachment. Would
>                               you be able to tar.gz and attach
>                               > the whole cluster_logs directory?
>                               >
>                               > Alex
>                               >
>                               > On Thu, Apr 28, 2016 at 6:23 AM, Valencia
>                               Serrao <**vserrao@us.ibm.com*
>                               <vs...@us.ibm.com>*
>                               > <*vserrao@us.ibm.com* <vs...@us.ibm.com>>>
>                               wrote:
>                               >
>                               > Hi Alex,
>                               >
>                               > I tried building impala again with the
>                               following:
>                               > HDFS CDH 5.7.0 (
>                               > *
>                               *http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3**
>                               <http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*>
>                               > <
>                               *http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*
>                               <http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3>
>                               >
>                               > )
>                               > HBASE CDH 5.7.0 SNAPSHOT (
>                               > *
>                               *http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz**
>                               <http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*>
>                               > <
>                               *http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*
>                               <http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz>>
>                               )
>                               > - this required to patch in a fix (
>                               > *
>                               *https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch**
>                               <https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*>
>                               > <
>                               *https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*
>                               <https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch>
>                               >
>                               > )
>                               > HIVE CDH 5.8.0 SNAPSHOT
>                               >
>                               > With the above combination, i'm able to
>                               move past the exception and
>                               > also have the RegionServer service up and
>                               running. However, it now gives
>                               > error as below:
>                               >
>                               >
>                               >
>                               ********************************************************************************************************************
>                               >
>                               (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>                               > CREATE EXTERNAL TABLE IF NOT EXISTS
>                               functional.decimal_tbl (
>                               > d1 DECIMAL,
>                               > d2 DECIMAL(10, 0),
>                               > d3 DECIMAL(20, 10),
>                               > d4 DECIMAL(38, 38),
>                               > d5 DECIMAL(10, 5))
>                               > PARTITIONED BY (d6 DECIMAL(9, 0))
>                               > ROW FORMAT delimited fields terminated by
>                               ','
>                               > STORED AS TEXTFILE
>                               > LOCATION '/test-warehouse/decimal_tbl'
>                               >
>                               >
>                               (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>                               > USE functional
>                               >
>                               >
>                               (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>                               > ALTER TABLE decimal_tbl ADD IF NOT EXISTS
>                               PARTITION(d6=1)
>                               >
>                               > Data Loading from Impala failed with
>                               error: ImpalaBeeswaxException:
>                               > INNER EXCEPTION: <class
>                               >
>                               'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
>                               > MESSAGE:
>                               > Error: null
>                               >
>                               >
>                               ******************************************************************************************************************
>                               >
>                               > Here is the complete log for the same.
>                               *(See attached file:
>                               > data-load-functional-exhaustive.log)*
>                               >
>                               > It would great if you could guide me on
>                               this issue, so i could proceed
>                               > with the fe tests.
>                               >
>                               > Still awaiting link to the source code of
>                               HDFS CDH 5.8.0
>                               >
>                               > Regards,
>                               > Valencia
>                               >
>                               >
>                               >
>                               >
>
>
>
>
>
>

Re: Fw: Issues with generating testdata for Impala

Posted by Valencia Serrao <vs...@us.ibm.com>.
Hi Casey,

As observed from the logs of testdata loading step on ppc:
1. There is assertion at:
impalad: /root/nishidha/Impala/be/src/udf/udf.h:559:
impala_udf::StringVal::StringVal(uint8_t*, int): Assertion `len >= 0'
failed.
2. Connection Reset:
I0510 10:45:04.258194  1574 thrift-util.cc:109] TSocket::read() recv()
<Host: ::ffff:10.77.67.118 Port: 38070>Connection reset by peer
I0510 10:45:04.258412  1574 thrift-util.cc:109] TThreadedServer client
died: ECONNRESET

Logs are uploaded here:
https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/YLX3pKx-MAY


Kindly guide me on these issues.

Regards,
Valencia



From:	Valencia Serrao/Austin/Contr/IBM
To:	Casey Ching <ca...@cloudera.com>
Cc:	Alex Behm <al...@cloudera.com>, David
            Clissold/Austin/IBM@IBMUS, dev@impala.incubator.apache.org,
            Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS
Date:	05/10/2016 11:49 AM
Subject:	Re: Fw: Issues with generating testdata for Impala


Hi Casey,

Thank you for the response.

Yes, we tried to setup the x86 environment, but here also testdata
generation fails. Yes, we are looking more deeply into the ppc and x86
logs. I will let you know the findings.

As you suggested, i also tried running the  data loading step and verified
if tpch exists through impala-shell. The tpch database doesn't exist.
Command used:
[testvm:21000] > describe database tpch;
Result:
Query: describe database tpch
ERROR: AnalysisException: Database does not exist: tpch

Please could you share the build or test results/logs, so we can verify our
setup. e.g.
1. Output of:  buildall.sh -noclean -notests -format -testdata
2. The cluster_logs

Looking forward to your reply.

Regards,
Valencia





From:	Casey Ching <ca...@cloudera.com>
To:	Alex Behm <al...@cloudera.com>, Valencia
            Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
Cc:	Valencia Serrao/Austin/Contr/IBM@IBMUS,
            dev@impala.incubator.apache.org, David
            Clissold/Austin/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS
Date:	05/09/2016 10:45 PM
Subject:	Re: Fw: Issues with generating testdata for Impala



Hi Valencia,

Have you tried setting up an x86 environment? That could be useful for
comparing to the ppc environment to see what is/isn’t working and being
able to see what the logs should look like.

If the tpch database isn’t there, that should mean data loading failed and
there should have been an error that caused the data loading to exit early
along with an error message in the logs. Did you see anything like that?
You might want to try only running the data loading step, then verifying
that the tpch database exists afterwards.

Casey


On May 9, 2016 at 5:27:49 AM, Valencia Serrao (vserrao@us.ibm.com) wrote:


      Hi Alex/Casey,

      I re-ran the fe tests with the testdata you provided, but the result
      is the same as that reported in the earlier mail, with most of the
      failures occurring due to tpch database not existing.

      Steps followed to test are as follows:
      1. copy the testdata to IMPALA_HOME/testdata/impala-data.
      2. ./buildall.sh -notests -noclean -format -testdata
      3. ./bin/run_all_tests.sh

      We had also tried the testdata generation on Ubuntu x86 ppc machine
      however, it stops at the same "Invalidate Metadata" step with the
      exception.

      Any pointers on these issues will be helpful.

      Regards,
      Valencia

      Valencia Serrao---05/05/2016 06:47:59 PM---Hi Alex/Casey, I tried to
      run the frontend tests with the data provided. Following is the
      result:

      From: Valencia Serrao/Austin/Contr/IBM
      To: Casey Ching <ca...@cloudera.com>
      Cc: Alex Behm <al...@cloudera.com>,
      dev@impala.incubator.apache.org, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS,
      Valencia Serrao/Austin/Contr/IBM@IBMUS
      Date: 05/05/2016 06:47 PM
      Subject: Re: Fw: Issues with generating testdata for Impala




      Hi Alex/Casey,

      I tried to run the frontend tests with the data provided. Following
      is the result:
      Tests run: 545, Failures: 226, Errors: 77, Skipped: 36 [attachment
      "data-load-functional-exhaustive.zip" deleted by Valencia
      Serrao/Austin/Contr/IBM]


      Earlier, the number of "Errors" were 87 , so now they have reduced by
      10. However, the "Failures" count is still the same. Most of the
      Failures in PlannerTest and AuthorizationTest are related to tpch
      (e.g. Database doesn't exist: tpch).

      With regard to the directory "impala_data", i've observed that it is
      not being accessed/used by any script. Are we missing on any
      configuration ?

      Kindly guide me on this.

      Regards,
      Valencia



      Valencia Serrao---05/05/2016 02:21:56 PM---Thanks, Casey! I will let
      you know the test status.

      From: Valencia Serrao/Austin/Contr/IBM
      To: Casey Ching <ca...@cloudera.com>
      Cc: Alex Behm <al...@cloudera.com>,
      dev@impala.incubator.apache.org, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS,
      Valencia Serrao/Austin/Contr/IBM@IBMUS
      Date: 05/05/2016 02:21 PM
      Subject: Re: Fw: Issues with generating testdata for Impala


      Thanks, Casey!

      I will let you know the test status.


      Casey Ching ---05/05/2016 01:09:11 PM---On May 4, 2016 at 11:08:07
      PM, Valencia Serrao (vserrao@us.ibm.com) wrote: Hi Alex,

      From: Casey Ching <ca...@cloudera.com>
      To: Alex Behm <al...@cloudera.com>, Valencia
      Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
      Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
      Date: 05/05/2016 01:09 PM
      Subject: Re: Fw: Issues with generating testdata for Impala







      On May 4, 2016 at 11:08:07 PM, Valencia Serrao (vserrao@us.ibm.com)
      wrote:


              Hi Alex,

              I've placed the individual testdata tars at the
              IMPALA_HOME/testdata/impala-data. Steps 1...10 i've already
              executed. Some queries about step no:11 and step no:12, that
              i want to clarify:

              1) . bin/impala-config.sh
              2) mkdir -p $IMPALA_HOME/testdata/impala-data
              3) pushd $IMPALA_HOME/testdata/impala-data
              4) cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz
              5) tar -xzf tpch.tar.gz
              6) rm tpch.tar.gz
              7) cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz
              8) tar -xzf tpcds.tar.gz
              9) rm tpcds.tar.gz
              10) popd

              11) ./buildall.sh -notests -noclean -format
              -----Here I've removed the -testdata option.
              The reason to do this is to clear the previously generated
              partial schemas.
      I think the -format option is supposed to clear out any old state.
      The -testdata flag is probably needed to generate and load the test
      data.


              12) sudo rm -rf $IMPALA_HOME/testdata/impala-data ---- Is
              this step required? Why?
      That is only for docker. It helps to reduct the image size. You
      shouldn’t need to do that or any of the other rm commands.


              Could you kindly confirm on these steps ? If any corrections,
              please let me know.

              Regards,
              Valencia



              Valencia Serrao---05/04/2016 04:18:24 PM---Hi Alex/Casey
              Thank you for responding and for sharing the testdata. I'm
              working on using the testda

              From: Valencia Serrao/Austin/Contr/IBM
              To: Alex Behm <al...@cloudera.com>
              Cc: Casey Ching <ca...@cloudera.com>,
              dev@impala.incubator.apache.org, Nishidha
              Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
              Jagadale/Austin/Contr/IBM@IBMUS, David
              Clissold/Austin/IBM@IBMUS
              Date: 05/04/2016 04:18 PM
              Subject: Re: Fw: Issues with generating testdata for Impala




              Hi Alex/Casey

              Thank you for responding and for sharing the testdata. I'm
              working on using the testdata to run the fe tests.

              Meanwhile, I've posted the logs onto "Impala Dev" google
              group. Here's the link:
              https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk


              Regards,
              Valencia


              Alex Behm ---05/04/2016 12:52:44 PM---Ahh, thanks Casey. Did
              not know about that. Valencia, Impala's data loading expects
              the files to be

              From: Alex Behm <al...@cloudera.com>
              To: Casey Ching <ca...@cloudera.com>
              Cc: dev@impala.incubator.apache.org, Sudarshan
              Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
              Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
              Serrao/Austin/Contr/IBM@IBMUS
              Date: 05/04/2016 12:52 PM
              Subject: Re: Fw: Issues with generating testdata for Impala



              Ahh, thanks Casey. Did not know about that.

              Valencia, Impala's data loading expects the files to be
              placed in IMPALA_HOME/testdata/impala-data

              On Tue, May 3, 2016 at 11:21 PM, Casey Ching <
              casey@cloudera.com> wrote:


                  Comment inline below


                  On May 3, 2016 at 11:18:06 PM, Alex Behm (
                  alex.behm@cloudera.com) wrote:


                              Hi Valencia,

                              I'm sorry you are having so much trouble with
                              our setup. Let's see what we
                              can do.

                              There was an infra issue with receiving the
                              logs you sent me. The
                              email/attachment got rejected on our side.
                              Maybe you can upload the logs
                              somewhere so I can grab them?

                              See more responses inline below.

                              On Sat, Apr 30, 2016 at 5:01 AM, Valencia
                              Serrao <vs...@us.ibm.com> wrote:

                              > Hi Alex,
                              >
                              > I was going more deeper through the logs. I
                              have some findings and queries:
                              >
                              > 1. At the "Invalidating Metadata" step (as
                              mentioned in below mail), i
                              > noticed that, it is trying to use kerberos.
                              Perhaps, this is preventing the
                              > testdata generation from proceeding, as we
                              are not using Kerberos.
                              > I need to know how this can be done without
                              involving Kerberos support ?
                              >
                              Kerberos is certainly not needed to build and
                              run tests.

                              >
                              > 2. I had executed the fe tests despite the
                              incomplete testdata generation,
                              > the tests started and surely have failed.
                              Many of these (null pointer
                              > exception in AuthorzationTests) have a
                              common cause: "tpch database does
                              > not exist."
                              > e.g. as shown
                              in .Impala/cluster_logs/query_tests/test-run-workload.log.

                              >
                              > Does the "tpch" database gets created after
                              the current blocker step
                              > "Invalidating Metadata" ?
                              >

                              Yes, the TPCH database is created and loaded
                              as part of that first phase.
                              However, the data files are not yet publicly
                              accessible. Let me work on
                              that from my side, and get back to you soon.
                              One way or the other we'll be
                              able to provide you with the data.

                  The data is at
                  https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
                   . The files are split into 50 MB pieces for git. You can
                  put them back together as is done in
                  https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile


                              >
                              > 3. In the fe test console output log,
                              another error shown:
                              > ============================= test session
                              starts
                              > ==============================
                              > platform linux2 -- Python 2.7.5 --
                              py-1.4.30 -- pytest-2.7.2
                              > rootdir: /work/, inifile:
                              > plugins: random, xdist
                              > ERROR: file not found:/work/I
                              >
                              mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/

                              >
                              > These are not present/created on my vm. May
                              i know when these get created ?
                              >
                              > 4. Could you also share the total number of
                              fe tests ?
                              >

                              I'll privately send you the console output
                              from a successful FE run.
                              Hopefully that can help.

                              Cheers,

                              Alex

                              >
                              >
                              > Looking forward to your reply.
                              >
                              > Regards,
                              > Valencia
                              >
                              >
                              > [image: Inactive hide details for Valencia
                              Serrao---04/30/2016 09:05:54
                              > AM---Hi Alex, I've been able to make some
                              progress on testdata]Valencia
                              > Serrao---04/30/2016 09:05:54 AM---Hi Alex,
                              I've been able to make some
                              > progress on testdata generation, however, i
                              still face the foll
                              >
                              > From: Valencia Serrao/Austin/Contr/IBM
                              > To: dev@impala.incubator.apache.org, Alex
                              Behm <al...@cloudera.com>
                              > Cc: Sudarshan
                              Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                              > Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
                              Serrao/Austin/Contr/IBM@IBMUS
                              > Date: 04/30/2016 09:05 AM
                              > Subject: Fw: Issues with generating
                              testdata for Impala
                              > ------------------------------
                              >
                              >
                              >
                              > Hi Alex,
                              >
                              > I've been able to make some progress on
                              testdata generation, however, i
                              > still face the following issues:
                              >
                              >
                              >
                              *******************************************************************************************************************************************************************

                              > Invalidating Metadata
                              >
                              >
                              (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):

                              > INSERT OVERWRITE TABLE
                              functional_parquet.alltypes partition (year,
                              month)
                              > SELECT id, bool_col, tinyint_col,
                              smallint_col, int_col, bigint_col,
                              > float_col, double_col, date_string_col,
                              string_col, timestamp_col, year,
                              > month
                              > FROM functional.alltypes
                              >
                              > Data Loading from Impala failed with error:
                              ImpalaBeeswaxException:
                              > INNER EXCEPTION: <class 'socket.error'>
                              > MESSAGE: [Errno 104] Connection reset by
                              peer
                              > Error
                              in /root/nishidha/Impala/testdata/bin/create-load-data.sh
 at line
                              > 41: while [ -n "$*" ]
                              > Error in /root/nishidha/Impala/buildall.sh
                              at line 368:
                              > $
                              {IMPALA_HOME}/testdata/bin/create-load-data.sh
 ${CREATE_LOAD_DATA_ARGS}
                              > <<< Y
                              >
                              >
                              *************************************************************************************************************************************************************************

                              >
                              > i continued with fe tests as is. Here is
                              the complete output log.
                              > [attachment "fe_test_output.zip" deleted by
                              Valencia
                              > Serrao/Austin/Contr/IBM]
                              >
                              > Cluster logs: [attachment "cluster_logs.7z"
                              deleted by Valencia
                              > Serrao/Austin/Contr/IBM]
                              >
                              > Kindly guide me on the same.
                              >
                              > Regards,
                              > Valencia
                              > ----- Forwarded by Valencia
                              Serrao/Austin/Contr/IBM on 04/29/2016 10:57
                              AM
                              > -----
                              >
                              > From: Sudarshan Jagadale/Austin/Contr/IBM
                              > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
                              > Date: 04/29/2016 10:49 AM
                              > Subject: Fw: Issues with generating
                              testdata for Impala
                              > ------------------------------
                              >
                              >
                              > FYI
                              > Thanks and Regards
                              > Sudarshan Jagadale
                              > Power Open Source Solutions
                              > ----- Forwarded by Sudarshan
                              Jagadale/Austin/Contr/IBM on 04/29/2016 10:48
                              > AM -----
                              >
                              > From: Alex Behm <al...@cloudera.com>
                              > To: dev@impala.incubator.apache.org
                              > Cc: Sudarshan
                              Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                              > Panpaliya/Austin/Contr/IBM@IBMUS
                              > Date: 04/28/2016 09:34 PM
                              > Subject: Re: Issues with generating
                              testdata for Impala
                              > ------------------------------
                              >
                              >
                              >
                              > Hi Valencia,
                              >
                              > sorry I did not get the attachment. Would
                              you be able to tar.gz and attach
                              > the whole cluster_logs directory?
                              >
                              > Alex
                              >
                              > On Thu, Apr 28, 2016 at 6:23 AM, Valencia
                              Serrao <*vserrao@us.ibm.com*
                              > <vs...@us.ibm.com>> wrote:
                              >
                              > Hi Alex,
                              >
                              > I tried building impala again with the
                              following:
                              > HDFS CDH 5.7.0 (
                              > *
                              http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*

                              > <
                              http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3
                              >
                              > )
                              > HBASE CDH 5.7.0 SNAPSHOT (
                              > *
                              http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*

                              > <
                              http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz
                              > )
                              > - this required to patch in a fix (
                              > *
                              https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*

                              > <
                              https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch
                              >
                              > )
                              > HIVE CDH 5.8.0 SNAPSHOT
                              >
                              > With the above combination, i'm able to
                              move past the exception and
                              > also have the RegionServer service up and
                              running. However, it now gives
                              > error as below:
                              >
                              >
                              >
                              ********************************************************************************************************************

                              >
                              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                              > CREATE EXTERNAL TABLE IF NOT EXISTS
                              functional.decimal_tbl (
                              > d1 DECIMAL,
                              > d2 DECIMAL(10, 0),
                              > d3 DECIMAL(20, 10),
                              > d4 DECIMAL(38, 38),
                              > d5 DECIMAL(10, 5))
                              > PARTITIONED BY (d6 DECIMAL(9, 0))
                              > ROW FORMAT delimited fields terminated by
                              ','
                              > STORED AS TEXTFILE
                              > LOCATION '/test-warehouse/decimal_tbl'
                              >
                              >
                              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                              > USE functional
                              >
                              >
                              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                              > ALTER TABLE decimal_tbl ADD IF NOT EXISTS
                              PARTITION(d6=1)
                              >
                              > Data Loading from Impala failed with error:
                              ImpalaBeeswaxException:
                              > INNER EXCEPTION: <class
                              >
                              'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>

                              > MESSAGE:
                              > Error: null
                              >
                              >
                              ******************************************************************************************************************

                              >
                              > Here is the complete log for the same.
                              *(See attached file:
                              > data-load-functional-exhaustive.log)*
                              >
                              > It would great if you could guide me on
                              this issue, so i could proceed
                              > with the fe tests.
                              >
                              > Still awaiting link to the source code of
                              HDFS CDH 5.8.0
                              >
                              > Regards,
                              > Valencia
                              >
                              >
                              >
                              >




Re: Fw: Issues with generating testdata for Impala

Posted by Casey Ching <ca...@cloudera.com>.
Hi Valencia,

Have you tried setting up an x86 environment? That could be useful for comparing to the ppc environment to see what is/isn’t working and being able to see what the logs should look like.

If the tpch database isn’t there, that should mean data loading failed and there should have been an error that caused the data loading to exit early along with an error message in the logs. Did you see anything like that? You might want to try only running the data loading step, then verifying that the tpch database exists afterwards.

Casey
On May 9, 2016 at 5:27:49 AM, Valencia Serrao (vserrao@us.ibm.com) wrote:

Hi Alex/Casey,

I re-ran the fe tests with the testdata you provided, but the result is the same as that reported in the earlier mail, with most of the failures occurring due to tpch database not existing.

Steps followed to test are as follows:
1. copy the testdata to IMPALA_HOME/testdata/impala-data.
2. ./buildall.sh -notests -noclean -format -testdata
3. ./bin/run_all_tests.sh

We had also tried the testdata generation on Ubuntu x86 ppc machine however, it stops at the same "Invalidate Metadata" step with the exception.

Any pointers on these issues will be helpful.

Regards,
Valencia

 Valencia Serrao---05/05/2016 06:47:59 PM---Hi Alex/Casey, I tried to run the frontend tests with the data provided. Following is the result:

From: Valencia Serrao/Austin/Contr/IBM
To: Casey Ching <ca...@cloudera.com>
Cc: Alex Behm <al...@cloudera.com>, dev@impala.incubator.apache.org, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
Date: 05/05/2016 06:47 PM
Subject: Re: Fw: Issues with generating testdata for Impala



Hi Alex/Casey,

I tried to run the frontend tests with the data provided. Following is the result:
Tests run: 545, Failures: 226, Errors: 77, Skipped: 36 [attachment "data-load-functional-exhaustive.zip" deleted by Valencia Serrao/Austin/Contr/IBM]


Earlier, the number of "Errors" were 87 , so now they have reduced by 10. However, the "Failures" count is still the same. Most of the Failures in PlannerTest and AuthorizationTest are related to tpch (e.g. Database doesn't exist: tpch).

With regard to the directory "impala_data", i've observed that it is not being accessed/used by any script. Are we missing on any configuration ?

Kindly guide me on this.

Regards,
Valencia



 Valencia Serrao---05/05/2016 02:21:56 PM---Thanks, Casey! I will let you know the test status.

From: Valencia Serrao/Austin/Contr/IBM
To: Casey Ching <ca...@cloudera.com>
Cc: Alex Behm <al...@cloudera.com>, dev@impala.incubator.apache.org, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
Date: 05/05/2016 02:21 PM
Subject: Re: Fw: Issues with generating testdata for Impala


Thanks, Casey!

I will let you know the test status.


 Casey Ching ---05/05/2016 01:09:11 PM---On May 4, 2016 at 11:08:07 PM, Valencia Serrao (vserrao@us.ibm.com) wrote: Hi Alex,

From: Casey Ching <ca...@cloudera.com>
To: Alex Behm <al...@cloudera.com>, Valencia Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
Date: 05/05/2016 01:09 PM
Subject: Re: Fw: Issues with generating testdata for Impala





On May 4, 2016 at 11:08:07 PM, Valencia Serrao (vserrao@us.ibm.com) wrote:

Hi Alex,

I've placed the individual testdata tars at the IMPALA_HOME/testdata/impala-data. Steps 1...10 i've already executed. Some queries about step no:11 and step no:12, that i want to clarify:

1) . bin/impala-config.sh
2) mkdir -p $IMPALA_HOME/testdata/impala-data
3) pushd $IMPALA_HOME/testdata/impala-data
4) cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz
5) tar -xzf tpch.tar.gz
6) rm tpch.tar.gz
7) cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz
8) tar -xzf tpcds.tar.gz
9) rm tpcds.tar.gz
10) popd

11) ./buildall.sh -notests -noclean -format
-----Here I've removed the -testdata option.
The reason to do this is to clear the previously generated partial schemas.
I think the -format option is supposed to clear out any old state. The -testdata flag is probably needed to generate and load the test data.


12) sudo rm -rf $IMPALA_HOME/testdata/impala-data ---- Is this step required? Why?
That is only for docker. It helps to reduct the image size. You shouldn’t need to do that or any of the other rm commands.


Could you kindly confirm on these steps ? If any corrections, please let me know.

Regards,
Valencia



Valencia Serrao---05/04/2016 04:18:24 PM---Hi Alex/Casey Thank you for responding and for sharing the testdata. I'm working on using the testda

From: Valencia Serrao/Austin/Contr/IBM
To: Alex Behm <al...@cloudera.com>
Cc: Casey Ching <ca...@cloudera.com>, dev@impala.incubator.apache.org, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS
Date: 05/04/2016 04:18 PM
Subject: Re: Fw: Issues with generating testdata for Impala


Hi Alex/Casey

Thank you for responding and for sharing the testdata. I'm working on using the testdata to run the fe tests.

Meanwhile, I've posted the logs onto "Impala Dev" google group. Here's the link: https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk

Regards,
Valencia


Alex Behm ---05/04/2016 12:52:44 PM---Ahh, thanks Casey. Did not know about that. Valencia, Impala's data loading expects the files to be

From: Alex Behm <al...@cloudera.com>
To: Casey Ching <ca...@cloudera.com>
Cc: dev@impala.incubator.apache.org, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
Date: 05/04/2016 12:52 PM
Subject: Re: Fw: Issues with generating testdata for Impala



Ahh, thanks Casey. Did not know about that.

Valencia, Impala's data loading expects the files to be placed in IMPALA_HOME/testdata/impala-data

On Tue, May 3, 2016 at 11:21 PM, Casey Ching <ca...@cloudera.com> wrote:

Comment inline below
On May 3, 2016 at 11:18:06 PM, Alex Behm (alex.behm@cloudera.com) wrote:

Hi Valencia,

I'm sorry you are having so much trouble with our setup. Let's see what we
can do.

There was an infra issue with receiving the logs you sent me. The
email/attachment got rejected on our side. Maybe you can upload the logs
somewhere so I can grab them?

See more responses inline below.

On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <vs...@us.ibm.com> wrote:

> Hi Alex,
>
> I was going more deeper through the logs. I have some findings and queries:
>
> 1. At the "Invalidating Metadata" step (as mentioned in below mail), i
> noticed that, it is trying to use kerberos. Perhaps, this is preventing the
> testdata generation from proceeding, as we are not using Kerberos.
> I need to know how this can be done without involving Kerberos support ?
>
Kerberos is certainly not needed to build and run tests.

>
> 2. I had executed the fe tests despite the incomplete testdata generation,
> the tests started and surely have failed. Many of these (null pointer
> exception in AuthorzationTests) have a common cause: "tpch database does
> not exist."
> e.g. as shown in .Impala/cluster_logs/query_tests/test-run-workload.log.
>
> Does the "tpch" database gets created after the current blocker step
> "Invalidating Metadata" ?
>

Yes, the TPCH database is created and loaded as part of that first phase.
However, the data files are not yet publicly accessible. Let me work on
that from my side, and get back to you soon. One way or the other we'll be
able to provide you with the data.

The data is at https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp . The files are split into 50 MB pieces for git. You can put them back together as is done in https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile

>
> 3. In the fe test console output log, another error shown:
> ============================= test session starts
> ==============================
> platform linux2 -- Python 2.7.5 -- py-1.4.30 -- pytest-2.7.2
> rootdir: /work/, inifile:
> plugins: random, xdist
> ERROR: file not found:/work/I
> mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/
>
> These are not present/created on my vm. May i know when these get created ?
>
> 4. Could you also share the total number of fe tests ?
>

I'll privately send you the console output from a successful FE run.
Hopefully that can help.

Cheers,

Alex

>
>
> Looking forward to your reply.
>
> Regards,
> Valencia
>
>
> [image: Inactive hide details for Valencia Serrao---04/30/2016 09:05:54
> AM---Hi Alex, I've been able to make some progress on testdata]Valencia
> Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been able to make some
> progress on testdata generation, however, i still face the foll
>
> From: Valencia Serrao/Austin/Contr/IBM
> To: dev@impala.incubator.apache.org, Alex Behm <al...@cloudera.com>
> Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
> Panpaliya/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
> Date: 04/30/2016 09:05 AM
> Subject: Fw: Issues with generating testdata for Impala
> ------------------------------
>
>
>
> Hi Alex,
>
> I've been able to make some progress on testdata generation, however, i
> still face the following issues:
>
>
> *******************************************************************************************************************************************************************
> Invalidating Metadata
>
> (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):
> INSERT OVERWRITE TABLE functional_parquet.alltypes partition (year, month)
> SELECT id, bool_col, tinyint_col, smallint_col, int_col, bigint_col,
> float_col, double_col, date_string_col, string_col, timestamp_col, year,
> month
> FROM functional.alltypes
>
> Data Loading from Impala failed with error: ImpalaBeeswaxException:
> INNER EXCEPTION: <class 'socket.error'>
> MESSAGE: [Errno 104] Connection reset by peer
> Error in /root/nishidha/Impala/testdata/bin/create-load-data.sh at line
> 41: while [ -n "$*" ]
> Error in /root/nishidha/Impala/buildall.sh at line 368:
> ${IMPALA_HOME}/testdata/bin/create-load-data.sh ${CREATE_LOAD_DATA_ARGS}
> <<< Y
>
> *************************************************************************************************************************************************************************
>
> i continued with fe tests as is. Here is the complete output log.
> [attachment "fe_test_output.zip" deleted by Valencia
> Serrao/Austin/Contr/IBM]
>
> Cluster logs: [attachment "cluster_logs.7z" deleted by Valencia
> Serrao/Austin/Contr/IBM]
>
> Kindly guide me on the same.
>
> Regards,
> Valencia
> ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on 04/29/2016 10:57 AM
> -----
>
> From: Sudarshan Jagadale/Austin/Contr/IBM
> To: Valencia Serrao/Austin/Contr/IBM@IBMUS
> Date: 04/29/2016 10:49 AM
> Subject: Fw: Issues with generating testdata for Impala
> ------------------------------
>
>
> FYI
> Thanks and Regards
> Sudarshan Jagadale
> Power Open Source Solutions
> ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM on 04/29/2016 10:48
> AM -----
>
> From: Alex Behm <al...@cloudera.com>
> To: dev@impala.incubator.apache.org  
> Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
> Panpaliya/Austin/Contr/IBM@IBMUS
> Date: 04/28/2016 09:34 PM
> Subject: Re: Issues with generating testdata for Impala
> ------------------------------
>
>
>
> Hi Valencia,
>
> sorry I did not get the attachment. Would you be able to tar.gz and attach
> the whole cluster_logs directory?
>
> Alex
>
> On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao <*vserrao@us.ibm.com*
> <vs...@us.ibm.com>> wrote:
>
> Hi Alex,
>
> I tried building impala again with the following:
> HDFS CDH 5.7.0 (
> *http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*  
> <http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3>
> )
> HBASE CDH 5.7.0 SNAPSHOT (
> *http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*  
> <http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz> )
> - this required to patch in a fix (
> *https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*  
> <https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch>
> )
> HIVE CDH 5.8.0 SNAPSHOT
>
> With the above combination, i'm able to move past the exception and
> also have the RegionServer service up and running. However, it now gives
> error as below:
>
>
> ********************************************************************************************************************
> (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
> CREATE EXTERNAL TABLE IF NOT EXISTS functional.decimal_tbl (
> d1 DECIMAL,
> d2 DECIMAL(10, 0),
> d3 DECIMAL(20, 10),
> d4 DECIMAL(38, 38),
> d5 DECIMAL(10, 5))
> PARTITIONED BY (d6 DECIMAL(9, 0))
> ROW FORMAT delimited fields terminated by ','
> STORED AS TEXTFILE
> LOCATION '/test-warehouse/decimal_tbl'
>
> (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
> USE functional
>
> (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
> ALTER TABLE decimal_tbl ADD IF NOT EXISTS PARTITION(d6=1)
>
> Data Loading from Impala failed with error: ImpalaBeeswaxException:
> INNER EXCEPTION: <class
> 'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
> MESSAGE:
> Error: null
>
> ******************************************************************************************************************
>
> Here is the complete log for the same. *(See attached file:
> data-load-functional-exhaustive.log)*
>
> It would great if you could guide me on this issue, so i could proceed
> with the fe tests.
>
> Still awaiting link to the source code of HDFS CDH 5.8.0
>
> Regards,
> Valencia
>
>
>
>




Re: Fw: Issues with generating testdata for Impala

Posted by Valencia Serrao <vs...@us.ibm.com>.
Hi Alex/Casey,

I re-ran the fe tests with the testdata you provided, but the result is the
same as that reported in the earlier mail, with most of  the failures
occurring due to tpch database not existing.

Steps followed to test are as follows:
1. copy the testdata to IMPALA_HOME/testdata/impala-data.
2. ./buildall.sh -notests -noclean -format -testdata
3. ./bin/run_all_tests.sh

We had also tried the testdata generation on Ubuntu x86 ppc machine
however, it stops at the same "Invalidate Metadata" step with the
exception.

Any pointers on these issues will be helpful.

Regards,
Valencia



From:	Valencia Serrao/Austin/Contr/IBM
To:	Casey Ching <ca...@cloudera.com>
Cc:	Alex Behm <al...@cloudera.com>,
            dev@impala.incubator.apache.org, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, David
            Clissold/Austin/IBM@IBMUS, Valencia
            Serrao/Austin/Contr/IBM@IBMUS
Date:	05/05/2016 06:47 PM
Subject:	Re: Fw: Issues with generating testdata for Impala


Hi Alex/Casey,

I tried to run the frontend tests with the data provided. Following is the
result:
	Tests run: 545, Failures: 226, Errors: 77, Skipped: 36    [attachment
"data-load-functional-exhaustive.zip" deleted by Valencia
Serrao/Austin/Contr/IBM]


Earlier, the number of "Errors" were 87 , so now they have reduced by 10.
However, the "Failures" count is still the same. Most of the Failures in
PlannerTest and AuthorizationTest are related to tpch (e.g. Database
doesn't exist: tpch).

With regard to the directory "impala_data", i've observed that it is not
being accessed/used by any script. Are we missing on any configuration ?

Kindly guide me on this.

Regards,
Valencia





From:	Valencia Serrao/Austin/Contr/IBM
To:	Casey Ching <ca...@cloudera.com>
Cc:	Alex Behm <al...@cloudera.com>,
            dev@impala.incubator.apache.org, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, David
            Clissold/Austin/IBM@IBMUS, Valencia
            Serrao/Austin/Contr/IBM@IBMUS
Date:	05/05/2016 02:21 PM
Subject:	Re: Fw: Issues with generating testdata for Impala


Thanks, Casey!

I will let you know the test status.




From:	Casey Ching <ca...@cloudera.com>
To:	Alex Behm <al...@cloudera.com>, Valencia
            Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
Cc:	Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS,
            dev@impala.incubator.apache.org
Date:	05/05/2016 01:09 PM
Subject:	Re: Fw: Issues with generating testdata for Impala








On May 4, 2016 at 11:08:07 PM, Valencia Serrao (vserrao@us.ibm.com) wrote:


      Hi Alex,

      I've placed the individual testdata tars at the
      IMPALA_HOME/testdata/impala-data. Steps 1...10 i've already executed.
      Some queries about step no:11 and step no:12, that i want to clarify:

      1) . bin/impala-config.sh
      2) mkdir -p $IMPALA_HOME/testdata/impala-data
      3) pushd $IMPALA_HOME/testdata/impala-data
      4) cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz
      5) tar -xzf tpch.tar.gz
      6) rm tpch.tar.gz
      7) cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz
      8) tar -xzf tpcds.tar.gz
      9) rm tpcds.tar.gz
      10) popd

      11) ./buildall.sh -notests -noclean -format
      -----Here I've removed the -testdata option.
      The reason to do this is to clear the previously generated partial
      schemas.


I think the -format option is supposed to clear out any old state. The
-testdata flag is probably needed to generate and load the test data.




      12) sudo rm -rf $IMPALA_HOME/testdata/impala-data ---- Is this step
      required? Why?


That is only for docker. It helps to reduct the image size. You shouldn’t
need to do that or any of the other rm commands.




      Could you kindly confirm on these steps ? If any corrections, please
      let me know.

      Regards,
      Valencia



       Valencia Serrao---05/04/2016 04:18:24 PM---Hi Alex/Casey Thank you
      for responding and for sharing the testdata. I'm working on using the
      testda

      From: Valencia Serrao/Austin/Contr/IBM
      To: Alex Behm <al...@cloudera.com>
      Cc: Casey Ching <ca...@cloudera.com>,
      dev@impala.incubator.apache.org, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS
      Date: 05/04/2016 04:18 PM
      Subject: Re: Fw: Issues with generating testdata for Impala




      Hi Alex/Casey

      Thank you for responding and for sharing the testdata. I'm working on
      using the testdata to run the fe tests.

      Meanwhile, I've posted the logs onto "Impala Dev" google group.
      Here's the link:
      https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk


      Regards,
      Valencia


       Alex Behm ---05/04/2016 12:52:44 PM---Ahh, thanks Casey. Did not
      know about that. Valencia, Impala's data loading expects the files to
      be

      From: Alex Behm <al...@cloudera.com>
      To: Casey Ching <ca...@cloudera.com>
      Cc: dev@impala.incubator.apache.org, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
      Serrao/Austin/Contr/IBM@IBMUS
      Date: 05/04/2016 12:52 PM
      Subject: Re: Fw: Issues with generating testdata for Impala



      Ahh, thanks Casey. Did not know about that.

      Valencia, Impala's data loading expects the files to be placed in
      IMPALA_HOME/testdata/impala-data

      On Tue, May 3, 2016 at 11:21 PM, Casey Ching <ca...@cloudera.com>
      wrote:
          Comment inline below



          On May 3, 2016 at 11:18:06 PM, Alex Behm (alex.behm@cloudera.com)
          wrote:


                  Hi Valencia,

                  I'm sorry you are having so much trouble with our setup.
                  Let's see what we
                  can do.

                  There was an infra issue with receiving the logs you sent
                  me. The
                  email/attachment got rejected on our side. Maybe you can
                  upload the logs
                  somewhere so I can grab them?

                  See more responses inline below.

                  On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <
                  vserrao@us.ibm.com> wrote:

                  > Hi Alex,
                  >
                  > I was going more deeper through the logs. I have some
                  findings and queries:
                  >
                  > 1. At the "Invalidating Metadata" step (as mentioned in
                  below mail), i
                  > noticed that, it is trying to use kerberos. Perhaps,
                  this is preventing the
                  > testdata generation from proceeding, as we are not
                  using Kerberos.
                  > I need to know how this can be done without involving
                  Kerberos support ?
                  >
                  Kerberos is certainly not needed to build and run tests.

                  >
                  > 2. I had executed the fe tests despite the incomplete
                  testdata generation,
                  > the tests started and surely have failed. Many of these
                  (null pointer
                  > exception in AuthorzationTests) have a common cause:
                  "tpch database does
                  > not exist."
                  > e.g. as shown
                  in .Impala/cluster_logs/query_tests/test-run-workload.log.

                  >
                  > Does the "tpch" database gets created after the current
                  blocker step
                  > "Invalidating Metadata" ?
                  >

                  Yes, the TPCH database is created and loaded as part of
                  that first phase.
                  However, the data files are not yet publicly accessible.
                  Let me work on
                  that from my side, and get back to you soon. One way or
                  the other we'll be
                  able to provide you with the data.

          The data is at
          https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
           . The files are split into 50 MB pieces for git. You can put
          them back together as is done in
          https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile

                  >
                  > 3. In the fe test console output log, another error
                  shown:
                  > ============================= test session starts
                  > ==============================
                  > platform linux2 -- Python 2.7.5 -- py-1.4.30 --
                  pytest-2.7.2
                  > rootdir: /work/, inifile:
                  > plugins: random, xdist
                  > ERROR: file not found:/work/I
                  >
                  mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/

                  >
                  > These are not present/created on my vm. May i know when
                  these get created ?
                  >
                  > 4. Could you also share the total number of fe tests ?
                  >

                  I'll privately send you the console output from a
                  successful FE run.
                  Hopefully that can help.

                  Cheers,

                  Alex

                  >
                  >
                  > Looking forward to your reply.
                  >
                  > Regards,
                  > Valencia
                  >
                  >
                  > [image: Inactive hide details for Valencia
                  Serrao---04/30/2016 09:05:54
                  > AM---Hi Alex, I've been able to make some progress on
                  testdata]Valencia
                  > Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been
                  able to make some
                  > progress on testdata generation, however, i still face
                  the foll
                  >
                  > From: Valencia Serrao/Austin/Contr/IBM
                  > To: dev@impala.incubator.apache.org, Alex Behm <
                  alex.behm@cloudera.com>
                  > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                  > Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
                  Serrao/Austin/Contr/IBM@IBMUS
                  > Date: 04/30/2016 09:05 AM
                  > Subject: Fw: Issues with generating testdata for Impala
                  > ------------------------------
                  >
                  >
                  >
                  > Hi Alex,
                  >
                  > I've been able to make some progress on testdata
                  generation, however, i
                  > still face the following issues:
                  >
                  >
                  >
                  *******************************************************************************************************************************************************************

                  > Invalidating Metadata
                  >
                  >
                  (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):

                  > INSERT OVERWRITE TABLE functional_parquet.alltypes
                  partition (year, month)
                  > SELECT id, bool_col, tinyint_col, smallint_col,
                  int_col, bigint_col,
                  > float_col, double_col, date_string_col, string_col,
                  timestamp_col, year,
                  > month
                  > FROM functional.alltypes
                  >
                  > Data Loading from Impala failed with error:
                  ImpalaBeeswaxException:
                  > INNER EXCEPTION: <class 'socket.error'>
                  > MESSAGE: [Errno 104] Connection reset by peer
                  > Error
                  in /root/nishidha/Impala/testdata/bin/create-load-data.sh
                  at line
                  > 41: while [ -n "$*" ]
                  > Error in /root/nishidha/Impala/buildall.sh at line 368:
                  > ${IMPALA_HOME}/testdata/bin/create-load-data.sh $
                  {CREATE_LOAD_DATA_ARGS}
                  > <<< Y
                  >
                  >
                  *************************************************************************************************************************************************************************

                  >
                  > i continued with fe tests as is. Here is the complete
                  output log.
                  > [attachment "fe_test_output.zip" deleted by Valencia
                  > Serrao/Austin/Contr/IBM]
                  >
                  > Cluster logs: [attachment "cluster_logs.7z" deleted by
                  Valencia
                  > Serrao/Austin/Contr/IBM]
                  >
                  > Kindly guide me on the same.
                  >
                  > Regards,
                  > Valencia
                  > ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on
                  04/29/2016 10:57 AM
                  > -----
                  >
                  > From: Sudarshan Jagadale/Austin/Contr/IBM
                  > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
                  > Date: 04/29/2016 10:49 AM
                  > Subject: Fw: Issues with generating testdata for Impala
                  > ------------------------------
                  >
                  >
                  > FYI
                  > Thanks and Regards
                  > Sudarshan Jagadale
                  > Power Open Source Solutions
                  > ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM
                  on 04/29/2016 10:48
                  > AM -----
                  >
                  > From: Alex Behm <al...@cloudera.com>
                  > To: dev@impala.incubator.apache.org
                  > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                  > Panpaliya/Austin/Contr/IBM@IBMUS
                  > Date: 04/28/2016 09:34 PM
                  > Subject: Re: Issues with generating testdata for Impala
                  > ------------------------------
                  >
                  >
                  >
                  > Hi Valencia,
                  >
                  > sorry I did not get the attachment. Would you be able
                  to tar.gz and attach
                  > the whole cluster_logs directory?
                  >
                  > Alex
                  >
                  > On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao <*
                  vserrao@us.ibm.com*
                  > <vs...@us.ibm.com>> wrote:
                  >
                  > Hi Alex,
                  >
                  > I tried building impala again with the following:
                  > HDFS CDH 5.7.0 (
                  > *
                  http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*

                  > <
                  http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3
                  >
                  > )
                  > HBASE CDH 5.7.0 SNAPSHOT (
                  > *
                  http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*

                  > <
                  http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz
                  > )
                  > - this required to patch in a fix (
                  > *
                  https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*

                  > <
                  https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch
                  >
                  > )
                  > HIVE CDH 5.8.0 SNAPSHOT
                  >
                  > With the above combination, i'm able to move past the
                  exception and
                  > also have the RegionServer service up and running.
                  However, it now gives
                  > error as below:
                  >
                  >
                  >
                  ********************************************************************************************************************

                  >
                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                  > CREATE EXTERNAL TABLE IF NOT EXISTS
                  functional.decimal_tbl (
                  > d1 DECIMAL,
                  > d2 DECIMAL(10, 0),
                  > d3 DECIMAL(20, 10),
                  > d4 DECIMAL(38, 38),
                  > d5 DECIMAL(10, 5))
                  > PARTITIONED BY (d6 DECIMAL(9, 0))
                  > ROW FORMAT delimited fields terminated by ','
                  > STORED AS TEXTFILE
                  > LOCATION '/test-warehouse/decimal_tbl'
                  >
                  >
                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                  > USE functional
                  >
                  >
                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                  > ALTER TABLE decimal_tbl ADD IF NOT EXISTS PARTITION
                  (d6=1)
                  >
                  > Data Loading from Impala failed with error:
                  ImpalaBeeswaxException:
                  > INNER EXCEPTION: <class
                  > 'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
                  > MESSAGE:
                  > Error: null
                  >
                  >
                  ******************************************************************************************************************

                  >
                  > Here is the complete log for the same. *(See attached
                  file:
                  > data-load-functional-exhaustive.log)*
                  >
                  > It would great if you could guide me on this issue, so
                  i could proceed
                  > with the fe tests.
                  >
                  > Still awaiting link to the source code of HDFS CDH
                  5.8.0
                  >
                  > Regards,
                  > Valencia
                  >
                  >
                  >
                  >



Re: Fw: Issues with generating testdata for Impala

Posted by Nishidha Panpaliya <ni...@us.ibm.com>.
Hello,

Today, we tried building Impala on Ubuntu 15.04 x86_64 using Impala's
toolchain. And unfortunately, test data generation is failed there as well.

I think we are missing at some setup step due to which we see the issue on
both platforms (x86 as well ppc). It would be great if you could provide us
any document with build and test instructions, just to verify our setup.

Anup will be sending the latest logs on x86.

Thanks,
Nishidha



From:	Valencia Serrao/Austin/Contr/IBM
To:	Casey Ching <ca...@cloudera.com>
Cc:	Alex Behm <al...@cloudera.com>,
            dev@impala.incubator.apache.org, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, David
            Clissold/Austin/IBM@IBMUS, Valencia
            Serrao/Austin/Contr/IBM@IBMUS
Date:	05/05/2016 06:47 PM
Subject:	Re: Fw: Issues with generating testdata for Impala


Hi Alex/Casey,

I tried to run the frontend tests with the data provided. Following is the
result:
	Tests run: 545, Failures: 226, Errors: 77, Skipped: 36    (See
attached file: data-load-functional-exhaustive.zip)


Earlier, the number of "Errors" were 87 , so now they have reduced by 10.
However, the "Failures" count is still the same. Most of the Failures in
PlannerTest and AuthorizationTest are related to tpch (e.g. Database
doesn't exist: tpch).

With regard to the directory "impala_data", i've observed that it is not
being accessed/used by any script. Are we missing on any configuration ?

Kindly guide me on this.

Regards,
Valencia





From:	Valencia Serrao/Austin/Contr/IBM
To:	Casey Ching <ca...@cloudera.com>
Cc:	Alex Behm <al...@cloudera.com>,
            dev@impala.incubator.apache.org, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, David
            Clissold/Austin/IBM@IBMUS, Valencia
            Serrao/Austin/Contr/IBM@IBMUS
Date:	05/05/2016 02:21 PM
Subject:	Re: Fw: Issues with generating testdata for Impala


Thanks, Casey!

I will let you know the test status.




From:	Casey Ching <ca...@cloudera.com>
To:	Alex Behm <al...@cloudera.com>, Valencia
            Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
Cc:	Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS,
            dev@impala.incubator.apache.org
Date:	05/05/2016 01:09 PM
Subject:	Re: Fw: Issues with generating testdata for Impala








On May 4, 2016 at 11:08:07 PM, Valencia Serrao (vserrao@us.ibm.com) wrote:


      Hi Alex,

      I've placed the individual testdata tars at the
      IMPALA_HOME/testdata/impala-data. Steps 1...10 i've already executed.
      Some queries about step no:11 and step no:12, that i want to clarify:

      1) . bin/impala-config.sh
      2) mkdir -p $IMPALA_HOME/testdata/impala-data
      3) pushd $IMPALA_HOME/testdata/impala-data
      4) cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz
      5) tar -xzf tpch.tar.gz
      6) rm tpch.tar.gz
      7) cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz
      8) tar -xzf tpcds.tar.gz
      9) rm tpcds.tar.gz
      10) popd

      11) ./buildall.sh -notests -noclean -format
      -----Here I've removed the -testdata option.
      The reason to do this is to clear the previously generated partial
      schemas.


I think the -format option is supposed to clear out any old state. The
-testdata flag is probably needed to generate and load the test data.




      12) sudo rm -rf $IMPALA_HOME/testdata/impala-data ---- Is this step
      required? Why?


That is only for docker. It helps to reduct the image size. You shouldn’t
need to do that or any of the other rm commands.




      Could you kindly confirm on these steps ? If any corrections, please
      let me know.

      Regards,
      Valencia



       Valencia Serrao---05/04/2016 04:18:24 PM---Hi Alex/Casey Thank you
      for responding and for sharing the testdata. I'm working on using the
      testda

      From: Valencia Serrao/Austin/Contr/IBM
      To: Alex Behm <al...@cloudera.com>
      Cc: Casey Ching <ca...@cloudera.com>,
      dev@impala.incubator.apache.org, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS
      Date: 05/04/2016 04:18 PM
      Subject: Re: Fw: Issues with generating testdata for Impala




      Hi Alex/Casey

      Thank you for responding and for sharing the testdata. I'm working on
      using the testdata to run the fe tests.

      Meanwhile, I've posted the logs onto "Impala Dev" google group.
      Here's the link:
      https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk


      Regards,
      Valencia


       Alex Behm ---05/04/2016 12:52:44 PM---Ahh, thanks Casey. Did not
      know about that. Valencia, Impala's data loading expects the files to
      be

      From: Alex Behm <al...@cloudera.com>
      To: Casey Ching <ca...@cloudera.com>
      Cc: dev@impala.incubator.apache.org, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
      Serrao/Austin/Contr/IBM@IBMUS
      Date: 05/04/2016 12:52 PM
      Subject: Re: Fw: Issues with generating testdata for Impala



      Ahh, thanks Casey. Did not know about that.

      Valencia, Impala's data loading expects the files to be placed in
      IMPALA_HOME/testdata/impala-data

      On Tue, May 3, 2016 at 11:21 PM, Casey Ching <ca...@cloudera.com>
      wrote:
          Comment inline below



          On May 3, 2016 at 11:18:06 PM, Alex Behm (alex.behm@cloudera.com)
          wrote:


                  Hi Valencia,

                  I'm sorry you are having so much trouble with our setup.
                  Let's see what we
                  can do.

                  There was an infra issue with receiving the logs you sent
                  me. The
                  email/attachment got rejected on our side. Maybe you can
                  upload the logs
                  somewhere so I can grab them?

                  See more responses inline below.

                  On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <
                  vserrao@us.ibm.com> wrote:

                  > Hi Alex,
                  >
                  > I was going more deeper through the logs. I have some
                  findings and queries:
                  >
                  > 1. At the "Invalidating Metadata" step (as mentioned in
                  below mail), i
                  > noticed that, it is trying to use kerberos. Perhaps,
                  this is preventing the
                  > testdata generation from proceeding, as we are not
                  using Kerberos.
                  > I need to know how this can be done without involving
                  Kerberos support ?
                  >
                  Kerberos is certainly not needed to build and run tests.

                  >
                  > 2. I had executed the fe tests despite the incomplete
                  testdata generation,
                  > the tests started and surely have failed. Many of these
                  (null pointer
                  > exception in AuthorzationTests) have a common cause:
                  "tpch database does
                  > not exist."
                  > e.g. as shown
                  in .Impala/cluster_logs/query_tests/test-run-workload.log.

                  >
                  > Does the "tpch" database gets created after the current
                  blocker step
                  > "Invalidating Metadata" ?
                  >

                  Yes, the TPCH database is created and loaded as part of
                  that first phase.
                  However, the data files are not yet publicly accessible.
                  Let me work on
                  that from my side, and get back to you soon. One way or
                  the other we'll be
                  able to provide you with the data.

          The data is at
          https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
           . The files are split into 50 MB pieces for git. You can put
          them back together as is done in
          https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile

                  >
                  > 3. In the fe test console output log, another error
                  shown:
                  > ============================= test session starts
                  > ==============================
                  > platform linux2 -- Python 2.7.5 -- py-1.4.30 --
                  pytest-2.7.2
                  > rootdir: /work/, inifile:
                  > plugins: random, xdist
                  > ERROR: file not found:/work/I
                  >
                  mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/

                  >
                  > These are not present/created on my vm. May i know when
                  these get created ?
                  >
                  > 4. Could you also share the total number of fe tests ?
                  >

                  I'll privately send you the console output from a
                  successful FE run.
                  Hopefully that can help.

                  Cheers,

                  Alex

                  >
                  >
                  > Looking forward to your reply.
                  >
                  > Regards,
                  > Valencia
                  >
                  >
                  > [image: Inactive hide details for Valencia
                  Serrao---04/30/2016 09:05:54
                  > AM---Hi Alex, I've been able to make some progress on
                  testdata]Valencia
                  > Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been
                  able to make some
                  > progress on testdata generation, however, i still face
                  the foll
                  >
                  > From: Valencia Serrao/Austin/Contr/IBM
                  > To: dev@impala.incubator.apache.org, Alex Behm <
                  alex.behm@cloudera.com>
                  > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                  > Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
                  Serrao/Austin/Contr/IBM@IBMUS
                  > Date: 04/30/2016 09:05 AM
                  > Subject: Fw: Issues with generating testdata for Impala
                  > ------------------------------
                  >
                  >
                  >
                  > Hi Alex,
                  >
                  > I've been able to make some progress on testdata
                  generation, however, i
                  > still face the following issues:
                  >
                  >
                  >
                  *******************************************************************************************************************************************************************

                  > Invalidating Metadata
                  >
                  >
                  (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):

                  > INSERT OVERWRITE TABLE functional_parquet.alltypes
                  partition (year, month)
                  > SELECT id, bool_col, tinyint_col, smallint_col,
                  int_col, bigint_col,
                  > float_col, double_col, date_string_col, string_col,
                  timestamp_col, year,
                  > month
                  > FROM functional.alltypes
                  >
                  > Data Loading from Impala failed with error:
                  ImpalaBeeswaxException:
                  > INNER EXCEPTION: <class 'socket.error'>
                  > MESSAGE: [Errno 104] Connection reset by peer
                  > Error
                  in /root/nishidha/Impala/testdata/bin/create-load-data.sh
                  at line
                  > 41: while [ -n "$*" ]
                  > Error in /root/nishidha/Impala/buildall.sh at line 368:
                  > ${IMPALA_HOME}/testdata/bin/create-load-data.sh $
                  {CREATE_LOAD_DATA_ARGS}
                  > <<< Y
                  >
                  >
                  *************************************************************************************************************************************************************************

                  >
                  > i continued with fe tests as is. Here is the complete
                  output log.
                  > [attachment "fe_test_output.zip" deleted by Valencia
                  > Serrao/Austin/Contr/IBM]
                  >
                  > Cluster logs: [attachment "cluster_logs.7z" deleted by
                  Valencia
                  > Serrao/Austin/Contr/IBM]
                  >
                  > Kindly guide me on the same.
                  >
                  > Regards,
                  > Valencia
                  > ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on
                  04/29/2016 10:57 AM
                  > -----
                  >
                  > From: Sudarshan Jagadale/Austin/Contr/IBM
                  > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
                  > Date: 04/29/2016 10:49 AM
                  > Subject: Fw: Issues with generating testdata for Impala
                  > ------------------------------
                  >
                  >
                  > FYI
                  > Thanks and Regards
                  > Sudarshan Jagadale
                  > Power Open Source Solutions
                  > ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM
                  on 04/29/2016 10:48
                  > AM -----
                  >
                  > From: Alex Behm <al...@cloudera.com>
                  > To: dev@impala.incubator.apache.org
                  > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                  > Panpaliya/Austin/Contr/IBM@IBMUS
                  > Date: 04/28/2016 09:34 PM
                  > Subject: Re: Issues with generating testdata for Impala
                  > ------------------------------
                  >
                  >
                  >
                  > Hi Valencia,
                  >
                  > sorry I did not get the attachment. Would you be able
                  to tar.gz and attach
                  > the whole cluster_logs directory?
                  >
                  > Alex
                  >
                  > On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao <*
                  vserrao@us.ibm.com*
                  > <vs...@us.ibm.com>> wrote:
                  >
                  > Hi Alex,
                  >
                  > I tried building impala again with the following:
                  > HDFS CDH 5.7.0 (
                  > *
                  http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*

                  > <
                  http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3
                  >
                  > )
                  > HBASE CDH 5.7.0 SNAPSHOT (
                  > *
                  http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*

                  > <
                  http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz
                  > )
                  > - this required to patch in a fix (
                  > *
                  https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*

                  > <
                  https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch
                  >
                  > )
                  > HIVE CDH 5.8.0 SNAPSHOT
                  >
                  > With the above combination, i'm able to move past the
                  exception and
                  > also have the RegionServer service up and running.
                  However, it now gives
                  > error as below:
                  >
                  >
                  >
                  ********************************************************************************************************************

                  >
                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                  > CREATE EXTERNAL TABLE IF NOT EXISTS
                  functional.decimal_tbl (
                  > d1 DECIMAL,
                  > d2 DECIMAL(10, 0),
                  > d3 DECIMAL(20, 10),
                  > d4 DECIMAL(38, 38),
                  > d5 DECIMAL(10, 5))
                  > PARTITIONED BY (d6 DECIMAL(9, 0))
                  > ROW FORMAT delimited fields terminated by ','
                  > STORED AS TEXTFILE
                  > LOCATION '/test-warehouse/decimal_tbl'
                  >
                  >
                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                  > USE functional
                  >
                  >
                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                  > ALTER TABLE decimal_tbl ADD IF NOT EXISTS PARTITION
                  (d6=1)
                  >
                  > Data Loading from Impala failed with error:
                  ImpalaBeeswaxException:
                  > INNER EXCEPTION: <class
                  > 'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
                  > MESSAGE:
                  > Error: null
                  >
                  >
                  ******************************************************************************************************************

                  >
                  > Here is the complete log for the same. *(See attached
                  file:
                  > data-load-functional-exhaustive.log)*
                  >
                  > It would great if you could guide me on this issue, so
                  i could proceed
                  > with the fe tests.
                  >
                  > Still awaiting link to the source code of HDFS CDH
                  5.8.0
                  >
                  > Regards,
                  > Valencia
                  >
                  >
                  >
                  >



Re: Fw: Issues with generating testdata for Impala

Posted by Casey Ching <ca...@cloudera.com>.


On May 6, 2016 at 6:47:37 AM, Anup Halarnkar (anuph@us.ibm.com) wrote:

Hi,

As per Nishidha's request, sharing the log of failure...
(See attached file: data-load-functional-exhaustive.tar.gz)

Few final lines in the log are represented below for your reference.

####################################################################################################################
Impala Cluster Running with 3 nodes.
Deleting key: testkey1 from KeyProvider: KMSClientProvider[http://127.0.0.1:16000/kms/v1/]
testkey1 has been successfully deleted.
KMSClientProvider[http://127.0.0.1:16000/kms/v1/] has been updated.
testkey1 has been successfully created with options Options{cipher='AES/CTR/NoPadding', bitLength=128, description='null', attributes=null}.
KMSClientProvider[http://127.0.0.1:16000/kms/v1/] has been updated.
Deleting key: testkey2 from KeyProvider: KMSClientProvider[http://127.0.0.1:16000/kms/v1/]
testkey2 has been successfully deleted.
KMSClientProvider[http://127.0.0.1:16000/kms/v1/] has been updated.
testkey2 has been successfully created with options Options{cipher='AES/CTR/NoPadding', bitLength=128, description='null', attributes=null}.
KMSClientProvider[http://127.0.0.1:16000/kms/v1/] has been updated.
Successfully added cache pool testPool.
LOADING CUSTOM SCHEMAS
Loading workload 'functional-query' Using exploration strategy 'exhaustive'. Logging to /root/deepali/Impala/cluster_logs/data_loading/data-load-functional-exhaustive.log
Error loading data. The end of the log file is:
select * from functional.decimal_tiny

(load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):
INSERT OVERWRITE TABLE functional_parquet.widetable_250_cols
select * from functional.widetable_250_cols

(load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):
INSERT OVERWRITE TABLE functional_parquet.widetable_500_cols
select * from functional.widetable_500_cols

(load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):
INSERT OVERWRITE TABLE functional_parquet.widetable_1000_cols
select * from functional.widetable_1000_cols

Data Loading from Impala failed with error: ImpalaBeeswaxException:
Query aborted:Cancelled due to unreachable impalad(s): hj-ibmibm1150:22002

This is the kind of message you’ll see when impala crashes. You might be able to find the cause in the logs but I usually go straight to the core dump with gdb.

As far as instructions, I think you have everything now. 



Error in /root/deepali/Impala/testdata/bin/create-load-data.sh at line 145: return 1
Error in ./buildall.sh at line 365: ${IMPALA_HOME}/testdata/bin/create-load-data.sh ${CREATE_LOAD_DATA_ARGS} <<< Y
####################################################################################################################


Thanks and Regards,
Anup Halarnkar

 Nishidha Panpaliya---05/06/2016 06:09:20 PM---Hello, Today, we tried building Impala on Ubuntu 15.04 x86_64 using Impala's toolchain. And unfortun

From: Nishidha Panpaliya/Austin/Contr/IBM
To: Casey Ching <ca...@cloudera.com>, Alex Behm <al...@cloudera.com>
Cc: dev@impala.incubator.apache.org, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS, Anup Halarnkar/Austin/Contr/IBM@IBMUS
Date: 05/06/2016 06:09 PM
Subject: Re: Fw: Issues with generating testdata for Impala



Hello,

Today, we tried building Impala on Ubuntu 15.04 x86_64 using Impala's toolchain. And unfortunately, test data generation is failed there as well.

I think we are missing at some setup step due to which we see the issue on both platforms (x86 as well ppc). It would be great if you could provide us any document with build and test instructions, just to verify our setup.

Anup will be sending the latest logs on x86.

Thanks,
Nishidha


 Valencia Serrao---05/05/2016 06:47:58 PM---Hi Alex/Casey, I tried to run the frontend tests with the data provided. Following is the result:

From: Valencia Serrao/Austin/Contr/IBM
To: Casey Ching <ca...@cloudera.com>
Cc: Alex Behm <al...@cloudera.com>, dev@impala.incubator.apache.org, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
Date: 05/05/2016 06:47 PM
Subject: Re: Fw: Issues with generating testdata for Impala


Hi Alex/Casey,

I tried to run the frontend tests with the data provided. Following is the result:
Tests run: 545, Failures: 226, Errors: 77, Skipped: 36 (See attached file: data-load-functional-exhaustive.zip)


Earlier, the number of "Errors" were 87 , so now they have reduced by 10. However, the "Failures" count is still the same. Most of the Failures in PlannerTest and AuthorizationTest are related to tpch (e.g. Database doesn't exist: tpch).

With regard to the directory "impala_data", i've observed that it is not being accessed/used by any script. Are we missing on any configuration ?

Kindly guide me on this.

Regards,
Valencia



 Valencia Serrao---05/05/2016 02:21:56 PM---Thanks, Casey! I will let you know the test status.

From: Valencia Serrao/Austin/Contr/IBM
To: Casey Ching <ca...@cloudera.com>
Cc: Alex Behm <al...@cloudera.com>, dev@impala.incubator.apache.org, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
Date: 05/05/2016 02:21 PM
Subject: Re: Fw: Issues with generating testdata for Impala


Thanks, Casey!

I will let you know the test status.


 Casey Ching ---05/05/2016 01:09:11 PM---On May 4, 2016 at 11:08:07 PM, Valencia Serrao (vserrao@us.ibm.com) wrote: Hi Alex,

From: Casey Ching <ca...@cloudera.com>
To: Alex Behm <al...@cloudera.com>, Valencia Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
Date: 05/05/2016 01:09 PM
Subject: Re: Fw: Issues with generating testdata for Impala





On May 4, 2016 at 11:08:07 PM, Valencia Serrao (vserrao@us.ibm.com) wrote:

Hi Alex,

I've placed the individual testdata tars at the IMPALA_HOME/testdata/impala-data. Steps 1...10 i've already executed. Some queries about step no:11 and step no:12, that i want to clarify:

1) . bin/impala-config.sh
2) mkdir -p $IMPALA_HOME/testdata/impala-data
3) pushd $IMPALA_HOME/testdata/impala-data
4) cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz
5) tar -xzf tpch.tar.gz
6) rm tpch.tar.gz
7) cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz
8) tar -xzf tpcds.tar.gz
9) rm tpcds.tar.gz
10) popd

11) ./buildall.sh -notests -noclean -format
-----Here I've removed the -testdata option.
The reason to do this is to clear the previously generated partial schemas.
I think the -format option is supposed to clear out any old state. The -testdata flag is probably needed to generate and load the test data.


12) sudo rm -rf $IMPALA_HOME/testdata/impala-data ---- Is this step required? Why?
That is only for docker. It helps to reduct the image size. You shouldn’t need to do that or any of the other rm commands.


Could you kindly confirm on these steps ? If any corrections, please let me know.

Regards,
Valencia



Valencia Serrao---05/04/2016 04:18:24 PM---Hi Alex/Casey Thank you for responding and for sharing the testdata. I'm working on using the testda

From: Valencia Serrao/Austin/Contr/IBM
To: Alex Behm <al...@cloudera.com>
Cc: Casey Ching <ca...@cloudera.com>, dev@impala.incubator.apache.org, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS
Date: 05/04/2016 04:18 PM
Subject: Re: Fw: Issues with generating testdata for Impala


Hi Alex/Casey

Thank you for responding and for sharing the testdata. I'm working on using the testdata to run the fe tests.

Meanwhile, I've posted the logs onto "Impala Dev" google group. Here's the link: https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk

Regards,
Valencia


Alex Behm ---05/04/2016 12:52:44 PM---Ahh, thanks Casey. Did not know about that. Valencia, Impala's data loading expects the files to be

From: Alex Behm <al...@cloudera.com>
To: Casey Ching <ca...@cloudera.com>
Cc: dev@impala.incubator.apache.org, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
Date: 05/04/2016 12:52 PM
Subject: Re: Fw: Issues with generating testdata for Impala



Ahh, thanks Casey. Did not know about that.

Valencia, Impala's data loading expects the files to be placed in IMPALA_HOME/testdata/impala-data

On Tue, May 3, 2016 at 11:21 PM, Casey Ching <ca...@cloudera.com> wrote:

Comment inline below
On May 3, 2016 at 11:18:06 PM, Alex Behm (alex.behm@cloudera.com) wrote:

Hi Valencia,

I'm sorry you are having so much trouble with our setup. Let's see what we
can do.

There was an infra issue with receiving the logs you sent me. The
email/attachment got rejected on our side. Maybe you can upload the logs
somewhere so I can grab them?

See more responses inline below.

On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <vs...@us.ibm.com> wrote:

> Hi Alex,
>
> I was going more deeper through the logs. I have some findings and queries:
>
> 1. At the "Invalidating Metadata" step (as mentioned in below mail), i
> noticed that, it is trying to use kerberos. Perhaps, this is preventing the
> testdata generation from proceeding, as we are not using Kerberos.
> I need to know how this can be done without involving Kerberos support ?
>
Kerberos is certainly not needed to build and run tests.

>
> 2. I had executed the fe tests despite the incomplete testdata generation,
> the tests started and surely have failed. Many of these (null pointer
> exception in AuthorzationTests) have a common cause: "tpch database does
> not exist."
> e.g. as shown in .Impala/cluster_logs/query_tests/test-run-workload.log.
>
> Does the "tpch" database gets created after the current blocker step
> "Invalidating Metadata" ?
>

Yes, the TPCH database is created and loaded as part of that first phase.
However, the data files are not yet publicly accessible. Let me work on
that from my side, and get back to you soon. One way or the other we'll be
able to provide you with the data.

The data is at https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp . The files are split into 50 MB pieces for git. You can put them back together as is done in https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile

>
> 3. In the fe test console output log, another error shown:
> ============================= test session starts
> ==============================
> platform linux2 -- Python 2.7.5 -- py-1.4.30 -- pytest-2.7.2
> rootdir: /work/, inifile:
> plugins: random, xdist
> ERROR: file not found:/work/I
> mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/
>
> These are not present/created on my vm. May i know when these get created ?
>
> 4. Could you also share the total number of fe tests ?
>

I'll privately send you the console output from a successful FE run.
Hopefully that can help.

Cheers,

Alex

>
>
> Looking forward to your reply.
>
> Regards,
> Valencia
>
>
> [image: Inactive hide details for Valencia Serrao---04/30/2016 09:05:54
> AM---Hi Alex, I've been able to make some progress on testdata]Valencia
> Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been able to make some
> progress on testdata generation, however, i still face the foll
>
> From: Valencia Serrao/Austin/Contr/IBM
> To: dev@impala.incubator.apache.org, Alex Behm <al...@cloudera.com>
> Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
> Panpaliya/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
> Date: 04/30/2016 09:05 AM
> Subject: Fw: Issues with generating testdata for Impala
> ------------------------------
>
>
>
> Hi Alex,
>
> I've been able to make some progress on testdata generation, however, i
> still face the following issues:
>
>
> *******************************************************************************************************************************************************************
> Invalidating Metadata
>
> (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):
> INSERT OVERWRITE TABLE functional_parquet.alltypes partition (year, month)
> SELECT id, bool_col, tinyint_col, smallint_col, int_col, bigint_col,
> float_col, double_col, date_string_col, string_col, timestamp_col, year,
> month
> FROM functional.alltypes
>
> Data Loading from Impala failed with error: ImpalaBeeswaxException:
> INNER EXCEPTION: <class 'socket.error'>
> MESSAGE: [Errno 104] Connection reset by peer
> Error in /root/nishidha/Impala/testdata/bin/create-load-data.sh at line
> 41: while [ -n "$*" ]
> Error in /root/nishidha/Impala/buildall.sh at line 368:
> ${IMPALA_HOME}/testdata/bin/create-load-data.sh ${CREATE_LOAD_DATA_ARGS}
> <<< Y
>
> *************************************************************************************************************************************************************************
>
> i continued with fe tests as is. Here is the complete output log.
> [attachment "fe_test_output.zip" deleted by Valencia
> Serrao/Austin/Contr/IBM]
>
> Cluster logs: [attachment "cluster_logs.7z" deleted by Valencia
> Serrao/Austin/Contr/IBM]
>
> Kindly guide me on the same.
>
> Regards,
> Valencia
> ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on 04/29/2016 10:57 AM
> -----
>
> From: Sudarshan Jagadale/Austin/Contr/IBM
> To: Valencia Serrao/Austin/Contr/IBM@IBMUS
> Date: 04/29/2016 10:49 AM
> Subject: Fw: Issues with generating testdata for Impala
> ------------------------------
>
>
> FYI
> Thanks and Regards
> Sudarshan Jagadale
> Power Open Source Solutions
> ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM on 04/29/2016 10:48
> AM -----
>
> From: Alex Behm <al...@cloudera.com>
> To: dev@impala.incubator.apache.org 
> Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
> Panpaliya/Austin/Contr/IBM@IBMUS
> Date: 04/28/2016 09:34 PM
> Subject: Re: Issues with generating testdata for Impala
> ------------------------------
>
>
>
> Hi Valencia,
>
> sorry I did not get the attachment. Would you be able to tar.gz and attach
> the whole cluster_logs directory?
>
> Alex
>
> On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao <*vserrao@us.ibm.com*
> <vs...@us.ibm.com>> wrote:
>
> Hi Alex,
>
> I tried building impala again with the following:
> HDFS CDH 5.7.0 (
> *http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3* 
> <http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3>
> )
> HBASE CDH 5.7.0 SNAPSHOT (
> *http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz* 
> <http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz> )
> - this required to patch in a fix (
> *https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch* 
> <https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch>
> )
> HIVE CDH 5.8.0 SNAPSHOT
>
> With the above combination, i'm able to move past the exception and
> also have the RegionServer service up and running. However, it now gives
> error as below:
>
>
> ********************************************************************************************************************
> (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
> CREATE EXTERNAL TABLE IF NOT EXISTS functional.decimal_tbl (
> d1 DECIMAL,
> d2 DECIMAL(10, 0),
> d3 DECIMAL(20, 10),
> d4 DECIMAL(38, 38),
> d5 DECIMAL(10, 5))
> PARTITIONED BY (d6 DECIMAL(9, 0))
> ROW FORMAT delimited fields terminated by ','
> STORED AS TEXTFILE
> LOCATION '/test-warehouse/decimal_tbl'
>
> (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
> USE functional
>
> (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
> ALTER TABLE decimal_tbl ADD IF NOT EXISTS PARTITION(d6=1)
>
> Data Loading from Impala failed with error: ImpalaBeeswaxException:
> INNER EXCEPTION: <class
> 'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
> MESSAGE:
> Error: null
>
> ******************************************************************************************************************
>
> Here is the complete log for the same. *(See attached file:
> data-load-functional-exhaustive.log)*
>
> It would great if you could guide me on this issue, so i could proceed
> with the fe tests.
>
> Still awaiting link to the source code of HDFS CDH 5.8.0
>
> Regards,
> Valencia
>
>
>
>





Re: Fw: Issues with generating testdata for Impala

Posted by Casey Ching <ca...@cloudera.com>.


On May 4, 2016 at 11:08:07 PM, Valencia Serrao (vserrao@us.ibm.com) wrote:

Hi Alex,

I've placed the individual testdata tars at the IMPALA_HOME/testdata/impala-data. Steps 1...10 i've already executed. Some queries about step no:11 and step no:12, that i want to clarify:

1) . bin/impala-config.sh
2) mkdir -p $IMPALA_HOME/testdata/impala-data
3) pushd $IMPALA_HOME/testdata/impala-data
4) cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz
5) tar -xzf tpch.tar.gz
6) rm tpch.tar.gz
7) cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz
8) tar -xzf tpcds.tar.gz
9) rm tpcds.tar.gz
10) popd

11) ./buildall.sh -notests -noclean -format
-----Here I've removed the -testdata option.
The reason to do this is to clear the previously generated partial schemas.

I think the -format option is supposed to clear out any old state. The -testdata flag is probably needed to generate and load the test data.



12) sudo rm -rf $IMPALA_HOME/testdata/impala-data ---- Is this step required? Why?

That is only for docker. It helps to reduct the image size. You shouldn’t need to do that or any of the other rm commands.



Could you kindly confirm on these steps ? If any corrections, please let me know.

Regards,
Valencia



 Valencia Serrao---05/04/2016 04:18:24 PM---Hi Alex/Casey Thank you for responding and for sharing the testdata. I'm working on using the testda

From: Valencia Serrao/Austin/Contr/IBM
To: Alex Behm <al...@cloudera.com>
Cc: Casey Ching <ca...@cloudera.com>, dev@impala.incubator.apache.org, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS
Date: 05/04/2016 04:18 PM
Subject: Re: Fw: Issues with generating testdata for Impala



Hi Alex/Casey

Thank you for responding and for sharing the testdata. I'm working on using the testdata to run the fe tests.

Meanwhile, I've posted the logs onto "Impala Dev" google group. Here's the link: https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk

Regards,
Valencia


 Alex Behm ---05/04/2016 12:52:44 PM---Ahh, thanks Casey. Did not know about that. Valencia, Impala's data loading expects the files to be

From: Alex Behm <al...@cloudera.com>
To: Casey Ching <ca...@cloudera.com>
Cc: dev@impala.incubator.apache.org, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
Date: 05/04/2016 12:52 PM
Subject: Re: Fw: Issues with generating testdata for Impala



Ahh, thanks Casey. Did not know about that.

Valencia, Impala's data loading expects the files to be placed in IMPALA_HOME/testdata/impala-data

On Tue, May 3, 2016 at 11:21 PM, Casey Ching <ca...@cloudera.com> wrote:
Comment inline below

On May 3, 2016 at 11:18:06 PM, Alex Behm (alex.behm@cloudera.com) wrote:

Hi Valencia,

I'm sorry you are having so much trouble with our setup. Let's see what we
can do.

There was an infra issue with receiving the logs you sent me. The
email/attachment got rejected on our side. Maybe you can upload the logs
somewhere so I can grab them?

See more responses inline below.

On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <vs...@us.ibm.com> wrote:

> Hi Alex,
>
> I was going more deeper through the logs. I have some findings and queries:
>
> 1. At the "Invalidating Metadata" step (as mentioned in below mail), i
> noticed that, it is trying to use kerberos. Perhaps, this is preventing the
> testdata generation from proceeding, as we are not using Kerberos.
> I need to know how this can be done without involving Kerberos support ?
>
Kerberos is certainly not needed to build and run tests.

>
> 2. I had executed the fe tests despite the incomplete testdata generation,
> the tests started and surely have failed. Many of these (null pointer
> exception in AuthorzationTests) have a common cause: "tpch database does
> not exist."
> e.g. as shown in .Impala/cluster_logs/query_tests/test-run-workload.log.
>
> Does the "tpch" database gets created after the current blocker step
> "Invalidating Metadata" ?
>

Yes, the TPCH database is created and loaded as part of that first phase.
However, the data files are not yet publicly accessible. Let me work on
that from my side, and get back to you soon. One way or the other we'll be
able to provide you with the data.

The data is at https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp . The files are split into 50 MB pieces for git. You can put them back together as is done in https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile

>
> 3. In the fe test console output log, another error shown:
> ============================= test session starts
> ==============================
> platform linux2 -- Python 2.7.5 -- py-1.4.30 -- pytest-2.7.2
> rootdir: /work/, inifile:
> plugins: random, xdist
> ERROR: file not found:/work/I
> mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/
>
> These are not present/created on my vm. May i know when these get created ?
>
> 4. Could you also share the total number of fe tests ?
>

I'll privately send you the console output from a successful FE run.
Hopefully that can help.

Cheers,

Alex

>
>
> Looking forward to your reply.
>
> Regards,
> Valencia
>
>
> [image: Inactive hide details for Valencia Serrao---04/30/2016 09:05:54
> AM---Hi Alex, I've been able to make some progress on testdata]Valencia
> Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been able to make some
> progress on testdata generation, however, i still face the foll
>
> From: Valencia Serrao/Austin/Contr/IBM
> To: dev@impala.incubator.apache.org, Alex Behm <al...@cloudera.com>
> Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
> Panpaliya/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
> Date: 04/30/2016 09:05 AM
> Subject: Fw: Issues with generating testdata for Impala
> ------------------------------
>
>
>
> Hi Alex,
>
> I've been able to make some progress on testdata generation, however, i
> still face the following issues:
>
>
> *******************************************************************************************************************************************************************
> Invalidating Metadata
>
> (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):
> INSERT OVERWRITE TABLE functional_parquet.alltypes partition (year, month)
> SELECT id, bool_col, tinyint_col, smallint_col, int_col, bigint_col,
> float_col, double_col, date_string_col, string_col, timestamp_col, year,
> month
> FROM functional.alltypes
>
> Data Loading from Impala failed with error: ImpalaBeeswaxException:
> INNER EXCEPTION: <class 'socket.error'>
> MESSAGE: [Errno 104] Connection reset by peer
> Error in /root/nishidha/Impala/testdata/bin/create-load-data.sh at line
> 41: while [ -n "$*" ]
> Error in /root/nishidha/Impala/buildall.sh at line 368:
> ${IMPALA_HOME}/testdata/bin/create-load-data.sh ${CREATE_LOAD_DATA_ARGS}
> <<< Y
>
> *************************************************************************************************************************************************************************
>
> i continued with fe tests as is. Here is the complete output log.
> [attachment "fe_test_output.zip" deleted by Valencia
> Serrao/Austin/Contr/IBM]
>
> Cluster logs: [attachment "cluster_logs.7z" deleted by Valencia
> Serrao/Austin/Contr/IBM]
>
> Kindly guide me on the same.
>
> Regards,
> Valencia
> ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on 04/29/2016 10:57 AM
> -----
>
> From: Sudarshan Jagadale/Austin/Contr/IBM
> To: Valencia Serrao/Austin/Contr/IBM@IBMUS
> Date: 04/29/2016 10:49 AM
> Subject: Fw: Issues with generating testdata for Impala
> ------------------------------
>
>
> FYI
> Thanks and Regards
> Sudarshan Jagadale
> Power Open Source Solutions
> ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM on 04/29/2016 10:48
> AM -----
>
> From: Alex Behm <al...@cloudera.com>
> To: dev@impala.incubator.apache.org 
> Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
> Panpaliya/Austin/Contr/IBM@IBMUS
> Date: 04/28/2016 09:34 PM
> Subject: Re: Issues with generating testdata for Impala
> ------------------------------
>
>
>
> Hi Valencia,
>
> sorry I did not get the attachment. Would you be able to tar.gz and attach
> the whole cluster_logs directory?
>
> Alex
>
> On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao <*vserrao@us.ibm.com*
> <vs...@us.ibm.com>> wrote:
>
> Hi Alex,
>
> I tried building impala again with the following:
> HDFS CDH 5.7.0 (
> *http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3* 
> <http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3>
> )
> HBASE CDH 5.7.0 SNAPSHOT (
> *http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz* 
> <http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz> )
> - this required to patch in a fix (
> *https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch* 
> <https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch>
> )
> HIVE CDH 5.8.0 SNAPSHOT
>
> With the above combination, i'm able to move past the exception and
> also have the RegionServer service up and running. However, it now gives
> error as below:
>
>
> ********************************************************************************************************************
> (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
> CREATE EXTERNAL TABLE IF NOT EXISTS functional.decimal_tbl (
> d1 DECIMAL,
> d2 DECIMAL(10, 0),
> d3 DECIMAL(20, 10),
> d4 DECIMAL(38, 38),
> d5 DECIMAL(10, 5))
> PARTITIONED BY (d6 DECIMAL(9, 0))
> ROW FORMAT delimited fields terminated by ','
> STORED AS TEXTFILE
> LOCATION '/test-warehouse/decimal_tbl'
>
> (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
> USE functional
>
> (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
> ALTER TABLE decimal_tbl ADD IF NOT EXISTS PARTITION(d6=1)
>
> Data Loading from Impala failed with error: ImpalaBeeswaxException:
> INNER EXCEPTION: <class
> 'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
> MESSAGE:
> Error: null
>
> ******************************************************************************************************************
>
> Here is the complete log for the same. *(See attached file:
> data-load-functional-exhaustive.log)*
>
> It would great if you could guide me on this issue, so i could proceed
> with the fe tests.
>
> Still awaiting link to the source code of HDFS CDH 5.8.0
>
> Regards,
> Valencia
>
>
>
>



Re: Fw: Issues with generating testdata for Impala

Posted by Valencia Serrao <vs...@us.ibm.com>.
Hi Alex,

I've placed the individual testdata tars at the
IMPALA_HOME/testdata/impala-data. Steps 1...10 i've already executed. Some
queries about step no:11 and step no:12,  that  i want to clarify:

1) . bin/impala-config.sh
2) mkdir -p $IMPALA_HOME/testdata/impala-data
3) pushd $IMPALA_HOME/testdata/impala-data
4) cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz
5) tar -xzf tpch.tar.gz
6) rm tpch.tar.gz
7) cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz
8) tar -xzf tpcds.tar.gz
9) rm tpcds.tar.gz
10) popd

11) ./buildall.sh -notests -noclean -format
-----Here I've removed the -testdata option.
The reason to do this is to clear the previously generated partial schemas.

12) sudo rm -rf $IMPALA_HOME/testdata/impala-data ---- Is this step
required? Why?

Could you kindly confirm on these steps ? If any corrections, please let me
know.

Regards,
Valencia





From:	Valencia Serrao/Austin/Contr/IBM
To:	Alex Behm <al...@cloudera.com>
Cc:	Casey Ching <ca...@cloudera.com>,
            dev@impala.incubator.apache.org, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, David
            Clissold/Austin/IBM@IBMUS
Date:	05/04/2016 04:18 PM
Subject:	Re: Fw: Issues with generating testdata for Impala


Hi Alex/Casey

Thank you for responding and for sharing the testdata. I'm working on using
the testdata to run the fe tests.

Meanwhile, I've posted the logs onto "Impala Dev" google group. Here's the
link:
https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk

Regards,
Valencia




From:	Alex Behm <al...@cloudera.com>
To:	Casey Ching <ca...@cloudera.com>
Cc:	dev@impala.incubator.apache.org, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
            Serrao/Austin/Contr/IBM@IBMUS
Date:	05/04/2016 12:52 PM
Subject:	Re: Fw: Issues with generating testdata for Impala



Ahh, thanks Casey. Did not know about that.

Valencia, Impala's data loading expects the files to be placed
in IMPALA_HOME/testdata/impala-data

On Tue, May 3, 2016 at 11:21 PM, Casey Ching <ca...@cloudera.com> wrote:
  Comment inline below




  On May 3, 2016 at 11:18:06 PM, Alex Behm (alex.behm@cloudera.com) wrote:


        Hi Valencia,

        I'm sorry you are having so much trouble with our setup. Let's see
        what we
        can do.

        There was an infra issue with receiving the logs you sent me. The
        email/attachment got rejected on our side. Maybe you can upload the
        logs
        somewhere so I can grab them?

        See more responses inline below.

        On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <
        vserrao@us.ibm.com> wrote:

        > Hi Alex,
        >
        > I was going more deeper through the logs. I have some findings
        and queries:
        >
        > 1. At the "Invalidating Metadata" step (as mentioned in below
        mail), i
        > noticed that, it is trying to use kerberos. Perhaps, this is
        preventing the
        > testdata generation from proceeding, as we are not using
        Kerberos.
        > I need to know how this can be done without involving Kerberos
        support ?
        >
        Kerberos is certainly not needed to build and run tests.

        >
        > 2. I had executed the fe tests despite the incomplete testdata
        generation,
        > the tests started and surely have failed. Many of these (null
        pointer
        > exception in AuthorzationTests) have a common cause: "tpch
        database does
        > not exist."
        > e.g. as shown
        in .Impala/cluster_logs/query_tests/test-run-workload.log.
        >
        > Does the "tpch" database gets created after the current blocker
        step
        > "Invalidating Metadata" ?
        >

        Yes, the TPCH database is created and loaded as part of that first
        phase.
        However, the data files are not yet publicly accessible. Let me
        work on
        that from my side, and get back to you soon. One way or the other
        we'll be
        able to provide you with the data.


  The data is at
  https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
   . The files are split into 50 MB pieces for git. You can put them back
  together as is done in
  https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile


        >
        > 3. In the fe test console output log, another error shown:
        > ============================= test session starts
        > ==============================
        > platform linux2 -- Python 2.7.5 -- py-1.4.30 -- pytest-2.7.2
        > rootdir: /work/, inifile:
        > plugins: random, xdist
        > ERROR: file not found:/work/I
        > mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/
        >
        > These are not present/created on my vm. May i know when these get
        created ?
        >
        > 4. Could you also share the total number of fe tests ?
        >

        I'll privately send you the console output from a successful FE
        run.
        Hopefully that can help.

        Cheers,

        Alex

        >
        >
        > Looking forward to your reply.
        >
        > Regards,
        > Valencia
        >
        >
        > [image: Inactive hide details for Valencia Serrao---04/30/2016
        09:05:54
        > AM---Hi Alex, I've been able to make some progress on
        testdata]Valencia
        > Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been able to make
        some
        > progress on testdata generation, however, i still face the foll
        >
        > From: Valencia Serrao/Austin/Contr/IBM
        > To: dev@impala.incubator.apache.org, Alex Behm <
        alex.behm@cloudera.com>
        > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
        > Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
        Serrao/Austin/Contr/IBM@IBMUS
        > Date: 04/30/2016 09:05 AM
        > Subject: Fw: Issues with generating testdata for Impala
        > ------------------------------
        >
        >
        >
        > Hi Alex,
        >
        > I've been able to make some progress on testdata generation,
        however, i
        > still face the following issues:
        >
        >
        >
        *******************************************************************************************************************************************************************

        > Invalidating Metadata
        >
        >
        (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):

        > INSERT OVERWRITE TABLE functional_parquet.alltypes partition
        (year, month)
        > SELECT id, bool_col, tinyint_col, smallint_col, int_col,
        bigint_col,
        > float_col, double_col, date_string_col, string_col,
        timestamp_col, year,
        > month
        > FROM functional.alltypes
        >
        > Data Loading from Impala failed with error:
        ImpalaBeeswaxException:
        > INNER EXCEPTION: <class 'socket.error'>
        > MESSAGE: [Errno 104] Connection reset by peer
        > Error in /root/nishidha/Impala/testdata/bin/create-load-data.sh
        at line
        > 41: while [ -n "$*" ]
        > Error in /root/nishidha/Impala/buildall.sh at line 368:
        > ${IMPALA_HOME}/testdata/bin/create-load-data.sh $
        {CREATE_LOAD_DATA_ARGS}
        > <<< Y
        >
        >
        *************************************************************************************************************************************************************************

        >
        > i continued with fe tests as is. Here is the complete output log.

        > [attachment "fe_test_output.zip" deleted by Valencia
        > Serrao/Austin/Contr/IBM]
        >
        > Cluster logs: [attachment "cluster_logs.7z" deleted by Valencia
        > Serrao/Austin/Contr/IBM]
        >
        > Kindly guide me on the same.
        >
        > Regards,
        > Valencia
        > ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on 04/29/2016
        10:57 AM
        > -----
        >
        > From: Sudarshan Jagadale/Austin/Contr/IBM
        > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
        > Date: 04/29/2016 10:49 AM
        > Subject: Fw: Issues with generating testdata for Impala
        > ------------------------------
        >
        >
        > FYI
        > Thanks and Regards
        > Sudarshan Jagadale
        > Power Open Source Solutions
        > ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM on
        04/29/2016 10:48
        > AM -----
        >
        > From: Alex Behm <al...@cloudera.com>
        > To: dev@impala.incubator.apache.org
        > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
        > Panpaliya/Austin/Contr/IBM@IBMUS
        > Date: 04/28/2016 09:34 PM
        > Subject: Re: Issues with generating testdata for Impala
        > ------------------------------
        >
        >
        >
        > Hi Valencia,
        >
        > sorry I did not get the attachment. Would you be able to tar.gz
        and attach
        > the whole cluster_logs directory?
        >
        > Alex
        >
        > On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao <*
        vserrao@us.ibm.com*
        > <vs...@us.ibm.com>> wrote:
        >
        > Hi Alex,
        >
        > I tried building impala again with the following:
        > HDFS CDH 5.7.0 (
        > *
        http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*

        > <
        http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3
        >
        > )
        > HBASE CDH 5.7.0 SNAPSHOT (
        > *
        http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*

        > <
        http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz
        > )
        > - this required to patch in a fix (
        > *
        https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*

        > <
        https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch
        >
        > )
        > HIVE CDH 5.8.0 SNAPSHOT
        >
        > With the above combination, i'm able to move past the exception
        and
        > also have the RegionServer service up and running. However, it
        now gives
        > error as below:
        >
        >
        >
        ********************************************************************************************************************

        >
        (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

        > CREATE EXTERNAL TABLE IF NOT EXISTS functional.decimal_tbl (
        > d1 DECIMAL,
        > d2 DECIMAL(10, 0),
        > d3 DECIMAL(20, 10),
        > d4 DECIMAL(38, 38),
        > d5 DECIMAL(10, 5))
        > PARTITIONED BY (d6 DECIMAL(9, 0))
        > ROW FORMAT delimited fields terminated by ','
        > STORED AS TEXTFILE
        > LOCATION '/test-warehouse/decimal_tbl'
        >
        >
        (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

        > USE functional
        >
        >
        (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

        > ALTER TABLE decimal_tbl ADD IF NOT EXISTS PARTITION(d6=1)
        >
        > Data Loading from Impala failed with error:
        ImpalaBeeswaxException:
        > INNER EXCEPTION: <class
        > 'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
        > MESSAGE:
        > Error: null
        >
        >
        ******************************************************************************************************************

        >
        > Here is the complete log for the same. *(See attached file:
        > data-load-functional-exhaustive.log)*
        >
        > It would great if you could guide me on this issue, so i could
        proceed
        > with the fe tests.
        >
        > Still awaiting link to the source code of HDFS CDH 5.8.0
        >
        > Regards,
        > Valencia
        >
        >
        >
        >


Re: Fw: Issues with generating testdata for Impala

Posted by Alex Behm <al...@cloudera.com>.
Ahh, thanks Casey. Did not know about that.

Valencia, Impala's data loading expects the files to be placed
in IMPALA_HOME/testdata/impala-data

On Tue, May 3, 2016 at 11:21 PM, Casey Ching <ca...@cloudera.com> wrote:

> Comment inline below
>
>
> On May 3, 2016 at 11:18:06 PM, Alex Behm (alex.behm@cloudera.com) wrote:
>
> Hi Valencia,
>
> I'm sorry you are having so much trouble with our setup. Let's see what we
> can do.
>
> There was an infra issue with receiving the logs you sent me. The
> email/attachment got rejected on our side. Maybe you can upload the logs
> somewhere so I can grab them?
>
> See more responses inline below.
>
> On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <vs...@us.ibm.com>
> wrote:
>
> > Hi Alex,
> >
> > I was going more deeper through the logs. I have some findings and
> queries:
> >
> > 1. At the "Invalidating Metadata" step (as mentioned in below mail), i
> > noticed that, it is trying to use kerberos. Perhaps, this is preventing
> the
> > testdata generation from proceeding, as we are not using Kerberos.
> > I need to know how this can be done without involving Kerberos support ?
> >
> Kerberos is certainly not needed to build and run tests.
>
> >
> > 2. I had executed the fe tests despite the incomplete testdata
> generation,
> > the tests started and surely have failed. Many of these (null pointer
> > exception in AuthorzationTests) have a common cause: "tpch database does
> > not exist."
> > e.g. as shown in .Impala/cluster_logs/query_tests/test-run-workload.log.
> >
> > Does the "tpch" database gets created after the current blocker step
> > "Invalidating Metadata" ?
> >
>
> Yes, the TPCH database is created and loaded as part of that first phase.
> However, the data files are not yet publicly accessible. Let me work on
> that from my side, and get back to you soon. One way or the other we'll be
> able to provide you with the data.
>
>
> The data is at
> https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp .
> The files are split into 50 MB pieces for git. You can put them back
> together as is done in
> https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile
>
>
> >
> > 3. In the fe test console output log, another error shown:
> > ============================= test session starts
> > ==============================
> > platform linux2 -- Python 2.7.5 -- py-1.4.30 -- pytest-2.7.2
> > rootdir: /work/, inifile:
> > plugins: random, xdist
> > ERROR: file not found:/work/I
> > mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/
> >
> > These are not present/created on my vm. May i know when these get
> created ?
> >
> > 4. Could you also share the total number of fe tests ?
> >
>
> I'll privately send you the console output from a successful FE run.
> Hopefully that can help.
>
> Cheers,
>
> Alex
>
> >
> >
> > Looking forward to your reply.
> >
> > Regards,
> > Valencia
> >
> >
> > [image: Inactive hide details for Valencia Serrao---04/30/2016 09:05:54
> > AM---Hi Alex, I've been able to make some progress on testdata]Valencia
> > Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been able to make some
> > progress on testdata generation, however, i still face the foll
> >
> > From: Valencia Serrao/Austin/Contr/IBM
> > To: dev@impala.incubator.apache.org, Alex Behm <al...@cloudera.com>
> > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
> > Panpaliya/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
> > Date: 04/30/2016 09:05 AM
> > Subject: Fw: Issues with generating testdata for Impala
> > ------------------------------
> >
> >
> >
> > Hi Alex,
> >
> > I've been able to make some progress on testdata generation, however, i
> > still face the following issues:
> >
> >
> >
> *******************************************************************************************************************************************************************
>
> > Invalidating Metadata
> >
> >
> (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):
>
> > INSERT OVERWRITE TABLE functional_parquet.alltypes partition (year,
> month)
> > SELECT id, bool_col, tinyint_col, smallint_col, int_col, bigint_col,
> > float_col, double_col, date_string_col, string_col, timestamp_col, year,
> > month
> > FROM functional.alltypes
> >
> > Data Loading from Impala failed with error: ImpalaBeeswaxException:
> > INNER EXCEPTION: <class 'socket.error'>
> > MESSAGE: [Errno 104] Connection reset by peer
> > Error in /root/nishidha/Impala/testdata/bin/create-load-data.sh at line
> > 41: while [ -n "$*" ]
> > Error in /root/nishidha/Impala/buildall.sh at line 368:
> > ${IMPALA_HOME}/testdata/bin/create-load-data.sh ${CREATE_LOAD_DATA_ARGS}
> > <<< Y
> >
> >
> *************************************************************************************************************************************************************************
>
> >
> > i continued with fe tests as is. Here is the complete output log.
> > [attachment "fe_test_output.zip" deleted by Valencia
> > Serrao/Austin/Contr/IBM]
> >
> > Cluster logs: [attachment "cluster_logs.7z" deleted by Valencia
> > Serrao/Austin/Contr/IBM]
> >
> > Kindly guide me on the same.
> >
> > Regards,
> > Valencia
> > ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on 04/29/2016 10:57
> AM
> > -----
> >
> > From: Sudarshan Jagadale/Austin/Contr/IBM
> > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
> > Date: 04/29/2016 10:49 AM
> > Subject: Fw: Issues with generating testdata for Impala
> > ------------------------------
> >
> >
> > FYI
> > Thanks and Regards
> > Sudarshan Jagadale
> > Power Open Source Solutions
> > ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM on 04/29/2016
> 10:48
> > AM -----
> >
> > From: Alex Behm <al...@cloudera.com>
> > To: dev@impala.incubator.apache.org
> > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
> > Panpaliya/Austin/Contr/IBM@IBMUS
> > Date: 04/28/2016 09:34 PM
> > Subject: Re: Issues with generating testdata for Impala
> > ------------------------------
> >
> >
> >
> > Hi Valencia,
> >
> > sorry I did not get the attachment. Would you be able to tar.gz and
> attach
> > the whole cluster_logs directory?
> >
> > Alex
> >
> > On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao <*vserrao@us.ibm.com*
> > <vs...@us.ibm.com>> wrote:
> >
> > Hi Alex,
> >
> > I tried building impala again with the following:
> > HDFS CDH 5.7.0 (
> > *
> http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*
> > <
> http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3>
>
> > )
> > HBASE CDH 5.7.0 SNAPSHOT (
> > *http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*
> > <http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz> )
> > - this required to patch in a fix (
> > *
> https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*
> > <
> https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch>
>
> > )
> > HIVE CDH 5.8.0 SNAPSHOT
> >
> > With the above combination, i'm able to move past the exception and
> > also have the RegionServer service up and running. However, it now gives
> > error as below:
> >
> >
> >
> ********************************************************************************************************************
>
> > (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
> > CREATE EXTERNAL TABLE IF NOT EXISTS functional.decimal_tbl (
> > d1 DECIMAL,
> > d2 DECIMAL(10, 0),
> > d3 DECIMAL(20, 10),
> > d4 DECIMAL(38, 38),
> > d5 DECIMAL(10, 5))
> > PARTITIONED BY (d6 DECIMAL(9, 0))
> > ROW FORMAT delimited fields terminated by ','
> > STORED AS TEXTFILE
> > LOCATION '/test-warehouse/decimal_tbl'
> >
> > (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
> > USE functional
> >
> > (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
> > ALTER TABLE decimal_tbl ADD IF NOT EXISTS PARTITION(d6=1)
> >
> > Data Loading from Impala failed with error: ImpalaBeeswaxException:
> > INNER EXCEPTION: <class
> > 'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
> > MESSAGE:
> > Error: null
> >
> >
> ******************************************************************************************************************
>
> >
> > Here is the complete log for the same. *(See attached file:
> > data-load-functional-exhaustive.log)*
> >
> > It would great if you could guide me on this issue, so i could proceed
> > with the fe tests.
> >
> > Still awaiting link to the source code of HDFS CDH 5.8.0
> >
> > Regards,
> > Valencia
> >
> >
> >
> >
>
>

Re: Fw: Issues with generating testdata for Impala

Posted by Tim Armstrong <ta...@cloudera.com>.
Hi Valencia,
  I have an update on the TPC-H/TPC-DS test data - I'm looking at
automating that part of data generation. I was able to verify that it is
the unmodified output of the TPC-H/TPC-DS data generator utilties (the
versions we have in native-toolchain). The only change is to move each
generated file into a subdirectory.

- Tim



On Tue, Jul 19, 2016 at 9:23 PM, Valencia Serrao <vs...@us.ibm.com> wrote:

> Hi Tim,
>
> Thanks for the update.
>
> Regards,
> Valencia
>
> [image: Inactive hide details for Tim Armstrong ---07/20/2016 02:35:47
> AM---Hi Valencia, I wasn't able to get a clear answer, but as]Tim
> Armstrong ---07/20/2016 02:35:47 AM---Hi Valencia, I wasn't able to get a
> clear answer, but as far as we know it hasn't been
>
> From: Tim Armstrong <ta...@cloudera.com>
> To: Valencia Serrao/Austin/Contr/IBM@IBMUS
> Cc: Alex Behm <al...@cloudera.com>, Casey Ching <ca...@cloudera.com>,
> dev@impala.incubator.apache.org, Manish Patil/Austin/Contr/IBM@IBMUS,
> Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
> Jagadale/Austin/Contr/IBM@IBMUS
> Date: 07/20/2016 02:35 AM
>
> Subject: Re: Fw: Issues with generating testdata for Impala
> ------------------------------
>
>
>
> Hi Valencia,
>   I wasn't able to get a clear answer, but as far as we know it hasn't
> been modified.
>
> - Tim
>
> On Tue, Jul 12, 2016 at 4:59 AM, Valencia Serrao <*vserrao@us.ibm.com*
> <vs...@us.ibm.com>> wrote:
>
>    Hi Tim,
>
>    Thank you for responding.
>
>    Please do let me know if any post-processing was done on the data at
>    *https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp*
>    <https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp>
>    *.*
>
>    Regards,
>    Valencia
>
>
>    [image: Inactive hide details for Tim Armstrong ---07/08/2016 01:31:46
>    AM---Hi Valencia, The data is scale factor 1 for the TPC-H and]Tim
>    Armstrong ---07/08/2016 01:31:46 AM---Hi Valencia, The data is scale factor
>    1 for the TPC-H and TPC-DS benchmarks:
>
>    From: Tim Armstrong <*tarmstrong@cloudera.com*
>    <ta...@cloudera.com>>
>    To: Valencia Serrao/Austin/Contr/IBM@IBMUS
>    Cc: Casey Ching <*casey@cloudera.com* <ca...@cloudera.com>>, Alex Behm
>    <*alex.behm@cloudera.com* <al...@cloudera.com>>,
>    *dev@impala.incubator.apache.org* <de...@impala.incubator.apache.org>,
>    Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
>    Jagadale/Austin/Contr/IBM@IBMUS, Manish Patil/Austin/Contr/IBM@IBMUS
>    Date: 07/08/2016 01:31 AM
>
>
>    Subject: Re: Fw: Issues with generating testdata for Impala
>    ------------------------------
>
>
>
>    Hi Valencia,
>      The data is scale factor 1 for the TPC-H and TPC-DS benchmarks:
>    *http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp*
>    <http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp>
>
>    I imagine you could reconstruct it using their data generators.
>
>    I'm unsure if we modified those data generators at all or did any
>    postprocessing. I'm going to check if anyone knows exactly how that data
>    was generated originally.
>
>    On Wed, Jul 6, 2016 at 10:52 PM, Valencia Serrao <*vserrao@us.ibm.com*
>    <vs...@us.ibm.com>> wrote:
>       Hi Casey/Alex/Tim,
>
>          I need to know whether it is possible to generate the tpch and
>          tpcds data without using the tar's you provided at
>          *https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp*
>          <https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp>.
>          Because when i tried to load data without using the tpch and tpcds tars,
>          though functional-query data loaded successfully, I got the following error
>          during the TPC-H data load step:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> * Error: Error while compiling statement: FAILED: SemanticException Line
>          1:23 Invalid path ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No
>          files matching path file: /ImpalaPPC/testdata/impala-data/tpch/lineitem
>          (state=42000,code=40000) org.apache.hive.service.cli.HiveSQLException:
>          Error while compiling statement: FAILED: SemanticException Line 1:23
>          Invalid path ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No files
>          matching path file:/ImpalaPPC/testdata/impala-data/tpch/lineitem at
>          org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:235) at
>          org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:221) at
>          org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:244) at
>          org.apache.hive.beeline.Commands.executeInternal(Commands.java:893) at
>          org.apache.hive.beeline.Commands.execute(Commands.java:1079) at
>          org.apache.hive.beeline.Commands.sql(Commands.java:976) at
>          org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1085) at
>          org.apache.hive.beeline.BeeLine.execute(BeeLine.java:917) at
>          org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:895) at
>          org.apache.hive.beeline.BeeLine.begin(BeeLine.java:837) at
>          org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:482)
>          at org.apache.hive.beeline.BeeLine.main(BeeLine.java:465) at
>          sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>          sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>          at
>          sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>          at java.lang.reflect.Method.invoke(Method.java:606) at
>          org.apache.hadoop.util.RunJar.run(RunJar.java:221) at
>          org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by:
>          org.apache.hive.service.cli.HiveSQLException: Error while compiling
>          statement: FAILED: SemanticException Line 1:23 Invalid path
>          ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No files matching path
>          file:/ImpalaPPC/testdata/impala-data/tpch/lineitem*
>
>
>          Regards,
>          Valencia
>
>          [image: Inactive hide details for Casey Ching ---05/04/2016
>          11:51:39 AM---Comment inline below On May 3, 2016 at 11:18:06 PM, Alex Behm]Casey
>          Ching ---05/04/2016 11:51:39 AM---Comment inline below On May 3, 2016 at
>          11:18:06 PM, Alex Behm (*alex.behm@cloudera.com*
>          <al...@cloudera.com>) wrote:
>
>          From: Casey Ching <*casey@cloudera.com* <ca...@cloudera.com>>
>          To: Alex Behm <*alex.behm@cloudera.com* <al...@cloudera.com>>,
>          *dev@impala.incubator.apache.org*
>          <de...@impala.incubator.apache.org>
>          Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
>          Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
>          Serrao/Austin/Contr/IBM@IBMUS
>          Date: 05/04/2016 11:51 AM
>          Subject: Re: Fw: Issues with generating testdata for Impala
>          ------------------------------
>
>
>
>
>          Comment inline below
>
>          On May 3, 2016 at 11:18:06 PM, Alex Behm (
>          *alex.behm@cloudera.com* <al...@cloudera.com>) wrote:
>          Hi Valencia,
>
>                                  I'm sorry you are having so much trouble
>                                  with our setup. Let's see what we
>                                  can do.
>
>                                  There was an infra issue with receiving
>                                  the logs you sent me. The
>                                  email/attachment got rejected on our
>                                  side. Maybe you can upload the logs
>                                  somewhere so I can grab them?
>
>                                  See more responses inline below.
>
>                                  On Sat, Apr 30, 2016 at 5:01 AM,
>                                  Valencia Serrao <*vserrao@us.ibm.com*
>                                  <vs...@us.ibm.com>> wrote:
>
>                                  > Hi Alex,
>                                  >
>                                  > I was going more deeper through the
>                                  logs. I have some findings and queries:
>                                  >
>                                  > 1. At the "Invalidating Metadata" step
>                                  (as mentioned in below mail), i
>                                  > noticed that, it is trying to use
>                                  kerberos. Perhaps, this is preventing the
>                                  > testdata generation from proceeding,
>                                  as we are not using Kerberos.
>                                  > I need to know how this can be done
>                                  without involving Kerberos support ?
>                                  >
>                                  Kerberos is certainly not needed to
>                                  build and run tests.
>
>                                  >
>                                  > 2. I had executed the fe tests despite
>                                  the incomplete testdata generation,
>                                  > the tests started and surely have
>                                  failed. Many of these (null pointer
>                                  > exception in AuthorzationTests) have a
>                                  common cause: "tpch database does
>                                  > not exist."
>                                  > e.g. as shown in
>                                  .Impala/cluster_logs/query_tests/test-run-workload.log.
>                                  >
>                                  > Does the "tpch" database gets created
>                                  after the current blocker step
>                                  > "Invalidating Metadata" ?
>                                  >
>
>                                  Yes, the TPCH database is created and
>                                  loaded as part of that first phase.
>                                  However, the data files are not yet
>                                  publicly accessible. Let me work on
>                                  that from my side, and get back to you
>                                  soon. One way or the other we'll be
>                                  able to provide you with the data.
>
>          The data is at
>          *https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp*
>          <https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp>
>          . The files are split into 50 MB pieces for git. You can put them back
>          together as is done in
>          *https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile*
>          <https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile>
>
>                                  >
>                                  > 3. In the fe test console output log,
>                                  another error shown:
>                                  > ============================= test
>                                  session starts
>                                  > ==============================
>                                  > platform linux2 -- Python 2.7.5 --
>                                  py-1.4.30 -- pytest-2.7.2
>                                  > rootdir: /work/, inifile:
>                                  > plugins: random, xdist
>                                  > ERROR: file not found:/work/I
>                                  >
>                                  mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/
>                                  >
>                                  > These are not present/created on my
>                                  vm. May i know when these get created ?
>                                  >
>                                  > 4. Could you also share the total
>                                  number of fe tests ?
>                                  >
>
>                                  I'll privately send you the console
>                                  output from a successful FE run.
>                                  Hopefully that can help.
>
>                                  Cheers,
>
>                                  Alex
>
>                                  >
>                                  >
>                                  > Looking forward to your reply.
>                                  >
>                                  > Regards,
>                                  > Valencia
>                                  >
>                                  >
>                                  > [image: Inactive hide details for
>                                  Valencia Serrao---04/30/2016 09:05:54
>                                  > AM---Hi Alex, I've been able to make
>                                  some progress on testdata]Valencia
>                                  > Serrao---04/30/2016 09:05:54 AM---Hi
>                                  Alex, I've been able to make some
>                                  > progress on testdata generation,
>                                  however, i still face the foll
>                                  >
>                                  > From: Valencia Serrao/Austin/Contr/IBM
>                                  > To: *dev@impala.incubator.apache.org*
>                                  <de...@impala.incubator.apache.org>, Alex
>                                  Behm <*alex.behm@cloudera.com*
>                                  <al...@cloudera.com>>
>                                  > Cc: Sudarshan
>                                  Jagadale/Austin/Contr/IBM@IBMUS,
>                                  Nishidha
>                                  > Panpaliya/Austin/Contr/IBM@IBMUS,
>                                  Valencia Serrao/Austin/Contr/IBM@IBMUS
>                                  > Date: 04/30/2016 09:05 AM
>                                  > Subject: Fw: Issues with generating
>                                  testdata for Impala
>                                  > ------------------------------
>                                  >
>                                  >
>                                  >
>                                  > Hi Alex,
>                                  >
>                                  > I've been able to make some progress
>                                  on testdata generation, however, i
>                                  > still face the following issues:
>                                  >
>                                  >
>                                  >
>                                  *******************************************************************************************************************************************************************
>
>                                  > Invalidating Metadata
>                                  >
>                                  >
>                                  (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):
>
>                                  > INSERT OVERWRITE TABLE
>                                  functional_parquet.alltypes partition (year, month)
>                                  > SELECT id, bool_col, tinyint_col,
>                                  smallint_col, int_col, bigint_col,
>                                  > float_col, double_col,
>                                  date_string_col, string_col, timestamp_col, year,
>                                  > month
>                                  > FROM functional.alltypes
>                                  >
>                                  > Data Loading from Impala failed with
>                                  error: ImpalaBeeswaxException:
>                                  > INNER EXCEPTION: <class
>                                  'socket.error'>
>                                  > MESSAGE: [Errno 104] Connection reset
>                                  by peer
>                                  > Error in
>                                  /root/nishidha/Impala/testdata/bin/create-load-data.sh at line
>                                  > 41: while [ -n "$*" ]
>                                  > Error in
>                                  /root/nishidha/Impala/buildall.sh at line 368:
>                                  >
>                                  ${IMPALA_HOME}/testdata/bin/create-load-data.sh ${CREATE_LOAD_DATA_ARGS}
>                                  > <<< Y
>                                  >
>                                  >
>                                  *************************************************************************************************************************************************************************
>
>                                  >
>                                  > i continued with fe tests as is. Here
>                                  is the complete output log.
>                                  > [attachment "fe_test_output.zip"
>                                  deleted by Valencia
>                                  > Serrao/Austin/Contr/IBM]
>                                  >
>                                  > Cluster logs: [attachment
>                                  "cluster_logs.7z" deleted by Valencia
>                                  > Serrao/Austin/Contr/IBM]
>                                  >
>                                  > Kindly guide me on the same.
>                                  >
>                                  > Regards,
>                                  > Valencia
>                                  > ----- Forwarded by Valencia
>                                  Serrao/Austin/Contr/IBM on 04/29/2016 10:57 AM
>                                  > -----
>                                  >
>                                  > From: Sudarshan
>                                  Jagadale/Austin/Contr/IBM
>                                  > To: Valencia
>                                  Serrao/Austin/Contr/IBM@IBMUS
>                                  > Date: 04/29/2016 10:49 AM
>                                  > Subject: Fw: Issues with generating
>                                  testdata for Impala
>                                  > ------------------------------
>                                  >
>                                  >
>                                  > FYI
>                                  > Thanks and Regards
>                                  > Sudarshan Jagadale
>                                  > Power Open Source Solutions
>                                  > ----- Forwarded by Sudarshan
>                                  Jagadale/Austin/Contr/IBM on 04/29/2016 10:48
>                                  > AM -----
>                                  >
>                                  > From: Alex Behm <
>                                  *alex.behm@cloudera.com*
>                                  <al...@cloudera.com>>
>                                  > To: *dev@impala.incubator.apache.org*
>                                  <de...@impala.incubator.apache.org>
>                                  > Cc: Sudarshan
>                                  Jagadale/Austin/Contr/IBM@IBMUS,
>                                  Nishidha
>                                  > Panpaliya/Austin/Contr/IBM@IBMUS
>                                  > Date: 04/28/2016 09:34 PM
>                                  > Subject: Re: Issues with generating
>                                  testdata for Impala
>                                  > ------------------------------
>                                  >
>                                  >
>                                  >
>                                  > Hi Valencia,
>                                  >
>                                  > sorry I did not get the attachment.
>                                  Would you be able to tar.gz and attach
>                                  > the whole cluster_logs directory?
>                                  >
>                                  > Alex
>                                  >
>                                  > On Thu, Apr 28, 2016 at 6:23 AM,
>                                  Valencia Serrao <**vserrao@us.ibm.com*
>                                  <vs...@us.ibm.com>*
>                                  > <*vserrao@us.ibm.com*
>                                  <vs...@us.ibm.com>>> wrote:
>                                  >
>                                  > Hi Alex,
>                                  >
>                                  > I tried building impala again with the
>                                  following:
>                                  > HDFS CDH 5.7.0 (
>                                  > *
>                                  *http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3**
>                                  <http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*>
>                                  > <
>                                  *http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*
>                                  <http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3>>
>
>                                  > )
>                                  > HBASE CDH 5.7.0 SNAPSHOT (
>                                  > *
>                                  *http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz**
>                                  <http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*>
>                                  > <
>                                  *http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*
>                                  <http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz>>
>                                  )
>                                  > - this required to patch in a fix (
>                                  > *
>                                  *https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch**
>                                  <https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*>
>                                  > <
>                                  *https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*
>                                  <https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch>>
>
>                                  > )
>                                  > HIVE CDH 5.8.0 SNAPSHOT
>                                  >
>                                  > With the above combination, i'm able
>                                  to move past the exception and
>                                  > also have the RegionServer service up
>                                  and running. However, it now gives
>                                  > error as below:
>                                  >
>                                  >
>                                  >
>                                  ********************************************************************************************************************
>
>                                  >
>                                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>                                  > CREATE EXTERNAL TABLE IF NOT EXISTS
>                                  functional.decimal_tbl (
>                                  > d1 DECIMAL,
>                                  > d2 DECIMAL(10, 0),
>                                  > d3 DECIMAL(20, 10),
>                                  > d4 DECIMAL(38, 38),
>                                  > d5 DECIMAL(10, 5))
>                                  > PARTITIONED BY (d6 DECIMAL(9, 0))
>                                  > ROW FORMAT delimited fields terminated
>                                  by ','
>                                  > STORED AS TEXTFILE
>                                  > LOCATION '/test-warehouse/decimal_tbl'
>                                  >
>                                  >
>                                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>                                  > USE functional
>                                  >
>                                  >
>                                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>                                  > ALTER TABLE decimal_tbl ADD IF NOT
>                                  EXISTS PARTITION(d6=1)
>                                  >
>                                  > Data Loading from Impala failed with
>                                  error: ImpalaBeeswaxException:
>                                  > INNER EXCEPTION: <class
>                                  >
>                                  'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
>                                  > MESSAGE:
>                                  > Error: null
>                                  >
>                                  >
>                                  ******************************************************************************************************************
>
>                                  >
>                                  > Here is the complete log for the same.
>                                  *(See attached file:
>                                  > data-load-functional-exhaustive.log)*
>                                  >
>                                  > It would great if you could guide me
>                                  on this issue, so i could proceed
>                                  > with the fe tests.
>                                  >
>                                  > Still awaiting link to the source code
>                                  of HDFS CDH 5.8.0
>                                  >
>                                  > Regards,
>                                  > Valencia
>                                  >
>                                  >
>                                  >
>                                  >
>
>
>
>
>

Re: Fw: Issues with generating testdata for Impala

Posted by Valencia Serrao <vs...@us.ibm.com>.
Hi Tim,

Thanks for the update.

Regards,
Valencia



From:	Tim Armstrong <ta...@cloudera.com>
To:	Valencia Serrao/Austin/Contr/IBM@IBMUS
Cc:	Alex Behm <al...@cloudera.com>, Casey Ching
            <ca...@cloudera.com>, dev@impala.incubator.apache.org, Manish
            Patil/Austin/Contr/IBM@IBMUS, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS
Date:	07/20/2016 02:35 AM
Subject:	Re: Fw: Issues with generating testdata for Impala



Hi Valencia,
  I wasn't able to get a clear answer, but as far as we know it hasn't been
modified.

- Tim

On Tue, Jul 12, 2016 at 4:59 AM, Valencia Serrao <vs...@us.ibm.com>
wrote:
  Hi Tim,

  Thank you for responding.

  Please do let me know if any post-processing was done on the data at
  https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
  .

  Regards,
  Valencia


  Inactive hide details for Tim Armstrong ---07/08/2016 01:31:46 AM---Hi
  Valencia,   The data is scale factor 1 for the TPC-H andTim Armstrong
  ---07/08/2016 01:31:46 AM---Hi Valencia, The data is scale factor 1 for
  the TPC-H and TPC-DS benchmarks:

  From: Tim Armstrong <ta...@cloudera.com>
  To: Valencia Serrao/Austin/Contr/IBM@IBMUS
  Cc: Casey Ching <ca...@cloudera.com>, Alex Behm <al...@cloudera.com>,
  dev@impala.incubator.apache.org, Nishidha
  Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
  Jagadale/Austin/Contr/IBM@IBMUS, Manish Patil/Austin/Contr/IBM@IBMUS
  Date: 07/08/2016 01:31 AM



  Subject: Re: Fw: Issues with generating testdata for Impala



  Hi Valencia,
    The data is scale factor 1 for the TPC-H and TPC-DS benchmarks:
  http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp


  I imagine you could reconstruct it using their data generators.

  I'm unsure if we modified those data generators at all or did any
  postprocessing. I'm going to check if anyone knows exactly how that data
  was generated originally.

  On Wed, Jul 6, 2016 at 10:52 PM, Valencia Serrao <vs...@us.ibm.com>
  wrote:
        Hi Casey/Alex/Tim,

        I need to know whether it is possible to generate the tpch and
        tpcds data without using the tar's you provided at
        https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
        . Because when i tried to load data without using the tpch and
        tpcds tars, though functional-query data loaded successfully, I got
        the following error during the TPC-H data load step:

        Error: Error while compiling statement: FAILED: SemanticException
        Line 1:23 Invalid path
        ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No files
        matching path file: /ImpalaPPC/testdata/impala-data/tpch/lineitem
        (state=42000,code=40000)
        org.apache.hive.service.cli.HiveSQLException: Error while compiling
        statement: FAILED: SemanticException Line 1:23 Invalid path
        ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No files
        matching path file:/ImpalaPPC/testdata/impala-data/tpch/lineitem
        at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:235)
        at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:221)
        at org.apache.hive.jdbc.HiveStatement.execute
        (HiveStatement.java:244)
        at org.apache.hive.beeline.Commands.executeInternal
        (Commands.java:893)
        at org.apache.hive.beeline.Commands.execute(Commands.java:1079)
        at org.apache.hive.beeline.Commands.sql(Commands.java:976)
        at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1085)
        at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:917)
        at org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:895)
        at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:837)
        at org.apache.hive.beeline.BeeLine.mainWithInputRedirection
        (BeeLine.java:482)
        at org.apache.hive.beeline.BeeLine.main(BeeLine.java:465)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke
        (NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke
        (DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
        Caused by: org.apache.hive.service.cli.HiveSQLException: Error
        while compiling statement: FAILED: SemanticException Line 1:23
        Invalid path ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No
        files matching path
        file:/ImpalaPPC/testdata/impala-data/tpch/lineitem


        Regards,
        Valencia

        Inactive hide details for Casey Ching ---05/04/2016 11:51:39
        AM---Comment inline below On May 3, 2016 at 11:18:06 PM, Alex Behm
        Casey Ching ---05/04/2016 11:51:39 AM---Comment inline below On May
        3, 2016 at 11:18:06 PM, Alex Behm (alex.behm@cloudera.com) wrote:

        From: Casey Ching <ca...@cloudera.com>
        To: Alex Behm <al...@cloudera.com>,
        dev@impala.incubator.apache.org
        Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
        Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
        Serrao/Austin/Contr/IBM@IBMUS
        Date: 05/04/2016 11:51 AM
        Subject: Re: Fw: Issues with generating testdata for Impala




        Comment inline below


        On May 3, 2016 at 11:18:06 PM, Alex Behm (alex.behm@cloudera.com)
        wrote:


                                Hi Valencia,

                                I'm sorry you are having so much trouble
                                with our setup. Let's see what we
                                can do.

                                There was an infra issue with receiving the
                                logs you sent me. The
                                email/attachment got rejected on our side.
                                Maybe you can upload the logs
                                somewhere so I can grab them?

                                See more responses inline below.

                                On Sat, Apr 30, 2016 at 5:01 AM, Valencia
                                Serrao <vs...@us.ibm.com> wrote:

                                > Hi Alex,
                                >
                                > I was going more deeper through the logs.
                                I have some findings and queries:
                                >
                                > 1. At the "Invalidating Metadata" step
                                (as mentioned in below mail), i
                                > noticed that, it is trying to use
                                kerberos. Perhaps, this is preventing the
                                > testdata generation from proceeding, as
                                we are not using Kerberos.
                                > I need to know how this can be done
                                without involving Kerberos support ?
                                >
                                Kerberos is certainly not needed to build
                                and run tests.

                                >
                                > 2. I had executed the fe tests despite
                                the incomplete testdata generation,
                                > the tests started and surely have failed.
                                Many of these (null pointer
                                > exception in AuthorzationTests) have a
                                common cause: "tpch database does
                                > not exist."
                                > e.g. as shown
                                in .Impala/cluster_logs/query_tests/test-run-workload.log.

                                >
                                > Does the "tpch" database gets created
                                after the current blocker step
                                > "Invalidating Metadata" ?
                                >

                                Yes, the TPCH database is created and
                                loaded as part of that first phase.
                                However, the data files are not yet
                                publicly accessible. Let me work on
                                that from my side, and get back to you
                                soon. One way or the other we'll be
                                able to provide you with the data.

        The data is at
        https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
         . The files are split into 50 MB pieces for git. You can put them
        back together as is done in
        https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile

                                >
                                > 3. In the fe test console output log,
                                another error shown:
                                > ============================= test
                                session starts
                                > ==============================
                                > platform linux2 -- Python 2.7.5 --
                                py-1.4.30 -- pytest-2.7.2
                                > rootdir: /work/, inifile:
                                > plugins: random, xdist
                                > ERROR: file not found:/work/I
                                >
                                mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/

                                >
                                > These are not present/created on my vm.
                                May i know when these get created ?
                                >
                                > 4. Could you also share the total number
                                of fe tests ?
                                >

                                I'll privately send you the console output
                                from a successful FE run.
                                Hopefully that can help.

                                Cheers,

                                Alex

                                >
                                >
                                > Looking forward to your reply.
                                >
                                > Regards,
                                > Valencia
                                >
                                >
                                > [image: Inactive hide details for
                                Valencia Serrao---04/30/2016 09:05:54
                                > AM---Hi Alex, I've been able to make some
                                progress on testdata]Valencia
                                > Serrao---04/30/2016 09:05:54 AM---Hi
                                Alex, I've been able to make some
                                > progress on testdata generation, however,
                                i still face the foll
                                >
                                > From: Valencia Serrao/Austin/Contr/IBM
                                > To: dev@impala.incubator.apache.org, Alex
                                Behm <al...@cloudera.com>
                                > Cc: Sudarshan
                                Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                                > Panpaliya/Austin/Contr/IBM@IBMUS,
                                Valencia Serrao/Austin/Contr/IBM@IBMUS
                                > Date: 04/30/2016 09:05 AM
                                > Subject: Fw: Issues with generating
                                testdata for Impala
                                > ------------------------------
                                >
                                >
                                >
                                > Hi Alex,
                                >
                                > I've been able to make some progress on
                                testdata generation, however, i
                                > still face the following issues:
                                >
                                >
                                >
                                *******************************************************************************************************************************************************************

                                > Invalidating Metadata
                                >
                                >
                                (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):

                                > INSERT OVERWRITE TABLE
                                functional_parquet.alltypes partition
                                (year, month)
                                > SELECT id, bool_col, tinyint_col,
                                smallint_col, int_col, bigint_col,
                                > float_col, double_col, date_string_col,
                                string_col, timestamp_col, year,
                                > month
                                > FROM functional.alltypes
                                >
                                > Data Loading from Impala failed with
                                error: ImpalaBeeswaxException:
                                > INNER EXCEPTION: <class 'socket.error'>
                                > MESSAGE: [Errno 104] Connection reset by
                                peer
                                > Error
                                in /root/nishidha/Impala/testdata/bin/create-load-data.sh
 at line
                                > 41: while [ -n "$*" ]
                                > Error
                                in /root/nishidha/Impala/buildall.sh at
                                line 368:
                                > $
                                {IMPALA_HOME}/testdata/bin/create-load-data.sh
 ${CREATE_LOAD_DATA_ARGS}
                                > <<< Y
                                >
                                >
                                *************************************************************************************************************************************************************************

                                >
                                > i continued with fe tests as is. Here is
                                the complete output log.
                                > [attachment "fe_test_output.zip" deleted
                                by Valencia
                                > Serrao/Austin/Contr/IBM]
                                >
                                > Cluster logs: [attachment
                                "cluster_logs.7z" deleted by Valencia
                                > Serrao/Austin/Contr/IBM]
                                >
                                > Kindly guide me on the same.
                                >
                                > Regards,
                                > Valencia
                                > ----- Forwarded by Valencia
                                Serrao/Austin/Contr/IBM on 04/29/2016 10:57
                                AM
                                > -----
                                >
                                > From: Sudarshan Jagadale/Austin/Contr/IBM

                                > To: Valencia
                                Serrao/Austin/Contr/IBM@IBMUS
                                > Date: 04/29/2016 10:49 AM
                                > Subject: Fw: Issues with generating
                                testdata for Impala
                                > ------------------------------
                                >
                                >
                                > FYI
                                > Thanks and Regards
                                > Sudarshan Jagadale
                                > Power Open Source Solutions
                                > ----- Forwarded by Sudarshan
                                Jagadale/Austin/Contr/IBM on 04/29/2016
                                10:48
                                > AM -----
                                >
                                > From: Alex Behm <al...@cloudera.com>
                                > To: dev@impala.incubator.apache.org
                                > Cc: Sudarshan
                                Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                                > Panpaliya/Austin/Contr/IBM@IBMUS
                                > Date: 04/28/2016 09:34 PM
                                > Subject: Re: Issues with generating
                                testdata for Impala
                                > ------------------------------
                                >
                                >
                                >
                                > Hi Valencia,
                                >
                                > sorry I did not get the attachment. Would
                                you be able to tar.gz and attach
                                > the whole cluster_logs directory?
                                >
                                > Alex
                                >
                                > On Thu, Apr 28, 2016 at 6:23 AM, Valencia
                                Serrao <*vserrao@us.ibm.com*
                                > <vs...@us.ibm.com>> wrote:
                                >
                                > Hi Alex,
                                >
                                > I tried building impala again with the
                                following:
                                > HDFS CDH 5.7.0 (
                                > *
                                http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*

                                > <
                                http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3
                                >
                                > )
                                > HBASE CDH 5.7.0 SNAPSHOT (
                                > *
                                http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*

                                > <
                                http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz
                                > )
                                > - this required to patch in a fix (
                                > *
                                https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*

                                > <
                                https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch
                                >
                                > )
                                > HIVE CDH 5.8.0 SNAPSHOT
                                >
                                > With the above combination, i'm able to
                                move past the exception and
                                > also have the RegionServer service up and
                                running. However, it now gives
                                > error as below:
                                >
                                >
                                >
                                ********************************************************************************************************************

                                >
                                (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                                > CREATE EXTERNAL TABLE IF NOT EXISTS
                                functional.decimal_tbl (
                                > d1 DECIMAL,
                                > d2 DECIMAL(10, 0),
                                > d3 DECIMAL(20, 10),
                                > d4 DECIMAL(38, 38),
                                > d5 DECIMAL(10, 5))
                                > PARTITIONED BY (d6 DECIMAL(9, 0))
                                > ROW FORMAT delimited fields terminated by
                                ','
                                > STORED AS TEXTFILE
                                > LOCATION '/test-warehouse/decimal_tbl'
                                >
                                >
                                (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                                > USE functional
                                >
                                >
                                (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                                > ALTER TABLE decimal_tbl ADD IF NOT EXISTS
                                PARTITION(d6=1)
                                >
                                > Data Loading from Impala failed with
                                error: ImpalaBeeswaxException:
                                > INNER EXCEPTION: <class
                                >
                                'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>

                                > MESSAGE:
                                > Error: null
                                >
                                >
                                ******************************************************************************************************************

                                >
                                > Here is the complete log for the same.
                                *(See attached file:
                                > data-load-functional-exhaustive.log)*
                                >
                                > It would great if you could guide me on
                                this issue, so i could proceed
                                > with the fe tests.
                                >
                                > Still awaiting link to the source code of
                                HDFS CDH 5.8.0
                                >
                                > Regards,
                                > Valencia
                                >
                                >
                                >
                                >















Re: Fw: Issues with generating testdata for Impala

Posted by Tim Armstrong <ta...@cloudera.com>.
Hi Valencia,
  I wasn't able to get a clear answer, but as far as we know it hasn't been
modified.

- Tim

On Tue, Jul 12, 2016 at 4:59 AM, Valencia Serrao <vs...@us.ibm.com> wrote:

> Hi Tim,
>
> Thank you for responding.
>
> Please do let me know if any post-processing was done on the data at
> *https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp*
> <https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp>
> *.*
>
> Regards,
> Valencia
>
>
> [image: Inactive hide details for Tim Armstrong ---07/08/2016 01:31:46
> AM---Hi Valencia, The data is scale factor 1 for the TPC-H and]Tim
> Armstrong ---07/08/2016 01:31:46 AM---Hi Valencia, The data is scale factor
> 1 for the TPC-H and TPC-DS benchmarks:
>
> From: Tim Armstrong <ta...@cloudera.com>
> To: Valencia Serrao/Austin/Contr/IBM@IBMUS
> Cc: Casey Ching <ca...@cloudera.com>, Alex Behm <al...@cloudera.com>,
> dev@impala.incubator.apache.org, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS,
> Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Manish
> Patil/Austin/Contr/IBM@IBMUS
> Date: 07/08/2016 01:31 AM
>
> Subject: Re: Fw: Issues with generating testdata for Impala
> ------------------------------
>
>
>
> Hi Valencia,
>   The data is scale factor 1 for the TPC-H and TPC-DS benchmarks:
> *http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp*
> <http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp>
>
> I imagine you could reconstruct it using their data generators.
>
> I'm unsure if we modified those data generators at all or did any
> postprocessing. I'm going to check if anyone knows exactly how that data
> was generated originally.
>
> On Wed, Jul 6, 2016 at 10:52 PM, Valencia Serrao <*vserrao@us.ibm.com*
> <vs...@us.ibm.com>> wrote:
>
>    Hi Casey/Alex/Tim,
>
>    I need to know whether it is possible to generate the tpch and tpcds
>    data without using the tar's you provided at
>    *https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp*
>    <https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp>.
>    Because when i tried to load data without using the tpch and tpcds tars,
>    though functional-query data loaded successfully, I got the following error
>    during the TPC-H data load step:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> * Error: Error while compiling statement: FAILED: SemanticException Line
>    1:23 Invalid path ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No
>    files matching path file: /ImpalaPPC/testdata/impala-data/tpch/lineitem
>    (state=42000,code=40000) org.apache.hive.service.cli.HiveSQLException:
>    Error while compiling statement: FAILED: SemanticException Line 1:23
>    Invalid path ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No files
>    matching path file:/ImpalaPPC/testdata/impala-data/tpch/lineitem at
>    org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:235) at
>    org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:221) at
>    org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:244) at
>    org.apache.hive.beeline.Commands.executeInternal(Commands.java:893) at
>    org.apache.hive.beeline.Commands.execute(Commands.java:1079) at
>    org.apache.hive.beeline.Commands.sql(Commands.java:976) at
>    org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1085) at
>    org.apache.hive.beeline.BeeLine.execute(BeeLine.java:917) at
>    org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:895) at
>    org.apache.hive.beeline.BeeLine.begin(BeeLine.java:837) at
>    org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:482)
>    at org.apache.hive.beeline.BeeLine.main(BeeLine.java:465) at
>    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>    at
>    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>    at java.lang.reflect.Method.invoke(Method.java:606) at
>    org.apache.hadoop.util.RunJar.run(RunJar.java:221) at
>    org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by:
>    org.apache.hive.service.cli.HiveSQLException: Error while compiling
>    statement: FAILED: SemanticException Line 1:23 Invalid path
>    ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No files matching path
>    file:/ImpalaPPC/testdata/impala-data/tpch/lineitem*
>
>
>    Regards,
>    Valencia
>
>    [image: Inactive hide details for Casey Ching ---05/04/2016 11:51:39
>    AM---Comment inline below On May 3, 2016 at 11:18:06 PM, Alex Behm]Casey
>    Ching ---05/04/2016 11:51:39 AM---Comment inline below On May 3, 2016 at
>    11:18:06 PM, Alex Behm (*alex.behm@cloudera.com*
>    <al...@cloudera.com>) wrote:
>
>    From: Casey Ching <*casey@cloudera.com* <ca...@cloudera.com>>
>    To: Alex Behm <*alex.behm@cloudera.com* <al...@cloudera.com>>,
>    *dev@impala.incubator.apache.org* <de...@impala.incubator.apache.org>
>    Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
>    Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
>    Serrao/Austin/Contr/IBM@IBMUS
>    Date: 05/04/2016 11:51 AM
>    Subject: Re: Fw: Issues with generating testdata for Impala
>    ------------------------------
>
>
>
>
>    Comment inline below
>
>    On May 3, 2016 at 11:18:06 PM, Alex Behm (*alex.behm@cloudera.com*
>    <al...@cloudera.com>) wrote:
>    Hi Valencia,
>
>                I'm sorry you are having so much trouble with our setup.
>                Let's see what we
>                can do.
>
>                There was an infra issue with receiving the logs you sent
>                me. The
>                email/attachment got rejected on our side. Maybe you can
>                upload the logs
>                somewhere so I can grab them?
>
>                See more responses inline below.
>
>                On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <
>                *vserrao@us.ibm.com* <vs...@us.ibm.com>> wrote:
>
>                > Hi Alex,
>                >
>                > I was going more deeper through the logs. I have some
>                findings and queries:
>                >
>                > 1. At the "Invalidating Metadata" step (as mentioned in
>                below mail), i
>                > noticed that, it is trying to use kerberos. Perhaps,
>                this is preventing the
>                > testdata generation from proceeding, as we are not using
>                Kerberos.
>                > I need to know how this can be done without involving
>                Kerberos support ?
>                >
>                Kerberos is certainly not needed to build and run tests.
>
>                >
>                > 2. I had executed the fe tests despite the incomplete
>                testdata generation,
>                > the tests started and surely have failed. Many of these
>                (null pointer
>                > exception in AuthorzationTests) have a common cause:
>                "tpch database does
>                > not exist."
>                > e.g. as shown in
>                .Impala/cluster_logs/query_tests/test-run-workload.log.
>                >
>                > Does the "tpch" database gets created after the current
>                blocker step
>                > "Invalidating Metadata" ?
>                >
>
>                Yes, the TPCH database is created and loaded as part of
>                that first phase.
>                However, the data files are not yet publicly accessible.
>                Let me work on
>                that from my side, and get back to you soon. One way or
>                the other we'll be
>                able to provide you with the data.
>
>    The data is at
>    *https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp*
>    <https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp>
>    . The files are split into 50 MB pieces for git. You can put them back
>    together as is done in
>    *https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile*
>    <https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile>
>
>                >
>                > 3. In the fe test console output log, another error
>                shown:
>                > ============================= test session starts
>                > ==============================
>                > platform linux2 -- Python 2.7.5 -- py-1.4.30 --
>                pytest-2.7.2
>                > rootdir: /work/, inifile:
>                > plugins: random, xdist
>                > ERROR: file not found:/work/I
>                >
>                mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/
>                >
>                > These are not present/created on my vm. May i know when
>                these get created ?
>                >
>                > 4. Could you also share the total number of fe tests ?
>                >
>
>                I'll privately send you the console output from a
>                successful FE run.
>                Hopefully that can help.
>
>                Cheers,
>
>                Alex
>
>                >
>                >
>                > Looking forward to your reply.
>                >
>                > Regards,
>                > Valencia
>                >
>                >
>                > [image: Inactive hide details for Valencia
>                Serrao---04/30/2016 09:05:54
>                > AM---Hi Alex, I've been able to make some progress on
>                testdata]Valencia
>                > Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been
>                able to make some
>                > progress on testdata generation, however, i still face
>                the foll
>                >
>                > From: Valencia Serrao/Austin/Contr/IBM
>                > To: *dev@impala.incubator.apache.org*
>                <de...@impala.incubator.apache.org>, Alex Behm <
>                *alex.behm@cloudera.com* <al...@cloudera.com>>
>                > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
>                > Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
>                Serrao/Austin/Contr/IBM@IBMUS
>                > Date: 04/30/2016 09:05 AM
>                > Subject: Fw: Issues with generating testdata for Impala
>                > ------------------------------
>                >
>                >
>                >
>                > Hi Alex,
>                >
>                > I've been able to make some progress on testdata
>                generation, however, i
>                > still face the following issues:
>                >
>                >
>                >
>                *******************************************************************************************************************************************************************
>
>                > Invalidating Metadata
>                >
>                >
>                (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):
>
>                > INSERT OVERWRITE TABLE functional_parquet.alltypes
>                partition (year, month)
>                > SELECT id, bool_col, tinyint_col, smallint_col, int_col,
>                bigint_col,
>                > float_col, double_col, date_string_col, string_col,
>                timestamp_col, year,
>                > month
>                > FROM functional.alltypes
>                >
>                > Data Loading from Impala failed with error:
>                ImpalaBeeswaxException:
>                > INNER EXCEPTION: <class 'socket.error'>
>                > MESSAGE: [Errno 104] Connection reset by peer
>                > Error in
>                /root/nishidha/Impala/testdata/bin/create-load-data.sh at line
>                > 41: while [ -n "$*" ]
>                > Error in /root/nishidha/Impala/buildall.sh at line 368:
>                > ${IMPALA_HOME}/testdata/bin/create-load-data.sh
>                ${CREATE_LOAD_DATA_ARGS}
>                > <<< Y
>                >
>                >
>                *************************************************************************************************************************************************************************
>
>                >
>                > i continued with fe tests as is. Here is the complete
>                output log.
>                > [attachment "fe_test_output.zip" deleted by Valencia
>                > Serrao/Austin/Contr/IBM]
>                >
>                > Cluster logs: [attachment "cluster_logs.7z" deleted by
>                Valencia
>                > Serrao/Austin/Contr/IBM]
>                >
>                > Kindly guide me on the same.
>                >
>                > Regards,
>                > Valencia
>                > ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on
>                04/29/2016 10:57 AM
>                > -----
>                >
>                > From: Sudarshan Jagadale/Austin/Contr/IBM
>                > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
>                > Date: 04/29/2016 10:49 AM
>                > Subject: Fw: Issues with generating testdata for Impala
>                > ------------------------------
>                >
>                >
>                > FYI
>                > Thanks and Regards
>                > Sudarshan Jagadale
>                > Power Open Source Solutions
>                > ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM
>                on 04/29/2016 10:48
>                > AM -----
>                >
>                > From: Alex Behm <*alex.behm@cloudera.com*
>                <al...@cloudera.com>>
>                > To: *dev@impala.incubator.apache.org*
>                <de...@impala.incubator.apache.org>
>                > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
>                > Panpaliya/Austin/Contr/IBM@IBMUS
>                > Date: 04/28/2016 09:34 PM
>                > Subject: Re: Issues with generating testdata for Impala
>                > ------------------------------
>                >
>                >
>                >
>                > Hi Valencia,
>                >
>                > sorry I did not get the attachment. Would you be able to
>                tar.gz and attach
>                > the whole cluster_logs directory?
>                >
>                > Alex
>                >
>                > On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao <*
>                *vserrao@us.ibm.com* <vs...@us.ibm.com>*
>                > <*vserrao@us.ibm.com* <vs...@us.ibm.com>>> wrote:
>                >
>                > Hi Alex,
>                >
>                > I tried building impala again with the following:
>                > HDFS CDH 5.7.0 (
>                > *
>                *http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3**
>                <http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*>
>                > <
>                *http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*
>                <http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3>>
>
>                > )
>                > HBASE CDH 5.7.0 SNAPSHOT (
>                > *
>                *http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz**
>                <http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*>
>                > <
>                *http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*
>                <http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz>>
>                )
>                > - this required to patch in a fix (
>                > *
>                *https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch**
>                <https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*>
>                > <
>                *https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*
>                <https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch>>
>
>                > )
>                > HIVE CDH 5.8.0 SNAPSHOT
>                >
>                > With the above combination, i'm able to move past the
>                exception and
>                > also have the RegionServer service up and running.
>                However, it now gives
>                > error as below:
>                >
>                >
>                >
>                ********************************************************************************************************************
>
>                >
>                (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>                > CREATE EXTERNAL TABLE IF NOT EXISTS
>                functional.decimal_tbl (
>                > d1 DECIMAL,
>                > d2 DECIMAL(10, 0),
>                > d3 DECIMAL(20, 10),
>                > d4 DECIMAL(38, 38),
>                > d5 DECIMAL(10, 5))
>                > PARTITIONED BY (d6 DECIMAL(9, 0))
>                > ROW FORMAT delimited fields terminated by ','
>                > STORED AS TEXTFILE
>                > LOCATION '/test-warehouse/decimal_tbl'
>                >
>                >
>                (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>                > USE functional
>                >
>                >
>                (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>                > ALTER TABLE decimal_tbl ADD IF NOT EXISTS
>                PARTITION(d6=1)
>                >
>                > Data Loading from Impala failed with error:
>                ImpalaBeeswaxException:
>                > INNER EXCEPTION: <class
>                > 'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
>                > MESSAGE:
>                > Error: null
>                >
>                >
>                ******************************************************************************************************************
>
>                >
>                > Here is the complete log for the same. *(See attached
>                file:
>                > data-load-functional-exhaustive.log)*
>                >
>                > It would great if you could guide me on this issue, so i
>                could proceed
>                > with the fe tests.
>                >
>                > Still awaiting link to the source code of HDFS CDH 5.8.0
>                >
>                > Regards,
>                > Valencia
>                >
>                >
>                >
>                >
>
>
>
>
>

Re: Fw: Issues with generating testdata for Impala

Posted by Valencia Serrao <vs...@us.ibm.com>.
Hi Tim,

Thank you for responding.

Please do let me know if any post-processing was done on the data at
https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
.

Regards,
Valencia




From:	Tim Armstrong <ta...@cloudera.com>
To:	Valencia Serrao/Austin/Contr/IBM@IBMUS
Cc:	Casey Ching <ca...@cloudera.com>, Alex Behm
            <al...@cloudera.com>, dev@impala.incubator.apache.org,
            Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, Manish
            Patil/Austin/Contr/IBM@IBMUS
Date:	07/08/2016 01:31 AM
Subject:	Re: Fw: Issues with generating testdata for Impala



Hi Valencia,
  The data is scale factor 1 for the TPC-H and TPC-DS benchmarks:
http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp


I imagine you could reconstruct it using their data generators.

I'm unsure if we modified those data generators at all or did any
postprocessing. I'm going to check if anyone knows exactly how that data
was generated originally.

On Wed, Jul 6, 2016 at 10:52 PM, Valencia Serrao <vs...@us.ibm.com>
wrote:
  Hi Casey/Alex/Tim,

  I need to know whether it is possible to generate the tpch and tpcds data
  without using the tar's you provided at
  https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
  . Because when i tried to load data without using the tpch and tpcds
  tars, though functional-query data loaded successfully, I got the
  following error during the TPC-H data load step:

  Error: Error while compiling statement: FAILED: SemanticException Line
  1:23 Invalid path ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No
  files matching path file: /ImpalaPPC/testdata/impala-data/tpch/lineitem
  (state=42000,code=40000)
  org.apache.hive.service.cli.HiveSQLException: Error while compiling
  statement: FAILED: SemanticException Line 1:23 Invalid path
  ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No files matching path
  file:/ImpalaPPC/testdata/impala-data/tpch/lineitem
  at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:235)
  at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:221)
  at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:244)
  at org.apache.hive.beeline.Commands.executeInternal(Commands.java:893)
  at org.apache.hive.beeline.Commands.execute(Commands.java:1079)
  at org.apache.hive.beeline.Commands.sql(Commands.java:976)
  at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1085)
  at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:917)
  at org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:895)
  at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:837)
  at org.apache.hive.beeline.BeeLine.mainWithInputRedirection
  (BeeLine.java:482)
  at org.apache.hive.beeline.BeeLine.main(BeeLine.java:465)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke
  (NativeMethodAccessorImpl.java:57)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke
  (DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606)
  at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
  at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
  Caused by: org.apache.hive.service.cli.HiveSQLException: Error while
  compiling statement: FAILED: SemanticException Line 1:23 Invalid path
  ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No files matching path
  file:/ImpalaPPC/testdata/impala-data/tpch/lineitem


  Regards,
  Valencia

  Inactive hide details for Casey Ching ---05/04/2016 11:51:39 AM---Comment
  inline below On May 3, 2016 at 11:18:06 PM, Alex BehmCasey Ching
  ---05/04/2016 11:51:39 AM---Comment inline below On May 3, 2016 at
  11:18:06 PM, Alex Behm (alex.behm@cloudera.com) wrote:

  From: Casey Ching <ca...@cloudera.com>
  To: Alex Behm <al...@cloudera.com>, dev@impala.incubator.apache.org
  Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
  Panpaliya/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
  Date: 05/04/2016 11:51 AM
  Subject: Re: Fw: Issues with generating testdata for Impala




  Comment inline below



  On May 3, 2016 at 11:18:06 PM, Alex Behm (alex.behm@cloudera.com) wrote:


              Hi Valencia,

              I'm sorry you are having so much trouble with our setup.
              Let's see what we
              can do.

              There was an infra issue with receiving the logs you sent me.
              The
              email/attachment got rejected on our side. Maybe you can
              upload the logs
              somewhere so I can grab them?

              See more responses inline below.

              On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <
              vserrao@us.ibm.com> wrote:

              > Hi Alex,
              >
              > I was going more deeper through the logs. I have some
              findings and queries:
              >
              > 1. At the "Invalidating Metadata" step (as mentioned in
              below mail), i
              > noticed that, it is trying to use kerberos. Perhaps, this
              is preventing the
              > testdata generation from proceeding, as we are not using
              Kerberos.
              > I need to know how this can be done without involving
              Kerberos support ?
              >
              Kerberos is certainly not needed to build and run tests.

              >
              > 2. I had executed the fe tests despite the incomplete
              testdata generation,
              > the tests started and surely have failed. Many of these
              (null pointer
              > exception in AuthorzationTests) have a common cause: "tpch
              database does
              > not exist."
              > e.g. as shown
              in .Impala/cluster_logs/query_tests/test-run-workload.log.
              >
              > Does the "tpch" database gets created after the current
              blocker step
              > "Invalidating Metadata" ?
              >

              Yes, the TPCH database is created and loaded as part of that
              first phase.
              However, the data files are not yet publicly accessible. Let
              me work on
              that from my side, and get back to you soon. One way or the
              other we'll be
              able to provide you with the data.

  The data is at
  https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
   . The files are split into 50 MB pieces for git. You can put them back
  together as is done in
  https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile

              >
              > 3. In the fe test console output log, another error shown:
              > ============================= test session starts
              > ==============================
              > platform linux2 -- Python 2.7.5 -- py-1.4.30 --
              pytest-2.7.2
              > rootdir: /work/, inifile:
              > plugins: random, xdist
              > ERROR: file not found:/work/I
              >
              mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/

              >
              > These are not present/created on my vm. May i know when
              these get created ?
              >
              > 4. Could you also share the total number of fe tests ?
              >

              I'll privately send you the console output from a successful
              FE run.
              Hopefully that can help.

              Cheers,

              Alex

              >
              >
              > Looking forward to your reply.
              >
              > Regards,
              > Valencia
              >
              >
              > [image: Inactive hide details for Valencia
              Serrao---04/30/2016 09:05:54
              > AM---Hi Alex, I've been able to make some progress on
              testdata]Valencia
              > Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been able
              to make some
              > progress on testdata generation, however, i still face the
              foll
              >
              > From: Valencia Serrao/Austin/Contr/IBM
              > To: dev@impala.incubator.apache.org, Alex Behm <
              alex.behm@cloudera.com>
              > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
              > Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
              Serrao/Austin/Contr/IBM@IBMUS
              > Date: 04/30/2016 09:05 AM
              > Subject: Fw: Issues with generating testdata for Impala
              > ------------------------------
              >
              >
              >
              > Hi Alex,
              >
              > I've been able to make some progress on testdata
              generation, however, i
              > still face the following issues:
              >
              >
              >
              *******************************************************************************************************************************************************************

              > Invalidating Metadata
              >
              >
              (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):

              > INSERT OVERWRITE TABLE functional_parquet.alltypes
              partition (year, month)
              > SELECT id, bool_col, tinyint_col, smallint_col, int_col,
              bigint_col,
              > float_col, double_col, date_string_col, string_col,
              timestamp_col, year,
              > month
              > FROM functional.alltypes
              >
              > Data Loading from Impala failed with error:
              ImpalaBeeswaxException:
              > INNER EXCEPTION: <class 'socket.error'>
              > MESSAGE: [Errno 104] Connection reset by peer
              > Error
              in /root/nishidha/Impala/testdata/bin/create-load-data.sh at
              line
              > 41: while [ -n "$*" ]
              > Error in /root/nishidha/Impala/buildall.sh at line 368:
              > ${IMPALA_HOME}/testdata/bin/create-load-data.sh $
              {CREATE_LOAD_DATA_ARGS}
              > <<< Y
              >
              >
              *************************************************************************************************************************************************************************

              >
              > i continued with fe tests as is. Here is the complete
              output log.
              > [attachment "fe_test_output.zip" deleted by Valencia
              > Serrao/Austin/Contr/IBM]
              >
              > Cluster logs: [attachment "cluster_logs.7z" deleted by
              Valencia
              > Serrao/Austin/Contr/IBM]
              >
              > Kindly guide me on the same.
              >
              > Regards,
              > Valencia
              > ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on
              04/29/2016 10:57 AM
              > -----
              >
              > From: Sudarshan Jagadale/Austin/Contr/IBM
              > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
              > Date: 04/29/2016 10:49 AM
              > Subject: Fw: Issues with generating testdata for Impala
              > ------------------------------
              >
              >
              > FYI
              > Thanks and Regards
              > Sudarshan Jagadale
              > Power Open Source Solutions
              > ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM on
              04/29/2016 10:48
              > AM -----
              >
              > From: Alex Behm <al...@cloudera.com>
              > To: dev@impala.incubator.apache.org
              > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
              > Panpaliya/Austin/Contr/IBM@IBMUS
              > Date: 04/28/2016 09:34 PM
              > Subject: Re: Issues with generating testdata for Impala
              > ------------------------------
              >
              >
              >
              > Hi Valencia,
              >
              > sorry I did not get the attachment. Would you be able to
              tar.gz and attach
              > the whole cluster_logs directory?
              >
              > Alex
              >
              > On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao <*
              vserrao@us.ibm.com*
              > <vs...@us.ibm.com>> wrote:
              >
              > Hi Alex,
              >
              > I tried building impala again with the following:
              > HDFS CDH 5.7.0 (
              > *
              http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*

              > <
              http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3
              >
              > )
              > HBASE CDH 5.7.0 SNAPSHOT (
              > *
              http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*

              > <
              http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz
              > )
              > - this required to patch in a fix (
              > *
              https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*

              > <
              https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch
              >
              > )
              > HIVE CDH 5.8.0 SNAPSHOT
              >
              > With the above combination, i'm able to move past the
              exception and
              > also have the RegionServer service up and running. However,
              it now gives
              > error as below:
              >
              >
              >
              ********************************************************************************************************************

              >
              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

              > CREATE EXTERNAL TABLE IF NOT EXISTS functional.decimal_tbl
              (
              > d1 DECIMAL,
              > d2 DECIMAL(10, 0),
              > d3 DECIMAL(20, 10),
              > d4 DECIMAL(38, 38),
              > d5 DECIMAL(10, 5))
              > PARTITIONED BY (d6 DECIMAL(9, 0))
              > ROW FORMAT delimited fields terminated by ','
              > STORED AS TEXTFILE
              > LOCATION '/test-warehouse/decimal_tbl'
              >
              >
              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

              > USE functional
              >
              >
              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

              > ALTER TABLE decimal_tbl ADD IF NOT EXISTS PARTITION(d6=1)
              >
              > Data Loading from Impala failed with error:
              ImpalaBeeswaxException:
              > INNER EXCEPTION: <class
              > 'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
              > MESSAGE:
              > Error: null
              >
              >
              ******************************************************************************************************************

              >
              > Here is the complete log for the same. *(See attached file:

              > data-load-functional-exhaustive.log)*
              >
              > It would great if you could guide me on this issue, so i
              could proceed
              > with the fe tests.
              >
              > Still awaiting link to the source code of HDFS CDH 5.8.0
              >
              > Regards,
              > Valencia
              >
              >
              >
              >











Re: Fw: Issues with generating testdata for Impala

Posted by Tim Armstrong <ta...@cloudera.com>.
Hi Valencia,
  The data is scale factor 1 for the TPC-H and TPC-DS benchmarks:
http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp

I imagine you could reconstruct it using their data generators.

I'm unsure if we modified those data generators at all or did any
postprocessing. I'm going to check if anyone knows exactly how that data
was generated originally.

On Wed, Jul 6, 2016 at 10:52 PM, Valencia Serrao <vs...@us.ibm.com> wrote:

> Hi Casey/Alex/Tim,
>
> I need to know whether it is possible to generate the tpch and tpcds data
> without using the tar's you provided at
> *https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp*
> <https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp>.
> Because when i tried to load data without using the tpch and tpcds tars,
> though functional-query data loaded successfully, I got the following
> error during the TPC-H data load step:
>
> *Error: Error while compiling statement: FAILED: SemanticException Line
> 1:23 Invalid path ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No
> files matching path file: /ImpalaPPC/testdata/impala-data/tpch/lineitem
> (state=42000,code=40000)*
> *org.apache.hive.service.cli.HiveSQLException: Error while compiling
> statement: FAILED: SemanticException Line 1:23 Invalid path
> ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No files matching path
> file:/ImpalaPPC/testdata/impala-data/tpch/lineitem*
> * at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:235)*
> * at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:221)*
> * at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:244)*
> * at org.apache.hive.beeline.Commands.executeInternal(Commands.java:893)*
> * at org.apache.hive.beeline.Commands.execute(Commands.java:1079)*
> * at org.apache.hive.beeline.Commands.sql(Commands.java:976)*
> * at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1085)*
> * at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:917)*
> * at org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:895)*
> * at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:837)*
> * at
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:482)*
> * at org.apache.hive.beeline.BeeLine.main(BeeLine.java:465)*
> * at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)*
> * at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)*
> * at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)*
> * at java.lang.reflect.Method.invoke(Method.java:606)*
> * at org.apache.hadoop.util.RunJar.run(RunJar.java:221)*
> * at org.apache.hadoop.util.RunJar.main(RunJar.java:136)*
> *Caused by: org.apache.hive.service.cli.HiveSQLException: Error while
> compiling statement: FAILED: SemanticException Line 1:23 Invalid path
> ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No files matching path
> file:/ImpalaPPC/testdata/impala-data/tpch/lineitem*
>
>
> Regards,
> Valencia
>
> [image: Inactive hide details for Casey Ching ---05/04/2016 11:51:39
> AM---Comment inline below On May 3, 2016 at 11:18:06 PM, Alex Behm]Casey
> Ching ---05/04/2016 11:51:39 AM---Comment inline below On May 3, 2016 at
> 11:18:06 PM, Alex Behm (alex.behm@cloudera.com) wrote:
>
> From: Casey Ching <ca...@cloudera.com>
> To: Alex Behm <al...@cloudera.com>, dev@impala.incubator.apache.org
> Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
> Panpaliya/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
> Date: 05/04/2016 11:51 AM
> Subject: Re: Fw: Issues with generating testdata for Impala
> ------------------------------
>
>
>
> Comment inline below
>
> On May 3, 2016 at 11:18:06 PM, Alex Behm (*alex.behm@cloudera.com*
> <al...@cloudera.com>) wrote:
>
>    Hi Valencia,
>
>       I'm sorry you are having so much trouble with our setup. Let's see
>       what we
>       can do.
>
>       There was an infra issue with receiving the logs you sent me. The
>       email/attachment got rejected on our side. Maybe you can upload the
>       logs
>       somewhere so I can grab them?
>
>       See more responses inline below.
>
>       On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <vs...@us.ibm.com>
>       wrote:
>
>       > Hi Alex,
>       >
>       > I was going more deeper through the logs. I have some findings
>       and queries:
>       >
>       > 1. At the "Invalidating Metadata" step (as mentioned in below
>       mail), i
>       > noticed that, it is trying to use kerberos. Perhaps, this is
>       preventing the
>       > testdata generation from proceeding, as we are not using
>       Kerberos.
>       > I need to know how this can be done without involving Kerberos
>       support ?
>       >
>       Kerberos is certainly not needed to build and run tests.
>
>       >
>       > 2. I had executed the fe tests despite the incomplete testdata
>       generation,
>       > the tests started and surely have failed. Many of these (null
>       pointer
>       > exception in AuthorzationTests) have a common cause: "tpch
>       database does
>       > not exist."
>       > e.g. as shown in
>       .Impala/cluster_logs/query_tests/test-run-workload.log.
>       >
>       > Does the "tpch" database gets created after the current blocker
>       step
>       > "Invalidating Metadata" ?
>       >
>
>       Yes, the TPCH database is created and loaded as part of that first
>       phase.
>       However, the data files are not yet publicly accessible. Let me
>       work on
>       that from my side, and get back to you soon. One way or the other
>       we'll be
>       able to provide you with the data.
>
>
> The data is at
> *https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp*
> <https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp>
> . The files are split into 50 MB pieces for git. You can put them back
> together as is done in
> *https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile*
> <https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile>
>
>
>       >
>       > 3. In the fe test console output log, another error shown:
>       > ============================= test session starts
>       > ==============================
>       > platform linux2 -- Python 2.7.5 -- py-1.4.30 -- pytest-2.7.2
>       > rootdir: /work/, inifile:
>       > plugins: random, xdist
>       > ERROR: file not found:/work/I
>       > mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/
>       >
>       > These are not present/created on my vm. May i know when these get
>       created ?
>       >
>       > 4. Could you also share the total number of fe tests ?
>       >
>
>       I'll privately send you the console output from a successful FE
>       run.
>       Hopefully that can help.
>
>       Cheers,
>
>       Alex
>
>       >
>       >
>       > Looking forward to your reply.
>       >
>       > Regards,
>       > Valencia
>       >
>       >
>       > [image: Inactive hide details for Valencia Serrao---04/30/2016
>       09:05:54
>       > AM---Hi Alex, I've been able to make some progress on
>       testdata]Valencia
>       > Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been able to make
>       some
>       > progress on testdata generation, however, i still face the foll
>       >
>       > From: Valencia Serrao/Austin/Contr/IBM
>       > To: dev@impala.incubator.apache.org, Alex Behm <
>       alex.behm@cloudera.com>
>       > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
>       > Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
>       Serrao/Austin/Contr/IBM@IBMUS
>       > Date: 04/30/2016 09:05 AM
>       > Subject: Fw: Issues with generating testdata for Impala
>       > ------------------------------
>       >
>       >
>       >
>       > Hi Alex,
>       >
>       > I've been able to make some progress on testdata generation,
>       however, i
>       > still face the following issues:
>       >
>       >
>       >
>       *******************************************************************************************************************************************************************
>
>       > Invalidating Metadata
>       >
>       >
>       (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):
>
>       > INSERT OVERWRITE TABLE functional_parquet.alltypes partition
>       (year, month)
>       > SELECT id, bool_col, tinyint_col, smallint_col, int_col,
>       bigint_col,
>       > float_col, double_col, date_string_col, string_col,
>       timestamp_col, year,
>       > month
>       > FROM functional.alltypes
>       >
>       > Data Loading from Impala failed with error:
>       ImpalaBeeswaxException:
>       > INNER EXCEPTION: <class 'socket.error'>
>       > MESSAGE: [Errno 104] Connection reset by peer
>       > Error in /root/nishidha/Impala/testdata/bin/create-load-data.sh
>       at line
>       > 41: while [ -n "$*" ]
>       > Error in /root/nishidha/Impala/buildall.sh at line 368:
>       > ${IMPALA_HOME}/testdata/bin/create-load-data.sh
>       ${CREATE_LOAD_DATA_ARGS}
>       > <<< Y
>       >
>       >
>       *************************************************************************************************************************************************************************
>
>       >
>       > i continued with fe tests as is. Here is the complete output log.
>       > [attachment "fe_test_output.zip" deleted by Valencia
>       > Serrao/Austin/Contr/IBM]
>       >
>       > Cluster logs: [attachment "cluster_logs.7z" deleted by Valencia
>       > Serrao/Austin/Contr/IBM]
>       >
>       > Kindly guide me on the same.
>       >
>       > Regards,
>       > Valencia
>       > ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on 04/29/2016
>       10:57 AM
>       > -----
>       >
>       > From: Sudarshan Jagadale/Austin/Contr/IBM
>       > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
>       > Date: 04/29/2016 10:49 AM
>       > Subject: Fw: Issues with generating testdata for Impala
>       > ------------------------------
>       >
>       >
>       > FYI
>       > Thanks and Regards
>       > Sudarshan Jagadale
>       > Power Open Source Solutions
>       > ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM on
>       04/29/2016 10:48
>       > AM -----
>       >
>       > From: Alex Behm <al...@cloudera.com>
>       > To: dev@impala.incubator.apache.org
>       > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
>       > Panpaliya/Austin/Contr/IBM@IBMUS
>       > Date: 04/28/2016 09:34 PM
>       > Subject: Re: Issues with generating testdata for Impala
>       > ------------------------------
>       >
>       >
>       >
>       > Hi Valencia,
>       >
>       > sorry I did not get the attachment. Would you be able to tar.gz
>       and attach
>       > the whole cluster_logs directory?
>       >
>       > Alex
>       >
>       > On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao <*
>       vserrao@us.ibm.com*
>       > <vs...@us.ibm.com>> wrote:
>       >
>       > Hi Alex,
>       >
>       > I tried building impala again with the following:
>       > HDFS CDH 5.7.0 (
>       > *
>       http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*
>       > <
>       http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3>
>
>       > )
>       > HBASE CDH 5.7.0 SNAPSHOT (
>       > *
>       http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*
>       > <
>       http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz>
>       )
>       > - this required to patch in a fix (
>       > *
>       https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*
>       > <
>       https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch>
>
>       > )
>       > HIVE CDH 5.8.0 SNAPSHOT
>       >
>       > With the above combination, i'm able to move past the exception
>       and
>       > also have the RegionServer service up and running. However, it
>       now gives
>       > error as below:
>       >
>       >
>       >
>       ********************************************************************************************************************
>
>       >
>       (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>       > CREATE EXTERNAL TABLE IF NOT EXISTS functional.decimal_tbl (
>       > d1 DECIMAL,
>       > d2 DECIMAL(10, 0),
>       > d3 DECIMAL(20, 10),
>       > d4 DECIMAL(38, 38),
>       > d5 DECIMAL(10, 5))
>       > PARTITIONED BY (d6 DECIMAL(9, 0))
>       > ROW FORMAT delimited fields terminated by ','
>       > STORED AS TEXTFILE
>       > LOCATION '/test-warehouse/decimal_tbl'
>       >
>       >
>       (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>       > USE functional
>       >
>       >
>       (load-functional-query-exhaustive-impala-generated-text-none-none.sql):
>       > ALTER TABLE decimal_tbl ADD IF NOT EXISTS PARTITION(d6=1)
>       >
>       > Data Loading from Impala failed with error:
>       ImpalaBeeswaxException:
>       > INNER EXCEPTION: <class
>       > 'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
>       > MESSAGE:
>       > Error: null
>       >
>       >
>       ******************************************************************************************************************
>
>       >
>       > Here is the complete log for the same. *(See attached file:
>       > data-load-functional-exhaustive.log)*
>       >
>       > It would great if you could guide me on this issue, so i could
>       proceed
>       > with the fe tests.
>       >
>       > Still awaiting link to the source code of HDFS CDH 5.8.0
>       >
>       > Regards,
>       > Valencia
>       >
>       >
>       >
>       >
>
>
>
>

Re: Fw: Issues with generating testdata for Impala

Posted by Valencia Serrao <vs...@us.ibm.com>.
Hi Casey/Alex/Tim,

I need to know  whether it is possible to generate the tpch and tpcds data
without using the tar's you provided at
https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
. Because when i tried to load data without using the tpch and tpcds tars,
though functional-query data loaded successfully,  I got the following
error during the TPC-H data load step:

Error: Error while compiling statement: FAILED: SemanticException Line 1:23
Invalid path ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No files
matching path file: /ImpalaPPC/testdata/impala-data/tpch/lineitem
(state=42000,code=40000)
org.apache.hive.service.cli.HiveSQLException: Error while compiling
statement: FAILED: SemanticException Line 1:23 Invalid path
''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No files matching path
file:/ImpalaPPC/testdata/impala-data/tpch/lineitem
        at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:235)
        at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:221)
        at org.apache.hive.jdbc.HiveStatement.execute
(HiveStatement.java:244)
        at org.apache.hive.beeline.Commands.executeInternal
(Commands.java:893)
        at org.apache.hive.beeline.Commands.execute(Commands.java:1079)
        at org.apache.hive.beeline.Commands.sql(Commands.java:976)
        at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1085)
        at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:917)
        at org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:895)
        at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:837)
        at org.apache.hive.beeline.BeeLine.mainWithInputRedirection
(BeeLine.java:482)
        at org.apache.hive.beeline.BeeLine.main(BeeLine.java:465)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke
(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke
(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while
compiling statement: FAILED: SemanticException Line 1:23 Invalid path
''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No files matching path
file:/ImpalaPPC/testdata/impala-data/tpch/lineitem


Regards,
Valencia



From:	Casey Ching <ca...@cloudera.com>
To:	Alex Behm <al...@cloudera.com>,
            dev@impala.incubator.apache.org
Cc:	Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
            Serrao/Austin/Contr/IBM@IBMUS
Date:	05/04/2016 11:51 AM
Subject:	Re: Fw: Issues with generating testdata for Impala



Comment inline below




On May 3, 2016 at 11:18:06 PM, Alex Behm (alex.behm@cloudera.com) wrote:


      Hi Valencia,

      I'm sorry you are having so much trouble with our setup. Let's see
      what we
      can do.

      There was an infra issue with receiving the logs you sent me. The
      email/attachment got rejected on our side. Maybe you can upload the
      logs
      somewhere so I can grab them?

      See more responses inline below.

      On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <vs...@us.ibm.com>
      wrote:

      > Hi Alex,
      >
      > I was going more deeper through the logs. I have some findings and
      queries:
      >
      > 1. At the "Invalidating Metadata" step (as mentioned in below
      mail), i
      > noticed that, it is trying to use kerberos. Perhaps, this is
      preventing the
      > testdata generation from proceeding, as we are not using Kerberos.
      > I need to know how this can be done without involving Kerberos
      support ?
      >
      Kerberos is certainly not needed to build and run tests.

      >
      > 2. I had executed the fe tests despite the incomplete testdata
      generation,
      > the tests started and surely have failed. Many of these (null
      pointer
      > exception in AuthorzationTests) have a common cause: "tpch database
      does
      > not exist."
      > e.g. as shown
      in .Impala/cluster_logs/query_tests/test-run-workload.log.
      >
      > Does the "tpch" database gets created after the current blocker
      step
      > "Invalidating Metadata" ?
      >

      Yes, the TPCH database is created and loaded as part of that first
      phase.
      However, the data files are not yet publicly accessible. Let me work
      on
      that from my side, and get back to you soon. One way or the other
      we'll be
      able to provide you with the data.


The data is at
https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
 . The files are split into 50 MB pieces for git. You can put them back
together as is done in
https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile


      >
      > 3. In the fe test console output log, another error shown:
      > ============================= test session starts
      > ==============================
      > platform linux2 -- Python 2.7.5 -- py-1.4.30 -- pytest-2.7.2
      > rootdir: /work/, inifile:
      > plugins: random, xdist
      > ERROR: file not found:/work/I
      > mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/
      >
      > These are not present/created on my vm. May i know when these get
      created ?
      >
      > 4. Could you also share the total number of fe tests ?
      >

      I'll privately send you the console output from a successful FE run.
      Hopefully that can help.

      Cheers,

      Alex

      >
      >
      > Looking forward to your reply.
      >
      > Regards,
      > Valencia
      >
      >
      > [image: Inactive hide details for Valencia Serrao---04/30/2016
      09:05:54
      > AM---Hi Alex, I've been able to make some progress on
      testdata]Valencia
      > Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been able to make
      some
      > progress on testdata generation, however, i still face the foll
      >
      > From: Valencia Serrao/Austin/Contr/IBM
      > To: dev@impala.incubator.apache.org, Alex Behm
      <al...@cloudera.com>
      > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
      > Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
      Serrao/Austin/Contr/IBM@IBMUS
      > Date: 04/30/2016 09:05 AM
      > Subject: Fw: Issues with generating testdata for Impala
      > ------------------------------
      >
      >
      >
      > Hi Alex,
      >
      > I've been able to make some progress on testdata generation,
      however, i
      > still face the following issues:
      >
      >
      >
      *******************************************************************************************************************************************************************

      > Invalidating Metadata
      >
      >
      (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):

      > INSERT OVERWRITE TABLE functional_parquet.alltypes partition (year,
      month)
      > SELECT id, bool_col, tinyint_col, smallint_col, int_col,
      bigint_col,
      > float_col, double_col, date_string_col, string_col, timestamp_col,
      year,
      > month
      > FROM functional.alltypes
      >
      > Data Loading from Impala failed with error: ImpalaBeeswaxException:

      > INNER EXCEPTION: <class 'socket.error'>
      > MESSAGE: [Errno 104] Connection reset by peer
      > Error in /root/nishidha/Impala/testdata/bin/create-load-data.sh at
      line
      > 41: while [ -n "$*" ]
      > Error in /root/nishidha/Impala/buildall.sh at line 368:
      > ${IMPALA_HOME}/testdata/bin/create-load-data.sh $
      {CREATE_LOAD_DATA_ARGS}
      > <<< Y
      >
      >
      *************************************************************************************************************************************************************************

      >
      > i continued with fe tests as is. Here is the complete output log.
      > [attachment "fe_test_output.zip" deleted by Valencia
      > Serrao/Austin/Contr/IBM]
      >
      > Cluster logs: [attachment "cluster_logs.7z" deleted by Valencia
      > Serrao/Austin/Contr/IBM]
      >
      > Kindly guide me on the same.
      >
      > Regards,
      > Valencia
      > ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on 04/29/2016
      10:57 AM
      > -----
      >
      > From: Sudarshan Jagadale/Austin/Contr/IBM
      > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
      > Date: 04/29/2016 10:49 AM
      > Subject: Fw: Issues with generating testdata for Impala
      > ------------------------------
      >
      >
      > FYI
      > Thanks and Regards
      > Sudarshan Jagadale
      > Power Open Source Solutions
      > ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM on
      04/29/2016 10:48
      > AM -----
      >
      > From: Alex Behm <al...@cloudera.com>
      > To: dev@impala.incubator.apache.org
      > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
      > Panpaliya/Austin/Contr/IBM@IBMUS
      > Date: 04/28/2016 09:34 PM
      > Subject: Re: Issues with generating testdata for Impala
      > ------------------------------
      >
      >
      >
      > Hi Valencia,
      >
      > sorry I did not get the attachment. Would you be able to tar.gz and
      attach
      > the whole cluster_logs directory?
      >
      > Alex
      >
      > On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao
      <*vserrao@us.ibm.com*
      > <vs...@us.ibm.com>> wrote:
      >
      > Hi Alex,
      >
      > I tried building impala again with the following:
      > HDFS CDH 5.7.0 (
      >
      *http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*

      > <
      http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3
      >
      > )
      > HBASE CDH 5.7.0 SNAPSHOT (
      >
      *http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*
      > <http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz
      > )
      > - this required to patch in a fix (
      >
      *https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*

      > <
      https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch
      >
      > )
      > HIVE CDH 5.8.0 SNAPSHOT
      >
      > With the above combination, i'm able to move past the exception and

      > also have the RegionServer service up and running. However, it now
      gives
      > error as below:
      >
      >
      >
      ********************************************************************************************************************

      >
      (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

      > CREATE EXTERNAL TABLE IF NOT EXISTS functional.decimal_tbl (
      > d1 DECIMAL,
      > d2 DECIMAL(10, 0),
      > d3 DECIMAL(20, 10),
      > d4 DECIMAL(38, 38),
      > d5 DECIMAL(10, 5))
      > PARTITIONED BY (d6 DECIMAL(9, 0))
      > ROW FORMAT delimited fields terminated by ','
      > STORED AS TEXTFILE
      > LOCATION '/test-warehouse/decimal_tbl'
      >
      >
      (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

      > USE functional
      >
      >
      (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

      > ALTER TABLE decimal_tbl ADD IF NOT EXISTS PARTITION(d6=1)
      >
      > Data Loading from Impala failed with error: ImpalaBeeswaxException:

      > INNER EXCEPTION: <class
      > 'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
      > MESSAGE:
      > Error: null
      >
      >
      ******************************************************************************************************************

      >
      > Here is the complete log for the same. *(See attached file:
      > data-load-functional-exhaustive.log)*
      >
      > It would great if you could guide me on this issue, so i could
      proceed
      > with the fe tests.
      >
      > Still awaiting link to the source code of HDFS CDH 5.8.0
      >
      > Regards,
      > Valencia
      >
      >
      >
      >


Re: Fw: Issues with generating testdata for Impala

Posted by Casey Ching <ca...@cloudera.com>.
Comment inline below


On May 3, 2016 at 11:18:06 PM, Alex Behm (alex.behm@cloudera.com) wrote:

Hi Valencia,  

I'm sorry you are having so much trouble with our setup. Let's see what we  
can do.  

There was an infra issue with receiving the logs you sent me. The  
email/attachment got rejected on our side. Maybe you can upload the logs  
somewhere so I can grab them?  

See more responses inline below.  

On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <vs...@us.ibm.com> wrote:  

> Hi Alex,  
>  
> I was going more deeper through the logs. I have some findings and queries:  
>  
> 1. At the "Invalidating Metadata" step (as mentioned in below mail), i  
> noticed that, it is trying to use kerberos. Perhaps, this is preventing the  
> testdata generation from proceeding, as we are not using Kerberos.  
> I need to know how this can be done without involving Kerberos support ?  
>  
Kerberos is certainly not needed to build and run tests.  

>  
> 2. I had executed the fe tests despite the incomplete testdata generation,  
> the tests started and surely have failed. Many of these (null pointer  
> exception in AuthorzationTests) have a common cause: "tpch database does  
> not exist."  
> e.g. as shown in .Impala/cluster_logs/query_tests/test-run-workload.log.  
>  
> Does the "tpch" database gets created after the current blocker step  
> "Invalidating Metadata" ?  
>  

Yes, the TPCH database is created and loaded as part of that first phase.  
However, the data files are not yet publicly accessible. Let me work on  
that from my side, and get back to you soon. One way or the other we'll be  
able to provide you with the data.  


The data is at https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp . The files are split into 50 MB pieces for git. You can put them back together as is done in https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile


>  
> 3. In the fe test console output log, another error shown:  
> ============================= test session starts  
> ==============================  
> platform linux2 -- Python 2.7.5 -- py-1.4.30 -- pytest-2.7.2  
> rootdir: /work/, inifile:  
> plugins: random, xdist  
> ERROR: file not found:/work/I  
> mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/  
>  
> These are not present/created on my vm. May i know when these get created ?  
>  
> 4. Could you also share the total number of fe tests ?  
>  

I'll privately send you the console output from a successful FE run.  
Hopefully that can help.  

Cheers,  

Alex  

>  
>  
> Looking forward to your reply.  
>  
> Regards,  
> Valencia  
>  
>  
> [image: Inactive hide details for Valencia Serrao---04/30/2016 09:05:54  
> AM---Hi Alex, I've been able to make some progress on testdata]Valencia  
> Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been able to make some  
> progress on testdata generation, however, i still face the foll  
>  
> From: Valencia Serrao/Austin/Contr/IBM  
> To: dev@impala.incubator.apache.org, Alex Behm <al...@cloudera.com>  
> Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha  
> Panpaliya/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS  
> Date: 04/30/2016 09:05 AM  
> Subject: Fw: Issues with generating testdata for Impala  
> ------------------------------  
>  
>  
>  
> Hi Alex,  
>  
> I've been able to make some progress on testdata generation, however, i  
> still face the following issues:  
>  
>  
> *******************************************************************************************************************************************************************  
> Invalidating Metadata  
>  
> (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):  
> INSERT OVERWRITE TABLE functional_parquet.alltypes partition (year, month)  
> SELECT id, bool_col, tinyint_col, smallint_col, int_col, bigint_col,  
> float_col, double_col, date_string_col, string_col, timestamp_col, year,  
> month  
> FROM functional.alltypes  
>  
> Data Loading from Impala failed with error: ImpalaBeeswaxException:  
> INNER EXCEPTION: <class 'socket.error'>  
> MESSAGE: [Errno 104] Connection reset by peer  
> Error in /root/nishidha/Impala/testdata/bin/create-load-data.sh at line  
> 41: while [ -n "$*" ]  
> Error in /root/nishidha/Impala/buildall.sh at line 368:  
> ${IMPALA_HOME}/testdata/bin/create-load-data.sh ${CREATE_LOAD_DATA_ARGS}  
> <<< Y  
>  
> *************************************************************************************************************************************************************************  
>  
> i continued with fe tests as is. Here is the complete output log.  
> [attachment "fe_test_output.zip" deleted by Valencia  
> Serrao/Austin/Contr/IBM]  
>  
> Cluster logs: [attachment "cluster_logs.7z" deleted by Valencia  
> Serrao/Austin/Contr/IBM]  
>  
> Kindly guide me on the same.  
>  
> Regards,  
> Valencia  
> ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on 04/29/2016 10:57 AM  
> -----  
>  
> From: Sudarshan Jagadale/Austin/Contr/IBM  
> To: Valencia Serrao/Austin/Contr/IBM@IBMUS  
> Date: 04/29/2016 10:49 AM  
> Subject: Fw: Issues with generating testdata for Impala  
> ------------------------------  
>  
>  
> FYI  
> Thanks and Regards  
> Sudarshan Jagadale  
> Power Open Source Solutions  
> ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM on 04/29/2016 10:48  
> AM -----  
>  
> From: Alex Behm <al...@cloudera.com>  
> To: dev@impala.incubator.apache.org  
> Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha  
> Panpaliya/Austin/Contr/IBM@IBMUS  
> Date: 04/28/2016 09:34 PM  
> Subject: Re: Issues with generating testdata for Impala  
> ------------------------------  
>  
>  
>  
> Hi Valencia,  
>  
> sorry I did not get the attachment. Would you be able to tar.gz and attach  
> the whole cluster_logs directory?  
>  
> Alex  
>  
> On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao <*vserrao@us.ibm.com*  
> <vs...@us.ibm.com>> wrote:  
>  
> Hi Alex,  
>  
> I tried building impala again with the following:  
> HDFS CDH 5.7.0 (  
> *http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*  
> <http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3>  
> )  
> HBASE CDH 5.7.0 SNAPSHOT (  
> *http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*  
> <http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz> )  
> - this required to patch in a fix (  
> *https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*  
> <https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch>  
> )  
> HIVE CDH 5.8.0 SNAPSHOT  
>  
> With the above combination, i'm able to move past the exception and  
> also have the RegionServer service up and running. However, it now gives  
> error as below:  
>  
>  
> ********************************************************************************************************************  
> (load-functional-query-exhaustive-impala-generated-text-none-none.sql):  
> CREATE EXTERNAL TABLE IF NOT EXISTS functional.decimal_tbl (  
> d1 DECIMAL,  
> d2 DECIMAL(10, 0),  
> d3 DECIMAL(20, 10),  
> d4 DECIMAL(38, 38),  
> d5 DECIMAL(10, 5))  
> PARTITIONED BY (d6 DECIMAL(9, 0))  
> ROW FORMAT delimited fields terminated by ','  
> STORED AS TEXTFILE  
> LOCATION '/test-warehouse/decimal_tbl'  
>  
> (load-functional-query-exhaustive-impala-generated-text-none-none.sql):  
> USE functional  
>  
> (load-functional-query-exhaustive-impala-generated-text-none-none.sql):  
> ALTER TABLE decimal_tbl ADD IF NOT EXISTS PARTITION(d6=1)  
>  
> Data Loading from Impala failed with error: ImpalaBeeswaxException:  
> INNER EXCEPTION: <class  
> 'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>  
> MESSAGE:  
> Error: null  
>  
> ******************************************************************************************************************  
>  
> Here is the complete log for the same. *(See attached file:  
> data-load-functional-exhaustive.log)*  
>  
> It would great if you could guide me on this issue, so i could proceed  
> with the fe tests.  
>  
> Still awaiting link to the source code of HDFS CDH 5.8.0  
>  
> Regards,  
> Valencia  
>  
>  
>  
>