You are viewing a plain text version of this content. The canonical link for it is here.
Posted to codereview@trafodion.apache.org by nonstop-qfchen <gi...@git.apache.org> on 2015/09/19 05:15:24 UTC

[GitHub] incubator-trafodion pull request: Jira1443 final committed

GitHub user nonstop-qfchen opened a pull request:

    https://github.com/apache/incubator-trafodion/pull/87

    Jira1443 final committed

    Hi,
    
    Please kindly review the changes and pull the request if it is OK. 
    
    This set enables update statistics for native HBase and HIVE tables (aka JIRA1443).  The set includes the following changes. 
    0.  No change to the syntax and semantics of the update stats utility for native tables. 
    1. Use of External Tables as aliases to native tables, in the context of creating histogram IDs. The IDs used are actually those of the external tables. 
    2. The histogram tables for native tables are stored in schema TRAFODION._HBASESTATS_ and TRAFODION._HIVESTATS_, respectively.  There is no change to the location of the histogram tables for  Trafodion tables. 
    3. The restriction that a new table can not be created LIKE an external table is removed. This facilitates the creation of sample tables for the native tables
    4. The column stats for native tables now behave exactly the same as for Trafodion tables (which leads to many changes in the expected files for regression/hive).
    5. A work-around (disable PCODE for one query) is added to EXECUCTOR/TEST002. Without it, the compilation runs away in PCODE optimization. 
    6. TPC-DS data now is converted to UTF8 before loading into the HIVE tables. This is to fix a character set conversion error when the source string is "Côte d'Ivoire". The hex code for the 2nd character  in the un-converted string is 0xd4, not exactly a valid UTF8 character. After the conversion, the character is encoded with 2 bytes. 
    7. Two new tests added: seabase/TEST028 (update stats for native HBase tables), and hive/TEST021 (update stats for native hive tables - in TPC-DS)
    
    Thanks --Qifan

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/nonstop-qfchen/incubator-trafodion jira1443_final_committed

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-trafodion/pull/87.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #87
    
----
commit 3d212c68f2a4bca5388178068117da5f89119b80
Author: Qifan Chen <qf...@dev02.trafodion.org>
Date:   2015-09-17T03:54:24Z

    jira-1443 final.

commit 890c306d2aa99635581d4abe5215ccd08fade280
Author: Qifan Chen <qf...@dev02.trafodion.org>
Date:   2015-09-17T14:49:24Z

    add a new regression test for upstats for native hbase tables

commit 8886fa81d810035a26535f945e9d889b718cd480
Author: Qifan Chen <qf...@dev02.trafodion.org>
Date:   2015-09-17T16:54:28Z

    fix hive stats not being used by the compiler

commit d3e4151308c0c0460b835d0162e6a6e009d6cd82
Author: Qifan Chen <qf...@dev02.trafodion.org>
Date:   2015-09-18T19:05:54Z

    Avoid creating an external table when the access is not originated from update stats.

commit 566ce15272c238621dc1d121f77137b8837e67e8
Author: Qifan Chen <qf...@dev02.trafodion.org>
Date:   2015-09-19T02:54:02Z

    Update regression expected files, mostly for HIVE and for EXECUTOR/TEST002

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by nonstop-qfchen <gi...@git.apache.org>.
Github user nonstop-qfchen commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r40011275
  
    --- Diff: core/sql/optimizer/ObjectNames.cpp ---
    @@ -1385,6 +1386,17 @@ void QualifiedName::print(FILE* ofd, const char* indent, const char* title) cons
     #endif
     }
     
    +NABoolean QualifiedName::isHistograms() const
    +{
    +   return (getObjectName() == "SB_HISTOGRAMS");
    +}
    +
    +NABoolean QualifiedName::isHistogramIntervals() const
    +{
    +   return (getObjectName() == "SB_HISTOGRAM_INTERVALS");
    +}
    --- End diff --
    
    changed.
    
    On Mon, Sep 21, 2015 at 1:11 PM, Roberta Marton <no...@github.com>
    wrote:
    
    > In core/sql/optimizer/ObjectNames.cpp
    > <https://github.com/apache/incubator-trafodion/pull/87#discussion_r40004717>
    > :
    >
    > > @@ -1385,6 +1386,17 @@ void QualifiedName::print(FILE* ofd, const char* indent, const char* title) cons
    > >  #endif
    > >  }
    > >
    > > +NABoolean QualifiedName::isHistograms() const
    > > +{
    > > +   return (getObjectName() == "SB_HISTOGRAMS");
    > > +}
    > > +
    > > +NABoolean QualifiedName::isHistogramIntervals() const
    > > +{
    > > +   return (getObjectName() == "SB_HISTOGRAM_INTERVALS");
    > > +}
    >
    > Might want to use the predefined values found in ComSmallDefs.h -
    > HBASE_HIST_NAME & HBASE_HISTINT_NAME
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/incubator-trafodion/pull/87/files#r40004717>.
    >
    
    
    
    -- 
    Regards, --Qifan



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by robertamarton <gi...@git.apache.org>.
Github user robertamarton commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r40006091
  
    --- Diff: core/sql/ustat/hs_globals.cpp ---
    @@ -2887,6 +2890,22 @@ Lng32 HSGlobalsClass::Initialize()
             HSHandleError(retcode);
             TM->Commit(); // Must commit this transaction (even if schema didn't get created).
           }
    +                                              /*=====================================*/
    +                                              /*   CREATE HBASE STATS SCHEMA         */
    +                                              /*   typically as trafodion.hbasestats */
    +                                              /*=====================================*/
    +    if (isNativeHbaseCat(objDef->getCatName()))
    +      {
    +        HSTranMan *TM = HSTranMan::Instance(); // Must have transaction around this.
    +        TM->Begin("Create schema for native hbase stats.");
    +        NAString ddl = "CREATE SCHEMA IF NOT EXISTS ";
    +        ddl.append(HBASE_STATS_CATALOG).append('.').append(HBASE_STATS_SCHEMA);
    --- End diff --
    
    Do you want these schemas to be private or shared?  All tables in a private schema are owned by the creator.  Tables in a shared schema are owned by the creator. 
    Who owns these schemas?  You should add a authorization clause to specify the owner.   You could make DB__ROOT the owner or add a new role as the owner. 
    The MANAGE_STATISTICS is a privilege we support today.  Anyone with the MANAGE_STATISTICS privilege can perform update statistics.  Will this work today with sample tables.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by nonstop-qfchen <gi...@git.apache.org>.
Github user nonstop-qfchen commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r39989410
  
    --- Diff: core/sql/regress/seabase/dml.sb028 ---
    @@ -0,0 +1,11 @@
    +
    --- End diff --
    
    Thanks.  Will include the copyright notice.
    
    On Mon, Sep 21, 2015 at 11:03 AM, DaveBirdsall <no...@github.com>
    wrote:
    
    > In core/sql/regress/seabase/dml.sb028
    > <https://github.com/apache/incubator-trafodion/pull/87#discussion_r39988941>
    > :
    >
    > > @@ -0,0 +1,11 @@
    > > +
    >
    > As should this
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/incubator-trafodion/pull/87/files#r39988941>.
    >
    
    
    
    -- 
    Regards, --Qifan



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by nonstop-qfchen <gi...@git.apache.org>.
Github user nonstop-qfchen commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r39989155
  
    --- Diff: core/sql/regress/executor/TEST002 ---
    @@ -661,7 +661,9 @@ left outer join (
       ) T) T) T) T) T) T
     on (t002t6.c = T.a)
     ;
    +
     control query shape merge_join(cut,cut);
    +cqd PCODE_OPT_LEVEL '0';
    --- End diff --
    
    Yes, I'll file a JIRA. In the process ....
    
    On Mon, Sep 21, 2015 at 10:59 AM, DaveBirdsall <no...@github.com>
    wrote:
    
    > In core/sql/regress/executor/TEST002
    > <https://github.com/apache/incubator-trafodion/pull/87#discussion_r39988524>
    > :
    >
    > >  control query shape merge_join(cut,cut);
    > > +cqd PCODE_OPT_LEVEL '0';
    >
    > This is worrisome actually. Is there a follow-on JIRA to fix the Pcode
    > problems?
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/incubator-trafodion/pull/87/files#r39988524>.
    >
    
    
    
    -- 
    Regards, --Qifan



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by nonstop-qfchen <gi...@git.apache.org>.
Github user nonstop-qfchen commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r40009573
  
    --- Diff: core/sql/ustat/hs_globals.cpp ---
    @@ -2887,6 +2890,22 @@ Lng32 HSGlobalsClass::Initialize()
             HSHandleError(retcode);
             TM->Commit(); // Must commit this transaction (even if schema didn't get created).
           }
    +                                              /*=====================================*/
    +                                              /*   CREATE HBASE STATS SCHEMA         */
    +                                              /*   typically as trafodion.hbasestats */
    +                                              /*=====================================*/
    +    if (isNativeHbaseCat(objDef->getCatName()))
    +      {
    +        HSTranMan *TM = HSTranMan::Instance(); // Must have transaction around this.
    +        TM->Begin("Create schema for native hbase stats.");
    +        NAString ddl = "CREATE SCHEMA IF NOT EXISTS ";
    +        ddl.append(HBASE_STATS_CATALOG).append('.').append(HBASE_STATS_SCHEMA);
    --- End diff --
    
    I will discuss this with you.  Sounds like we need to create two schemas
    (one for native HBase and one for  native HIVE) during the cluster setup
    step.  During update stats, it is too late because a regular user can
    perform the update stats.
    
    On Mon, Sep 21, 2015 at 1:21 PM, Roberta Marton <no...@github.com>
    wrote:
    
    > In core/sql/ustat/hs_globals.cpp
    > <https://github.com/apache/incubator-trafodion/pull/87#discussion_r40006091>
    > :
    >
    > > @@ -2887,6 +2890,22 @@ Lng32 HSGlobalsClass::Initialize()
    > >          HSHandleError(retcode);
    > >          TM->Commit(); // Must commit this transaction (even if schema didn't get created).
    > >        }
    > > +                                              /*=====================================*/
    > > +                                              /*   CREATE HBASE STATS SCHEMA         */
    > > +                                              /*   typically as trafodion.hbasestats */
    > > +                                              /*=====================================*/
    > > +    if (isNativeHbaseCat(objDef->getCatName()))
    > > +      {
    > > +        HSTranMan *TM = HSTranMan::Instance(); // Must have transaction around this.
    > > +        TM->Begin("Create schema for native hbase stats.");
    > > +        NAString ddl = "CREATE SCHEMA IF NOT EXISTS ";
    > > +        ddl.append(HBASE_STATS_CATALOG).append('.').append(HBASE_STATS_SCHEMA);
    >
    > Do you want these schemas to be private or shared? All tables in a private
    > schema are owned by the creator. Tables in a shared schema are owned by the
    > creator.
    > Who owns these schemas? You should add a authorization clause to specify
    > the owner. You could make DB__ROOT the owner or add a new role as the
    > owner.
    > The MANAGE_STATISTICS is a privilege we support today. Anyone with the
    > MANAGE_STATISTICS privilege can perform update statistics. Will this work
    > today with sample tables.
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/incubator-trafodion/pull/87/files#r40006091>.
    >
    
    
    
    -- 
    Regards, --Qifan



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by robertamarton <gi...@git.apache.org>.
Github user robertamarton commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r40006356
  
    --- Diff: core/sql/ustat/hs_globals.cpp ---
    @@ -2887,6 +2890,22 @@ Lng32 HSGlobalsClass::Initialize()
             HSHandleError(retcode);
             TM->Commit(); // Must commit this transaction (even if schema didn't get created).
           }
    +                                              /*=====================================*/
    +                                              /*   CREATE HBASE STATS SCHEMA         */
    +                                              /*   typically as trafodion.hbasestats */
    +                                              /*=====================================*/
    +    if (isNativeHbaseCat(objDef->getCatName()))
    +      {
    +        HSTranMan *TM = HSTranMan::Instance(); // Must have transaction around this.
    +        TM->Begin("Create schema for native hbase stats.");
    +        NAString ddl = "CREATE SCHEMA IF NOT EXISTS ";
    +        ddl.append(HBASE_STATS_CATALOG).append('.').append(HBASE_STATS_SCHEMA);
    +        retcode = HSFuncExecQuery(ddl, -UERR_INTERNAL_ERROR, NULL,
    +                                  "Creating schema for native HBase statistics", NULL,
    +                                  NULL);
    +        HSHandleError(retcode);
    +        TM->Commit(); // Must commit this transaction (even if schema didn't get created).
    +      }
                                                   /*==============================*/
    --- End diff --
    
    Just curious, why do you commit a failure?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r39987723
  
    --- Diff: core/sqf/sql/scripts/install_hadoop_regr_test_env ---
    @@ -214,6 +213,20 @@ fi
     
       cd $MY_TPCDS_DATA_DIR
     
    +  which iconv >>${MY_LOG_FILE} 2>&1
    +  if (( $? != 0 ))
    +  then
    +    echo "iconv utility not available. The data will be in ISO-8859-1 format."
    +  else
    +    echo "Converting the data into UTF-8 format ..."
    +    for t in date_dim time_dim item customer customer_demographics household_demographics customer_address store promotion store_sales
    +      do
    +        iconv -f ISO-8859-1 -t UTF-8 -o ${t}.utf8.dat ${t}.dat >>${MY_LOG_FILE} 2>&1
    +        mv ${t}.utf8.dat ${t}.dat
    --- End diff --
    
    This is not idempotent. (If we do it a second time we'll get invalid data.) If this script is interrupted and restarted from the beginning, do we get fresh copies of all the data?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by sureshsubbiah <gi...@git.apache.org>.
Github user sureshsubbiah commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r39984765
  
    --- Diff: core/sql/regress/hive/EXPECTED021 ---
    @@ -0,0 +1,132 @@
    +>>obey TEST021(setup);
    +>>--------------------------------------------------------------------------
    +>>
    +>>set schema hive.hive;
    +
    +--- SQL operation complete.
    +>>cqd mode_seahive 'ON';
    +
    +--- SQL operation complete.
    +>>
    +>>obey TEST021(tests);
    +>>--------------------------------------------------------------------------
    +>>-- test update stats on native hive tables
    +>>
    +>>update statistics for table customer on every column sample random 2 percent;
    +
    +--- SQL operation complete.
    +>>
    +>>showstats for table customer on "C_BIRTH_YEAR" detail;
    +
    +Detailed Histogram data for Table HIVE.HIVE.CUSTOMER
    +Table ID: 8826569449427418914
    +
    +Hist ID:    1850933272
    +Column(s):  C_BIRTH_YEAR
    +Total Rows: 100000
    +Total UEC:  70
    +Low Value:  (1924)
    +High Value: (1992)
    +Intervals:  70
    +
    +Number    Rowcount         UEC Boundary
    +====== =========== =========== ======================================
    +     0           0           0 (1924)
    +     1        1300           1 (1924)
    +     2        1750           1 (1925)
    +     3        1600           1 (1926)
    +     4        1400           1 (1927)
    +     5        1250           1 (1928)
    +     6        1150           1 (1929)
    +     7        1400           1 (1930)
    +     8         800           1 (1931)
    +     9        1700           1 (1932)
    +    10        1300           1 (1933)
    +    11        1500           1 (1934)
    +    12        1550           1 (1935)
    +    13        1400           1 (1936)
    +    14        1700           1 (1937)
    +    15        1300           1 (1938)
    +    16        1800           1 (1939)
    +    17        1450           1 (1940)
    +    18        1450           1 (1941)
    +    19        1700           1 (1942)
    +    20        1750           1 (1943)
    +    21        1450           1 (1944)
    +    22        1150           1 (1945)
    +    23        1000           1 (1946)
    +    24        1250           1 (1947)
    +    25        1600           1 (1948)
    +    26        1550           1 (1949)
    +    27        1650           1 (1950)
    +    28        1550           1 (1951)
    +    29        1050           1 (1952)
    +    30        1500           1 (1953)
    +    31        1200           1 (1954)
    +    32        1700           1 (1955)
    +    33        1450           1 (1956)
    +    34        1600           1 (1957)
    +    35        1550           1 (1958)
    +    36        1100           1 (1959)
    +    37        1500           1 (1960)
    +    38        1750           1 (1961)
    +    39         950           1 (1962)
    +    40        1700           1 (1963)
    +    41        1500           1 (1964)
    +    42        1100           1 (1965)
    +    43        1350           1 (1966)
    +    44        1350           1 (1967)
    +    45        1400           1 (1968)
    +    46        1300           1 (1969)
    +    47        1700           1 (1970)
    +    48        2000           1 (1971)
    +    49        1300           1 (1972)
    +    50        1650           1 (1973)
    +    51         850           1 (1974)
    +    52        1050           1 (1975)
    +    53        1300           1 (1976)
    +    54        1000           1 (1977)
    +    55        1350           1 (1978)
    +    56        1700           1 (1979)
    +    57        1100           1 (1980)
    +    58        1650           1 (1981)
    +    59        1700           1 (1982)
    +    60        1150           1 (1983)
    +    61        1100           1 (1984)
    +    62        1850           1 (1985)
    +    63         950           1 (1986)
    +    64        1350           1 (1987)
    +    65        1850           1 (1988)
    +    66         950           1 (1989)
    +    67        1400           1 (1990)
    +    68        1250           1 (1991)
    +    69        1100           1 (1992)
    +    70        3200           1 (NULL)
    +
    --- End diff --
    
    With sampling can we get the same results every time for a consistent test?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by nonstop-qfchen <gi...@git.apache.org>.
Github user nonstop-qfchen commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r39991101
  
    --- Diff: core/sql/ustat/hs_yacc.y ---
    @@ -60,6 +60,8 @@ extern THREAD_P HSGlobalsClass *hs_globals_y;
     extern int yylex(YYSTYPE * lvalp, void* scanner);
     //extern int yylex();
     
    +extern NABoolean isUpdatestatsStmt;
    --- End diff --
    
    Good point. I'll move it to hs_globals.
    
    On Mon, Sep 21, 2015 at 11:11 AM, DaveBirdsall <no...@github.com>
    wrote:
    
    > In core/sql/ustat/hs_yacc.y
    > <https://github.com/apache/incubator-trafodion/pull/87#discussion_r39990031>
    > :
    >
    > > @@ -60,6 +60,8 @@ extern THREAD_P HSGlobalsClass *hs_globals_y;
    > >  extern int yylex(YYSTYPE * lvalp, void* scanner);
    > >  //extern int yylex();
    > >
    > > +extern NABoolean isUpdatestatsStmt;
    >
    > There's no object we can stick this global inside of?
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/incubator-trafodion/pull/87/files#r39990031>.
    >
    
    
    
    -- 
    Regards, --Qifan



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by robertamarton <gi...@git.apache.org>.
Github user robertamarton commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r40007563
  
    --- Diff: core/sql/ustat/hs_la.cpp ---
    @@ -994,13 +999,118 @@ Lng32 HSHiveTableDef::DescribeColumnNames()
     }
     
     //=====================================================
    +//
    +NAString HSHbaseTableDef::getHistLoc(formatType format) const
    +{
    +  if ( HSGlobalsClass::isNativeHbaseCat(getCatName(format))) {
    +    return HBASE_STATS_CATALOG "." HBASE_STATS_SCHEMA;
    +  } else {  
    +    NAString name(getCatName(format));
    +    name.append(".");
    +    name.append(getSchemaName(format));
    +    return name;
    +  }
    +}
    +
    +static 
    +Lng32 CreateExternalTable(const NAString& catName, const NAString& schName, const NAString& nativeTableName)
    +{
    +   HSLogMan *LM = HSLogMan::Instance();
    +   if (LM->LogNeeded())
    +      {
    +        snprintf(LM->msg, sizeof(LM->msg), "Creating external table table for %s on demand.",
    +                          nativeTableName.data());
    +        LM->Log(LM->msg);
    +      }
    +
    +   // do not have to worry about the catalog and schema for the new external table 
    +   // here. These names will be determined by the processing logic. 
    +   NAString ddl = "CREATE EXTERNAL TABLE ";
    +   ddl.append(nativeTableName);
    +   ddl.append(" FOR ");
    +   ddl.append(catName);
    +   ddl.append(".");
    +   ddl.append(schName);
    +   ddl.append(".");
    +   ddl.append(nativeTableName);
    +   ddl.append(" ATTRIBUTE ");
    +
    +   HSTranMan *TM = HSTranMan::Instance();
    +   if (TM->InTransaction())
    +      ddl.append(" AUDIT");
    +   else
    +      ddl.append("NO AUDIT");
    +
    +   Lng32 retcode = HSFuncExecDDL(ddl.data(), - UERR_INTERNAL_ERROR, NULL,
    +                            "Create external table", NULL);
    +
    +   if (retcode < 0 && LM->LogNeeded())
    +      {
    +        snprintf(LM->msg, sizeof(LM->msg), "Creation of the external table failed.");
    +        LM->Log(LM->msg);
    +      }
    +
    +   return retcode;
    +}
    +
    +NABoolean HSTableDef::setObjectUID(NABoolean createExternalTable)
    +{
    +  objectUID_ = naTbl_->objectUid().get_value();
    +
    +  if (createExternalTable && objectUID_ <= 0 && 
    +      HSGlobalsClass::isNativeCat(getCatName(EXTERNAL_FORMAT)) ) {
    +
    +    // If objectUID is not set, it means there is no corresponding
    +    // external table created for it. Need to create one here.
    +    NAString catName = getCatName(EXTERNAL_FORMAT);
    +    NAString schName = getSchemaName(EXTERNAL_FORMAT);
    +    NAString objName = getObjectName(EXTERNAL_FORMAT);
    +    Lng32 retcode = CreateExternalTable(catName, schName, objName);
    +
    +    if (retcode != 0)
    +      return FALSE;
     
    -NABoolean HSHbaseTableDef::objExists(labelDetail unused)
    +    setNATable();
    +    if (!naTbl_)
    +      return FALSE;
    +
    --- End diff --
    
    Do you get a reasonable error back if unable to create the external table or the special schemas?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by nonstop-qfchen <gi...@git.apache.org>.
Github user nonstop-qfchen commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r39985378
  
    --- Diff: core/sql/regress/hive/EXPECTED021 ---
    @@ -0,0 +1,132 @@
    +>>obey TEST021(setup);
    +>>--------------------------------------------------------------------------
    +>>
    +>>set schema hive.hive;
    +
    +--- SQL operation complete.
    +>>cqd mode_seahive 'ON';
    +
    +--- SQL operation complete.
    +>>
    +>>obey TEST021(tests);
    +>>--------------------------------------------------------------------------
    +>>-- test update stats on native hive tables
    +>>
    +>>update statistics for table customer on every column sample random 2 percent;
    +
    +--- SQL operation complete.
    +>>
    +>>showstats for table customer on "C_BIRTH_YEAR" detail;
    +
    +Detailed Histogram data for Table HIVE.HIVE.CUSTOMER
    +Table ID: 8826569449427418914
    +
    +Hist ID:    1850933272
    +Column(s):  C_BIRTH_YEAR
    +Total Rows: 100000
    +Total UEC:  70
    +Low Value:  (1924)
    +High Value: (1992)
    +Intervals:  70
    +
    +Number    Rowcount         UEC Boundary
    +====== =========== =========== ======================================
    +     0           0           0 (1924)
    +     1        1300           1 (1924)
    +     2        1750           1 (1925)
    +     3        1600           1 (1926)
    +     4        1400           1 (1927)
    +     5        1250           1 (1928)
    +     6        1150           1 (1929)
    +     7        1400           1 (1930)
    +     8         800           1 (1931)
    +     9        1700           1 (1932)
    +    10        1300           1 (1933)
    +    11        1500           1 (1934)
    +    12        1550           1 (1935)
    +    13        1400           1 (1936)
    +    14        1700           1 (1937)
    +    15        1300           1 (1938)
    +    16        1800           1 (1939)
    +    17        1450           1 (1940)
    +    18        1450           1 (1941)
    +    19        1700           1 (1942)
    +    20        1750           1 (1943)
    +    21        1450           1 (1944)
    +    22        1150           1 (1945)
    +    23        1000           1 (1946)
    +    24        1250           1 (1947)
    +    25        1600           1 (1948)
    +    26        1550           1 (1949)
    +    27        1650           1 (1950)
    +    28        1550           1 (1951)
    +    29        1050           1 (1952)
    +    30        1500           1 (1953)
    +    31        1200           1 (1954)
    +    32        1700           1 (1955)
    +    33        1450           1 (1956)
    +    34        1600           1 (1957)
    +    35        1550           1 (1958)
    +    36        1100           1 (1959)
    +    37        1500           1 (1960)
    +    38        1750           1 (1961)
    +    39         950           1 (1962)
    +    40        1700           1 (1963)
    +    41        1500           1 (1964)
    +    42        1100           1 (1965)
    +    43        1350           1 (1966)
    +    44        1350           1 (1967)
    +    45        1400           1 (1968)
    +    46        1300           1 (1969)
    +    47        1700           1 (1970)
    +    48        2000           1 (1971)
    +    49        1300           1 (1972)
    +    50        1650           1 (1973)
    +    51         850           1 (1974)
    +    52        1050           1 (1975)
    +    53        1300           1 (1976)
    +    54        1000           1 (1977)
    +    55        1350           1 (1978)
    +    56        1700           1 (1979)
    +    57        1100           1 (1980)
    +    58        1650           1 (1981)
    +    59        1700           1 (1982)
    +    60        1150           1 (1983)
    +    61        1100           1 (1984)
    +    62        1850           1 (1985)
    +    63         950           1 (1986)
    +    64        1350           1 (1987)
    +    65        1850           1 (1988)
    +    66         950           1 (1989)
    +    67        1400           1 (1990)
    +    68        1250           1 (1991)
    +    69        1100           1 (1992)
    +    70        3200           1 (NULL)
    +
    --- End diff --
    
    The answer is no. That is the reason that showstats and explain output are
    excluded from the test.
    
    On Mon, Sep 21, 2015 at 10:30 AM, Suresh Subbiah <no...@github.com>
    wrote:
    
    > In core/sql/regress/hive/EXPECTED021
    > <https://github.com/apache/incubator-trafodion/pull/87#discussion_r39984765>
    > :
    >
    > > +    56        1700           1 (1979)
    > > +    57        1100           1 (1980)
    > > +    58        1650           1 (1981)
    > > +    59        1700           1 (1982)
    > > +    60        1150           1 (1983)
    > > +    61        1100           1 (1984)
    > > +    62        1850           1 (1985)
    > > +    63         950           1 (1986)
    > > +    64        1350           1 (1987)
    > > +    65        1850           1 (1988)
    > > +    66         950           1 (1989)
    > > +    67        1400           1 (1990)
    > > +    68        1250           1 (1991)
    > > +    69        1100           1 (1992)
    > > +    70        3200           1 (NULL)
    > > +
    >
    > With sampling can we get the same results every time for a consistent test?
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/incubator-trafodion/pull/87/files#r39984765>.
    >
    
    
    
    -- 
    Regards, --Qifan



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by nonstop-qfchen <gi...@git.apache.org>.
Github user nonstop-qfchen commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r39990182
  
    --- Diff: core/sql/ustat/hs_parser.cpp ---
    @@ -428,25 +435,22 @@ Lng32 AddTableName( const hs_table_type type
                 LM->Log(LM->msg);
              }
     
    -        const char* period = strchr(hs_globals->catSch->data(), '.');
    -        NAString catName(hs_globals->catSch->data(),
    -                         period - hs_globals->catSch->data());
    +        NAString catName(hs_globals->objDef->getCatName());
    +
    +        *hs_globals->hstogram_table = getHistogramsTableLocation(hs_globals->catSch->data(), FALSE);
    +
    +        *hs_globals->hsintval_table = getHistogramsTableLocation(hs_globals->catSch->data(), FALSE);
    --- End diff --
    
    In the lines followed, each will get the corresponding table name. Here
    they are assigned with the identical catalog and schema name.
    
    On Mon, Sep 21, 2015 at 11:09 AM, DaveBirdsall <no...@github.com>
    wrote:
    
    > In core/sql/ustat/hs_parser.cpp
    > <https://github.com/apache/incubator-trafodion/pull/87#discussion_r39989803>
    > :
    >
    > > @@ -428,25 +435,22 @@ Lng32 AddTableName( const hs_table_type type
    > >              LM->Log(LM->msg);
    > >           }
    > >
    > > -        const char* period = strchr(hs_globals->catSch->data(), '.');
    > > -        NAString catName(hs_globals->catSch->data(),
    > > -                         period - hs_globals->catSch->data());
    > > +        NAString catName(hs_globals->objDef->getCatName());
    > > +
    > > +        *hs_globals->hstogram_table = getHistogramsTableLocation(hs_globals->catSch->data(), FALSE);
    > > +
    > > +        *hs_globals->hsintval_table = getHistogramsTableLocation(hs_globals->catSch->data(), FALSE);
    >
    > Looks like hsintval_table will get the same value as hstogram_table? (At
    > least I can't see anything different on the right hand side of the two
    > assigns)
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/incubator-trafodion/pull/87/files#r39989803>.
    >
    
    
    
    -- 
    Regards, --Qifan



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r39990031
  
    --- Diff: core/sql/ustat/hs_yacc.y ---
    @@ -60,6 +60,8 @@ extern THREAD_P HSGlobalsClass *hs_globals_y;
     extern int yylex(YYSTYPE * lvalp, void* scanner);
     //extern int yylex();
     
    +extern NABoolean isUpdatestatsStmt;
    --- End diff --
    
    There's no object we can stick this global inside of?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by robertamarton <gi...@git.apache.org>.
Github user robertamarton commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r40004717
  
    --- Diff: core/sql/optimizer/ObjectNames.cpp ---
    @@ -1385,6 +1386,17 @@ void QualifiedName::print(FILE* ofd, const char* indent, const char* title) cons
     #endif
     }
     
    +NABoolean QualifiedName::isHistograms() const
    +{
    +   return (getObjectName() == "SB_HISTOGRAMS");
    +}
    +
    +NABoolean QualifiedName::isHistogramIntervals() const
    +{
    +   return (getObjectName() == "SB_HISTOGRAM_INTERVALS");
    +}
    --- End diff --
    
    Might want to use the predefined values found in ComSmallDefs.h - HBASE_HIST_NAME & HBASE_HISTINT_NAME


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r39989803
  
    --- Diff: core/sql/ustat/hs_parser.cpp ---
    @@ -428,25 +435,22 @@ Lng32 AddTableName( const hs_table_type type
                 LM->Log(LM->msg);
              }
     
    -        const char* period = strchr(hs_globals->catSch->data(), '.');
    -        NAString catName(hs_globals->catSch->data(),
    -                         period - hs_globals->catSch->data());
    +        NAString catName(hs_globals->objDef->getCatName());
    +
    +        *hs_globals->hstogram_table = getHistogramsTableLocation(hs_globals->catSch->data(), FALSE);
    +
    +        *hs_globals->hsintval_table = getHistogramsTableLocation(hs_globals->catSch->data(), FALSE);
    --- End diff --
    
    Looks like hsintval_table will get the same value as hstogram_table? (At least I can't see anything different on the right hand side of the two assigns)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by sureshsubbiah <gi...@git.apache.org>.
Github user sureshsubbiah commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r39984607
  
    --- Diff: core/sql/regress/hive/EXPECTED020 ---
    @@ -177,10 +177,10 @@ Suite 100             College
     LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
     ---- ---- ---- --------------------  --------  --------------------  ---------
     
    -3    .    4    root                                                  5.00E+001
    -2    1    3    hybrid_hash_join                                      5.00E+001
    -.    .    2    hive_scan                       STORE                 1.10E+001
    -.    .    1    hive_scan                       STORE_ORC             9.00E+000
    +3    .    4    root                                                  4.08E+003
    +1    2    3    nested_join                                           4.08E+003
    +.    .    2    hive_scan                       STORE_ORC             4.08E+001
    +.    .    1    hive_scan                       STORE                 1.00E+002
    --- End diff --
    
    Like this plan!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by nonstop-qfchen <gi...@git.apache.org>.
Github user nonstop-qfchen commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r40010849
  
    --- Diff: core/sql/ustat/hs_la.cpp ---
    @@ -994,13 +999,118 @@ Lng32 HSHiveTableDef::DescribeColumnNames()
     }
     
     //=====================================================
    +//
    +NAString HSHbaseTableDef::getHistLoc(formatType format) const
    +{
    +  if ( HSGlobalsClass::isNativeHbaseCat(getCatName(format))) {
    +    return HBASE_STATS_CATALOG "." HBASE_STATS_SCHEMA;
    +  } else {  
    +    NAString name(getCatName(format));
    +    name.append(".");
    +    name.append(getSchemaName(format));
    +    return name;
    +  }
    +}
    +
    +static 
    +Lng32 CreateExternalTable(const NAString& catName, const NAString& schName, const NAString& nativeTableName)
    +{
    +   HSLogMan *LM = HSLogMan::Instance();
    +   if (LM->LogNeeded())
    +      {
    +        snprintf(LM->msg, sizeof(LM->msg), "Creating external table table for %s on demand.",
    +                          nativeTableName.data());
    +        LM->Log(LM->msg);
    +      }
    +
    +   // do not have to worry about the catalog and schema for the new external table 
    +   // here. These names will be determined by the processing logic. 
    +   NAString ddl = "CREATE EXTERNAL TABLE ";
    +   ddl.append(nativeTableName);
    +   ddl.append(" FOR ");
    +   ddl.append(catName);
    +   ddl.append(".");
    +   ddl.append(schName);
    +   ddl.append(".");
    +   ddl.append(nativeTableName);
    +   ddl.append(" ATTRIBUTE ");
    +
    +   HSTranMan *TM = HSTranMan::Instance();
    +   if (TM->InTransaction())
    +      ddl.append(" AUDIT");
    +   else
    +      ddl.append("NO AUDIT");
    +
    +   Lng32 retcode = HSFuncExecDDL(ddl.data(), - UERR_INTERNAL_ERROR, NULL,
    +                            "Create external table", NULL);
    +
    +   if (retcode < 0 && LM->LogNeeded())
    +      {
    +        snprintf(LM->msg, sizeof(LM->msg), "Creation of the external table failed.");
    +        LM->Log(LM->msg);
    +      }
    +
    +   return retcode;
    +}
    +
    +NABoolean HSTableDef::setObjectUID(NABoolean createExternalTable)
    +{
    +  objectUID_ = naTbl_->objectUid().get_value();
    +
    +  if (createExternalTable && objectUID_ <= 0 && 
    +      HSGlobalsClass::isNativeCat(getCatName(EXTERNAL_FORMAT)) ) {
    +
    +    // If objectUID is not set, it means there is no corresponding
    +    // external table created for it. Need to create one here.
    +    NAString catName = getCatName(EXTERNAL_FORMAT);
    +    NAString schName = getSchemaName(EXTERNAL_FORMAT);
    +    NAString objName = getObjectName(EXTERNAL_FORMAT);
    +    Lng32 retcode = CreateExternalTable(catName, schName, objName);
    +
    +    if (retcode != 0)
    +      return FALSE;
     
    -NABoolean HSHbaseTableDef::objExists(labelDetail unused)
    +    setNATable();
    +    if (!naTbl_)
    +      return FALSE;
    +
    --- End diff --
    
    Yes. the error code will be returned in comedians area.
    
    On Mon, Sep 21, 2015 at 1:32 PM, Roberta Marton <no...@github.com>
    wrote:
    
    > In core/sql/ustat/hs_la.cpp
    > <https://github.com/apache/incubator-trafodion/pull/87#discussion_r40007563>
    > :
    >
    > >
    > > -NABoolean HSHbaseTableDef::objExists(labelDetail unused)
    > > +    setNATable();
    > > +    if (!naTbl_)
    > > +      return FALSE;
    > > +
    >
    > Do you get a reasonable error back if unable to create the external table
    > or the special schemas?
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/incubator-trafodion/pull/87/files#r40007563>.
    >
    
    
    
    -- 
    Regards, --Qifan



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-trafodion/pull/87


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by robertamarton <gi...@git.apache.org>.
Github user robertamarton commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r40009684
  
    --- Diff: core/sql/ustat/hs_parser.cpp ---
    @@ -428,25 +435,22 @@ Lng32 AddTableName( const hs_table_type type
                 LM->Log(LM->msg);
              }
     
    -        const char* period = strchr(hs_globals->catSch->data(), '.');
    -        NAString catName(hs_globals->catSch->data(),
    -                         period - hs_globals->catSch->data());
    +        NAString catName(hs_globals->objDef->getCatName());
    +
    +        *hs_globals->hstogram_table = getHistogramsTableLocation(hs_globals->catSch->data(), FALSE);
    +
    +        *hs_globals->hsintval_table = getHistogramsTableLocation(hs_globals->catSch->data(), FALSE);
    +
             NABoolean isHbaseOrHive = HSGlobalsClass::isHbaseCat(catName) ||
                                       HSGlobalsClass::isHiveCat(catName);
     
    -        *hs_globals->hstogram_table = 
    -            getHistogramsTableLocation(hs_globals->catSch->data(), FALSE);
    -        if (isHbaseOrHive)
    +        if (isHbaseOrHive) {
               hs_globals->hstogram_table->append(".").append(HBASE_HIST_NAME);
    -        else
    -          hs_globals->hstogram_table->append(".HISTOGRAMS");
    -
    -        *hs_globals->hsintval_table = 
    -            getHistogramsTableLocation(hs_globals->catSch->data(), FALSE);
    -        if (isHbaseOrHive)
               hs_globals->hsintval_table->append(".").append(HBASE_HISTINT_NAME);
    -        else
    +        } else {
    +          hs_globals->hstogram_table->append(".HISTOGRAMS");
               hs_globals->hsintval_table->append(".HISTOGRAM_INTERVALS");
    +        }
    --- End diff --
    
    I currently don't have a working SQL, but are histogram  table names HISTOGRAMS or SB_HISTOGRAMS?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r39988941
  
    --- Diff: core/sql/regress/seabase/dml.sb028 ---
    @@ -0,0 +1,11 @@
    +
    --- End diff --
    
    As should this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r39989234
  
    --- Diff: core/sql/ustat/hs_globals.cpp ---
    @@ -2887,6 +2890,22 @@ Lng32 HSGlobalsClass::Initialize()
             HSHandleError(retcode);
             TM->Commit(); // Must commit this transaction (even if schema didn't get created).
           }
    +                                              /*=====================================*/
    +                                              /*   CREATE HBASE STATS SCHEMA         */
    +                                              /*   typically as trafodion.hbasestats */
    +                                              /*=====================================*/
    +    if (isNativeHbaseCat(objDef->getCatName()))
    +      {
    +        HSTranMan *TM = HSTranMan::Instance(); // Must have transaction around this.
    +        TM->Begin("Create schema for native hbase stats.");
    --- End diff --
    
    Is it known that there is no user transaction active in this code path?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by nonstop-qfchen <gi...@git.apache.org>.
Github user nonstop-qfchen commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r40008277
  
    --- Diff: core/sql/ustat/hs_globals.cpp ---
    @@ -2887,6 +2890,22 @@ Lng32 HSGlobalsClass::Initialize()
             HSHandleError(retcode);
             TM->Commit(); // Must commit this transaction (even if schema didn't get created).
           }
    +                                              /*=====================================*/
    +                                              /*   CREATE HBASE STATS SCHEMA         */
    +                                              /*   typically as trafodion.hbasestats */
    +                                              /*=====================================*/
    +    if (isNativeHbaseCat(objDef->getCatName()))
    +      {
    +        HSTranMan *TM = HSTranMan::Instance(); // Must have transaction around this.
    +        TM->Begin("Create schema for native hbase stats.");
    +        NAString ddl = "CREATE SCHEMA IF NOT EXISTS ";
    +        ddl.append(HBASE_STATS_CATALOG).append('.').append(HBASE_STATS_SCHEMA);
    +        retcode = HSFuncExecQuery(ddl, -UERR_INTERNAL_ERROR, NULL,
    +                                  "Creating schema for native HBase statistics", NULL,
    +                                  NULL);
    +        HSHandleError(retcode);
    +        TM->Commit(); // Must commit this transaction (even if schema didn't get created).
    --- End diff --
    
    The comment is misleading.  The Commit method actually checks retched_ and
    if it is less than 0 indicating an error had occurred, the method does
    nothing more other than log the error code to ULOG.
    
    I'l fix the comment to indicate this.
    
    
    On Mon, Sep 21, 2015 at 1:25 PM, Roberta Marton <no...@github.com>
    wrote:
    
    > In core/sql/ustat/hs_globals.cpp
    > <https://github.com/apache/incubator-trafodion/pull/87#discussion_r40006615>
    > :
    >
    > > @@ -2887,6 +2890,22 @@ Lng32 HSGlobalsClass::Initialize()
    > >          HSHandleError(retcode);
    > >          TM->Commit(); // Must commit this transaction (even if schema didn't get created).
    > >        }
    > > +                                              /*=====================================*/
    > > +                                              /*   CREATE HBASE STATS SCHEMA         */
    > > +                                              /*   typically as trafodion.hbasestats */
    > > +                                              /*=====================================*/
    > > +    if (isNativeHbaseCat(objDef->getCatName()))
    > > +      {
    > > +        HSTranMan *TM = HSTranMan::Instance(); // Must have transaction around this.
    > > +        TM->Begin("Create schema for native hbase stats.");
    > > +        NAString ddl = "CREATE SCHEMA IF NOT EXISTS ";
    > > +        ddl.append(HBASE_STATS_CATALOG).append('.').append(HBASE_STATS_SCHEMA);
    > > +        retcode = HSFuncExecQuery(ddl, -UERR_INTERNAL_ERROR, NULL,
    > > +                                  "Creating schema for native HBase statistics", NULL,
    > > +                                  NULL);
    > > +        HSHandleError(retcode);
    > > +        TM->Commit(); // Must commit this transaction (even if schema didn't get created).
    >
    > Just curious, why commit a failed request. If the create schema failed,
    > could it leave the metadata in on inconsistent state? i.e., creating a
    > schema updates several metadata tables, if the operation failed after
    > updating one table (e.g. OBJECTS) but before it completes another (e.g.
    > TABLES) there could be an issue.
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/incubator-trafodion/pull/87/files#r40006615>.
    >
    
    
    
    -- 
    Regards, --Qifan



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by nonstop-qfchen <gi...@git.apache.org>.
Github user nonstop-qfchen commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r39989059
  
    --- Diff: core/sqf/sql/scripts/install_hadoop_regr_test_env ---
    @@ -214,6 +213,20 @@ fi
     
       cd $MY_TPCDS_DATA_DIR
     
    +  which iconv >>${MY_LOG_FILE} 2>&1
    +  if (( $? != 0 ))
    +  then
    +    echo "iconv utility not available. The data will be in ISO-8859-1 format."
    +  else
    +    echo "Converting the data into UTF-8 format ..."
    +    for t in date_dim time_dim item customer customer_demographics household_demographics customer_address store promotion store_sales
    +      do
    +        iconv -f ISO-8859-1 -t UTF-8 -o ${t}.utf8.dat ${t}.dat >>${MY_LOG_FILE} 2>&1
    +        mv ${t}.utf8.dat ${t}.dat
    --- End diff --
    
    The data set is generated from dsdgen step above with the last command
    being " ./dsdgen -force $FORCE -dir $MY_TPCDS_DATA_DIR -scale $SCALE -table
    promotion   >>${MY_LOG_FILE} 2>&1".
    
    So if we rerun the command again, the input to iconv will be the same as in
    the previous run.
    
    On Mon, Sep 21, 2015 at 10:53 AM, DaveBirdsall <no...@github.com>
    wrote:
    
    > In core/sqf/sql/scripts/install_hadoop_regr_test_env
    > <https://github.com/apache/incubator-trafodion/pull/87#discussion_r39987723>
    > :
    >
    > > @@ -214,6 +213,20 @@ fi
    > >
    > >    cd $MY_TPCDS_DATA_DIR
    > >
    > > +  which iconv >>${MY_LOG_FILE} 2>&1
    > > +  if (( $? != 0 ))
    > > +  then
    > > +    echo "iconv utility not available. The data will be in ISO-8859-1 format."
    > > +  else
    > > +    echo "Converting the data into UTF-8 format ..."
    > > +    for t in date_dim time_dim item customer customer_demographics household_demographics customer_address store promotion store_sales
    > > +      do
    > > +        iconv -f ISO-8859-1 -t UTF-8 -o ${t}.utf8.dat ${t}.dat >>${MY_LOG_FILE} 2>&1
    > > +        mv ${t}.utf8.dat ${t}.dat
    >
    > This is not idempotent. (If we do it a second time we'll get invalid
    > data.) If this script is interrupted and restarted from the beginning, do
    > we get fresh copies of all the data?
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/incubator-trafodion/pull/87/files#r39987723>.
    >
    
    
    
    -- 
    Regards, --Qifan



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by robertamarton <gi...@git.apache.org>.
Github user robertamarton commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r40003413
  
    --- Diff: core/sql/optimizer/BindRelExpr.cpp ---
    @@ -1573,7 +1573,8 @@ NATable *BindWA::getNATable(CorrName& corrName,
       // native HIVE or HBASE objects unless the allowExternalTables flag is set.  
       // allowExternalTables is set for drop table and SHOWDDL statements.  
       // TDB - may want to merge the Trafodion version with the native version.
    -  if ((table) && (table->isExternalTable() && (! bindWA->allowExternalTables())))
    +  if ((table) && (table->isExternalTable() && (! bindWA->allowExternalTables()) &&
    --- End diff --
    
    As an option, you could set "allowExternalTables" in the code that get the NATable structure instead of adding another check.  Not sure how many places in Histogram code you need to get NATable information.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by robertamarton <gi...@git.apache.org>.
Github user robertamarton commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r40004034
  
    --- Diff: core/sql/optimizer/NATable.cpp ---
    @@ -5110,6 +5138,12 @@ NATable::NATable(BindWA *bindWA,
         setIsExternalTable(TRUE);
       }
      
    +  if (qualifiedName_.getQualifiedNameObj().isHistograms() || 
    +      qualifiedName_.getQualifiedNameObj().isHistogramIntervals())
    --- End diff --
    
    CmpSeabaseDDL.h has a static method called isHistogramTable that checks for both histograms and histogram_intervals.  I might be better to call this in case you ever add another histograms table.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by robertamarton <gi...@git.apache.org>.
Github user robertamarton commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r40006615
  
    --- Diff: core/sql/ustat/hs_globals.cpp ---
    @@ -2887,6 +2890,22 @@ Lng32 HSGlobalsClass::Initialize()
             HSHandleError(retcode);
             TM->Commit(); // Must commit this transaction (even if schema didn't get created).
           }
    +                                              /*=====================================*/
    +                                              /*   CREATE HBASE STATS SCHEMA         */
    +                                              /*   typically as trafodion.hbasestats */
    +                                              /*=====================================*/
    +    if (isNativeHbaseCat(objDef->getCatName()))
    +      {
    +        HSTranMan *TM = HSTranMan::Instance(); // Must have transaction around this.
    +        TM->Begin("Create schema for native hbase stats.");
    +        NAString ddl = "CREATE SCHEMA IF NOT EXISTS ";
    +        ddl.append(HBASE_STATS_CATALOG).append('.').append(HBASE_STATS_SCHEMA);
    +        retcode = HSFuncExecQuery(ddl, -UERR_INTERNAL_ERROR, NULL,
    +                                  "Creating schema for native HBase statistics", NULL,
    +                                  NULL);
    +        HSHandleError(retcode);
    +        TM->Commit(); // Must commit this transaction (even if schema didn't get created).
    --- End diff --
    
    Just curious, why commit a failed request.  If the create schema failed, could it leave the metadata in on inconsistent state?  i.e., creating a schema updates several metadata tables, if the operation failed after updating one table (e.g. OBJECTS) but before it completes another (e.g. TABLES) there could be an issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by nonstop-qfchen <gi...@git.apache.org>.
Github user nonstop-qfchen commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r40010341
  
    --- Diff: core/sql/ustat/hs_parser.cpp ---
    @@ -428,25 +435,22 @@ Lng32 AddTableName( const hs_table_type type
                 LM->Log(LM->msg);
              }
     
    -        const char* period = strchr(hs_globals->catSch->data(), '.');
    -        NAString catName(hs_globals->catSch->data(),
    -                         period - hs_globals->catSch->data());
    +        NAString catName(hs_globals->objDef->getCatName());
    +
    +        *hs_globals->hstogram_table = getHistogramsTableLocation(hs_globals->catSch->data(), FALSE);
    +
    +        *hs_globals->hsintval_table = getHistogramsTableLocation(hs_globals->catSch->data(), FALSE);
    +
             NABoolean isHbaseOrHive = HSGlobalsClass::isHbaseCat(catName) ||
                                       HSGlobalsClass::isHiveCat(catName);
     
    -        *hs_globals->hstogram_table = 
    -            getHistogramsTableLocation(hs_globals->catSch->data(), FALSE);
    -        if (isHbaseOrHive)
    +        if (isHbaseOrHive) {
               hs_globals->hstogram_table->append(".").append(HBASE_HIST_NAME);
    -        else
    -          hs_globals->hstogram_table->append(".HISTOGRAMS");
    -
    -        *hs_globals->hsintval_table = 
    -            getHistogramsTableLocation(hs_globals->catSch->data(), FALSE);
    -        if (isHbaseOrHive)
               hs_globals->hsintval_table->append(".").append(HBASE_HISTINT_NAME);
    -        else
    +        } else {
    +          hs_globals->hstogram_table->append(".HISTOGRAMS");
               hs_globals->hsintval_table->append(".HISTOGRAM_INTERVALS");
    +        }
    --- End diff --
    
    In trafodion, the test isHbaseOrHive will always be true. So we will always
    be using SB_HISTOGRAMS and SB_HISTOGRAM_INTERVALS.  The other two names are
    use by SQ.
    
    Maybe later on, we can do a cleanup.
    
    On Mon, Sep 21, 2015 at 1:49 PM, Roberta Marton <no...@github.com>
    wrote:
    
    > In core/sql/ustat/hs_parser.cpp
    > <https://github.com/apache/incubator-trafodion/pull/87#discussion_r40009684>
    > :
    >
    > >            hs_globals->hsintval_table->append(".HISTOGRAM_INTERVALS");
    > > +        }
    >
    > I currently don't have a working SQL, but are histogram table names
    > HISTOGRAMS or SB_HISTOGRAMS?
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/incubator-trafodion/pull/87/files#r40009684>.
    >
    
    
    
    -- 
    Regards, --Qifan



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r39988915
  
    --- Diff: core/sql/regress/seabase/ddl.sb028 ---
    @@ -0,0 +1,23 @@
    +
    --- End diff --
    
    This file should also have an Apache copyright


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r39988524
  
    --- Diff: core/sql/regress/executor/TEST002 ---
    @@ -661,7 +661,9 @@ left outer join (
       ) T) T) T) T) T) T
     on (t002t6.c = T.a)
     ;
    +
     control query shape merge_join(cut,cut);
    +cqd PCODE_OPT_LEVEL '0';
    --- End diff --
    
    This is worrisome actually. Is there a follow-on JIRA to fix the Pcode problems?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by nonstop-qfchen <gi...@git.apache.org>.
Github user nonstop-qfchen commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r39990016
  
    --- Diff: core/sql/ustat/hs_globals.cpp ---
    @@ -2887,6 +2890,22 @@ Lng32 HSGlobalsClass::Initialize()
             HSHandleError(retcode);
             TM->Commit(); // Must commit this transaction (even if schema didn't get created).
           }
    +                                              /*=====================================*/
    +                                              /*   CREATE HBASE STATS SCHEMA         */
    +                                              /*   typically as trafodion.hbasestats */
    +                                              /*=====================================*/
    +    if (isNativeHbaseCat(objDef->getCatName()))
    +      {
    +        HSTranMan *TM = HSTranMan::Instance(); // Must have transaction around this.
    +        TM->Begin("Create schema for native hbase stats.");
    --- End diff --
    
    Yes. the Begin() and Commit() methods detect any existing transactions in
    progress and if one if found, these two methods are no op.
    
    On Mon, Sep 21, 2015 at 11:05 AM, DaveBirdsall <no...@github.com>
    wrote:
    
    > In core/sql/ustat/hs_globals.cpp
    > <https://github.com/apache/incubator-trafodion/pull/87#discussion_r39989234>
    > :
    >
    > > @@ -2887,6 +2890,22 @@ Lng32 HSGlobalsClass::Initialize()
    > >          HSHandleError(retcode);
    > >          TM->Commit(); // Must commit this transaction (even if schema didn't get created).
    > >        }
    > > +                                              /*=====================================*/
    > > +                                              /*   CREATE HBASE STATS SCHEMA         */
    > > +                                              /*   typically as trafodion.hbasestats */
    > > +                                              /*=====================================*/
    > > +    if (isNativeHbaseCat(objDef->getCatName()))
    > > +      {
    > > +        HSTranMan *TM = HSTranMan::Instance(); // Must have transaction around this.
    > > +        TM->Begin("Create schema for native hbase stats.");
    >
    > Is it known that there is no user transaction active in this code path?
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/incubator-trafodion/pull/87/files#r39989234>.
    >
    
    
    
    -- 
    Regards, --Qifan



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by robertamarton <gi...@git.apache.org>.
Github user robertamarton commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r40007320
  
    --- Diff: core/sql/ustat/hs_la.cpp ---
    @@ -994,13 +999,118 @@ Lng32 HSHiveTableDef::DescribeColumnNames()
     }
     
     //=====================================================
    +//
    +NAString HSHbaseTableDef::getHistLoc(formatType format) const
    +{
    +  if ( HSGlobalsClass::isNativeHbaseCat(getCatName(format))) {
    +    return HBASE_STATS_CATALOG "." HBASE_STATS_SCHEMA;
    +  } else {  
    +    NAString name(getCatName(format));
    +    name.append(".");
    +    name.append(getSchemaName(format));
    +    return name;
    +  }
    +}
    +
    +static 
    +Lng32 CreateExternalTable(const NAString& catName, const NAString& schName, const NAString& nativeTableName)
    +{
    +   HSLogMan *LM = HSLogMan::Instance();
    +   if (LM->LogNeeded())
    +      {
    +        snprintf(LM->msg, sizeof(LM->msg), "Creating external table table for %s on demand.",
    +                          nativeTableName.data());
    +        LM->Log(LM->msg);
    +      }
    +
    +   // do not have to worry about the catalog and schema for the new external table 
    +   // here. These names will be determined by the processing logic. 
    +   NAString ddl = "CREATE EXTERNAL TABLE ";
    +   ddl.append(nativeTableName);
    +   ddl.append(" FOR ");
    +   ddl.append(catName);
    +   ddl.append(".");
    +   ddl.append(schName);
    +   ddl.append(".");
    +   ddl.append(nativeTableName);
    +   ddl.append(" ATTRIBUTE ");
    +
    +   HSTranMan *TM = HSTranMan::Instance();
    +   if (TM->InTransaction())
    +      ddl.append(" AUDIT");
    +   else
    +      ddl.append("NO AUDIT");
    +
    --- End diff --
    
    I always assumed that external tables are non-audited.  You are creating them either audited or non-audited.  Since these tables do not contain any Trafodion data, is there any reason to make them audited?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: Jira1443 final committed

Posted by nonstop-qfchen <gi...@git.apache.org>.
Github user nonstop-qfchen commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/87#discussion_r40009877
  
    --- Diff: core/sql/ustat/hs_la.cpp ---
    @@ -994,13 +999,118 @@ Lng32 HSHiveTableDef::DescribeColumnNames()
     }
     
     //=====================================================
    +//
    +NAString HSHbaseTableDef::getHistLoc(formatType format) const
    +{
    +  if ( HSGlobalsClass::isNativeHbaseCat(getCatName(format))) {
    +    return HBASE_STATS_CATALOG "." HBASE_STATS_SCHEMA;
    +  } else {  
    +    NAString name(getCatName(format));
    +    name.append(".");
    +    name.append(getSchemaName(format));
    +    return name;
    +  }
    +}
    +
    +static 
    +Lng32 CreateExternalTable(const NAString& catName, const NAString& schName, const NAString& nativeTableName)
    +{
    +   HSLogMan *LM = HSLogMan::Instance();
    +   if (LM->LogNeeded())
    +      {
    +        snprintf(LM->msg, sizeof(LM->msg), "Creating external table table for %s on demand.",
    +                          nativeTableName.data());
    +        LM->Log(LM->msg);
    +      }
    +
    +   // do not have to worry about the catalog and schema for the new external table 
    +   // here. These names will be determined by the processing logic. 
    +   NAString ddl = "CREATE EXTERNAL TABLE ";
    +   ddl.append(nativeTableName);
    +   ddl.append(" FOR ");
    +   ddl.append(catName);
    +   ddl.append(".");
    +   ddl.append(schName);
    +   ddl.append(".");
    +   ddl.append(nativeTableName);
    +   ddl.append(" ATTRIBUTE ");
    +
    +   HSTranMan *TM = HSTranMan::Instance();
    +   if (TM->InTransaction())
    +      ddl.append(" AUDIT");
    +   else
    +      ddl.append("NO AUDIT");
    +
    --- End diff --
    
    Good point.
    
    On Mon, Sep 21, 2015 at 1:30 PM, Roberta Marton <no...@github.com>
    wrote:
    
    > In core/sql/ustat/hs_la.cpp
    > <https://github.com/apache/incubator-trafodion/pull/87#discussion_r40007320>
    > :
    >
    > > +   NAString ddl = "CREATE EXTERNAL TABLE ";
    > > +   ddl.append(nativeTableName);
    > > +   ddl.append(" FOR ");
    > > +   ddl.append(catName);
    > > +   ddl.append(".");
    > > +   ddl.append(schName);
    > > +   ddl.append(".");
    > > +   ddl.append(nativeTableName);
    > > +   ddl.append(" ATTRIBUTE ");
    > > +
    > > +   HSTranMan *TM = HSTranMan::Instance();
    > > +   if (TM->InTransaction())
    > > +      ddl.append(" AUDIT");
    > > +   else
    > > +      ddl.append("NO AUDIT");
    > > +
    >
    > I always assumed that external tables are non-audited. You are creating
    > them either audited or non-audited. Since these tables do not contain any
    > Trafodion data, is there any reason to make them audited?
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/incubator-trafodion/pull/87/files#r40007320>.
    >
    
    
    
    -- 
    Regards, --Qifan



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---