You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Eugene Koifman (JIRA)" <ji...@apache.org> on 2017/09/11 18:41:00 UTC

[jira] [Created] (HIVE-17505) hive.optimize.union.remove=true doesn't work with insert into

Eugene Koifman created HIVE-17505:
-------------------------------------

             Summary: hive.optimize.union.remove=true doesn't work with insert into
                 Key: HIVE-17505
                 URL: https://issues.apache.org/jira/browse/HIVE-17505
             Project: Hive
          Issue Type: Bug
            Reporter: Eugene Koifman


add this to TestTxnNoBuckets  (not related to Acid - just a repro)
{noformat}
@Test
  public void testToAcidConversionMultiBucket() throws Exception {
    hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_OPTIMIZE_UNION_REMOVE, true);
    hiveConf.setVar(HiveConf.ConfVars.HIVEFETCHTASKCONVERSION, "none");
    d.close();
    d = new Driver(hiveConf);
    int[][] values = {{1,2},{3,4},{5,6},{7,8},{9,10}};
    runStatementOnDriver("insert into " + Table.ACIDTBL + makeValuesClause(values));
    runStatementOnDriver("drop table if exists T");
    runStatementOnDriver("create table T (a int, b int) stored as ORC  TBLPROPERTIES ('transactional'='false')");//todo: try with T bucketd
//    runStatementOnDriver("insert into T select a,b from (" + "select a, b from " + Table.ACIDTBL + " where a <= 5 union all select a, b from " + Table.ACIDTBL + " where a >= 5" + ") S order by a, b");
    runStatementOnDriver("insert into T(a,b) select a, b from " + Table.ACIDTBL + " where a between 1 and 3 group by a, b union all select a, b from " + Table.ACIDTBL + " where a between 5 and 7 union all select a, b from " + Table.ACIDTBL + " where a >= 9");
    List<String> rs = runStatementOnDriver("select a, b, INPUT__FILE__NAME from T order by a, b, INPUT__FILE__NAME");
    LOG.warn("before converting to acid");
    for(String s : rs) {
      LOG.warn(s);
    }
{noformat}
this creates 
{noformat}
ekoifman:apache-hive-3.0.0-SNAPSHOT-bin ekoifman$ tree ~/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnNoBuckets-1505153866252/warehouse/t/.hive-staging_hive_2017-09-11_11-18-48_614_1924461543400304640-1/
/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnNoBuckets-1505153866252/warehouse/t/.hive-staging_hive_2017-09-11_11-18-48_614_1924461543400304640-1/
├── -ext-10002
│   ├── 19
│   │   └── 000000_0
│   ├── 20
│   │   └── 000000_0
│   └── 21
│       └── 000000_0
└── _tmp.-ext-10000

5 directories, 3 files
{noformat}
but 
_Hive.copyFiles(HiveConf conf, Path srcf, Path destf, FileSystem fs, boolean isSrcLocal, boolean isAcid, List<Path> newFiles)_

bails out at 
{noformat}
    if (srcs == null) {
      LOG.info("No sources specified to move: " + srcf);
      return;
      // srcs = new FileStatus[0]; Why is this needed?
    }

{noformat}
and so the table T ends up empty.

cc [~ashutoshc], [~jcamachorodriguez]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)