You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Krzysztof Indyk (JIRA)" <ji...@apache.org> on 2015/10/17 14:38:05 UTC

[jira] [Created] (PIG-4705) Error Schema for data cannot be determined using HCatalog

Krzysztof Indyk created PIG-4705:
------------------------------------

             Summary: Error Schema for data cannot be determined using HCatalog
                 Key: PIG-4705
                 URL: https://issues.apache.org/jira/browse/PIG-4705
             Project: Pig
          Issue Type: Bug
          Components: tez
    Affects Versions: 0.15.0
         Environment: HDP 2.3.2
            Reporter: Krzysztof Indyk


When we use {{HCatalog}} as source and destination of data for {{Pig}} on {{Tez}} we get  ??ERROR 1115: Schema for data cannot be determined??.
Pig works fine when we use map reduce or use HCatalog only as one of endpoints i.e. load data directly from file and store using HCatalog.

The error appears after upgrading from {{Pig 0.14}} on {{Tez 0.5.2}} to {{Pig 0.15}} on {{Tez 0.7.0}} ( HDP 2.2.6}} to {{HDP 2.3.2}}).

To reproduce:
- create hive tables from hive_tables.hql
- load data to table_input from sample.csv
- run following Pig script on Tez

{code}

data = LOAD 'table_input' USING org.apache.hive.hcatalog.pig.HCatLoader();
items_unique = DISTINCT data;

counted = FOREACH (GROUP items_unique BY col2)
	    GENERATE
	      group AS name,
	      COUNT(items_unique) AS value;
  
STORE counted INTO 'table_output' USING org.apache.hive.hcatalog.pig.HCatStorer();
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)