You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/05/08 07:44:06 UTC

[GitHub] [iceberg] liubo1022126 opened a new issue #2567: cannot find field xxx at Map operator initialization failed when iceberg table join hive table

liubo1022126 opened a new issue #2567:
URL: https://github.com/apache/iceberg/issues/2567


   Hi
   
   I have two table, table A is iceberg format and table B is simple textfile format .
   
   table A like: 
   ROW FORMAT SERDE 
     'org.apache.iceberg.mr.hive.HiveIcebergSerDe' 
   STORED BY 
     'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' 
   
   table B like:
   ROW FORMAT SERDE 
     'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
   WITH SERDEPROPERTIES ( 
     'field.delim'='', 
     'line.delim'='\n', 
     'serialization.format'='', 
     'serialization.null.format'='') 
   STORED AS INPUTFORMAT 
     'org.apache.hadoop.mapred.TextInputFormat' 
   OUTPUTFORMAT 
     'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
   
   when I have an query table A left join table B (table A have 400W records, table B have 10W records), I get error like below, but when table B only have 5W records, query join success.
   
   Error: java.lang.RuntimeException: Error in configuring object
   	at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:113)
   	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:79)
   	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:137)
   	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
   	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
   	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
   Caused by: java.lang.reflect.InvocationTargetException
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:110)
   	... 9 more
   Caused by: java.lang.RuntimeException: Error in configuring object
   	at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:113)
   	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:79)
   	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:137)
   	at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
   	... 14 more
   Caused by: java.lang.reflect.InvocationTargetException
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:110)
   	... 17 more
   Caused by: java.lang.RuntimeException: Map operator initialization failed
   	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:125)
   	... 22 more
   Caused by: java.lang.RuntimeException: cannot find field xxx_guid from [org.apache.iceberg.mr.hive.serde.objectinspector.IcebergRecordObjectInspector$IcebergRecordStructField@8556f063, org.apache.iceberg.mr.hive.serde.objectinspector.IcebergRecordObjectInspector$IcebergRecordStructField@767b213e, org.apache.iceberg.mr.hive.serde.objectinspector.IcebergRecordObjectInspector$IcebergRecordStructField@145d3731]
   	at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:523)
   	at org.apache.iceberg.mr.hive.serde.objectinspector.IcebergRecordObjectInspector.getStructFieldRef(IcebergRecordObjectInspector.java:70)
   	at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:56)
   	at org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:1033)
   	at org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:1059)
   	at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:75)
   	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:366)
   	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:556)
   	at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:508)
   	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
   	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:556)
   	at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:508)
   	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
   	at org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:501)
   	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:104)
   	... 22 more
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on issue #2567: cannot find field xxx at Map operator initialization failed when iceberg table join hive table

Posted by GitBox <gi...@apache.org>.
pvary commented on issue #2567:
URL: https://github.com/apache/iceberg/issues/2567#issuecomment-841165790


   @dixingxing0: By any chance, could you create a failing unit test for it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary edited a comment on issue #2567: cannot find field xxx at Map operator initialization failed when iceberg table join hive table

Posted by GitBox <gi...@apache.org>.
pvary edited a comment on issue #2567:
URL: https://github.com/apache/iceberg/issues/2567#issuecomment-841165790


   @dixingxing0: By any chance, could you create a failing unit test for it? That would greatly help in fixing the problem. Currently I am not able to repro the issue so first we would need that.
   
   Thanks,
   Peter


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on issue #2567: cannot find field xxx at Map operator initialization failed when iceberg table join hive table

Posted by GitBox <gi...@apache.org>.
pvary commented on issue #2567:
URL: https://github.com/apache/iceberg/issues/2567#issuecomment-836847031


   @liubo1022126: To be honest, I am a bit confused. Did @dixingxing0 solution worked for you?
   When @dixingxing0 closed #2198 I was not sure how he fixed it, and I assumed that it was somewhere in his own code. If not then we might want to find a general solution.
   
   @dixingxing0: Could you please help us out here?
   
   Thanks,
   Peter


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dixingxing0 commented on issue #2567: cannot find field xxx at Map operator initialization failed when iceberg table join hive table

Posted by GitBox <gi...@apache.org>.
dixingxing0 commented on issue #2567:
URL: https://github.com/apache/iceberg/issues/2567#issuecomment-837659408


   Hi @pvary, as i described in #2198, this exception occurred in the `initializeMapOperator` phase, and `HiveIcebergSerDe` will not be used to deserialize data of hive table, so i just caught the exception in `IcebergRecordObjectInspector#getStructFieldRef` and returned a dummy struct field, i'll submit a PR to show how it works.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] liubo1022126 commented on issue #2567: cannot find field xxx at Map operator initialization failed when iceberg table join hive table

Posted by GitBox <gi...@apache.org>.
liubo1022126 commented on issue #2567:
URL: https://github.com/apache/iceberg/issues/2567#issuecomment-835182253


   #2198 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dixingxing0 commented on issue #2567: cannot find field xxx at Map operator initialization failed when iceberg table join hive table

Posted by GitBox <gi...@apache.org>.
dixingxing0 commented on issue #2567:
URL: https://github.com/apache/iceberg/issues/2567#issuecomment-841139053


   Hi @pvary, after i did more testing, i found my solution not working as expected.
   
   I've return an dummy field with `Integer` type when IcebergRecordObjectInspector#getStructFieldRef  raise 'cannot find field xxx_guid', but `Integer` type can not working with all SQL functions, e.g. CONCAT_WS(xxx_guid, '###') will raise an exception:
   ```
   org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: Argument 72 of function CONCAT_WS must be "string or array<string>"
   ```
   
   So, I think we still need another general solution.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on issue #2567: cannot find field xxx at Map operator initialization failed when iceberg table join hive table

Posted by GitBox <gi...@apache.org>.
pvary commented on issue #2567:
URL: https://github.com/apache/iceberg/issues/2567#issuecomment-841119237


   Thanks @dixingxing0!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org