You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by wanghaifei <wa...@jd.com> on 2015/03/06 09:42:58 UTC

How hive0.14 concurrent read detla file

dear sir,
   problem 1:  for files to concurrent read ?
       Hive0.14 file is read directly from the HDFS.The following is the record of the log:    


15/02/26 16:43:31 [main]: INFO orc.ReaderImpl: Reading ORC rows from hdfs://spark-jrdata-12.pekdc1.jdfin.local:9000/user/hive/warehouse/sku_01/end_dt=20150111/000000_0 with {include: [true, true, true, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false], offset: 0, length: 9223372036854775807}       

    Here I have a question.  To hive0.13, through the MR to read the file. If the data quantity is big, the faster the execution rate. But  in hive0.14, It Is how to take concurrent reads the file, so as to improve the query speed.  Here I know hive0.14, through the package data structure, to your query  need column only get this column instead of the whole line. 
      I hope you tell me detail implementation class .

   problem 2:  to run merge the data  of detail implementation class .  
        
       I hope to answer.
       Thank you .