You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@kylin.apache.org by ShaoFeng Shi <sh...@apache.org> on 2017/09/07 09:02:59 UTC

Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

Hi Alexander,

I encounter a problem when using HDFS for cubing building, and S3 for HBase
on EMR. In the "Load HFile to HBase Table" step, Kylin got a failure with
time out error:

Thu Sep 07 15:33:27 GMT+08:00 2017,
RpcRetryingCaller{globalStartTime=1504769048975, pause=100, retries=35},
java.io.IOException: Call to ip-10-0-0-28.ec2.internal/10.0.0.28:16020
failed on local exception:
org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=41,
waitTime=60001, operationTimeout=60000

In HBase region server, I saw HBase uploads the HFile to S3; Since the cube
is a little big (13GB), it takes much longer time than usual. Kylin client
closed the connection as it thought timeout:

2017-09-07 08:01:12,275 INFO
[RpcServer.FifoWFPBQ.default.handler=16,queue=1,port=16020]
regionserver.HRegionFileSystem: Bulk-load file
hdfs://ip-10-0-0-118.ec2.internal:8020/kylin/kylin_default_instance/kylin-cdcb5f57-2ea9-47d9-85db-7a6c7490cc55/test/hfile/F1/a897b4d33ed648e6a5d0bfb05cffdfd6
is on different filesystem than the destination store. Copying file over to
destination filesystem.
2017-09-07 08:01:23,919 INFO
[RpcServer.FifoWFPBQ.default.handler=22,queue=1,port=16020]
s3.MultipartUploadManager: completed multipart upload of 8 parts 965420145
bytes

2017-09-07 08:26:33,838 WARN
[RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020] ipc.RpcServer:
(responseTooSlow):
{"call":"BulkLoadHFile(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$BulkLoadHFileRequest)","starttimems":1504770958916,"responsesize":2,"method":"BulkLoadHFile","param":"TODO:
class
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$BulkLoadHFileRequest","processingtimems":1834922,"client":"
10.0.0.243:49152","queuetimems":0,"class":"HRegionServer"}
2017-09-07 08:26:33,838 WARN
[RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020] ipc.RpcServer:
RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020: caught a
ClosedChannelException, this means that the server /10.0.0.28:16020 was
processing a request but the client went away. The error message was: null

So I wonder how did you bypass this problem, did you set a very large
timeout value for HBase, or your cube size isn't that big? Thanks.



2017-08-14 14:19 GMT+08:00 Alexander Sterligov <st...@joom.it>:

> Here is ticket for hfile on s3 issue - https://issues.apache.org/
> jira/browse/KYLIN-2788
>
> On Mon, Aug 14, 2017 at 9:17 AM, Alexander Sterligov <st...@joom.it>
> wrote:
>
>> I forgot there was one more issue with s3 - https://issues.apache.org/ji
>> ra/browse/KYLIN-2740.
>>
>> Global dictionary in 2.0 doesn't work out of the box. I patched kylin as
>> described in ticket.
>>
>> On Sun, Aug 13, 2017 at 4:24 AM, ShaoFeng Shi <sh...@apache.org>
>> wrote:
>>
>>> Nice; For the writting hfile to S3 issue,  it need more
>>> investigation.  Please open a Kylin JIRA for tracking. We will update there
>>> if has any finding.
>>>
>>> 2017-08-12 23:52 GMT+08:00 Alexander Sterligov <st...@joom.it>:
>>>
>>>> Query performance is pretty same as on slides about kylin. I have high
>>>> bucket cache hit (>90%), so data is almost always read from local disk. For
>>>> some other use cases it might be different.
>>>>
>>>> 12 авг. 2017 г. 17:59 пользователь "ShaoFeng Shi" <
>>>> shaofengshi@apache.org> написал:
>>>>
>>>> Cool; how about the query performance with data on s3?
>>>>
>>>> 2017-08-11 23:27 GMT+08:00 Alexander Sterligov <st...@joom.it>:
>>>>
>>>>> Yes, that's the only one fow now.
>>>>>
>>>>> On Fri, Aug 11, 2017 at 6:23 PM, ShaoFeng Shi <sh...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> No need to add I think, because I see they already in the
>>>>>> configuration of that step.
>>>>>>
>>>>>> Is this the only issue you see with Kylin on EMR+S3?
>>>>>>
>>>>>> [image: 内嵌图片 1]
>>>>>>
>>>>>> 2017-08-11 20:26 GMT+08:00 Alexander Sterligov <st...@joom.it>:
>>>>>>
>>>>>>> What if we shall add direct output in kylin_job_conf.xml
>>>>>>> and kylin_job_conf_inmem.xml?
>>>>>>>
>>>>>>> hbase.zookeeper.quorum for example doesn't work if not specified in
>>>>>>> these configs.
>>>>>>>
>>>>>>> On Fri, Aug 11, 2017 at 3:13 PM, ShaoFeng Shi <
>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>
>>>>>>>> EMR enables the direct output in mapred-site.xml, while in this
>>>>>>>> step it seems these settings doesn't work (althoug the job's configuration
>>>>>>>> shows they are there). I disabled the direct output but the behavior has no
>>>>>>>> change. I did some search but no finding. I need drop the EMR now, and may
>>>>>>>> get back it later.
>>>>>>>>
>>>>>>>> If you have any idea or findings, please share it. We'd like to
>>>>>>>> make Kylin has better support for cloud.
>>>>>>>>
>>>>>>>> Thanks for your feedback!
>>>>>>>>
>>>>>>>> 2017-08-11 19:19 GMT+08:00 Alexander Sterligov <sterligovak@joom.it
>>>>>>>> >:
>>>>>>>>
>>>>>>>>> Any ideas how to fix that?
>>>>>>>>>
>>>>>>>>> On Fri, Aug 11, 2017 at 2:16 PM, ShaoFeng Shi <
>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>
>>>>>>>>>> I got the same problem as you:
>>>>>>>>>>
>>>>>>>>>> 2017-08-11 08:44:16,342 WARN  [Job 2c86b4b6-7639-4a97-ba63-63c9dca095f6-2255]
>>>>>>>>>> mapreduce.LoadIncrementalHFiles:422 : Bulk load operation did
>>>>>>>>>> not find any files to load in directory s3://privatekeybucket-anac5h41
>>>>>>>>>> 523l/kylin/kylin_default_instance/kylin-2c86b4b6-7639-4a97-b
>>>>>>>>>> a63-63c9dca095f6/kylin_sales_cube_clone3/hfile.  Does it contain
>>>>>>>>>> files in subdirectories that correspond to column family names?
>>>>>>>>>>
>>>>>>>>>> In S3 view, I see the files exist in "_temporary" folder, seems
>>>>>>>>>> were not moved to the target folder on complete. It seems EMR try to direct
>>>>>>>>>> write to otuput path, but actually not.
>>>>>>>>>>
>>>>>>>>>> 2017-08-11 16:34 GMT+08:00 Alexander Sterligov <
>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>
>>>>>>>>>>> No, defaultFs is hdfs.
>>>>>>>>>>>
>>>>>>>>>>> I’ve seen such behavior when set working dir to s3, but didn’t
>>>>>>>>>>> set cluster-fs at all. Maybe you have a typo in the name of the property. I
>>>>>>>>>>> used the old one «kylin.hbase.cluster.fs»
>>>>>>>>>>>
>>>>>>>>>>> When both working-dir and cluster-fs were set to s3 I got
>>>>>>>>>>> _temporary dir of convert job at s3, but no hfiles. Also I saw correct
>>>>>>>>>>> output path for the job in the log. But I didn’t check if job creates
>>>>>>>>>>> temporary files in s3, but then copies results to hdfs. I hardly believe it
>>>>>>>>>>> happens.
>>>>>>>>>>>
>>>>>>>>>>> Do you see proper arguments for the step in the log?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 11 авг. 2017 г., в 11:17, ShaoFeng Shi <sh...@apache.org>
>>>>>>>>>>> написал(а):
>>>>>>>>>>>
>>>>>>>>>>> Hi Alexander,
>>>>>>>>>>>
>>>>>>>>>>> That makes sense. Using S3 for Cube build and storage is
>>>>>>>>>>> required for a cloud hadoop environment.
>>>>>>>>>>>
>>>>>>>>>>> I tried to reproduce this problem. I created a EMR with S3 as
>>>>>>>>>>> HBase storage, in kylin.properties, I set "kylin.env.hdfs-working-dir"
>>>>>>>>>>> and "kylin.storage.hbase.cluster-fs" to the S3 bucket. But in
>>>>>>>>>>> the "Convert Cuboid Data to HFile" step, Kylin still writes to
>>>>>>>>>>> local HDFS; Did you modify the core-site.xml to make S3 as the default FS?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2017-08-10 22:53 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>
>>>>>>>>>>>> Yes, I workarounded this problem in such way and it works.
>>>>>>>>>>>>
>>>>>>>>>>>> One problem of such solution is that I have to use pretty large
>>>>>>>>>>>> hdfs and it'expensive. And also I have to manually garbage collect it,
>>>>>>>>>>>> because it is not moved to s3, but copied. Kylin cleanup job doesn't work
>>>>>>>>>>>> for it, because main metadata folder is at s3. So it would be really nice
>>>>>>>>>>>> to put everything to s3.
>>>>>>>>>>>>
>>>>>>>>>>>> Another problem is that I had to rise hbase rpc timeout,
>>>>>>>>>>>> because bulk loading from hdfs takes long. That was not trivial. 3 minutes
>>>>>>>>>>>> work good, but with drawback of queries or metadata writes handing for 3
>>>>>>>>>>>> minutes if something bad happen. But that's rare event.
>>>>>>>>>>>>
>>>>>>>>>>>> 10 авг. 2017 г. 17:42 пользователь "ShaoFeng Shi" <
>>>>>>>>>>>> shaofengshi@apache.org> написал:
>>>>>>>>>>>>
>>>>>>>>>>>> How about leaving empty for "kylin.hbase.cluster.fs"? This
>>>>>>>>>>>>> property is for two-cluster deployment (one Hadoop for cube build, the
>>>>>>>>>>>>> other for query);
>>>>>>>>>>>>>
>>>>>>>>>>>>> When be empty, the HFile will be written to default fs (HDFS
>>>>>>>>>>>>> in EMR), and then load to HBase. I'm not sure whether EMR HBase (using S3
>>>>>>>>>>>>> as storage) can bulk load files from HDFS or not. If it can, that would be
>>>>>>>>>>>>> great as the write performance of HDFS would be better than S3.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2017-08-10 22:29 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I also thought about it, but no, it's not consistency.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Consistency view is enabled. I use same s3 for my own
>>>>>>>>>>>>>> map-reduce jobs and it's ok.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I also checked if it lost consistency (emrfs diff). No
>>>>>>>>>>>>>> problems.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In case of inconsistency of s3 files disappear right after
>>>>>>>>>>>>>> they were written and appear some time after. Hfiles didn't appear after a
>>>>>>>>>>>>>> day, but _template is there.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It's 100% reproducable, I think I'll investigate this problem
>>>>>>>>>>>>>> by running conversion job manually.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 10 авг. 2017 г. 17:18 пользователь "ShaoFeng Shi" <
>>>>>>>>>>>>>> shaofengshi@apache.org> написал:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Did you enable the Consistent View? This article explains the
>>>>>>>>>>>>>>> challenge when using S3 directly for ETL process:
>>>>>>>>>>>>>>> https://aws.amazon.com/cn/blogs/big-data/ensuring-consistenc
>>>>>>>>>>>>>>> y-when-using-amazon-s3-and-amazon-elastic-mapreduce-for-etl-
>>>>>>>>>>>>>>> workflows/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2017-08-09 18:19 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yes, it's empty. Also I see this message in the log:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2017-08-09 09:02:35,947 WARN  [Job
>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:234 : Skipping
>>>>>>>>>>>>>>>> non-directory s3://joom.emr.fs/home/producti
>>>>>>>>>>>>>>>> on/bi/kylin/kylin_metadata/kyl
>>>>>>>>>>>>>>>> in-1e436685-7102-4621-a4cb-6472b866126d
>>>>>>>>>>>>>>>> /main_event_1_main/hfile/_SUCCESS
>>>>>>>>>>>>>>>> 2017-08-09 09:02:36,009 WARN  [Job
>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:252 : Skipping non-file
>>>>>>>>>>>>>>>> FileStatusExt{path=s3://joom.e
>>>>>>>>>>>>>>>> mr.fs/home/production/bi/kylin
>>>>>>>>>>>>>>>> /kylin_metadata/kylin-1e436685
>>>>>>>>>>>>>>>> -7102-4621-a4cb-6472b866126d/m
>>>>>>>>>>>>>>>> ain_event_1_main/hfile/_temporary/1; isDirectory=true;
>>>>>>>>>>>>>>>> modification_time=0; access_time=0; owner=; group=; permission=rwxrwxrwx;
>>>>>>>>>>>>>>>> isSymlink=false}
>>>>>>>>>>>>>>>> 2017-08-09 09:02:36,014 WARN  [Job
>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:422 : Bulk load operation
>>>>>>>>>>>>>>>> did not find any files to load in directory
>>>>>>>>>>>>>>>> s3://joom.emr.fs/home/producti
>>>>>>>>>>>>>>>> on/bi/kylin/kylin_metadata/kyl
>>>>>>>>>>>>>>>> in-1e436685-7102-4621-a4cb-647
>>>>>>>>>>>>>>>> 2b866126d/main_event_1_main/hfile.  Does it contain files
>>>>>>>>>>>>>>>> in subdirectories that correspond to column family names?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Aug 9, 2017 at 1:15 PM, ShaoFeng Shi <
>>>>>>>>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The HFile will be moved to HBase data folder when bulk
>>>>>>>>>>>>>>>>> load finished; Did you check whether the HTable has data?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2017-08-09 17:54 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I set kylin.hbase.cluster.fs to s3 bucket where hbase
>>>>>>>>>>>>>>>>>> lives.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Step "Convert Cuboid Data to HFile" finished without
>>>>>>>>>>>>>>>>>> errors. Statistics at the end of the job said that it has written lot's of
>>>>>>>>>>>>>>>>>> data to s3.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> But there is no hfiles in kylin_metadata folder
>>>>>>>>>>>>>>>>>> (kylin_metadata /kylin-1e436685-7102-4621-a4cb-6472b866126d/<table
>>>>>>>>>>>>>>>>>> name>/hfile), but only _temporary folder and _SUCCESS file.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> _temporary contains hfiles inside attempt folders. it
>>>>>>>>>>>>>>>>>> looks like there were not copied from _temporary to result dir. But there
>>>>>>>>>>>>>>>>>> is no errors neither in kylin log, nor in reducers' logs.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Then loading empty hfiles produces empty segments.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Is that a bug or I'm doing something wrong?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best regards,
>>>>>>>>>>>
>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Best regards,
>>>>>>>>>>
>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best regards,
>>>>>>
>>>>>> Shaofeng Shi 史少锋
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>>
>>>> Shaofeng Shi 史少锋
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>>
>>> Shaofeng Shi 史少锋
>>>
>>>
>>
>


-- 
Best regards,

Shaofeng Shi 史少锋

Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

Posted by Alexander Sterligov <st...@joom.it>.

I totally agree:

Another problem is that I had to rise hbase rpc timeout, because bulk
> loading from hdfs takes long. That was not trivial. 3 minutes work good,
> but with drawback of queries or metadata writes handing for 3 minutes if
> something bad happen.
>

On Thu, Sep 7, 2017 at 1:23 PM, ShaoFeng Shi <sh...@apache.org> wrote:

> Setting hbase.rpc.timeout to a large value has drawback I think; It will
> cause other rpc operations wait longer. So the best way is directly writing
> HFile to the S3 bucket that HBase reads. Not sure whether HBase still needs
> a move operation; if need, that will become another problem.
>
> 2017-09-07 18:02 GMT+08:00 Alexander Sterligov <st...@joom.it>:
>
>> Just in case - I've changed it in /etc/hbase/conf/hbase-site.xml
>>
>> On Thu, Sep 7, 2017 at 12:59 PM, ShaoFeng Shi <sh...@apache.org>
>> wrote:
>>
>>> Thanks; I also set a larger value for the rpc timeout, but it didn't
>>> change the behavior. I'm using EMR 5.5, not sure whether it is a bug.
>>>
>>> 2017-09-07 17:24 GMT+08:00 Alexander Sterligov <st...@joom.it>:
>>>
>>>> Hi,
>>>>
>>>> I've set large hbase timeout:
>>>>
>>>> <property>
>>>>     <name>hbase.rpc.timeout</name>
>>>>     <value>1800000</value>
>>>>   </property>
>>>>
>>>> On Thu, Sep 7, 2017 at 12:02 PM, ShaoFeng Shi <sh...@apache.org>
>>>> wrote:
>>>>
>>>>> Hi Alexander,
>>>>>
>>>>> I encounter a problem when using HDFS for cubing building, and S3 for
>>>>> HBase on EMR. In the "Load HFile to HBase Table" step, Kylin got a failure
>>>>> with time out error:
>>>>>
>>>>> Thu Sep 07 15:33:27 GMT+08:00 2017, RpcRetryingCaller{globalStartTime=1504769048975,
>>>>> pause=100, retries=35}, java.io.IOException: Call to
>>>>> ip-10-0-0-28.ec2.internal/10.0.0.28:16020 failed on local exception:
>>>>> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=41,
>>>>> waitTime=60001, operationTimeout=60000
>>>>>
>>>>> In HBase region server, I saw HBase uploads the HFile to S3; Since the
>>>>> cube is a little big (13GB), it takes much longer time than usual. Kylin
>>>>> client closed the connection as it thought timeout:
>>>>>
>>>>> 2017-09-07 08:01:12,275 INFO  [RpcServer.FifoWFPBQ.default.handler=16,queue=1,port=16020]
>>>>> regionserver.HRegionFileSystem: Bulk-load file
>>>>> hdfs://ip-10-0-0-118.ec2.internal:8020/kylin/kylin_default_i
>>>>> nstance/kylin-cdcb5f57-2ea9-47d9-85db-7a6c7490cc55/test/hfil
>>>>> e/F1/a897b4d33ed648e6a5d0bfb05cffdfd6 is on different filesystem than
>>>>> the destination store. Copying file over to destination filesystem.
>>>>> 2017-09-07 08:01:23,919 INFO  [RpcServer.FifoWFPBQ.default.handler=22,queue=1,port=16020]
>>>>> s3.MultipartUploadManager: completed multipart upload of 8 parts 965420145
>>>>> bytes
>>>>>
>>>>> 2017-09-07 08:26:33,838 WARN  [RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020]
>>>>> ipc.RpcServer: (responseTooSlow): {"call":"BulkLoadHFile(org.apa
>>>>> che.hadoop.hbase.protobuf.generated.ClientProtos$BulkLoadHFi
>>>>> leRequest)","starttimems":1504770958916,"responsesize":2,"me
>>>>> thod":"BulkLoadHFile","param":"TODO: class
>>>>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Bulk
>>>>> LoadHFileRequest","processingtimems":1834922,"client":"10.0.
>>>>> 0.243:49152","queuetimems":0,"class":"HRegionServer"}
>>>>> 2017-09-07 08:26:33,838 WARN  [RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020]
>>>>> ipc.RpcServer: RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020:
>>>>> caught a ClosedChannelException, this means that the server /
>>>>> 10.0.0.28:16020 was processing a request but the client went away.
>>>>> The error message was: null
>>>>>
>>>>> So I wonder how did you bypass this problem, did you set a very large
>>>>> timeout value for HBase, or your cube size isn't that big? Thanks.
>>>>>
>>>>>
>>>>>
>>>>> 2017-08-14 14:19 GMT+08:00 Alexander Sterligov <st...@joom.it>:
>>>>>
>>>>>> Here is ticket for hfile on s3 issue - https://issues.apache.org/ji
>>>>>> ra/browse/KYLIN-2788
>>>>>>
>>>>>> On Mon, Aug 14, 2017 at 9:17 AM, Alexander Sterligov <
>>>>>> sterligovak@joom.it> wrote:
>>>>>>
>>>>>>> I forgot there was one more issue with s3 -
>>>>>>> https://issues.apache.org/jira/browse/KYLIN-2740.
>>>>>>>
>>>>>>> Global dictionary in 2.0 doesn't work out of the box. I patched
>>>>>>> kylin as described in ticket.
>>>>>>>
>>>>>>> On Sun, Aug 13, 2017 at 4:24 AM, ShaoFeng Shi <
>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>
>>>>>>>> Nice; For the writting hfile to S3 issue,  it need more
>>>>>>>> investigation.  Please open a Kylin JIRA for tracking. We will update there
>>>>>>>> if has any finding.
>>>>>>>>
>>>>>>>> 2017-08-12 23:52 GMT+08:00 Alexander Sterligov <sterligovak@joom.it
>>>>>>>> >:
>>>>>>>>
>>>>>>>>> Query performance is pretty same as on slides about kylin. I have
>>>>>>>>> high bucket cache hit (>90%), so data is almost always read from local
>>>>>>>>> disk. For some other use cases it might be different.
>>>>>>>>>
>>>>>>>>> 12 авг. 2017 г. 17:59 пользователь "ShaoFeng Shi" <
>>>>>>>>> shaofengshi@apache.org> написал:
>>>>>>>>>
>>>>>>>>> Cool; how about the query performance with data on s3?
>>>>>>>>>
>>>>>>>>> 2017-08-11 23:27 GMT+08:00 Alexander Sterligov <
>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>
>>>>>>>>>> Yes, that's the only one fow now.
>>>>>>>>>>
>>>>>>>>>> On Fri, Aug 11, 2017 at 6:23 PM, ShaoFeng Shi <
>>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> No need to add I think, because I see they already in the
>>>>>>>>>>> configuration of that step.
>>>>>>>>>>>
>>>>>>>>>>> Is this the only issue you see with Kylin on EMR+S3?
>>>>>>>>>>>
>>>>>>>>>>> [image: 内嵌图片 1]
>>>>>>>>>>>
>>>>>>>>>>> 2017-08-11 20:26 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>
>>>>>>>>>>>> What if we shall add direct output in kylin_job_conf.xml
>>>>>>>>>>>> and kylin_job_conf_inmem.xml?
>>>>>>>>>>>>
>>>>>>>>>>>> hbase.zookeeper.quorum for example doesn't work if not
>>>>>>>>>>>> specified in these configs.
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Aug 11, 2017 at 3:13 PM, ShaoFeng Shi <
>>>>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> EMR enables the direct output in mapred-site.xml, while in
>>>>>>>>>>>>> this step it seems these settings doesn't work (althoug the job's
>>>>>>>>>>>>> configuration shows they are there). I disabled the direct output but the
>>>>>>>>>>>>> behavior has no change. I did some search but no finding. I need drop the
>>>>>>>>>>>>> EMR now, and may get back it later.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If you have any idea or findings, please share it. We'd like
>>>>>>>>>>>>> to make Kylin has better support for cloud.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for your feedback!
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2017-08-11 19:19 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Any ideas how to fix that?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Aug 11, 2017 at 2:16 PM, ShaoFeng Shi <
>>>>>>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I got the same problem as you:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2017-08-11 08:44:16,342 WARN  [Job
>>>>>>>>>>>>>>> 2c86b4b6-7639-4a97-ba63-63c9dca095f6-2255]
>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:422 : Bulk load operation
>>>>>>>>>>>>>>> did not find any files to load in directory s3://privatekeybucket-anac5h41
>>>>>>>>>>>>>>> 523l/kylin/kylin_default_instance/kylin-2c86b4b6-7639-4a97-b
>>>>>>>>>>>>>>> a63-63c9dca095f6/kylin_sales_cube_clone3/hfile.  Does it
>>>>>>>>>>>>>>> contain files in subdirectories that correspond to column family names?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In S3 view, I see the files exist in "_temporary" folder,
>>>>>>>>>>>>>>> seems were not moved to the target folder on complete. It seems EMR try to
>>>>>>>>>>>>>>> direct write to otuput path, but actually not.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2017-08-11 16:34 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> No, defaultFs is hdfs.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I’ve seen such behavior when set working dir to s3, but
>>>>>>>>>>>>>>>> didn’t set cluster-fs at all. Maybe you have a typo in the name of the
>>>>>>>>>>>>>>>> property. I used the old one «kylin.hbase.cluster.fs»
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When both working-dir and cluster-fs were set to s3 I got
>>>>>>>>>>>>>>>> _temporary dir of convert job at s3, but no hfiles. Also I saw correct
>>>>>>>>>>>>>>>> output path for the job in the log. But I didn’t check if job creates
>>>>>>>>>>>>>>>> temporary files in s3, but then copies results to hdfs. I hardly believe it
>>>>>>>>>>>>>>>> happens.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Do you see proper arguments for the step in the log?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 11 авг. 2017 г., в 11:17, ShaoFeng Shi <
>>>>>>>>>>>>>>>> shaofengshi@apache.org> написал(а):
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Alexander,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> That makes sense. Using S3 for Cube build and storage is
>>>>>>>>>>>>>>>> required for a cloud hadoop environment.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I tried to reproduce this problem. I created a EMR with S3
>>>>>>>>>>>>>>>> as HBase storage, in kylin.properties, I set "kylin.env.hdfs-working-dir"
>>>>>>>>>>>>>>>> and "kylin.storage.hbase.cluster-fs" to the S3 bucket. But
>>>>>>>>>>>>>>>> in the "Convert Cuboid Data to HFile" step, Kylin still
>>>>>>>>>>>>>>>> writes to local HDFS; Did you modify the core-site.xml to make S3 as the
>>>>>>>>>>>>>>>> default FS?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2017-08-10 22:53 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yes, I workarounded this problem in such way and it works.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> One problem of such solution is that I have to use pretty
>>>>>>>>>>>>>>>>> large hdfs and it'expensive. And also I have to manually garbage collect
>>>>>>>>>>>>>>>>> it, because it is not moved to s3, but copied. Kylin cleanup job doesn't
>>>>>>>>>>>>>>>>> work for it, because main metadata folder is at s3. So it would be really
>>>>>>>>>>>>>>>>> nice to put everything to s3.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Another problem is that I had to rise hbase rpc timeout,
>>>>>>>>>>>>>>>>> because bulk loading from hdfs takes long. That was not trivial. 3 minutes
>>>>>>>>>>>>>>>>> work good, but with drawback of queries or metadata writes handing for 3
>>>>>>>>>>>>>>>>> minutes if something bad happen. But that's rare event.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 10 авг. 2017 г. 17:42 пользователь "ShaoFeng Shi" <
>>>>>>>>>>>>>>>>> shaofengshi@apache.org> написал:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> How about leaving empty for "kylin.hbase.cluster.fs"?
>>>>>>>>>>>>>>>>>> This property is for two-cluster deployment (one Hadoop for cube build, the
>>>>>>>>>>>>>>>>>> other for query);
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> When be empty, the HFile will be written to default fs
>>>>>>>>>>>>>>>>>> (HDFS in EMR), and then load to HBase. I'm not sure whether EMR HBase
>>>>>>>>>>>>>>>>>> (using S3 as storage) can bulk load files from HDFS or not. If it can, that
>>>>>>>>>>>>>>>>>> would be great as the write performance of HDFS would be better than S3.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2017-08-10 22:29 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I also thought about it, but no, it's not consistency.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Consistency view is enabled. I use same s3 for my own
>>>>>>>>>>>>>>>>>>> map-reduce jobs and it's ok.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I also checked if it lost consistency (emrfs diff). No
>>>>>>>>>>>>>>>>>>> problems.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> In case of inconsistency of s3 files disappear right
>>>>>>>>>>>>>>>>>>> after they were written and appear some time after. Hfiles didn't appear
>>>>>>>>>>>>>>>>>>> after a day, but _template is there.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> It's 100% reproducable, I think I'll investigate this
>>>>>>>>>>>>>>>>>>> problem by running conversion job manually.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 10 авг. 2017 г. 17:18 пользователь "ShaoFeng Shi" <
>>>>>>>>>>>>>>>>>>> shaofengshi@apache.org> написал:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Did you enable the Consistent View? This article
>>>>>>>>>>>>>>>>>>>> explains the challenge when using S3 directly for ETL process:
>>>>>>>>>>>>>>>>>>>> https://aws.amazon.com/cn/blog
>>>>>>>>>>>>>>>>>>>> s/big-data/ensuring-consistenc
>>>>>>>>>>>>>>>>>>>> y-when-using-amazon-s3-and-ama
>>>>>>>>>>>>>>>>>>>> zon-elastic-mapreduce-for-etl-workflows/
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 2017-08-09 18:19 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Yes, it's empty. Also I see this message in the log:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> 2017-08-09 09:02:35,947 WARN  [Job
>>>>>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:234 : Skipping
>>>>>>>>>>>>>>>>>>>>> non-directory s3://joom.emr.fs/home/producti
>>>>>>>>>>>>>>>>>>>>> on/bi/kylin/kylin_metadata/kyl
>>>>>>>>>>>>>>>>>>>>> in-1e436685-7102-4621-a4cb-6472b866126d
>>>>>>>>>>>>>>>>>>>>> /main_event_1_main/hfile/_SUCCESS
>>>>>>>>>>>>>>>>>>>>> 2017-08-09 09:02:36,009 WARN  [Job
>>>>>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:252 : Skipping
>>>>>>>>>>>>>>>>>>>>> non-file FileStatusExt{path=s3://joom.e
>>>>>>>>>>>>>>>>>>>>> mr.fs/home/production/bi/kylin
>>>>>>>>>>>>>>>>>>>>> /kylin_metadata/kylin-1e436685
>>>>>>>>>>>>>>>>>>>>> -7102-4621-a4cb-6472b866126d/m
>>>>>>>>>>>>>>>>>>>>> ain_event_1_main/hfile/_temporary/1;
>>>>>>>>>>>>>>>>>>>>> isDirectory=true; modification_time=0; access_time=0; owner=; group=;
>>>>>>>>>>>>>>>>>>>>> permission=rwxrwxrwx; isSymlink=false}
>>>>>>>>>>>>>>>>>>>>> 2017-08-09 09:02:36,014 WARN  [Job
>>>>>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:422 : Bulk load
>>>>>>>>>>>>>>>>>>>>> operation did not find any files to load in directory
>>>>>>>>>>>>>>>>>>>>> s3://joom.emr.fs/home/producti
>>>>>>>>>>>>>>>>>>>>> on/bi/kylin/kylin_metadata/kyl
>>>>>>>>>>>>>>>>>>>>> in-1e436685-7102-4621-a4cb-647
>>>>>>>>>>>>>>>>>>>>> 2b866126d/main_event_1_main/hfile.  Does it contain
>>>>>>>>>>>>>>>>>>>>> files in subdirectories that correspond to column family names?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Wed, Aug 9, 2017 at 1:15 PM, ShaoFeng Shi <
>>>>>>>>>>>>>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> The HFile will be moved to HBase data folder when
>>>>>>>>>>>>>>>>>>>>>> bulk load finished; Did you check whether the HTable has data?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> 2017-08-09 17:54 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I set kylin.hbase.cluster.fs to s3 bucket where
>>>>>>>>>>>>>>>>>>>>>>> hbase lives.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Step "Convert Cuboid Data to HFile" finished
>>>>>>>>>>>>>>>>>>>>>>> without errors. Statistics at the end of the job said that it has written
>>>>>>>>>>>>>>>>>>>>>>> lot's of data to s3.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> But there is no hfiles in kylin_metadata folder
>>>>>>>>>>>>>>>>>>>>>>> (kylin_metadata /kylin-1e436685-7102-4621-a4cb-6472b866126d/<table
>>>>>>>>>>>>>>>>>>>>>>> name>/hfile), but only _temporary folder and _SUCCESS file.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> _temporary contains hfiles inside attempt folders.
>>>>>>>>>>>>>>>>>>>>>>> it looks like there were not copied from _temporary to result dir. But
>>>>>>>>>>>>>>>>>>>>>>> there is no errors neither in kylin log, nor in reducers' logs.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Then loading empty hfiles produces empty segments.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Is that a bug or I'm doing something wrong?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best regards,
>>>>>>>>>>>
>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>>
>>>>> Shaofeng Shi 史少锋
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>>
>>> Shaofeng Shi 史少锋
>>>
>>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>

Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

Posted by ShaoFeng Shi <sh...@apache.org>.

Setting hbase.rpc.timeout to a large value has drawback I think; It will
cause other rpc operations wait longer. So the best way is directly writing
HFile to the S3 bucket that HBase reads. Not sure whether HBase still needs
a move operation; if need, that will become another problem.

2017-09-07 18:02 GMT+08:00 Alexander Sterligov <st...@joom.it>:

> Just in case - I've changed it in /etc/hbase/conf/hbase-site.xml
>
> On Thu, Sep 7, 2017 at 12:59 PM, ShaoFeng Shi <sh...@apache.org>
> wrote:
>
>> Thanks; I also set a larger value for the rpc timeout, but it didn't
>> change the behavior. I'm using EMR 5.5, not sure whether it is a bug.
>>
>> 2017-09-07 17:24 GMT+08:00 Alexander Sterligov <st...@joom.it>:
>>
>>> Hi,
>>>
>>> I've set large hbase timeout:
>>>
>>> <property>
>>>     <name>hbase.rpc.timeout</name>
>>>     <value>1800000</value>
>>>   </property>
>>>
>>> On Thu, Sep 7, 2017 at 12:02 PM, ShaoFeng Shi <sh...@apache.org>
>>> wrote:
>>>
>>>> Hi Alexander,
>>>>
>>>> I encounter a problem when using HDFS for cubing building, and S3 for
>>>> HBase on EMR. In the "Load HFile to HBase Table" step, Kylin got a failure
>>>> with time out error:
>>>>
>>>> Thu Sep 07 15:33:27 GMT+08:00 2017, RpcRetryingCaller{globalStartTime=1504769048975,
>>>> pause=100, retries=35}, java.io.IOException: Call to
>>>> ip-10-0-0-28.ec2.internal/10.0.0.28:16020 failed on local exception:
>>>> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=41,
>>>> waitTime=60001, operationTimeout=60000
>>>>
>>>> In HBase region server, I saw HBase uploads the HFile to S3; Since the
>>>> cube is a little big (13GB), it takes much longer time than usual. Kylin
>>>> client closed the connection as it thought timeout:
>>>>
>>>> 2017-09-07 08:01:12,275 INFO  [RpcServer.FifoWFPBQ.default.handler=16,queue=1,port=16020]
>>>> regionserver.HRegionFileSystem: Bulk-load file
>>>> hdfs://ip-10-0-0-118.ec2.internal:8020/kylin/kylin_default_i
>>>> nstance/kylin-cdcb5f57-2ea9-47d9-85db-7a6c7490cc55/test/hfil
>>>> e/F1/a897b4d33ed648e6a5d0bfb05cffdfd6 is on different filesystem than
>>>> the destination store. Copying file over to destination filesystem.
>>>> 2017-09-07 08:01:23,919 INFO  [RpcServer.FifoWFPBQ.default.handler=22,queue=1,port=16020]
>>>> s3.MultipartUploadManager: completed multipart upload of 8 parts 965420145
>>>> bytes
>>>>
>>>> 2017-09-07 08:26:33,838 WARN  [RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020]
>>>> ipc.RpcServer: (responseTooSlow): {"call":"BulkLoadHFile(org.apa
>>>> che.hadoop.hbase.protobuf.generated.ClientProtos$BulkLoadHFi
>>>> leRequest)","starttimems":1504770958916,"responsesize":2,"me
>>>> thod":"BulkLoadHFile","param":"TODO: class
>>>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Bulk
>>>> LoadHFileRequest","processingtimems":1834922,"client":"10.0.0.243:49152
>>>> ","queuetimems":0,"class":"HRegionServer"}
>>>> 2017-09-07 08:26:33,838 WARN  [RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020]
>>>> ipc.RpcServer: RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020:
>>>> caught a ClosedChannelException, this means that the server /
>>>> 10.0.0.28:16020 was processing a request but the client went away. The
>>>> error message was: null
>>>>
>>>> So I wonder how did you bypass this problem, did you set a very large
>>>> timeout value for HBase, or your cube size isn't that big? Thanks.
>>>>
>>>>
>>>>
>>>> 2017-08-14 14:19 GMT+08:00 Alexander Sterligov <st...@joom.it>:
>>>>
>>>>> Here is ticket for hfile on s3 issue - https://issues.apache.org/ji
>>>>> ra/browse/KYLIN-2788
>>>>>
>>>>> On Mon, Aug 14, 2017 at 9:17 AM, Alexander Sterligov <
>>>>> sterligovak@joom.it> wrote:
>>>>>
>>>>>> I forgot there was one more issue with s3 -
>>>>>> https://issues.apache.org/jira/browse/KYLIN-2740.
>>>>>>
>>>>>> Global dictionary in 2.0 doesn't work out of the box. I patched kylin
>>>>>> as described in ticket.
>>>>>>
>>>>>> On Sun, Aug 13, 2017 at 4:24 AM, ShaoFeng Shi <shaofengshi@apache.org
>>>>>> > wrote:
>>>>>>
>>>>>>> Nice; For the writting hfile to S3 issue,  it need more
>>>>>>> investigation.  Please open a Kylin JIRA for tracking. We will update there
>>>>>>> if has any finding.
>>>>>>>
>>>>>>> 2017-08-12 23:52 GMT+08:00 Alexander Sterligov <st...@joom.it>
>>>>>>> :
>>>>>>>
>>>>>>>> Query performance is pretty same as on slides about kylin. I have
>>>>>>>> high bucket cache hit (>90%), so data is almost always read from local
>>>>>>>> disk. For some other use cases it might be different.
>>>>>>>>
>>>>>>>> 12 авг. 2017 г. 17:59 пользователь "ShaoFeng Shi" <
>>>>>>>> shaofengshi@apache.org> написал:
>>>>>>>>
>>>>>>>> Cool; how about the query performance with data on s3?
>>>>>>>>
>>>>>>>> 2017-08-11 23:27 GMT+08:00 Alexander Sterligov <sterligovak@joom.it
>>>>>>>> >:
>>>>>>>>
>>>>>>>>> Yes, that's the only one fow now.
>>>>>>>>>
>>>>>>>>> On Fri, Aug 11, 2017 at 6:23 PM, ShaoFeng Shi <
>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>
>>>>>>>>>> No need to add I think, because I see they already in the
>>>>>>>>>> configuration of that step.
>>>>>>>>>>
>>>>>>>>>> Is this the only issue you see with Kylin on EMR+S3?
>>>>>>>>>>
>>>>>>>>>> [image: 内嵌图片 1]
>>>>>>>>>>
>>>>>>>>>> 2017-08-11 20:26 GMT+08:00 Alexander Sterligov <
>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>
>>>>>>>>>>> What if we shall add direct output in kylin_job_conf.xml
>>>>>>>>>>> and kylin_job_conf_inmem.xml?
>>>>>>>>>>>
>>>>>>>>>>> hbase.zookeeper.quorum for example doesn't work if not specified
>>>>>>>>>>> in these configs.
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Aug 11, 2017 at 3:13 PM, ShaoFeng Shi <
>>>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> EMR enables the direct output in mapred-site.xml, while in this
>>>>>>>>>>>> step it seems these settings doesn't work (althoug the job's configuration
>>>>>>>>>>>> shows they are there). I disabled the direct output but the behavior has no
>>>>>>>>>>>> change. I did some search but no finding. I need drop the EMR now, and may
>>>>>>>>>>>> get back it later.
>>>>>>>>>>>>
>>>>>>>>>>>> If you have any idea or findings, please share it. We'd like to
>>>>>>>>>>>> make Kylin has better support for cloud.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for your feedback!
>>>>>>>>>>>>
>>>>>>>>>>>> 2017-08-11 19:19 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>
>>>>>>>>>>>>> Any ideas how to fix that?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Aug 11, 2017 at 2:16 PM, ShaoFeng Shi <
>>>>>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I got the same problem as you:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2017-08-11 08:44:16,342 WARN  [Job
>>>>>>>>>>>>>> 2c86b4b6-7639-4a97-ba63-63c9dca095f6-2255]
>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:422 : Bulk load operation
>>>>>>>>>>>>>> did not find any files to load in directory s3://privatekeybucket-anac5h41
>>>>>>>>>>>>>> 523l/kylin/kylin_default_instance/kylin-2c86b4b6-7639-4a97-b
>>>>>>>>>>>>>> a63-63c9dca095f6/kylin_sales_cube_clone3/hfile.  Does it
>>>>>>>>>>>>>> contain files in subdirectories that correspond to column family names?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In S3 view, I see the files exist in "_temporary" folder,
>>>>>>>>>>>>>> seems were not moved to the target folder on complete. It seems EMR try to
>>>>>>>>>>>>>> direct write to otuput path, but actually not.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2017-08-11 16:34 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> No, defaultFs is hdfs.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I’ve seen such behavior when set working dir to s3, but
>>>>>>>>>>>>>>> didn’t set cluster-fs at all. Maybe you have a typo in the name of the
>>>>>>>>>>>>>>> property. I used the old one «kylin.hbase.cluster.fs»
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When both working-dir and cluster-fs were set to s3 I got
>>>>>>>>>>>>>>> _temporary dir of convert job at s3, but no hfiles. Also I saw correct
>>>>>>>>>>>>>>> output path for the job in the log. But I didn’t check if job creates
>>>>>>>>>>>>>>> temporary files in s3, but then copies results to hdfs. I hardly believe it
>>>>>>>>>>>>>>> happens.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Do you see proper arguments for the step in the log?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 11 авг. 2017 г., в 11:17, ShaoFeng Shi <
>>>>>>>>>>>>>>> shaofengshi@apache.org> написал(а):
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Alexander,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> That makes sense. Using S3 for Cube build and storage is
>>>>>>>>>>>>>>> required for a cloud hadoop environment.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I tried to reproduce this problem. I created a EMR with S3
>>>>>>>>>>>>>>> as HBase storage, in kylin.properties, I set "kylin.env.hdfs-working-dir"
>>>>>>>>>>>>>>> and "kylin.storage.hbase.cluster-fs" to the S3 bucket. But
>>>>>>>>>>>>>>> in the "Convert Cuboid Data to HFile" step, Kylin still
>>>>>>>>>>>>>>> writes to local HDFS; Did you modify the core-site.xml to make S3 as the
>>>>>>>>>>>>>>> default FS?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2017-08-10 22:53 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yes, I workarounded this problem in such way and it works.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> One problem of such solution is that I have to use pretty
>>>>>>>>>>>>>>>> large hdfs and it'expensive. And also I have to manually garbage collect
>>>>>>>>>>>>>>>> it, because it is not moved to s3, but copied. Kylin cleanup job doesn't
>>>>>>>>>>>>>>>> work for it, because main metadata folder is at s3. So it would be really
>>>>>>>>>>>>>>>> nice to put everything to s3.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Another problem is that I had to rise hbase rpc timeout,
>>>>>>>>>>>>>>>> because bulk loading from hdfs takes long. That was not trivial. 3 minutes
>>>>>>>>>>>>>>>> work good, but with drawback of queries or metadata writes handing for 3
>>>>>>>>>>>>>>>> minutes if something bad happen. But that's rare event.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 10 авг. 2017 г. 17:42 пользователь "ShaoFeng Shi" <
>>>>>>>>>>>>>>>> shaofengshi@apache.org> написал:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> How about leaving empty for "kylin.hbase.cluster.fs"? This
>>>>>>>>>>>>>>>>> property is for two-cluster deployment (one Hadoop for cube build, the
>>>>>>>>>>>>>>>>> other for query);
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> When be empty, the HFile will be written to default fs
>>>>>>>>>>>>>>>>> (HDFS in EMR), and then load to HBase. I'm not sure whether EMR HBase
>>>>>>>>>>>>>>>>> (using S3 as storage) can bulk load files from HDFS or not. If it can, that
>>>>>>>>>>>>>>>>> would be great as the write performance of HDFS would be better than S3.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2017-08-10 22:29 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I also thought about it, but no, it's not consistency.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Consistency view is enabled. I use same s3 for my own
>>>>>>>>>>>>>>>>>> map-reduce jobs and it's ok.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I also checked if it lost consistency (emrfs diff). No
>>>>>>>>>>>>>>>>>> problems.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In case of inconsistency of s3 files disappear right
>>>>>>>>>>>>>>>>>> after they were written and appear some time after. Hfiles didn't appear
>>>>>>>>>>>>>>>>>> after a day, but _template is there.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It's 100% reproducable, I think I'll investigate this
>>>>>>>>>>>>>>>>>> problem by running conversion job manually.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 10 авг. 2017 г. 17:18 пользователь "ShaoFeng Shi" <
>>>>>>>>>>>>>>>>>> shaofengshi@apache.org> написал:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Did you enable the Consistent View? This article explains
>>>>>>>>>>>>>>>>>>> the challenge when using S3 directly for ETL process:
>>>>>>>>>>>>>>>>>>> https://aws.amazon.com/cn/blog
>>>>>>>>>>>>>>>>>>> s/big-data/ensuring-consistenc
>>>>>>>>>>>>>>>>>>> y-when-using-amazon-s3-and-ama
>>>>>>>>>>>>>>>>>>> zon-elastic-mapreduce-for-etl-workflows/
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 2017-08-09 18:19 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Yes, it's empty. Also I see this message in the log:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 2017-08-09 09:02:35,947 WARN  [Job
>>>>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:234 : Skipping
>>>>>>>>>>>>>>>>>>>> non-directory s3://joom.emr.fs/home/producti
>>>>>>>>>>>>>>>>>>>> on/bi/kylin/kylin_metadata/kyl
>>>>>>>>>>>>>>>>>>>> in-1e436685-7102-4621-a4cb-6472b866126d
>>>>>>>>>>>>>>>>>>>> /main_event_1_main/hfile/_SUCCESS
>>>>>>>>>>>>>>>>>>>> 2017-08-09 09:02:36,009 WARN  [Job
>>>>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:252 : Skipping
>>>>>>>>>>>>>>>>>>>> non-file FileStatusExt{path=s3://joom.e
>>>>>>>>>>>>>>>>>>>> mr.fs/home/production/bi/kylin
>>>>>>>>>>>>>>>>>>>> /kylin_metadata/kylin-1e436685
>>>>>>>>>>>>>>>>>>>> -7102-4621-a4cb-6472b866126d/m
>>>>>>>>>>>>>>>>>>>> ain_event_1_main/hfile/_temporary/1; isDirectory=true;
>>>>>>>>>>>>>>>>>>>> modification_time=0; access_time=0; owner=; group=; permission=rwxrwxrwx;
>>>>>>>>>>>>>>>>>>>> isSymlink=false}
>>>>>>>>>>>>>>>>>>>> 2017-08-09 09:02:36,014 WARN  [Job
>>>>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:422 : Bulk load
>>>>>>>>>>>>>>>>>>>> operation did not find any files to load in directory
>>>>>>>>>>>>>>>>>>>> s3://joom.emr.fs/home/producti
>>>>>>>>>>>>>>>>>>>> on/bi/kylin/kylin_metadata/kyl
>>>>>>>>>>>>>>>>>>>> in-1e436685-7102-4621-a4cb-647
>>>>>>>>>>>>>>>>>>>> 2b866126d/main_event_1_main/hfile.  Does it contain
>>>>>>>>>>>>>>>>>>>> files in subdirectories that correspond to column family names?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Wed, Aug 9, 2017 at 1:15 PM, ShaoFeng Shi <
>>>>>>>>>>>>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The HFile will be moved to HBase data folder when bulk
>>>>>>>>>>>>>>>>>>>>> load finished; Did you check whether the HTable has data?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> 2017-08-09 17:54 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I set kylin.hbase.cluster.fs to s3 bucket where hbase
>>>>>>>>>>>>>>>>>>>>>> lives.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Step "Convert Cuboid Data to HFile" finished without
>>>>>>>>>>>>>>>>>>>>>> errors. Statistics at the end of the job said that it has written lot's of
>>>>>>>>>>>>>>>>>>>>>> data to s3.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> But there is no hfiles in kylin_metadata folder
>>>>>>>>>>>>>>>>>>>>>> (kylin_metadata /kylin-1e436685-7102-4621-a4cb-6472b866126d/<table
>>>>>>>>>>>>>>>>>>>>>> name>/hfile), but only _temporary folder and _SUCCESS file.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> _temporary contains hfiles inside attempt folders. it
>>>>>>>>>>>>>>>>>>>>>> looks like there were not copied from _temporary to result dir. But there
>>>>>>>>>>>>>>>>>>>>>> is no errors neither in kylin log, nor in reducers' logs.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Then loading empty hfiles produces empty segments.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Is that a bug or I'm doing something wrong?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>
>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Best regards,
>>>>>>>>>>
>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>>
>>>>>>> Shaofeng Shi 史少锋
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>>
>>>> Shaofeng Shi 史少锋
>>>>
>>>>
>>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>


-- 
Best regards,

Shaofeng Shi 史少锋

Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

Posted by Alexander Sterligov <st...@joom.it>.

Just in case - I've changed it in /etc/hbase/conf/hbase-site.xml

On Thu, Sep 7, 2017 at 12:59 PM, ShaoFeng Shi <sh...@apache.org>
wrote:

> Thanks; I also set a larger value for the rpc timeout, but it didn't
> change the behavior. I'm using EMR 5.5, not sure whether it is a bug.
>
> 2017-09-07 17:24 GMT+08:00 Alexander Sterligov <st...@joom.it>:
>
>> Hi,
>>
>> I've set large hbase timeout:
>>
>> <property>
>>     <name>hbase.rpc.timeout</name>
>>     <value>1800000</value>
>>   </property>
>>
>> On Thu, Sep 7, 2017 at 12:02 PM, ShaoFeng Shi <sh...@apache.org>
>> wrote:
>>
>>> Hi Alexander,
>>>
>>> I encounter a problem when using HDFS for cubing building, and S3 for
>>> HBase on EMR. In the "Load HFile to HBase Table" step, Kylin got a failure
>>> with time out error:
>>>
>>> Thu Sep 07 15:33:27 GMT+08:00 2017, RpcRetryingCaller{globalStartTime=1504769048975,
>>> pause=100, retries=35}, java.io.IOException: Call to
>>> ip-10-0-0-28.ec2.internal/10.0.0.28:16020 failed on local exception:
>>> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=41,
>>> waitTime=60001, operationTimeout=60000
>>>
>>> In HBase region server, I saw HBase uploads the HFile to S3; Since the
>>> cube is a little big (13GB), it takes much longer time than usual. Kylin
>>> client closed the connection as it thought timeout:
>>>
>>> 2017-09-07 08:01:12,275 INFO  [RpcServer.FifoWFPBQ.default.handler=16,queue=1,port=16020]
>>> regionserver.HRegionFileSystem: Bulk-load file
>>> hdfs://ip-10-0-0-118.ec2.internal:8020/kylin/kylin_default_i
>>> nstance/kylin-cdcb5f57-2ea9-47d9-85db-7a6c7490cc55/test/hfil
>>> e/F1/a897b4d33ed648e6a5d0bfb05cffdfd6 is on different filesystem than
>>> the destination store. Copying file over to destination filesystem.
>>> 2017-09-07 08:01:23,919 INFO  [RpcServer.FifoWFPBQ.default.handler=22,queue=1,port=16020]
>>> s3.MultipartUploadManager: completed multipart upload of 8 parts 965420145
>>> bytes
>>>
>>> 2017-09-07 08:26:33,838 WARN  [RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020]
>>> ipc.RpcServer: (responseTooSlow): {"call":"BulkLoadHFile(org.apa
>>> che.hadoop.hbase.protobuf.generated.ClientProtos$BulkLoadHFi
>>> leRequest)","starttimems":1504770958916,"responsesize":2,"me
>>> thod":"BulkLoadHFile","param":"TODO: class
>>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Bulk
>>> LoadHFileRequest","processingtimems":1834922,"client":"10.0.0.243:49152
>>> ","queuetimems":0,"class":"HRegionServer"}
>>> 2017-09-07 08:26:33,838 WARN  [RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020]
>>> ipc.RpcServer: RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020:
>>> caught a ClosedChannelException, this means that the server /
>>> 10.0.0.28:16020 was processing a request but the client went away. The
>>> error message was: null
>>>
>>> So I wonder how did you bypass this problem, did you set a very large
>>> timeout value for HBase, or your cube size isn't that big? Thanks.
>>>
>>>
>>>
>>> 2017-08-14 14:19 GMT+08:00 Alexander Sterligov <st...@joom.it>:
>>>
>>>> Here is ticket for hfile on s3 issue - https://issues.apache.org/ji
>>>> ra/browse/KYLIN-2788
>>>>
>>>> On Mon, Aug 14, 2017 at 9:17 AM, Alexander Sterligov <
>>>> sterligovak@joom.it> wrote:
>>>>
>>>>> I forgot there was one more issue with s3 -
>>>>> https://issues.apache.org/jira/browse/KYLIN-2740.
>>>>>
>>>>> Global dictionary in 2.0 doesn't work out of the box. I patched kylin
>>>>> as described in ticket.
>>>>>
>>>>> On Sun, Aug 13, 2017 at 4:24 AM, ShaoFeng Shi <sh...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Nice; For the writting hfile to S3 issue,  it need more
>>>>>> investigation.  Please open a Kylin JIRA for tracking. We will update there
>>>>>> if has any finding.
>>>>>>
>>>>>> 2017-08-12 23:52 GMT+08:00 Alexander Sterligov <st...@joom.it>:
>>>>>>
>>>>>>> Query performance is pretty same as on slides about kylin. I have
>>>>>>> high bucket cache hit (>90%), so data is almost always read from local
>>>>>>> disk. For some other use cases it might be different.
>>>>>>>
>>>>>>> 12 авг. 2017 г. 17:59 пользователь "ShaoFeng Shi" <
>>>>>>> shaofengshi@apache.org> написал:
>>>>>>>
>>>>>>> Cool; how about the query performance with data on s3?
>>>>>>>
>>>>>>> 2017-08-11 23:27 GMT+08:00 Alexander Sterligov <st...@joom.it>
>>>>>>> :
>>>>>>>
>>>>>>>> Yes, that's the only one fow now.
>>>>>>>>
>>>>>>>> On Fri, Aug 11, 2017 at 6:23 PM, ShaoFeng Shi <
>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>
>>>>>>>>> No need to add I think, because I see they already in the
>>>>>>>>> configuration of that step.
>>>>>>>>>
>>>>>>>>> Is this the only issue you see with Kylin on EMR+S3?
>>>>>>>>>
>>>>>>>>> [image: 内嵌图片 1]
>>>>>>>>>
>>>>>>>>> 2017-08-11 20:26 GMT+08:00 Alexander Sterligov <
>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>
>>>>>>>>>> What if we shall add direct output in kylin_job_conf.xml
>>>>>>>>>> and kylin_job_conf_inmem.xml?
>>>>>>>>>>
>>>>>>>>>> hbase.zookeeper.quorum for example doesn't work if not specified
>>>>>>>>>> in these configs.
>>>>>>>>>>
>>>>>>>>>> On Fri, Aug 11, 2017 at 3:13 PM, ShaoFeng Shi <
>>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> EMR enables the direct output in mapred-site.xml, while in this
>>>>>>>>>>> step it seems these settings doesn't work (althoug the job's configuration
>>>>>>>>>>> shows they are there). I disabled the direct output but the behavior has no
>>>>>>>>>>> change. I did some search but no finding. I need drop the EMR now, and may
>>>>>>>>>>> get back it later.
>>>>>>>>>>>
>>>>>>>>>>> If you have any idea or findings, please share it. We'd like to
>>>>>>>>>>> make Kylin has better support for cloud.
>>>>>>>>>>>
>>>>>>>>>>> Thanks for your feedback!
>>>>>>>>>>>
>>>>>>>>>>> 2017-08-11 19:19 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>
>>>>>>>>>>>> Any ideas how to fix that?
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Aug 11, 2017 at 2:16 PM, ShaoFeng Shi <
>>>>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I got the same problem as you:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2017-08-11 08:44:16,342 WARN  [Job
>>>>>>>>>>>>> 2c86b4b6-7639-4a97-ba63-63c9dca095f6-2255]
>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:422 : Bulk load operation did
>>>>>>>>>>>>> not find any files to load in directory s3://privatekeybucket-anac5h41
>>>>>>>>>>>>> 523l/kylin/kylin_default_instance/kylin-2c86b4b6-7639-4a97-b
>>>>>>>>>>>>> a63-63c9dca095f6/kylin_sales_cube_clone3/hfile.  Does it
>>>>>>>>>>>>> contain files in subdirectories that correspond to column family names?
>>>>>>>>>>>>>
>>>>>>>>>>>>> In S3 view, I see the files exist in "_temporary" folder,
>>>>>>>>>>>>> seems were not moved to the target folder on complete. It seems EMR try to
>>>>>>>>>>>>> direct write to otuput path, but actually not.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2017-08-11 16:34 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> No, defaultFs is hdfs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I’ve seen such behavior when set working dir to s3, but
>>>>>>>>>>>>>> didn’t set cluster-fs at all. Maybe you have a typo in the name of the
>>>>>>>>>>>>>> property. I used the old one «kylin.hbase.cluster.fs»
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> When both working-dir and cluster-fs were set to s3 I got
>>>>>>>>>>>>>> _temporary dir of convert job at s3, but no hfiles. Also I saw correct
>>>>>>>>>>>>>> output path for the job in the log. But I didn’t check if job creates
>>>>>>>>>>>>>> temporary files in s3, but then copies results to hdfs. I hardly believe it
>>>>>>>>>>>>>> happens.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Do you see proper arguments for the step in the log?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 11 авг. 2017 г., в 11:17, ShaoFeng Shi <
>>>>>>>>>>>>>> shaofengshi@apache.org> написал(а):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Alexander,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> That makes sense. Using S3 for Cube build and storage is
>>>>>>>>>>>>>> required for a cloud hadoop environment.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I tried to reproduce this problem. I created a EMR with S3 as
>>>>>>>>>>>>>> HBase storage, in kylin.properties, I set "kylin.env.hdfs-working-dir"
>>>>>>>>>>>>>> and "kylin.storage.hbase.cluster-fs" to the S3 bucket. But
>>>>>>>>>>>>>> in the "Convert Cuboid Data to HFile" step, Kylin still
>>>>>>>>>>>>>> writes to local HDFS; Did you modify the core-site.xml to make S3 as the
>>>>>>>>>>>>>> default FS?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2017-08-10 22:53 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yes, I workarounded this problem in such way and it works.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> One problem of such solution is that I have to use pretty
>>>>>>>>>>>>>>> large hdfs and it'expensive. And also I have to manually garbage collect
>>>>>>>>>>>>>>> it, because it is not moved to s3, but copied. Kylin cleanup job doesn't
>>>>>>>>>>>>>>> work for it, because main metadata folder is at s3. So it would be really
>>>>>>>>>>>>>>> nice to put everything to s3.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Another problem is that I had to rise hbase rpc timeout,
>>>>>>>>>>>>>>> because bulk loading from hdfs takes long. That was not trivial. 3 minutes
>>>>>>>>>>>>>>> work good, but with drawback of queries or metadata writes handing for 3
>>>>>>>>>>>>>>> minutes if something bad happen. But that's rare event.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 10 авг. 2017 г. 17:42 пользователь "ShaoFeng Shi" <
>>>>>>>>>>>>>>> shaofengshi@apache.org> написал:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> How about leaving empty for "kylin.hbase.cluster.fs"? This
>>>>>>>>>>>>>>>> property is for two-cluster deployment (one Hadoop for cube build, the
>>>>>>>>>>>>>>>> other for query);
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When be empty, the HFile will be written to default fs
>>>>>>>>>>>>>>>> (HDFS in EMR), and then load to HBase. I'm not sure whether EMR HBase
>>>>>>>>>>>>>>>> (using S3 as storage) can bulk load files from HDFS or not. If it can, that
>>>>>>>>>>>>>>>> would be great as the write performance of HDFS would be better than S3.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2017-08-10 22:29 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I also thought about it, but no, it's not consistency.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Consistency view is enabled. I use same s3 for my own
>>>>>>>>>>>>>>>>> map-reduce jobs and it's ok.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I also checked if it lost consistency (emrfs diff). No
>>>>>>>>>>>>>>>>> problems.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In case of inconsistency of s3 files disappear right after
>>>>>>>>>>>>>>>>> they were written and appear some time after. Hfiles didn't appear after a
>>>>>>>>>>>>>>>>> day, but _template is there.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It's 100% reproducable, I think I'll investigate this
>>>>>>>>>>>>>>>>> problem by running conversion job manually.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 10 авг. 2017 г. 17:18 пользователь "ShaoFeng Shi" <
>>>>>>>>>>>>>>>>> shaofengshi@apache.org> написал:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Did you enable the Consistent View? This article explains
>>>>>>>>>>>>>>>>>> the challenge when using S3 directly for ETL process:
>>>>>>>>>>>>>>>>>> https://aws.amazon.com/cn/blog
>>>>>>>>>>>>>>>>>> s/big-data/ensuring-consistenc
>>>>>>>>>>>>>>>>>> y-when-using-amazon-s3-and-ama
>>>>>>>>>>>>>>>>>> zon-elastic-mapreduce-for-etl-workflows/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2017-08-09 18:19 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Yes, it's empty. Also I see this message in the log:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 2017-08-09 09:02:35,947 WARN  [Job
>>>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:234 : Skipping
>>>>>>>>>>>>>>>>>>> non-directory s3://joom.emr.fs/home/producti
>>>>>>>>>>>>>>>>>>> on/bi/kylin/kylin_metadata/kyl
>>>>>>>>>>>>>>>>>>> in-1e436685-7102-4621-a4cb-6472b866126d
>>>>>>>>>>>>>>>>>>> /main_event_1_main/hfile/_SUCCESS
>>>>>>>>>>>>>>>>>>> 2017-08-09 09:02:36,009 WARN  [Job
>>>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:252 : Skipping non-file
>>>>>>>>>>>>>>>>>>> FileStatusExt{path=s3://joom.e
>>>>>>>>>>>>>>>>>>> mr.fs/home/production/bi/kylin
>>>>>>>>>>>>>>>>>>> /kylin_metadata/kylin-1e436685
>>>>>>>>>>>>>>>>>>> -7102-4621-a4cb-6472b866126d/m
>>>>>>>>>>>>>>>>>>> ain_event_1_main/hfile/_temporary/1; isDirectory=true;
>>>>>>>>>>>>>>>>>>> modification_time=0; access_time=0; owner=; group=; permission=rwxrwxrwx;
>>>>>>>>>>>>>>>>>>> isSymlink=false}
>>>>>>>>>>>>>>>>>>> 2017-08-09 09:02:36,014 WARN  [Job
>>>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:422 : Bulk load
>>>>>>>>>>>>>>>>>>> operation did not find any files to load in directory
>>>>>>>>>>>>>>>>>>> s3://joom.emr.fs/home/producti
>>>>>>>>>>>>>>>>>>> on/bi/kylin/kylin_metadata/kyl
>>>>>>>>>>>>>>>>>>> in-1e436685-7102-4621-a4cb-647
>>>>>>>>>>>>>>>>>>> 2b866126d/main_event_1_main/hfile.  Does it contain
>>>>>>>>>>>>>>>>>>> files in subdirectories that correspond to column family names?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Aug 9, 2017 at 1:15 PM, ShaoFeng Shi <
>>>>>>>>>>>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The HFile will be moved to HBase data folder when bulk
>>>>>>>>>>>>>>>>>>>> load finished; Did you check whether the HTable has data?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 2017-08-09 17:54 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I set kylin.hbase.cluster.fs to s3 bucket where hbase
>>>>>>>>>>>>>>>>>>>>> lives.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Step "Convert Cuboid Data to HFile" finished without
>>>>>>>>>>>>>>>>>>>>> errors. Statistics at the end of the job said that it has written lot's of
>>>>>>>>>>>>>>>>>>>>> data to s3.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> But there is no hfiles in kylin_metadata folder
>>>>>>>>>>>>>>>>>>>>> (kylin_metadata /kylin-1e436685-7102-4621-a4cb-6472b866126d/<table
>>>>>>>>>>>>>>>>>>>>> name>/hfile), but only _temporary folder and _SUCCESS file.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> _temporary contains hfiles inside attempt folders. it
>>>>>>>>>>>>>>>>>>>>> looks like there were not copied from _temporary to result dir. But there
>>>>>>>>>>>>>>>>>>>>> is no errors neither in kylin log, nor in reducers' logs.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Then loading empty hfiles produces empty segments.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Is that a bug or I'm doing something wrong?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best regards,
>>>>>>>>>>>
>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>>
>>>>>>> Shaofeng Shi 史少锋
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best regards,
>>>>>>
>>>>>> Shaofeng Shi 史少锋
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>>
>>> Shaofeng Shi 史少锋
>>>
>>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>

Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

Posted by ShaoFeng Shi <sh...@apache.org>.

Thanks; I also set a larger value for the rpc timeout, but it didn't change
the behavior. I'm using EMR 5.5, not sure whether it is a bug.

2017-09-07 17:24 GMT+08:00 Alexander Sterligov <st...@joom.it>:

> Hi,
>
> I've set large hbase timeout:
>
> <property>
>     <name>hbase.rpc.timeout</name>
>     <value>1800000</value>
>   </property>
>
> On Thu, Sep 7, 2017 at 12:02 PM, ShaoFeng Shi <sh...@apache.org>
> wrote:
>
>> Hi Alexander,
>>
>> I encounter a problem when using HDFS for cubing building, and S3 for
>> HBase on EMR. In the "Load HFile to HBase Table" step, Kylin got a failure
>> with time out error:
>>
>> Thu Sep 07 15:33:27 GMT+08:00 2017, RpcRetryingCaller{globalStartTime=1504769048975,
>> pause=100, retries=35}, java.io.IOException: Call to
>> ip-10-0-0-28.ec2.internal/10.0.0.28:16020 failed on local exception:
>> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=41,
>> waitTime=60001, operationTimeout=60000
>>
>> In HBase region server, I saw HBase uploads the HFile to S3; Since the
>> cube is a little big (13GB), it takes much longer time than usual. Kylin
>> client closed the connection as it thought timeout:
>>
>> 2017-09-07 08:01:12,275 INFO  [RpcServer.FifoWFPBQ.default.handler=16,queue=1,port=16020]
>> regionserver.HRegionFileSystem: Bulk-load file
>> hdfs://ip-10-0-0-118.ec2.internal:8020/kylin/kylin_default_i
>> nstance/kylin-cdcb5f57-2ea9-47d9-85db-7a6c7490cc55/test/hfil
>> e/F1/a897b4d33ed648e6a5d0bfb05cffdfd6 is on different filesystem than
>> the destination store. Copying file over to destination filesystem.
>> 2017-09-07 08:01:23,919 INFO  [RpcServer.FifoWFPBQ.default.handler=22,queue=1,port=16020]
>> s3.MultipartUploadManager: completed multipart upload of 8 parts 965420145
>> bytes
>>
>> 2017-09-07 08:26:33,838 WARN  [RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020]
>> ipc.RpcServer: (responseTooSlow): {"call":"BulkLoadHFile(org.apa
>> che.hadoop.hbase.protobuf.generated.ClientProtos$BulkLoadHFi
>> leRequest)","starttimems":1504770958916,"responsesize":2,"
>> method":"BulkLoadHFile","param":"TODO: class
>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Bulk
>> LoadHFileRequest","processingtimems":1834922,"client":"10.0.0.243:49152
>> ","queuetimems":0,"class":"HRegionServer"}
>> 2017-09-07 08:26:33,838 WARN  [RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020]
>> ipc.RpcServer: RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020:
>> caught a ClosedChannelException, this means that the server /
>> 10.0.0.28:16020 was processing a request but the client went away. The
>> error message was: null
>>
>> So I wonder how did you bypass this problem, did you set a very large
>> timeout value for HBase, or your cube size isn't that big? Thanks.
>>
>>
>>
>> 2017-08-14 14:19 GMT+08:00 Alexander Sterligov <st...@joom.it>:
>>
>>> Here is ticket for hfile on s3 issue - https://issues.apache.org/ji
>>> ra/browse/KYLIN-2788
>>>
>>> On Mon, Aug 14, 2017 at 9:17 AM, Alexander Sterligov <
>>> sterligovak@joom.it> wrote:
>>>
>>>> I forgot there was one more issue with s3 -
>>>> https://issues.apache.org/jira/browse/KYLIN-2740.
>>>>
>>>> Global dictionary in 2.0 doesn't work out of the box. I patched kylin
>>>> as described in ticket.
>>>>
>>>> On Sun, Aug 13, 2017 at 4:24 AM, ShaoFeng Shi <sh...@apache.org>
>>>> wrote:
>>>>
>>>>> Nice; For the writting hfile to S3 issue,  it need more
>>>>> investigation.  Please open a Kylin JIRA for tracking. We will update there
>>>>> if has any finding.
>>>>>
>>>>> 2017-08-12 23:52 GMT+08:00 Alexander Sterligov <st...@joom.it>:
>>>>>
>>>>>> Query performance is pretty same as on slides about kylin. I have
>>>>>> high bucket cache hit (>90%), so data is almost always read from local
>>>>>> disk. For some other use cases it might be different.
>>>>>>
>>>>>> 12 авг. 2017 г. 17:59 пользователь "ShaoFeng Shi" <
>>>>>> shaofengshi@apache.org> написал:
>>>>>>
>>>>>> Cool; how about the query performance with data on s3?
>>>>>>
>>>>>> 2017-08-11 23:27 GMT+08:00 Alexander Sterligov <st...@joom.it>:
>>>>>>
>>>>>>> Yes, that's the only one fow now.
>>>>>>>
>>>>>>> On Fri, Aug 11, 2017 at 6:23 PM, ShaoFeng Shi <
>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>
>>>>>>>> No need to add I think, because I see they already in the
>>>>>>>> configuration of that step.
>>>>>>>>
>>>>>>>> Is this the only issue you see with Kylin on EMR+S3?
>>>>>>>>
>>>>>>>> [image: 内嵌图片 1]
>>>>>>>>
>>>>>>>> 2017-08-11 20:26 GMT+08:00 Alexander Sterligov <sterligovak@joom.it
>>>>>>>> >:
>>>>>>>>
>>>>>>>>> What if we shall add direct output in kylin_job_conf.xml
>>>>>>>>> and kylin_job_conf_inmem.xml?
>>>>>>>>>
>>>>>>>>> hbase.zookeeper.quorum for example doesn't work if not specified
>>>>>>>>> in these configs.
>>>>>>>>>
>>>>>>>>> On Fri, Aug 11, 2017 at 3:13 PM, ShaoFeng Shi <
>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>
>>>>>>>>>> EMR enables the direct output in mapred-site.xml, while in this
>>>>>>>>>> step it seems these settings doesn't work (althoug the job's configuration
>>>>>>>>>> shows they are there). I disabled the direct output but the behavior has no
>>>>>>>>>> change. I did some search but no finding. I need drop the EMR now, and may
>>>>>>>>>> get back it later.
>>>>>>>>>>
>>>>>>>>>> If you have any idea or findings, please share it. We'd like to
>>>>>>>>>> make Kylin has better support for cloud.
>>>>>>>>>>
>>>>>>>>>> Thanks for your feedback!
>>>>>>>>>>
>>>>>>>>>> 2017-08-11 19:19 GMT+08:00 Alexander Sterligov <
>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>
>>>>>>>>>>> Any ideas how to fix that?
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Aug 11, 2017 at 2:16 PM, ShaoFeng Shi <
>>>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I got the same problem as you:
>>>>>>>>>>>>
>>>>>>>>>>>> 2017-08-11 08:44:16,342 WARN  [Job
>>>>>>>>>>>> 2c86b4b6-7639-4a97-ba63-63c9dca095f6-2255]
>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:422 : Bulk load operation did
>>>>>>>>>>>> not find any files to load in directory s3://privatekeybucket-anac5h41
>>>>>>>>>>>> 523l/kylin/kylin_default_instance/kylin-2c86b4b6-7639-4a97-b
>>>>>>>>>>>> a63-63c9dca095f6/kylin_sales_cube_clone3/hfile.  Does it
>>>>>>>>>>>> contain files in subdirectories that correspond to column family names?
>>>>>>>>>>>>
>>>>>>>>>>>> In S3 view, I see the files exist in "_temporary" folder, seems
>>>>>>>>>>>> were not moved to the target folder on complete. It seems EMR try to direct
>>>>>>>>>>>> write to otuput path, but actually not.
>>>>>>>>>>>>
>>>>>>>>>>>> 2017-08-11 16:34 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>
>>>>>>>>>>>>> No, defaultFs is hdfs.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I’ve seen such behavior when set working dir to s3, but didn’t
>>>>>>>>>>>>> set cluster-fs at all. Maybe you have a typo in the name of the property. I
>>>>>>>>>>>>> used the old one «kylin.hbase.cluster.fs»
>>>>>>>>>>>>>
>>>>>>>>>>>>> When both working-dir and cluster-fs were set to s3 I got
>>>>>>>>>>>>> _temporary dir of convert job at s3, but no hfiles. Also I saw correct
>>>>>>>>>>>>> output path for the job in the log. But I didn’t check if job creates
>>>>>>>>>>>>> temporary files in s3, but then copies results to hdfs. I hardly believe it
>>>>>>>>>>>>> happens.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Do you see proper arguments for the step in the log?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 11 авг. 2017 г., в 11:17, ShaoFeng Shi <sh...@apache.org>
>>>>>>>>>>>>> написал(а):
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Alexander,
>>>>>>>>>>>>>
>>>>>>>>>>>>> That makes sense. Using S3 for Cube build and storage is
>>>>>>>>>>>>> required for a cloud hadoop environment.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I tried to reproduce this problem. I created a EMR with S3 as
>>>>>>>>>>>>> HBase storage, in kylin.properties, I set "kylin.env.hdfs-working-dir"
>>>>>>>>>>>>> and "kylin.storage.hbase.cluster-fs" to the S3 bucket. But in
>>>>>>>>>>>>> the "Convert Cuboid Data to HFile" step, Kylin still writes
>>>>>>>>>>>>> to local HDFS; Did you modify the core-site.xml to make S3 as the default
>>>>>>>>>>>>> FS?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2017-08-10 22:53 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes, I workarounded this problem in such way and it works.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> One problem of such solution is that I have to use pretty
>>>>>>>>>>>>>> large hdfs and it'expensive. And also I have to manually garbage collect
>>>>>>>>>>>>>> it, because it is not moved to s3, but copied. Kylin cleanup job doesn't
>>>>>>>>>>>>>> work for it, because main metadata folder is at s3. So it would be really
>>>>>>>>>>>>>> nice to put everything to s3.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Another problem is that I had to rise hbase rpc timeout,
>>>>>>>>>>>>>> because bulk loading from hdfs takes long. That was not trivial. 3 minutes
>>>>>>>>>>>>>> work good, but with drawback of queries or metadata writes handing for 3
>>>>>>>>>>>>>> minutes if something bad happen. But that's rare event.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 10 авг. 2017 г. 17:42 пользователь "ShaoFeng Shi" <
>>>>>>>>>>>>>> shaofengshi@apache.org> написал:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> How about leaving empty for "kylin.hbase.cluster.fs"? This
>>>>>>>>>>>>>>> property is for two-cluster deployment (one Hadoop for cube build, the
>>>>>>>>>>>>>>> other for query);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When be empty, the HFile will be written to default fs (HDFS
>>>>>>>>>>>>>>> in EMR), and then load to HBase. I'm not sure whether EMR HBase (using S3
>>>>>>>>>>>>>>> as storage) can bulk load files from HDFS or not. If it can, that would be
>>>>>>>>>>>>>>> great as the write performance of HDFS would be better than S3.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2017-08-10 22:29 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I also thought about it, but no, it's not consistency.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Consistency view is enabled. I use same s3 for my own
>>>>>>>>>>>>>>>> map-reduce jobs and it's ok.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I also checked if it lost consistency (emrfs diff). No
>>>>>>>>>>>>>>>> problems.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In case of inconsistency of s3 files disappear right after
>>>>>>>>>>>>>>>> they were written and appear some time after. Hfiles didn't appear after a
>>>>>>>>>>>>>>>> day, but _template is there.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It's 100% reproducable, I think I'll investigate this
>>>>>>>>>>>>>>>> problem by running conversion job manually.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 10 авг. 2017 г. 17:18 пользователь "ShaoFeng Shi" <
>>>>>>>>>>>>>>>> shaofengshi@apache.org> написал:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Did you enable the Consistent View? This article explains
>>>>>>>>>>>>>>>>> the challenge when using S3 directly for ETL process:
>>>>>>>>>>>>>>>>> https://aws.amazon.com/cn/blog
>>>>>>>>>>>>>>>>> s/big-data/ensuring-consistenc
>>>>>>>>>>>>>>>>> y-when-using-amazon-s3-and-ama
>>>>>>>>>>>>>>>>> zon-elastic-mapreduce-for-etl-workflows/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2017-08-09 18:19 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Yes, it's empty. Also I see this message in the log:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2017-08-09 09:02:35,947 WARN  [Job
>>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:234 : Skipping
>>>>>>>>>>>>>>>>>> non-directory s3://joom.emr.fs/home/producti
>>>>>>>>>>>>>>>>>> on/bi/kylin/kylin_metadata/kyl
>>>>>>>>>>>>>>>>>> in-1e436685-7102-4621-a4cb-6472b866126d
>>>>>>>>>>>>>>>>>> /main_event_1_main/hfile/_SUCCESS
>>>>>>>>>>>>>>>>>> 2017-08-09 09:02:36,009 WARN  [Job
>>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:252 : Skipping non-file
>>>>>>>>>>>>>>>>>> FileStatusExt{path=s3://joom.e
>>>>>>>>>>>>>>>>>> mr.fs/home/production/bi/kylin
>>>>>>>>>>>>>>>>>> /kylin_metadata/kylin-1e436685
>>>>>>>>>>>>>>>>>> -7102-4621-a4cb-6472b866126d/m
>>>>>>>>>>>>>>>>>> ain_event_1_main/hfile/_temporary/1; isDirectory=true;
>>>>>>>>>>>>>>>>>> modification_time=0; access_time=0; owner=; group=; permission=rwxrwxrwx;
>>>>>>>>>>>>>>>>>> isSymlink=false}
>>>>>>>>>>>>>>>>>> 2017-08-09 09:02:36,014 WARN  [Job
>>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:422 : Bulk load
>>>>>>>>>>>>>>>>>> operation did not find any files to load in directory
>>>>>>>>>>>>>>>>>> s3://joom.emr.fs/home/producti
>>>>>>>>>>>>>>>>>> on/bi/kylin/kylin_metadata/kyl
>>>>>>>>>>>>>>>>>> in-1e436685-7102-4621-a4cb-647
>>>>>>>>>>>>>>>>>> 2b866126d/main_event_1_main/hfile.  Does it contain
>>>>>>>>>>>>>>>>>> files in subdirectories that correspond to column family names?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Aug 9, 2017 at 1:15 PM, ShaoFeng Shi <
>>>>>>>>>>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The HFile will be moved to HBase data folder when bulk
>>>>>>>>>>>>>>>>>>> load finished; Did you check whether the HTable has data?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 2017-08-09 17:54 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I set kylin.hbase.cluster.fs to s3 bucket where hbase
>>>>>>>>>>>>>>>>>>>> lives.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Step "Convert Cuboid Data to HFile" finished without
>>>>>>>>>>>>>>>>>>>> errors. Statistics at the end of the job said that it has written lot's of
>>>>>>>>>>>>>>>>>>>> data to s3.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> But there is no hfiles in kylin_metadata folder
>>>>>>>>>>>>>>>>>>>> (kylin_metadata /kylin-1e436685-7102-4621-a4cb-6472b866126d/<table
>>>>>>>>>>>>>>>>>>>> name>/hfile), but only _temporary folder and _SUCCESS file.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> _temporary contains hfiles inside attempt folders. it
>>>>>>>>>>>>>>>>>>>> looks like there were not copied from _temporary to result dir. But there
>>>>>>>>>>>>>>>>>>>> is no errors neither in kylin log, nor in reducers' logs.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Then loading empty hfiles produces empty segments.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Is that a bug or I'm doing something wrong?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>
>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Best regards,
>>>>>>>>>>
>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best regards,
>>>>>>
>>>>>> Shaofeng Shi 史少锋
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>>
>>>>> Shaofeng Shi 史少锋
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>


-- 
Best regards,

Shaofeng Shi 史少锋

Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

Posted by Alexander Sterligov <st...@joom.it>.

Hi,

I've set large hbase timeout:

<property>
    <name>hbase.rpc.timeout</name>
    <value>1800000</value>
  </property>

On Thu, Sep 7, 2017 at 12:02 PM, ShaoFeng Shi <sh...@apache.org>
wrote:

> Hi Alexander,
>
> I encounter a problem when using HDFS for cubing building, and S3 for
> HBase on EMR. In the "Load HFile to HBase Table" step, Kylin got a failure
> with time out error:
>
> Thu Sep 07 15:33:27 GMT+08:00 2017, RpcRetryingCaller{globalStartTime=1504769048975,
> pause=100, retries=35}, java.io.IOException: Call to
> ip-10-0-0-28.ec2.internal/10.0.0.28:16020 failed on local exception:
> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=41,
> waitTime=60001, operationTimeout=60000
>
> In HBase region server, I saw HBase uploads the HFile to S3; Since the
> cube is a little big (13GB), it takes much longer time than usual. Kylin
> client closed the connection as it thought timeout:
>
> 2017-09-07 08:01:12,275 INFO  [RpcServer.FifoWFPBQ.default.handler=16,queue=1,port=16020]
> regionserver.HRegionFileSystem: Bulk-load file hdfs://ip-10-0-0-118.ec2.
> internal:8020/kylin/kylin_default_instance/kylin-cdcb5f57-2ea9-47d9-85db-
> 7a6c7490cc55/test/hfile/F1/a897b4d33ed648e6a5d0bfb05cffdfd6 is on
> different filesystem than the destination store. Copying file over to
> destination filesystem.
> 2017-09-07 08:01:23,919 INFO  [RpcServer.FifoWFPBQ.default.handler=22,queue=1,port=16020]
> s3.MultipartUploadManager: completed multipart upload of 8 parts 965420145
> bytes
>
> 2017-09-07 08:26:33,838 WARN  [RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020]
> ipc.RpcServer: (responseTooSlow): {"call":"BulkLoadHFile(org.
> apache.hadoop.hbase.protobuf.generated.ClientProtos$
> BulkLoadHFileRequest)","starttimems":1504770958916,"
> responsesize":2,"method":"BulkLoadHFile","param":"TODO: class
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$
> BulkLoadHFileRequest","processingtimems":1834922,"client":"
> 10.0.0.243:49152","queuetimems":0,"class":"HRegionServer"}
> 2017-09-07 08:26:33,838 WARN  [RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020]
> ipc.RpcServer: RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020:
> caught a ClosedChannelException, this means that the server /
> 10.0.0.28:16020 was processing a request but the client went away. The
> error message was: null
>
> So I wonder how did you bypass this problem, did you set a very large
> timeout value for HBase, or your cube size isn't that big? Thanks.
>
>
>
> 2017-08-14 14:19 GMT+08:00 Alexander Sterligov <st...@joom.it>:
>
>> Here is ticket for hfile on s3 issue - https://issues.apache.org/ji
>> ra/browse/KYLIN-2788
>>
>> On Mon, Aug 14, 2017 at 9:17 AM, Alexander Sterligov <sterligovak@joom.it
>> > wrote:
>>
>>> I forgot there was one more issue with s3 - https://issues.apache.org/ji
>>> ra/browse/KYLIN-2740.
>>>
>>> Global dictionary in 2.0 doesn't work out of the box. I patched kylin as
>>> described in ticket.
>>>
>>> On Sun, Aug 13, 2017 at 4:24 AM, ShaoFeng Shi <sh...@apache.org>
>>> wrote:
>>>
>>>> Nice; For the writting hfile to S3 issue,  it need more
>>>> investigation.  Please open a Kylin JIRA for tracking. We will update there
>>>> if has any finding.
>>>>
>>>> 2017-08-12 23:52 GMT+08:00 Alexander Sterligov <st...@joom.it>:
>>>>
>>>>> Query performance is pretty same as on slides about kylin. I have high
>>>>> bucket cache hit (>90%), so data is almost always read from local disk. For
>>>>> some other use cases it might be different.
>>>>>
>>>>> 12 авг. 2017 г. 17:59 пользователь "ShaoFeng Shi" <
>>>>> shaofengshi@apache.org> написал:
>>>>>
>>>>> Cool; how about the query performance with data on s3?
>>>>>
>>>>> 2017-08-11 23:27 GMT+08:00 Alexander Sterligov <st...@joom.it>:
>>>>>
>>>>>> Yes, that's the only one fow now.
>>>>>>
>>>>>> On Fri, Aug 11, 2017 at 6:23 PM, ShaoFeng Shi <shaofengshi@apache.org
>>>>>> > wrote:
>>>>>>
>>>>>>> No need to add I think, because I see they already in the
>>>>>>> configuration of that step.
>>>>>>>
>>>>>>> Is this the only issue you see with Kylin on EMR+S3?
>>>>>>>
>>>>>>> [image: 内嵌图片 1]
>>>>>>>
>>>>>>> 2017-08-11 20:26 GMT+08:00 Alexander Sterligov <st...@joom.it>
>>>>>>> :
>>>>>>>
>>>>>>>> What if we shall add direct output in kylin_job_conf.xml
>>>>>>>> and kylin_job_conf_inmem.xml?
>>>>>>>>
>>>>>>>> hbase.zookeeper.quorum for example doesn't work if not specified in
>>>>>>>> these configs.
>>>>>>>>
>>>>>>>> On Fri, Aug 11, 2017 at 3:13 PM, ShaoFeng Shi <
>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>
>>>>>>>>> EMR enables the direct output in mapred-site.xml, while in this
>>>>>>>>> step it seems these settings doesn't work (althoug the job's configuration
>>>>>>>>> shows they are there). I disabled the direct output but the behavior has no
>>>>>>>>> change. I did some search but no finding. I need drop the EMR now, and may
>>>>>>>>> get back it later.
>>>>>>>>>
>>>>>>>>> If you have any idea or findings, please share it. We'd like to
>>>>>>>>> make Kylin has better support for cloud.
>>>>>>>>>
>>>>>>>>> Thanks for your feedback!
>>>>>>>>>
>>>>>>>>> 2017-08-11 19:19 GMT+08:00 Alexander Sterligov <
>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>
>>>>>>>>>> Any ideas how to fix that?
>>>>>>>>>>
>>>>>>>>>> On Fri, Aug 11, 2017 at 2:16 PM, ShaoFeng Shi <
>>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> I got the same problem as you:
>>>>>>>>>>>
>>>>>>>>>>> 2017-08-11 08:44:16,342 WARN  [Job 2c86b4b6-7639-4a97-ba63-63c9dca095f6-2255]
>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:422 : Bulk load operation did
>>>>>>>>>>> not find any files to load in directory s3://privatekeybucket-anac5h41
>>>>>>>>>>> 523l/kylin/kylin_default_instance/kylin-2c86b4b6-7639-4a97-b
>>>>>>>>>>> a63-63c9dca095f6/kylin_sales_cube_clone3/hfile.  Does it
>>>>>>>>>>> contain files in subdirectories that correspond to column family names?
>>>>>>>>>>>
>>>>>>>>>>> In S3 view, I see the files exist in "_temporary" folder, seems
>>>>>>>>>>> were not moved to the target folder on complete. It seems EMR try to direct
>>>>>>>>>>> write to otuput path, but actually not.
>>>>>>>>>>>
>>>>>>>>>>> 2017-08-11 16:34 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>
>>>>>>>>>>>> No, defaultFs is hdfs.
>>>>>>>>>>>>
>>>>>>>>>>>> I’ve seen such behavior when set working dir to s3, but didn’t
>>>>>>>>>>>> set cluster-fs at all. Maybe you have a typo in the name of the property. I
>>>>>>>>>>>> used the old one «kylin.hbase.cluster.fs»
>>>>>>>>>>>>
>>>>>>>>>>>> When both working-dir and cluster-fs were set to s3 I got
>>>>>>>>>>>> _temporary dir of convert job at s3, but no hfiles. Also I saw correct
>>>>>>>>>>>> output path for the job in the log. But I didn’t check if job creates
>>>>>>>>>>>> temporary files in s3, but then copies results to hdfs. I hardly believe it
>>>>>>>>>>>> happens.
>>>>>>>>>>>>
>>>>>>>>>>>> Do you see proper arguments for the step in the log?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 11 авг. 2017 г., в 11:17, ShaoFeng Shi <sh...@apache.org>
>>>>>>>>>>>> написал(а):
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Alexander,
>>>>>>>>>>>>
>>>>>>>>>>>> That makes sense. Using S3 for Cube build and storage is
>>>>>>>>>>>> required for a cloud hadoop environment.
>>>>>>>>>>>>
>>>>>>>>>>>> I tried to reproduce this problem. I created a EMR with S3 as
>>>>>>>>>>>> HBase storage, in kylin.properties, I set "kylin.env.hdfs-working-dir"
>>>>>>>>>>>> and "kylin.storage.hbase.cluster-fs" to the S3 bucket. But in
>>>>>>>>>>>> the "Convert Cuboid Data to HFile" step, Kylin still writes to
>>>>>>>>>>>> local HDFS; Did you modify the core-site.xml to make S3 as the default FS?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2017-08-10 22:53 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>
>>>>>>>>>>>>> Yes, I workarounded this problem in such way and it works.
>>>>>>>>>>>>>
>>>>>>>>>>>>> One problem of such solution is that I have to use pretty
>>>>>>>>>>>>> large hdfs and it'expensive. And also I have to manually garbage collect
>>>>>>>>>>>>> it, because it is not moved to s3, but copied. Kylin cleanup job doesn't
>>>>>>>>>>>>> work for it, because main metadata folder is at s3. So it would be really
>>>>>>>>>>>>> nice to put everything to s3.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Another problem is that I had to rise hbase rpc timeout,
>>>>>>>>>>>>> because bulk loading from hdfs takes long. That was not trivial. 3 minutes
>>>>>>>>>>>>> work good, but with drawback of queries or metadata writes handing for 3
>>>>>>>>>>>>> minutes if something bad happen. But that's rare event.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 10 авг. 2017 г. 17:42 пользователь "ShaoFeng Shi" <
>>>>>>>>>>>>> shaofengshi@apache.org> написал:
>>>>>>>>>>>>>
>>>>>>>>>>>>> How about leaving empty for "kylin.hbase.cluster.fs"? This
>>>>>>>>>>>>>> property is for two-cluster deployment (one Hadoop for cube build, the
>>>>>>>>>>>>>> other for query);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> When be empty, the HFile will be written to default fs (HDFS
>>>>>>>>>>>>>> in EMR), and then load to HBase. I'm not sure whether EMR HBase (using S3
>>>>>>>>>>>>>> as storage) can bulk load files from HDFS or not. If it can, that would be
>>>>>>>>>>>>>> great as the write performance of HDFS would be better than S3.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2017-08-10 22:29 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I also thought about it, but no, it's not consistency.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Consistency view is enabled. I use same s3 for my own
>>>>>>>>>>>>>>> map-reduce jobs and it's ok.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I also checked if it lost consistency (emrfs diff). No
>>>>>>>>>>>>>>> problems.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In case of inconsistency of s3 files disappear right after
>>>>>>>>>>>>>>> they were written and appear some time after. Hfiles didn't appear after a
>>>>>>>>>>>>>>> day, but _template is there.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It's 100% reproducable, I think I'll investigate this
>>>>>>>>>>>>>>> problem by running conversion job manually.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 10 авг. 2017 г. 17:18 пользователь "ShaoFeng Shi" <
>>>>>>>>>>>>>>> shaofengshi@apache.org> написал:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Did you enable the Consistent View? This article explains
>>>>>>>>>>>>>>>> the challenge when using S3 directly for ETL process:
>>>>>>>>>>>>>>>> https://aws.amazon.com/cn/blog
>>>>>>>>>>>>>>>> s/big-data/ensuring-consistenc
>>>>>>>>>>>>>>>> y-when-using-amazon-s3-and-ama
>>>>>>>>>>>>>>>> zon-elastic-mapreduce-for-etl-workflows/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2017-08-09 18:19 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yes, it's empty. Also I see this message in the log:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2017-08-09 09:02:35,947 WARN  [Job
>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:234 : Skipping
>>>>>>>>>>>>>>>>> non-directory s3://joom.emr.fs/home/producti
>>>>>>>>>>>>>>>>> on/bi/kylin/kylin_metadata/kyl
>>>>>>>>>>>>>>>>> in-1e436685-7102-4621-a4cb-6472b866126d
>>>>>>>>>>>>>>>>> /main_event_1_main/hfile/_SUCCESS
>>>>>>>>>>>>>>>>> 2017-08-09 09:02:36,009 WARN  [Job
>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:252 : Skipping non-file
>>>>>>>>>>>>>>>>> FileStatusExt{path=s3://joom.e
>>>>>>>>>>>>>>>>> mr.fs/home/production/bi/kylin
>>>>>>>>>>>>>>>>> /kylin_metadata/kylin-1e436685
>>>>>>>>>>>>>>>>> -7102-4621-a4cb-6472b866126d/m
>>>>>>>>>>>>>>>>> ain_event_1_main/hfile/_temporary/1; isDirectory=true;
>>>>>>>>>>>>>>>>> modification_time=0; access_time=0; owner=; group=; permission=rwxrwxrwx;
>>>>>>>>>>>>>>>>> isSymlink=false}
>>>>>>>>>>>>>>>>> 2017-08-09 09:02:36,014 WARN  [Job
>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:422 : Bulk load operation
>>>>>>>>>>>>>>>>> did not find any files to load in directory
>>>>>>>>>>>>>>>>> s3://joom.emr.fs/home/producti
>>>>>>>>>>>>>>>>> on/bi/kylin/kylin_metadata/kyl
>>>>>>>>>>>>>>>>> in-1e436685-7102-4621-a4cb-647
>>>>>>>>>>>>>>>>> 2b866126d/main_event_1_main/hfile.  Does it contain files
>>>>>>>>>>>>>>>>> in subdirectories that correspond to column family names?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Aug 9, 2017 at 1:15 PM, ShaoFeng Shi <
>>>>>>>>>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The HFile will be moved to HBase data folder when bulk
>>>>>>>>>>>>>>>>>> load finished; Did you check whether the HTable has data?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2017-08-09 17:54 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I set kylin.hbase.cluster.fs to s3 bucket where hbase
>>>>>>>>>>>>>>>>>>> lives.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Step "Convert Cuboid Data to HFile" finished without
>>>>>>>>>>>>>>>>>>> errors. Statistics at the end of the job said that it has written lot's of
>>>>>>>>>>>>>>>>>>> data to s3.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> But there is no hfiles in kylin_metadata folder
>>>>>>>>>>>>>>>>>>> (kylin_metadata /kylin-1e436685-7102-4621-a4cb-6472b866126d/<table
>>>>>>>>>>>>>>>>>>> name>/hfile), but only _temporary folder and _SUCCESS file.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> _temporary contains hfiles inside attempt folders. it
>>>>>>>>>>>>>>>>>>> looks like there were not copied from _temporary to result dir. But there
>>>>>>>>>>>>>>>>>>> is no errors neither in kylin log, nor in reducers' logs.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Then loading empty hfiles produces empty segments.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Is that a bug or I'm doing something wrong?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>
>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best regards,
>>>>>>>>>>>
>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>>
>>>>>>> Shaofeng Shi 史少锋
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>>
>>>>> Shaofeng Shi 史少锋
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>>
>>>> Shaofeng Shi 史少锋
>>>>
>>>>
>>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>