You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by "胡志华 (万里通科技及数据中心商务智能团队数据分析组)" <HU...@pingan.com.cn> on 2016/05/06 15:35:25 UTC
答复: 答复: problem happened at step "build base cuboid data"
I think it's difficulty to check, the amount of data is huge.
Could you give me some suggestion?
-----邮件原件-----
发件人: ShaoFeng Shi [mailto:shaofengshi@apache.org]
发送时间: 2016年5月6日 23:33
收件人: dev@kylin.apache.org
主题: Re: 答复: problem happened at step "build base cuboid data"
I mean the data in hive table; if there is some dirty data (e.g, it was declared as decimal, but actually be a string), it may cause the cube build failed.
2016-05-06 23:29 GMT+08:00 胡志华(万里通科技及数据中心商务智能团队数据分析组) <
HUZHIHUA160@pingan.com.cn>:
> You mean data type, let me give you description,
>
> > desc partner_txn_sub_order_ft0_s;
> OK
> sub_txn_id string
> txn_id string
> order_id string
> sub_order_id string
> pp_order_id string
> pp_sub_order_id string
> process_dt string
> slt_doc_id string
> doc_id string
> sub_doc_id string
> gain_pay_ind string
> pathway_ind string
> txn_type_ind string
> txn_sub_type_ind string
> product_id string
> product_name string
> rule_id string
> rule_name string
> slt_partner_id string
> slt_partner_desc string
> pay_cash decimal(22,7)
> pay_points decimal(22,7)
> gain_points decimal(22,7)
> discount decimal(22,7)
> age_level_ind string
> gender_ind string
> phone_province_ind string
> phone_city_ind string
> point_current_level_ind string
> binding_d string
> binding_m string
> is_email_verified int
> is_mobile_verified int
> is_app int
> partner_gain_pt_level_ind string
> wlt_txn_level_ind string
> brand_point_no string
> pathway_desc string
> is_activity int
> pt_log_d string
> partner_id string
>
> # Partition Information
> # col_name data_type comment
>
> pt_log_d string
> partner_id string
> Time taken: 0.124 seconds, Fetched: 47 row(s)
> hive>
>
> -----邮件原件-----
> 发件人: ShaoFeng Shi [mailto:shaofengshi@apache.org]
> 发送时间: 2016年5月6日 23:27
> 收件人: dev@kylin.apache.org
> 主题: Re: problem happened at step "build base cuboid data"
>
> seems some data couldn't be parsed as a BigDecimal. You may need check
> the data type in source table.
>
> 2016-05-06 20:34 GMT+08:00 胡志华(万里通科技及数据中心商务智能团队数据分析组) <
> HUZHIHUA160@pingan.com.cn>:
>
> > Hi all
> >
> > I am encountering a problem at step "build base cuboid
> > data", mapreduce log as below
> >
> > And I googled it, but found nothing useful, so who can help me ?
> >
> >
> > Error: java.lang.NumberFormatException at
> > java.math.BigDecimal.<init>(BigDecimal.java:470) at
> > java.math.BigDecimal.<init>(BigDecimal.java:739) at
> > org.apache.kylin.measure.basic.BigDecimalIngester.valueOf(BigDecimal
> > In
> > gester.java:39)
> > at
> > org.apache.kylin.measure.basic.BigDecimalIngester.valueOf(BigDecimal
> > In
> > gester.java:29)
> > at
> > org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase.buildValueOf(B
> > as
> > eCuboidMapperBase.java:189)
> > at
> > org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase.buildValue(Bas
> > eC
> > uboidMapperBase.java:159)
> > at
> > org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase.outputKV(BaseC
> > ub
> > oidMapperBase.java:206)
> > at
> > org.apache.kylin.engine.mr.steps.HiveToBaseCuboidMapper.map(HiveToBa
> > se
> > CuboidMapper.java:53) at
> > org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at
> > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at
> > org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at
> > java.security.AccessController.doPrivileged(Native Method) at
> > javax.security.auth.Subject.doAs(Subject.java:415) at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInform
> > at
> > ion.java:1614) at
> > org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> >
> >
> > ********************************************************************
> > **
> > **********************************************************
> > The information in this email is confidential and may be legally
> > privileged. If you have received this email in error or are not the
> > intended recipient, please immediately notify the sender and delete
> > this message from your computer. Any use, distribution, or copying
> > of this email other than by the intended recipient is strictly
> > prohibited. All messages sent to and from us may be monitored to
> > ensure compliance with internal policies and to protect our business.
> > Emails are not secure and cannot be guaranteed to be error free as
> > they can be intercepted, amended, lost or destroyed, or contain
> > viruses. Anyone who communicates with us by email is taken to accept
> these risks.
> >
> > 收发邮件者请注意:
> > 本邮件含保密信息,若误收本邮件,请务必通知发送人并直接删去,不得使用、传播或复制本邮件。
> > 进出邮件均受到本公司合规监控。邮件可能发生被截留、被修改、丢失、被破坏或包含计算机病毒等不安全情况。
> >
> > ********************************************************************
> > **
> > **********************************************************
> >
>
>
>
> --
> Best regards,
>
> Shaofeng Shi
>
>
> **********************************************************************
> **********************************************************
> The information in this email is confidential and may be legally
> privileged. If you have received this email in error or are not the
> intended recipient, please immediately notify the sender and delete
> this message from your computer. Any use, distribution, or copying of
> this email other than by the intended recipient is strictly
> prohibited. All messages sent to and from us may be monitored to
> ensure compliance with internal policies and to protect our business.
> Emails are not secure and cannot be guaranteed to be error free as
> they can be intercepted, amended, lost or destroyed, or contain
> viruses. Anyone who communicates with us by email is taken to accept these risks.
>
> 收发邮件者请注意:
> 本邮件含保密信息,若误收本邮件,请务必通知发送人并直接删去,不得使用、传播或复制本邮件。
> 进出邮件均受到本公司合规监控。邮件可能发生被截留、被修改、丢失、被破坏或包含计算机病毒等不安全情况。
>
> **********************************************************************
> **********************************************************
>
--
Best regards,
Shaofeng Shi
********************************************************************************************************************************
The information in this email is confidential and may be legally privileged. If you have received this email in error or are not the intended recipient, please immediately notify the sender and delete this message from your computer. Any use, distribution, or copying of this email other than by the intended recipient is strictly prohibited. All messages sent to and from us may be monitored to ensure compliance with internal policies and to protect our business.
Emails are not secure and cannot be guaranteed to be error free as they can be intercepted, amended, lost or destroyed, or contain viruses. Anyone who communicates with us by email is taken to accept these risks.
收发邮件者请注意:
本邮件含保密信息,若误收本邮件,请务必通知发送人并直接删去,不得使用、传播或复制本邮件。
进出邮件均受到本公司合规监控。邮件可能发生被截留、被修改、丢失、被破坏或包含计算机病毒等不安全情况。
********************************************************************************************************************************
Re: 答复: 答复: problem happened at step "build base cuboid data"
Posted by ShaoFeng Shi <sh...@apache.org>.
Hive is more error-tolerant; while Kylin need the data be washed and clean,
so to ensure the accuracy at a high aggregation level.
I don't have a good idea either; you may need check the up-stream system to
see whether there was some problem.
2016-05-06 23:35 GMT+08:00 胡志华(万里通科技及数据中心商务智能团队数据分析组) <
HUZHIHUA160@pingan.com.cn>:
> I think it's difficulty to check, the amount of data is huge.
>
> Could you give me some suggestion?
>
> -----邮件原件-----
> 发件人: ShaoFeng Shi [mailto:shaofengshi@apache.org]
> 发送时间: 2016年5月6日 23:33
> 收件人: dev@kylin.apache.org
> 主题: Re: 答复: problem happened at step "build base cuboid data"
>
> I mean the data in hive table; if there is some dirty data (e.g, it was
> declared as decimal, but actually be a string), it may cause the cube build
> failed.
>
> 2016-05-06 23:29 GMT+08:00 胡志华(万里通科技及数据中心商务智能团队数据分析组) <
> HUZHIHUA160@pingan.com.cn>:
>
> > You mean data type, let me give you description,
> >
> > > desc partner_txn_sub_order_ft0_s;
> > OK
> > sub_txn_id string
> > txn_id string
> > order_id string
> > sub_order_id string
> > pp_order_id string
> > pp_sub_order_id string
> > process_dt string
> > slt_doc_id string
> > doc_id string
> > sub_doc_id string
> > gain_pay_ind string
> > pathway_ind string
> > txn_type_ind string
> > txn_sub_type_ind string
> > product_id string
> > product_name string
> > rule_id string
> > rule_name string
> > slt_partner_id string
> > slt_partner_desc string
> > pay_cash decimal(22,7)
> > pay_points decimal(22,7)
> > gain_points decimal(22,7)
> > discount decimal(22,7)
> > age_level_ind string
> > gender_ind string
> > phone_province_ind string
> > phone_city_ind string
> > point_current_level_ind string
> > binding_d string
> > binding_m string
> > is_email_verified int
> > is_mobile_verified int
> > is_app int
> > partner_gain_pt_level_ind string
> > wlt_txn_level_ind string
> > brand_point_no string
> > pathway_desc string
> > is_activity int
> > pt_log_d string
> > partner_id string
> >
> > # Partition Information
> > # col_name data_type comment
> >
> > pt_log_d string
> > partner_id string
> > Time taken: 0.124 seconds, Fetched: 47 row(s)
> > hive>
> >
> > -----邮件原件-----
> > 发件人: ShaoFeng Shi [mailto:shaofengshi@apache.org]
> > 发送时间: 2016年5月6日 23:27
> > 收件人: dev@kylin.apache.org
> > 主题: Re: problem happened at step "build base cuboid data"
> >
> > seems some data couldn't be parsed as a BigDecimal. You may need check
> > the data type in source table.
> >
> > 2016-05-06 20:34 GMT+08:00 胡志华(万里通科技及数据中心商务智能团队数据分析组) <
> > HUZHIHUA160@pingan.com.cn>:
> >
> > > Hi all
> > >
> > > I am encountering a problem at step "build base cuboid
> > > data", mapreduce log as below
> > >
> > > And I googled it, but found nothing useful, so who can help me ?
> > >
> > >
> > > Error: java.lang.NumberFormatException at
> > > java.math.BigDecimal.<init>(BigDecimal.java:470) at
> > > java.math.BigDecimal.<init>(BigDecimal.java:739) at
> > > org.apache.kylin.measure.basic.BigDecimalIngester.valueOf(BigDecimal
> > > In
> > > gester.java:39)
> > > at
> > > org.apache.kylin.measure.basic.BigDecimalIngester.valueOf(BigDecimal
> > > In
> > > gester.java:29)
> > > at
> > > org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase.buildValueOf(B
> > > as
> > > eCuboidMapperBase.java:189)
> > > at
> > > org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase.buildValue(Bas
> > > eC
> > > uboidMapperBase.java:159)
> > > at
> > > org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase.outputKV(BaseC
> > > ub
> > > oidMapperBase.java:206)
> > > at
> > > org.apache.kylin.engine.mr.steps.HiveToBaseCuboidMapper.map(HiveToBa
> > > se
> > > CuboidMapper.java:53) at
> > > org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at
> > > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at
> > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at
> > > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at
> > > java.security.AccessController.doPrivileged(Native Method) at
> > > javax.security.auth.Subject.doAs(Subject.java:415) at
> > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInform
> > > at
> > > ion.java:1614) at
> > > org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> > >
> > >
> > > ********************************************************************
> > > **
> > > **********************************************************
> > > The information in this email is confidential and may be legally
> > > privileged. If you have received this email in error or are not the
> > > intended recipient, please immediately notify the sender and delete
> > > this message from your computer. Any use, distribution, or copying
> > > of this email other than by the intended recipient is strictly
> > > prohibited. All messages sent to and from us may be monitored to
> > > ensure compliance with internal policies and to protect our business.
> > > Emails are not secure and cannot be guaranteed to be error free as
> > > they can be intercepted, amended, lost or destroyed, or contain
> > > viruses. Anyone who communicates with us by email is taken to accept
> > these risks.
> > >
> > > 收发邮件者请注意:
> > > 本邮件含保密信息,若误收本邮件,请务必通知发送人并直接删去,不得使用、传播或复制本邮件。
> > > 进出邮件均受到本公司合规监控。邮件可能发生被截留、被修改、丢失、被破坏或包含计算机病毒等不安全情况。
> > >
> > > ********************************************************************
> > > **
> > > **********************************************************
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> > Shaofeng Shi
> >
> >
> > **********************************************************************
> > **********************************************************
> > The information in this email is confidential and may be legally
> > privileged. If you have received this email in error or are not the
> > intended recipient, please immediately notify the sender and delete
> > this message from your computer. Any use, distribution, or copying of
> > this email other than by the intended recipient is strictly
> > prohibited. All messages sent to and from us may be monitored to
> > ensure compliance with internal policies and to protect our business.
> > Emails are not secure and cannot be guaranteed to be error free as
> > they can be intercepted, amended, lost or destroyed, or contain
> > viruses. Anyone who communicates with us by email is taken to accept
> these risks.
> >
> > 收发邮件者请注意:
> > 本邮件含保密信息,若误收本邮件,请务必通知发送人并直接删去,不得使用、传播或复制本邮件。
> > 进出邮件均受到本公司合规监控。邮件可能发生被截留、被修改、丢失、被破坏或包含计算机病毒等不安全情况。
> >
> > **********************************************************************
> > **********************************************************
> >
>
>
>
> --
> Best regards,
>
> Shaofeng Shi
>
>
> ********************************************************************************************************************************
> The information in this email is confidential and may be legally
> privileged. If you have received this email in error or are not the
> intended recipient, please immediately notify the sender and delete this
> message from your computer. Any use, distribution, or copying of this email
> other than by the intended recipient is strictly prohibited. All messages
> sent to and from us may be monitored to ensure compliance with internal
> policies and to protect our business.
> Emails are not secure and cannot be guaranteed to be error free as they
> can be intercepted, amended, lost or destroyed, or contain viruses. Anyone
> who communicates with us by email is taken to accept these risks.
>
> 收发邮件者请注意:
> 本邮件含保密信息,若误收本邮件,请务必通知发送人并直接删去,不得使用、传播或复制本邮件。
> 进出邮件均受到本公司合规监控。邮件可能发生被截留、被修改、丢失、被破坏或包含计算机病毒等不安全情况。
>
> ********************************************************************************************************************************
>
--
Best regards,
Shaofeng Shi