You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by "Fustes, Diego" <Di...@ndt-global.com> on 2015/11/18 12:30:40 UTC

PhoenixConfigurationUtil.CURRENT_SCN_VALUE for phoenix-spark plugin does not work

Hi all,

I'm trying to use the phoenix-spark plugin to process the data stored in a HBase table. Because of project requirements, I need to keep a strict control on the timestamps of new versions stored for each row.

I'm using the function "saveToPhoenix" to store a ProductRDD with the updates to create a new version, and setting PhoenixConfigurationUtil.CURRENT_SCN_VALUE in the configuration. However, it seems that the method is ignoring this setting
and using the server time instead to perform HBase puts. Is there any way to accomplish this? I'm using phoenix version 4.4.0, included in HDP2.3

Thanks, Diego


[Description: Description: cid:image001.png@01CF4378.72EDFE50]
NDT GDAC Spain S.L.
Diego Fustes, Big Data and Machine Learning Expert
Gran Vía de les Corts Catalanes 130, 11th floor
08038 Barcelona, Spain
Phone: +34 93 43 255 27
diego.fustes@ndt-global.com<ma...@ndt-global.com>
www.ndt-global.com<http://www.ndt-global.com/>


-- 
This email is intended only for the recipient(s) designated above.  Any dissemination, distribution, copying, or use of the information contained herein by anyone other than the recipient(s) designated by the sender is unauthorized and strictly prohibited and subject to legal privilege.  If you have received this e-mail in error, please notify the sender immediately and delete and destroy this email.

Der Inhalt dieser E-Mail und deren Anhänge sind vertraulich. Wenn Sie nicht der Adressat sind, informieren Sie bitte den Absender unverzüglich, verwenden Sie den Inhalt nicht und löschen Sie die E-Mail sofort.

NDT Global GmbH and Co. KG,  Friedrich-List-Str. 1, D-76297 Stutensee, Germany
Registry Court Mannheim
HRA 704288

Personally liable partner: 
NDT Verwaltungs GmbH
Friedrich-List-Straße 1, D-76297 Stutensee, Germany
Registry Court Mannheim
HRB 714639
CEO: Gunther Blitz






RE: PhoenixConfigurationUtil.CURRENT_SCN_VALUE for phoenix-spark plugin does not work

Posted by "Fustes, Diego" <Di...@ndt-global.com>.
Hi Josh,

I'm doing exactly that, but it seems that it does not work. Take into consideration that this attribute needs to be set in all HBase puts that you execute. When I use Phoenix through JDBC, I need to specify this value when I create each Connection.

I've just created PHOENIX-2429<https://issues.apache.org/jira/browse/PHOENIX-2429>, please take a look and let me know.

Thank you and best regards,

Diego


From: Josh Mahonin [mailto:jmahonin@gmail.com]
Sent: 18 November 2015 15:13
To: user@phoenix.apache.org
Subject: Re: PhoenixConfigurationUtil.CURRENT_SCN_VALUE for phoenix-spark plugin does not work

Hi Diego,

It's sadly not documented, but the 'saveToPhoenix' method is able to take in a Hadoop 'Configuration' object as well. I haven't tested it, but in theory any custom config parameters should trickle down to the Hadoop Input/Output formats in this way.

Can you try invoke 'saveToPhoenix' using something like:

  .saveToPhoenix(tableName, cols, conf, zkUrl = Some(url))

Or, assuming 'hbase.zookeeper.quorum' is set in the config, you should be able to just use

  .saveToPhoenix(tableName, cols, conf)

ref: https://github.com/apache/phoenix/blob/master/phoenix-spark/src/main/scala/org/apache/phoenix/spark/ProductRDDFunctions.scala#L26-L28

Please let us know if that works for you, and if it doesn't please file a JIRA ticket.

Thanks,

Josh

On Wed, Nov 18, 2015 at 6:30 AM, Fustes, Diego <Di...@ndt-global.com>> wrote:
Hi all,

I'm trying to use the phoenix-spark plugin to process the data stored in a HBase table. Because of project requirements, I need to keep a strict control on the timestamps of new versions stored for each row.

I'm using the function "saveToPhoenix" to store a ProductRDD with the updates to create a new version, and setting PhoenixConfigurationUtil.CURRENT_SCN_VALUE in the configuration. However, it seems that the method is ignoring this setting
and using the server time instead to perform HBase puts. Is there any way to accomplish this? I'm using phoenix version 4.4.0, included in HDP2.3

Thanks, Diego


[Description: Description: cid:image001.png@01CF4378.72EDFE50]
NDT GDAC Spain S.L.
Diego Fustes, Big Data and Machine Learning Expert
Gran Vía de les Corts Catalanes 130, 11th floor
08038 Barcelona, Spain
Phone: +34 93 43 255 27
diego.fustes@ndt-global.com<ma...@ndt-global.com>
www.ndt-global.com<http://www.ndt-global.com/>


--

This email is intended only for the recipient(s) designated above.  Any dissemination, distribution, copying, or use of the information contained herein by anyone other than the recipient(s) designated by the sender is unauthorized and strictly prohibited and subject to legal privilege.  If you have received this e-mail in error, please notify the sender immediately and delete and destroy this email.



Der Inhalt dieser E-Mail und deren Anhänge sind vertraulich. Wenn Sie nicht der Adressat sind, informieren Sie bitte den Absender unverzüglich, verwenden Sie den Inhalt nicht und löschen Sie die E-Mail sofort.



NDT Global GmbH and Co. KG,  Friedrich-List-Str. 1, D-76297 Stutensee, Germany

Registry Court Mannheim

HRA 704288



Personally liable partner:

NDT Verwaltungs GmbH

Friedrich-List-Straße 1, D-76297 Stutensee, Germany

Registry Court Mannheim

HRB 714639

CEO: Gunther Blitz










Re: PhoenixConfigurationUtil.CURRENT_SCN_VALUE for phoenix-spark plugin does not work

Posted by Josh Mahonin <jm...@gmail.com>.
Hi Diego,

It's sadly not documented, but the 'saveToPhoenix' method is able to take
in a Hadoop 'Configuration' object as well. I haven't tested it, but in
theory any custom config parameters should trickle down to the Hadoop
Input/Output formats in this way.

Can you try invoke 'saveToPhoenix' using something like:

  .saveToPhoenix(tableName, cols, conf, zkUrl = Some(url))

Or, assuming 'hbase.zookeeper.quorum' is set in the config, you should be
able to just use

  .saveToPhoenix(tableName, cols, conf)

ref:
https://github.com/apache/phoenix/blob/master/phoenix-spark/src/main/scala/org/apache/phoenix/spark/ProductRDDFunctions.scala#L26-L28

Please let us know if that works for you, and if it doesn't please file a
JIRA ticket.

Thanks,

Josh

On Wed, Nov 18, 2015 at 6:30 AM, Fustes, Diego <Di...@ndt-global.com>
wrote:

> Hi all,
>
>
>
> I'm trying to use the phoenix-spark plugin to process the data stored in a
> HBase table. Because of project requirements, I need to keep a strict
> control on the timestamps of new versions stored for each row.
>
>
>
> I'm using the function "saveToPhoenix" to store a ProductRDD with the
> updates to create a new version, and setting
> PhoenixConfigurationUtil.CURRENT_SCN_VALUE in the configuration. However,
> it seems that the method is ignoring this setting
>
> and using the server time instead to perform HBase puts. Is there any way
> to accomplish this? I'm using phoenix version 4.4.0, included in HDP2.3
>
>
>
> Thanks, Diego
>
>
>
>
>
> [image: Description: Description: cid:image001.png@01CF4378.72EDFE50]
>
> *NDT GDAC Spain S.L.*
>
> Diego Fustes, Big Data and Machine Learning Expert
>
> Gran Vía de les Corts Catalanes 130, 11th floor
>
> 08038 Barcelona, Spain
>
> Phone: +34 93 43 255 27
>
> diego.fustes@ndt-global.com
>
> *www.ndt-global.com <http://www.ndt-global.com/>*
>
>
>
> --
> This email is intended only for the recipient(s) designated above.  Any dissemination, distribution, copying, or use of the information contained herein by anyone other than the recipient(s) designated by the sender is unauthorized and strictly prohibited and subject to legal privilege.  If you have received this e-mail in error, please notify the sender immediately and delete and destroy this email.
>
> Der Inhalt dieser E-Mail und deren Anhänge sind vertraulich. Wenn Sie nicht der Adressat sind, informieren Sie bitte den Absender unverzüglich, verwenden Sie den Inhalt nicht und löschen Sie die E-Mail sofort.
>
> NDT Global GmbH and Co. KG,  Friedrich-List-Str. 1, D-76297 Stutensee, Germany
> Registry Court Mannheim
> HRA 704288
>
> Personally liable partner:
> NDT Verwaltungs GmbH
> Friedrich-List-Straße 1, D-76297 Stutensee, Germany
> Registry Court Mannheim
> HRB 714639
> CEO: Gunther Blitz
>
>
>
>
>
>