You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by 陈竞 <cj...@gmail.com> on 2016/11/21 08:14:46 UTC

how to use key-value storage like redis with PCollection?

my dataflow case is like that:
stream:
a stream want to query some data from redis with a key,

batch:
a table left join another table in with a key

i want to unify the two sence above by a transform like MapJoin, so i need
to use
PCollection to represent the data in redis, but the question is that
PCollection has no interface to make PCollection queryable, so is there any
solution for my case?

Re: 回复:how to use key-value storage like redis with PCollection?

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
Hi Ya-Feng,

The RedisIO is on my private repo (not yet public). I will push on 
public repo asap.

Regards
JB

On 11/24/2016 10:04 AM, \u90ed\u4e9a\u5cf0(\u9ed8\u5cad) wrote:
> Hi Jean-Baptiste,
>
> Morning.
> I have quite similiar with Jing's case. I wanna join some relatively
> static data from HBase (which were bulk loaded everyday) in an unbounded
> pipeline. I'd like take a look at your code for a reference. I checked
> your github but couldn't found anything close to RedisIO you mentioned.
> Did I overlook anything? or could you send me a link to your RedisIO.
>
> thanks a lot.
> Ya-Feng
>
>     ------------------------------------------------------------------
>     \u53d1\u4ef6\u4eba\uff1aJean-Baptiste Onofr� <jb...@nanthrax.net>
>     \u53d1\u9001\u65f6\u95f4\uff1a2016\u5e7411\u670822\u65e5(\u661f\u671f\u4e8c) 03:29
>     \u6536\u4ef6\u4eba\uff1auser <us...@beam.incubator.apache.org>
>     \u4e3b\u3000\u9898\uff1aRe: how to use key-value storage like redis with PCollection?
>
>     Hi Amir,
>
>     I'm working on MqttIO right now, I will push the RedisIO on my github
>     just after.
>
>     I will let you know.
>
>     Regards
>     JB
>
>     On 11/21/2016 08:13 PM, amir bahmanyari wrote:
>     > Am very curious about the RedisIO() example you mentioned JB...
>     > Thanks !
>     >
>     >
>     > ------------------------------------------------------------------------
>     > *From:* Lukasz Cwik <lc...@google.com>
>     > *To:* user@beam.incubator.apache.org
>     > *Sent:* Monday, November 21, 2016 5:42 AM
>     > *Subject:* Re: how to use key-value storage like redis with PCollection?
>     >
>     > Have you taken a look at the PCollectionView?
>     >
>     > It allows you to use various views of a PCollection from within a DoFn.
>     > This
>     > <https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ViewTest.java#L461> is
>     > a short example where a multimap view is used to join two PCollections.
>     > In your pipeline you would have the bounded PCollection used as a map or
>     > multimap view. You would then use a DoFn that had a main input with an
>     > unbounded PCollection and a side input of the view.
>     >
>     > On Mon, Nov 21, 2016 at 3:28 AM, Jean-Baptiste Onofr� <jb@nanthrax.net
>     > <ma...@nanthrax.net>> wrote:
>     >
>     >     Sure, it's on a private repo, let me push on the public one.
>     >
>     >     I will let you know as soon as it's done.
>     >
>     >     Thanks !
>     >     Regards
>     >     JB
>     >
>     >     On 11/21/2016 10:25 AM, \u9648\u7ade wrote:
>     >
>     >         ok, thank you very much. Could you show me your branch address?
>     >
>     >         2016-11-21 17:20 GMT+08:00 Jean-Baptiste Onofr� <jb@nanthrax.net
>     >         <ma...@nanthrax.net>
>     >         <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>:
>     >
>     >             I have an example, but with the RedisIO.
>     >
>     >             So, if you are interested, I can share my branch.
>     >
>     >             Regards
>     >             JB
>     >
>     >             On 11/21/2016 10:18 AM, \u9648\u7ade wrote:
>     >
>     >                 could you show the example code of redis query with
>     >         PCollection?
>     >
>     >                 2016-11-21 16:41 GMT+08:00 Jean-Baptiste Onofr�
>     >         <jb@nanthrax.net <ma...@nanthrax.net>
>     >                 <mailto:jb@nanthrax.net <ma...@nanthrax.net>>
>     >                 <mailto:jb@nanthrax.net <ma...@nanthrax.net>
>     >         <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>>:
>     >
>     >
>     >                     Hi,
>     >
>     >                     you can convert your PCollection<KV<?,?>> to a
>     >                 PCollection<POJO> and
>     >                     then create a DoFn to do the query.
>     >
>     >                     By the way, I have a RedisIO mostly ready.
>     >
>     >                     Regards
>     >                     JB
>     >
>     >
>     >                     On 11/21/2016 09:14 AM, \u9648\u7ade wrote:
>     >
>     >                         my dataflow case is like that:
>     >                         stream:
>     >                         a stream want to query some data from redis with
>     >         a key,
>     >
>     >                         batch:
>     >                         a table left join another table in with a key
>     >
>     >                         i want to unify the two sence above by a
>     >         transform like
>     >                 MapJoin,
>     >                         so i
>     >                         need to use
>     >                         PCollection to represent the data in redis, but the
>     >                 question is that
>     >                         PCollection has no interface to make PCollection
>     >                 queryable, so
>     >                         is there
>     >                         any solution for my case?
>     >
>     >
>     >                     --
>     >                     Jean-Baptiste Onofr�
>     >                     jbonofre@apache.org <ma...@apache.org>
>     >         <mailto:jbonofre@apache.org <ma...@apache.org>>
>     >                 <mailto:jbonofre@apache.org <ma...@apache.org>
>     >         <mailto:jbonofre@apache.org <ma...@apache.org>>>
>     >                     http://blog.nanthrax.net <http://blog.nanthrax.net/>
>     >                     Talend - http://www.talend.com <http://www.talend.com/>
>     >
>     >
>     >
>     >
>     >                 --
>     >                 \u9648\u7ade\uff0c\u4e2d\u79d1\u9662\u8ba1\u7b97\u6280\u672f\u7814\u7a76\u6240\uff0c\u9ad8\u6027\u80fd\u8ba1\u7b97\u673a\u4e2d\u5fc3
>     >                 Jing Chen HPCC.ICT.AC <http://hpcc.ict.ac/>
>     >         <http://HPCC.ICT.AC <http://hpcc.ict.ac/>> <http://HPCC.ICT.AC
>     >         <http://hpcc.ict.ac/>>
>     >                 China
>     >
>     >
>     >             --
>     >             Jean-Baptiste Onofr�
>     >             jbonofre@apache.org <ma...@apache.org>
>     >         <mailto:jbonofre@apache.org <ma...@apache.org>>
>     >             http://blog.nanthrax.net <http://blog.nanthrax.net/>
>     >             Talend - http://www.talend.com <http://www.talend.com/>
>     >
>     >
>     >
>     >
>     >         --
>     >         \u9648\u7ade\uff0c\u4e2d\u79d1\u9662\u8ba1\u7b97\u6280\u672f\u7814\u7a76\u6240\uff0c\u9ad8\u6027\u80fd\u8ba1\u7b97\u673a\u4e2d\u5fc3
>     >         Jing Chen HPCC.ICT.AC <http://hpcc.ict.ac/> <http://HPCC.ICT.AC
>     >         <http://hpcc.ict.ac/>> China
>     >
>     >
>     >     --
>     >     Jean-Baptiste Onofr�
>     >     jbonofre@apache.org <ma...@apache.org>
>     >     http://blog.nanthrax.net <http://blog.nanthrax.net/>
>     >     Talend - http://www.talend.com <http://www.talend.com/>
>     >
>     >
>     >
>     >
>
>     --
>     Jean-Baptiste Onofr�
>     jbonofre@apache.org
>     http://blog.nanthrax.net
>     Talend - http://www.talend.com
>
>

-- 
Jean-Baptiste Onofr�
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Re: 回复:how to use key-value storage like redis with PCollection?

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
Hi,

I created a PullRequest with the RedisIO/RedisPubSubIO:

https://github.com/apache/beam/pull/1687

You have two IOs available: RedisIO to deal with key-value pairs store, 
and RedisPubSubIO for Redis PubSub.

I will update the PR today or tomorrow with:
- complete of RedisCluster (especially for the sharding in RedisIO)
- support of List, Set, Hash, Z* key-value pairs. Right now, RedisIO 
only deals with String key-value pairs.

Regards
JB

On 11/24/2016 10:04 AM, \u90ed\u4e9a\u5cf0(\u9ed8\u5cad) wrote:
> Hi Jean-Baptiste,
>
> Morning.
> I have quite similiar with Jing's case. I wanna join some relatively
> static data from HBase (which were bulk loaded everyday) in an unbounded
> pipeline. I'd like take a look at your code for a reference. I checked
> your github but couldn't found anything close to RedisIO you mentioned.
> Did I overlook anything? or could you send me a link to your RedisIO.
>
> thanks a lot.
> Ya-Feng
>
>     ------------------------------------------------------------------
>     \u53d1\u4ef6\u4eba\uff1aJean-Baptiste Onofr� <jb...@nanthrax.net>
>     \u53d1\u9001\u65f6\u95f4\uff1a2016\u5e7411\u670822\u65e5(\u661f\u671f\u4e8c) 03:29
>     \u6536\u4ef6\u4eba\uff1auser <us...@beam.incubator.apache.org>
>     \u4e3b\u3000\u9898\uff1aRe: how to use key-value storage like redis with PCollection?
>
>     Hi Amir,
>
>     I'm working on MqttIO right now, I will push the RedisIO on my github
>     just after.
>
>     I will let you know.
>
>     Regards
>     JB
>
>     On 11/21/2016 08:13 PM, amir bahmanyari wrote:
>     > Am very curious about the RedisIO() example you mentioned JB...
>     > Thanks !
>     >
>     >
>     > ------------------------------------------------------------------------
>     > *From:* Lukasz Cwik <lc...@google.com>
>     > *To:* user@beam.incubator.apache.org
>     > *Sent:* Monday, November 21, 2016 5:42 AM
>     > *Subject:* Re: how to use key-value storage like redis with PCollection?
>     >
>     > Have you taken a look at the PCollectionView?
>     >
>     > It allows you to use various views of a PCollection from within a DoFn.
>     > This
>     > <https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ViewTest.java#L461> is
>     > a short example where a multimap view is used to join two PCollections.
>     > In your pipeline you would have the bounded PCollection used as a map or
>     > multimap view. You would then use a DoFn that had a main input with an
>     > unbounded PCollection and a side input of the view.
>     >
>     > On Mon, Nov 21, 2016 at 3:28 AM, Jean-Baptiste Onofr� <jb@nanthrax.net
>     > <ma...@nanthrax.net>> wrote:
>     >
>     >     Sure, it's on a private repo, let me push on the public one.
>     >
>     >     I will let you know as soon as it's done.
>     >
>     >     Thanks !
>     >     Regards
>     >     JB
>     >
>     >     On 11/21/2016 10:25 AM, \u9648\u7ade wrote:
>     >
>     >         ok, thank you very much. Could you show me your branch address?
>     >
>     >         2016-11-21 17:20 GMT+08:00 Jean-Baptiste Onofr� <jb@nanthrax.net
>     >         <ma...@nanthrax.net>
>     >         <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>:
>     >
>     >             I have an example, but with the RedisIO.
>     >
>     >             So, if you are interested, I can share my branch.
>     >
>     >             Regards
>     >             JB
>     >
>     >             On 11/21/2016 10:18 AM, \u9648\u7ade wrote:
>     >
>     >                 could you show the example code of redis query with
>     >         PCollection?
>     >
>     >                 2016-11-21 16:41 GMT+08:00 Jean-Baptiste Onofr�
>     >         <jb@nanthrax.net <ma...@nanthrax.net>
>     >                 <mailto:jb@nanthrax.net <ma...@nanthrax.net>>
>     >                 <mailto:jb@nanthrax.net <ma...@nanthrax.net>
>     >         <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>>:
>     >
>     >
>     >                     Hi,
>     >
>     >                     you can convert your PCollection<KV<?,?>> to a
>     >                 PCollection<POJO> and
>     >                     then create a DoFn to do the query.
>     >
>     >                     By the way, I have a RedisIO mostly ready.
>     >
>     >                     Regards
>     >                     JB
>     >
>     >
>     >                     On 11/21/2016 09:14 AM, \u9648\u7ade wrote:
>     >
>     >                         my dataflow case is like that:
>     >                         stream:
>     >                         a stream want to query some data from redis with
>     >         a key,
>     >
>     >                         batch:
>     >                         a table left join another table in with a key
>     >
>     >                         i want to unify the two sence above by a
>     >         transform like
>     >                 MapJoin,
>     >                         so i
>     >                         need to use
>     >                         PCollection to represent the data in redis, but the
>     >                 question is that
>     >                         PCollection has no interface to make PCollection
>     >                 queryable, so
>     >                         is there
>     >                         any solution for my case?
>     >
>     >
>     >                     --
>     >                     Jean-Baptiste Onofr�
>     >                     jbonofre@apache.org <ma...@apache.org>
>     >         <mailto:jbonofre@apache.org <ma...@apache.org>>
>     >                 <mailto:jbonofre@apache.org <ma...@apache.org>
>     >         <mailto:jbonofre@apache.org <ma...@apache.org>>>
>     >                     http://blog.nanthrax.net <http://blog.nanthrax.net/>
>     >                     Talend - http://www.talend.com <http://www.talend.com/>
>     >
>     >
>     >
>     >
>     >                 --
>     >                 \u9648\u7ade\uff0c\u4e2d\u79d1\u9662\u8ba1\u7b97\u6280\u672f\u7814\u7a76\u6240\uff0c\u9ad8\u6027\u80fd\u8ba1\u7b97\u673a\u4e2d\u5fc3
>     >                 Jing Chen HPCC.ICT.AC <http://hpcc.ict.ac/>
>     >         <http://HPCC.ICT.AC <http://hpcc.ict.ac/>> <http://HPCC.ICT.AC
>     >         <http://hpcc.ict.ac/>>
>     >                 China
>     >
>     >
>     >             --
>     >             Jean-Baptiste Onofr�
>     >             jbonofre@apache.org <ma...@apache.org>
>     >         <mailto:jbonofre@apache.org <ma...@apache.org>>
>     >             http://blog.nanthrax.net <http://blog.nanthrax.net/>
>     >             Talend - http://www.talend.com <http://www.talend.com/>
>     >
>     >
>     >
>     >
>     >         --
>     >         \u9648\u7ade\uff0c\u4e2d\u79d1\u9662\u8ba1\u7b97\u6280\u672f\u7814\u7a76\u6240\uff0c\u9ad8\u6027\u80fd\u8ba1\u7b97\u673a\u4e2d\u5fc3
>     >         Jing Chen HPCC.ICT.AC <http://hpcc.ict.ac/> <http://HPCC.ICT.AC
>     >         <http://hpcc.ict.ac/>> China
>     >
>     >
>     >     --
>     >     Jean-Baptiste Onofr�
>     >     jbonofre@apache.org <ma...@apache.org>
>     >     http://blog.nanthrax.net <http://blog.nanthrax.net/>
>     >     Talend - http://www.talend.com <http://www.talend.com/>
>     >
>     >
>     >
>     >
>
>     --
>     Jean-Baptiste Onofr�
>     jbonofre@apache.org
>     http://blog.nanthrax.net
>     Talend - http://www.talend.com
>
>

-- 
Jean-Baptiste Onofr�
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

回复:how to use key-value storage like redis with PCollection?

Posted by "郭亚峰(默岭)" <ya...@alibaba-inc.com>.
Hi Jean-Baptiste,
Morning.I have quite similiar with Jing's case. I wanna join some relatively static data from HBase (which were bulk loaded everyday) in an unbounded pipeline. I'd like take a look at your code for a reference. I checked your github but couldn't found anything close to RedisIO you mentioned. Did I overlook anything? or could you send me a link to your RedisIO.
thanks a lot.Ya-Feng
------------------------------------------------------------------发件人:Jean-Baptiste Onofré <jb...@nanthrax.net>发送时间:2016年11月22日(星期二) 03:29收件人:user <us...@beam.incubator.apache.org>主 题:Re: how to use key-value storage like redis with PCollection?
Hi Amir,

I'm working on MqttIO right now, I will push the RedisIO on my github 
just after.

I will let you know.

Regards
JB

On 11/21/2016 08:13 PM, amir bahmanyari wrote:
> Am very curious about the RedisIO() example you mentioned JB...
> Thanks !
>
>
> ------------------------------------------------------------------------
> *From:* Lukasz Cwik <lc...@google.com>
> *To:* user@beam.incubator.apache.org
> *Sent:* Monday, November 21, 2016 5:42 AM
> *Subject:* Re: how to use key-value storage like redis with PCollection?
>
> Have you taken a look at the PCollectionView?
>
> It allows you to use various views of a PCollection from within a DoFn.
> This
> <https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ViewTest.java#L461> is
> a short example where a multimap view is used to join two PCollections.
> In your pipeline you would have the bounded PCollection used as a map or
> multimap view. You would then use a DoFn that had a main input with an
> unbounded PCollection and a side input of the view.
>
> On Mon, Nov 21, 2016 at 3:28 AM, Jean-Baptiste Onofré <jb@nanthrax.net
> <ma...@nanthrax.net>> wrote:
>
>     Sure, it's on a private repo, let me push on the public one.
>
>     I will let you know as soon as it's done.
>
>     Thanks !
>     Regards
>     JB
>
>     On 11/21/2016 10:25 AM, 陈竞 wrote:
>
>         ok, thank you very much. Could you show me your branch address?
>
>         2016-11-21 17:20 GMT+08:00 Jean-Baptiste Onofré <jb@nanthrax.net
>         <ma...@nanthrax.net>
>         <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>:
>
>             I have an example, but with the RedisIO.
>
>             So, if you are interested, I can share my branch.
>
>             Regards
>             JB
>
>             On 11/21/2016 10:18 AM, 陈竞 wrote:
>
>                 could you show the example code of redis query with
>         PCollection?
>
>                 2016-11-21 16:41 GMT+08:00 Jean-Baptiste Onofré
>         <jb@nanthrax.net <ma...@nanthrax.net>
>                 <mailto:jb@nanthrax.net <ma...@nanthrax.net>>
>                 <mailto:jb@nanthrax.net <ma...@nanthrax.net>
>         <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>>:
>
>
>                     Hi,
>
>                     you can convert your PCollection<KV<?,?>> to a
>                 PCollection<POJO> and
>                     then create a DoFn to do the query.
>
>                     By the way, I have a RedisIO mostly ready.
>
>                     Regards
>                     JB
>
>
>                     On 11/21/2016 09:14 AM, 陈竞 wrote:
>
>                         my dataflow case is like that:
>                         stream:
>                         a stream want to query some data from redis with
>         a key,
>
>                         batch:
>                         a table left join another table in with a key
>
>                         i want to unify the two sence above by a
>         transform like
>                 MapJoin,
>                         so i
>                         need to use
>                         PCollection to represent the data in redis, but the
>                 question is that
>                         PCollection has no interface to make PCollection
>                 queryable, so
>                         is there
>                         any solution for my case?
>
>
>                     --
>                     Jean-Baptiste Onofré
>                     jbonofre@apache.org <ma...@apache.org>
>         <mailto:jbonofre@apache.org <ma...@apache.org>>
>                 <mailto:jbonofre@apache.org <ma...@apache.org>
>         <mailto:jbonofre@apache.org <ma...@apache.org>>>
>                     http://blog.nanthrax.net <http://blog.nanthrax.net/>
>                     Talend - http://www.talend.com <http://www.talend.com/>
>
>
>
>
>                 --
>                 陈竞,中科院计算技术研究所,高性能计算机中心
>                 Jing Chen HPCC.ICT.AC <http://hpcc.ict.ac/>
>         <http://HPCC.ICT.AC <http://hpcc.ict.ac/>> <http://HPCC.ICT.AC
>         <http://hpcc.ict.ac/>>
>                 China
>
>
>             --
>             Jean-Baptiste Onofré
>             jbonofre@apache.org <ma...@apache.org>
>         <mailto:jbonofre@apache.org <ma...@apache.org>>
>             http://blog.nanthrax.net <http://blog.nanthrax.net/>
>             Talend - http://www.talend.com <http://www.talend.com/>
>
>
>
>
>         --
>         陈竞,中科院计算技术研究所,高性能计算机中心
>         Jing Chen HPCC.ICT.AC <http://hpcc.ict.ac/> <http://HPCC.ICT.AC
>         <http://hpcc.ict.ac/>> China
>
>
>     --
>     Jean-Baptiste Onofré
>     jbonofre@apache.org <ma...@apache.org>
>     http://blog.nanthrax.net <http://blog.nanthrax.net/>
>     Talend - http://www.talend.com <http://www.talend.com/>
>
>
>
>

-- 
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Re: how to use key-value storage like redis with PCollection?

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
Hi Amir,

I'm working on MqttIO right now, I will push the RedisIO on my github 
just after.

I will let you know.

Regards
JB

On 11/21/2016 08:13 PM, amir bahmanyari wrote:
> Am very curious about the RedisIO() example you mentioned JB...
> Thanks !
>
>
> ------------------------------------------------------------------------
> *From:* Lukasz Cwik <lc...@google.com>
> *To:* user@beam.incubator.apache.org
> *Sent:* Monday, November 21, 2016 5:42 AM
> *Subject:* Re: how to use key-value storage like redis with PCollection?
>
> Have you taken a look at the PCollectionView?
>
> It allows you to use various views of a PCollection from within a DoFn.
> This
> <https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ViewTest.java#L461> is
> a short example where a multimap view is used to join two PCollections.
> In your pipeline you would have the bounded PCollection used as a map or
> multimap view. You would then use a DoFn that had a main input with an
> unbounded PCollection and a side input of the view.
>
> On Mon, Nov 21, 2016 at 3:28 AM, Jean-Baptiste Onofr� <jb@nanthrax.net
> <ma...@nanthrax.net>> wrote:
>
>     Sure, it's on a private repo, let me push on the public one.
>
>     I will let you know as soon as it's done.
>
>     Thanks !
>     Regards
>     JB
>
>     On 11/21/2016 10:25 AM, \u9648\u7ade wrote:
>
>         ok, thank you very much. Could you show me your branch address?
>
>         2016-11-21 17:20 GMT+08:00 Jean-Baptiste Onofr� <jb@nanthrax.net
>         <ma...@nanthrax.net>
>         <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>:
>
>             I have an example, but with the RedisIO.
>
>             So, if you are interested, I can share my branch.
>
>             Regards
>             JB
>
>             On 11/21/2016 10:18 AM, \u9648\u7ade wrote:
>
>                 could you show the example code of redis query with
>         PCollection?
>
>                 2016-11-21 16:41 GMT+08:00 Jean-Baptiste Onofr�
>         <jb@nanthrax.net <ma...@nanthrax.net>
>                 <mailto:jb@nanthrax.net <ma...@nanthrax.net>>
>                 <mailto:jb@nanthrax.net <ma...@nanthrax.net>
>         <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>>:
>
>
>                     Hi,
>
>                     you can convert your PCollection<KV<?,?>> to a
>                 PCollection<POJO> and
>                     then create a DoFn to do the query.
>
>                     By the way, I have a RedisIO mostly ready.
>
>                     Regards
>                     JB
>
>
>                     On 11/21/2016 09:14 AM, \u9648\u7ade wrote:
>
>                         my dataflow case is like that:
>                         stream:
>                         a stream want to query some data from redis with
>         a key,
>
>                         batch:
>                         a table left join another table in with a key
>
>                         i want to unify the two sence above by a
>         transform like
>                 MapJoin,
>                         so i
>                         need to use
>                         PCollection to represent the data in redis, but the
>                 question is that
>                         PCollection has no interface to make PCollection
>                 queryable, so
>                         is there
>                         any solution for my case?
>
>
>                     --
>                     Jean-Baptiste Onofr�
>                     jbonofre@apache.org <ma...@apache.org>
>         <mailto:jbonofre@apache.org <ma...@apache.org>>
>                 <mailto:jbonofre@apache.org <ma...@apache.org>
>         <mailto:jbonofre@apache.org <ma...@apache.org>>>
>                     http://blog.nanthrax.net <http://blog.nanthrax.net/>
>                     Talend - http://www.talend.com <http://www.talend.com/>
>
>
>
>
>                 --
>                 \u9648\u7ade\uff0c\u4e2d\u79d1\u9662\u8ba1\u7b97\u6280\u672f\u7814\u7a76\u6240\uff0c\u9ad8\u6027\u80fd\u8ba1\u7b97\u673a\u4e2d\u5fc3
>                 Jing Chen HPCC.ICT.AC <http://hpcc.ict.ac/>
>         <http://HPCC.ICT.AC <http://hpcc.ict.ac/>> <http://HPCC.ICT.AC
>         <http://hpcc.ict.ac/>>
>                 China
>
>
>             --
>             Jean-Baptiste Onofr�
>             jbonofre@apache.org <ma...@apache.org>
>         <mailto:jbonofre@apache.org <ma...@apache.org>>
>             http://blog.nanthrax.net <http://blog.nanthrax.net/>
>             Talend - http://www.talend.com <http://www.talend.com/>
>
>
>
>
>         --
>         \u9648\u7ade\uff0c\u4e2d\u79d1\u9662\u8ba1\u7b97\u6280\u672f\u7814\u7a76\u6240\uff0c\u9ad8\u6027\u80fd\u8ba1\u7b97\u673a\u4e2d\u5fc3
>         Jing Chen HPCC.ICT.AC <http://hpcc.ict.ac/> <http://HPCC.ICT.AC
>         <http://hpcc.ict.ac/>> China
>
>
>     --
>     Jean-Baptiste Onofr�
>     jbonofre@apache.org <ma...@apache.org>
>     http://blog.nanthrax.net <http://blog.nanthrax.net/>
>     Talend - http://www.talend.com <http://www.talend.com/>
>
>
>
>

-- 
Jean-Baptiste Onofr�
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Re: how to use key-value storage like redis with PCollection?

Posted by amir bahmanyari <am...@yahoo.com>.
Am very curious about the RedisIO() example you mentioned JB...Thanks !

      From: Lukasz Cwik <lc...@google.com>
 To: user@beam.incubator.apache.org 
 Sent: Monday, November 21, 2016 5:42 AM
 Subject: Re: how to use key-value storage like redis with PCollection?
   
Have you taken a look at the PCollectionView?
It allows you to use various views of a PCollection from within a DoFn. This is a short example where a multimap view is used to join two PCollections. In your pipeline you would have the bounded PCollection used as a map or multimap view. You would then use a DoFn that had a main input with an unbounded PCollection and a side input of the view.
On Mon, Nov 21, 2016 at 3:28 AM, Jean-Baptiste Onofré <jb...@nanthrax.net> wrote:

Sure, it's on a private repo, let me push on the public one.

I will let you know as soon as it's done.

Thanks !
Regards
JB

On 11/21/2016 10:25 AM, 陈竞 wrote:

ok, thank you very much. Could you show me your branch address?

2016-11-21 17:20 GMT+08:00 Jean-Baptiste Onofré <jb@nanthrax.net
<ma...@nanthrax.net>>:

    I have an example, but with the RedisIO.

    So, if you are interested, I can share my branch.

    Regards
    JB

    On 11/21/2016 10:18 AM, 陈竞 wrote:

        could you show the example code of redis query with PCollection?

        2016-11-21 16:41 GMT+08:00 Jean-Baptiste Onofré <jb@nanthrax.net
        <ma...@nanthrax.net>
        <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>:

            Hi,

            you can convert your PCollection<KV<?,?>> to a
        PCollection<POJO> and
            then create a DoFn to do the query.

            By the way, I have a RedisIO mostly ready.

            Regards
            JB


            On 11/21/2016 09:14 AM, 陈竞 wrote:

                my dataflow case is like that:
                stream:
                a stream want to query some data from redis with a key,

                batch:
                a table left join another table in with a key

                i want to unify the two sence above by a transform like
        MapJoin,
                so i
                need to use
                PCollection to represent the data in redis, but the
        question is that
                PCollection has no interface to make PCollection
        queryable, so
                is there
                any solution for my case?


            --
            Jean-Baptiste Onofré
            jbonofre@apache.org <ma...@apache.org>
        <mailto:jbonofre@apache.org <ma...@apache.org>>
            http://blog.nanthrax.net
            Talend - http://www.talend.com




        --
        陈竞,中科院计算技术研究所,高性能计算机中心
        Jing Chen HPCC.ICT.AC <http://HPCC.ICT.AC> <http://HPCC.ICT.AC>
        China


    --
    Jean-Baptiste Onofré
    jbonofre@apache.org <ma...@apache.org>
    http://blog.nanthrax.net
    Talend - http://www.talend.com




--
陈竞,中科院计算技术研究所,高性能计算机中心
Jing Chen HPCC.ICT.AC <http://HPCC.ICT.AC> China


-- 
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com




   

Re: how to use key-value storage like redis with PCollection?

Posted by Lukasz Cwik <lc...@google.com>.
Have you taken a look at the PCollectionView?

It allows you to use various views of a PCollection from within a DoFn. This
<https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ViewTest.java#L461>
is
a short example where a multimap view is used to join two PCollections. In
your pipeline you would have the bounded PCollection used as a map or
multimap view. You would then use a DoFn that had a main input with an
unbounded PCollection and a side input of the view.

On Mon, Nov 21, 2016 at 3:28 AM, Jean-Baptiste Onofré <jb...@nanthrax.net>
wrote:

> Sure, it's on a private repo, let me push on the public one.
>
> I will let you know as soon as it's done.
>
> Thanks !
> Regards
> JB
>
> On 11/21/2016 10:25 AM, 陈竞 wrote:
>
>> ok, thank you very much. Could you show me your branch address?
>>
>> 2016-11-21 17:20 GMT+08:00 Jean-Baptiste Onofré <jb@nanthrax.net
>> <ma...@nanthrax.net>>:
>>
>>     I have an example, but with the RedisIO.
>>
>>     So, if you are interested, I can share my branch.
>>
>>     Regards
>>     JB
>>
>>     On 11/21/2016 10:18 AM, 陈竞 wrote:
>>
>>         could you show the example code of redis query with PCollection?
>>
>>         2016-11-21 16:41 GMT+08:00 Jean-Baptiste Onofré <jb@nanthrax.net
>>         <ma...@nanthrax.net>
>>         <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>:
>>
>>
>>             Hi,
>>
>>             you can convert your PCollection<KV<?,?>> to a
>>         PCollection<POJO> and
>>             then create a DoFn to do the query.
>>
>>             By the way, I have a RedisIO mostly ready.
>>
>>             Regards
>>             JB
>>
>>
>>             On 11/21/2016 09:14 AM, 陈竞 wrote:
>>
>>                 my dataflow case is like that:
>>                 stream:
>>                 a stream want to query some data from redis with a key,
>>
>>                 batch:
>>                 a table left join another table in with a key
>>
>>                 i want to unify the two sence above by a transform like
>>         MapJoin,
>>                 so i
>>                 need to use
>>                 PCollection to represent the data in redis, but the
>>         question is that
>>                 PCollection has no interface to make PCollection
>>         queryable, so
>>                 is there
>>                 any solution for my case?
>>
>>
>>             --
>>             Jean-Baptiste Onofré
>>             jbonofre@apache.org <ma...@apache.org>
>>         <mailto:jbonofre@apache.org <ma...@apache.org>>
>>             http://blog.nanthrax.net
>>             Talend - http://www.talend.com
>>
>>
>>
>>
>>         --
>>         陈竞,中科院计算技术研究所,高性能计算机中心
>>         Jing Chen HPCC.ICT.AC <http://HPCC.ICT.AC> <http://HPCC.ICT.AC>
>>         China
>>
>>
>>     --
>>     Jean-Baptiste Onofré
>>     jbonofre@apache.org <ma...@apache.org>
>>     http://blog.nanthrax.net
>>     Talend - http://www.talend.com
>>
>>
>>
>>
>> --
>> 陈竞,中科院计算技术研究所,高性能计算机中心
>> Jing Chen HPCC.ICT.AC <http://HPCC.ICT.AC> China
>>
>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Re: how to use key-value storage like redis with PCollection?

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
Sure, it's on a private repo, let me push on the public one.

I will let you know as soon as it's done.

Thanks !
Regards
JB

On 11/21/2016 10:25 AM, \u9648\u7ade wrote:
> ok, thank you very much. Could you show me your branch address?
>
> 2016-11-21 17:20 GMT+08:00 Jean-Baptiste Onofr� <jb@nanthrax.net
> <ma...@nanthrax.net>>:
>
>     I have an example, but with the RedisIO.
>
>     So, if you are interested, I can share my branch.
>
>     Regards
>     JB
>
>     On 11/21/2016 10:18 AM, \u9648\u7ade wrote:
>
>         could you show the example code of redis query with PCollection?
>
>         2016-11-21 16:41 GMT+08:00 Jean-Baptiste Onofr� <jb@nanthrax.net
>         <ma...@nanthrax.net>
>         <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>:
>
>             Hi,
>
>             you can convert your PCollection<KV<?,?>> to a
>         PCollection<POJO> and
>             then create a DoFn to do the query.
>
>             By the way, I have a RedisIO mostly ready.
>
>             Regards
>             JB
>
>
>             On 11/21/2016 09:14 AM, \u9648\u7ade wrote:
>
>                 my dataflow case is like that:
>                 stream:
>                 a stream want to query some data from redis with a key,
>
>                 batch:
>                 a table left join another table in with a key
>
>                 i want to unify the two sence above by a transform like
>         MapJoin,
>                 so i
>                 need to use
>                 PCollection to represent the data in redis, but the
>         question is that
>                 PCollection has no interface to make PCollection
>         queryable, so
>                 is there
>                 any solution for my case?
>
>
>             --
>             Jean-Baptiste Onofr�
>             jbonofre@apache.org <ma...@apache.org>
>         <mailto:jbonofre@apache.org <ma...@apache.org>>
>             http://blog.nanthrax.net
>             Talend - http://www.talend.com
>
>
>
>
>         --
>         \u9648\u7ade\uff0c\u4e2d\u79d1\u9662\u8ba1\u7b97\u6280\u672f\u7814\u7a76\u6240\uff0c\u9ad8\u6027\u80fd\u8ba1\u7b97\u673a\u4e2d\u5fc3
>         Jing Chen HPCC.ICT.AC <http://HPCC.ICT.AC> <http://HPCC.ICT.AC>
>         China
>
>
>     --
>     Jean-Baptiste Onofr�
>     jbonofre@apache.org <ma...@apache.org>
>     http://blog.nanthrax.net
>     Talend - http://www.talend.com
>
>
>
>
> --
> \u9648\u7ade\uff0c\u4e2d\u79d1\u9662\u8ba1\u7b97\u6280\u672f\u7814\u7a76\u6240\uff0c\u9ad8\u6027\u80fd\u8ba1\u7b97\u673a\u4e2d\u5fc3
> Jing Chen HPCC.ICT.AC <http://HPCC.ICT.AC> China

-- 
Jean-Baptiste Onofr�
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Re: how to use key-value storage like redis with PCollection?

Posted by 陈竞 <cj...@gmail.com>.
ok, thank you very much. Could you show me your branch address?

2016-11-21 17:20 GMT+08:00 Jean-Baptiste Onofré <jb...@nanthrax.net>:

> I have an example, but with the RedisIO.
>
> So, if you are interested, I can share my branch.
>
> Regards
> JB
>
> On 11/21/2016 10:18 AM, 陈竞 wrote:
>
>> could you show the example code of redis query with PCollection?
>>
>> 2016-11-21 16:41 GMT+08:00 Jean-Baptiste Onofré <jb@nanthrax.net
>> <ma...@nanthrax.net>>:
>>
>>     Hi,
>>
>>     you can convert your PCollection<KV<?,?>> to a PCollection<POJO> and
>>     then create a DoFn to do the query.
>>
>>     By the way, I have a RedisIO mostly ready.
>>
>>     Regards
>>     JB
>>
>>
>>     On 11/21/2016 09:14 AM, 陈竞 wrote:
>>
>>         my dataflow case is like that:
>>         stream:
>>         a stream want to query some data from redis with a key,
>>
>>         batch:
>>         a table left join another table in with a key
>>
>>         i want to unify the two sence above by a transform like MapJoin,
>>         so i
>>         need to use
>>         PCollection to represent the data in redis, but the question is
>> that
>>         PCollection has no interface to make PCollection queryable, so
>>         is there
>>         any solution for my case?
>>
>>
>>     --
>>     Jean-Baptiste Onofré
>>     jbonofre@apache.org <ma...@apache.org>
>>     http://blog.nanthrax.net
>>     Talend - http://www.talend.com
>>
>>
>>
>>
>> --
>> 陈竞,中科院计算技术研究所,高性能计算机中心
>> Jing Chen HPCC.ICT.AC <http://HPCC.ICT.AC> China
>>
>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>



-- 
陈竞,中科院计算技术研究所,高性能计算机中心
Jing Chen HPCC.ICT.AC China

Re: how to use key-value storage like redis with PCollection?

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
I have an example, but with the RedisIO.

So, if you are interested, I can share my branch.

Regards
JB

On 11/21/2016 10:18 AM, \u9648\u7ade wrote:
> could you show the example code of redis query with PCollection?
>
> 2016-11-21 16:41 GMT+08:00 Jean-Baptiste Onofr� <jb@nanthrax.net
> <ma...@nanthrax.net>>:
>
>     Hi,
>
>     you can convert your PCollection<KV<?,?>> to a PCollection<POJO> and
>     then create a DoFn to do the query.
>
>     By the way, I have a RedisIO mostly ready.
>
>     Regards
>     JB
>
>
>     On 11/21/2016 09:14 AM, \u9648\u7ade wrote:
>
>         my dataflow case is like that:
>         stream:
>         a stream want to query some data from redis with a key,
>
>         batch:
>         a table left join another table in with a key
>
>         i want to unify the two sence above by a transform like MapJoin,
>         so i
>         need to use
>         PCollection to represent the data in redis, but the question is that
>         PCollection has no interface to make PCollection queryable, so
>         is there
>         any solution for my case?
>
>
>     --
>     Jean-Baptiste Onofr�
>     jbonofre@apache.org <ma...@apache.org>
>     http://blog.nanthrax.net
>     Talend - http://www.talend.com
>
>
>
>
> --
> \u9648\u7ade\uff0c\u4e2d\u79d1\u9662\u8ba1\u7b97\u6280\u672f\u7814\u7a76\u6240\uff0c\u9ad8\u6027\u80fd\u8ba1\u7b97\u673a\u4e2d\u5fc3
> Jing Chen HPCC.ICT.AC <http://HPCC.ICT.AC> China

-- 
Jean-Baptiste Onofr�
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Re: how to use key-value storage like redis with PCollection?

Posted by 陈竞 <cj...@gmail.com>.
could you show the example code of redis query with PCollection?

2016-11-21 16:41 GMT+08:00 Jean-Baptiste Onofré <jb...@nanthrax.net>:

> Hi,
>
> you can convert your PCollection<KV<?,?>> to a PCollection<POJO> and then
> create a DoFn to do the query.
>
> By the way, I have a RedisIO mostly ready.
>
> Regards
> JB
>
>
> On 11/21/2016 09:14 AM, 陈竞 wrote:
>
>> my dataflow case is like that:
>> stream:
>> a stream want to query some data from redis with a key,
>>
>> batch:
>> a table left join another table in with a key
>>
>> i want to unify the two sence above by a transform like MapJoin, so i
>> need to use
>> PCollection to represent the data in redis, but the question is that
>> PCollection has no interface to make PCollection queryable, so is there
>> any solution for my case?
>>
>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>



-- 
陈竞,中科院计算技术研究所,高性能计算机中心
Jing Chen HPCC.ICT.AC China

Re: how to use key-value storage like redis with PCollection?

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
Hi,

you can convert your PCollection<KV<?,?>> to a PCollection<POJO> and 
then create a DoFn to do the query.

By the way, I have a RedisIO mostly ready.

Regards
JB

On 11/21/2016 09:14 AM, \u9648\u7ade wrote:
> my dataflow case is like that:
> stream:
> a stream want to query some data from redis with a key,
>
> batch:
> a table left join another table in with a key
>
> i want to unify the two sence above by a transform like MapJoin, so i
> need to use
> PCollection to represent the data in redis, but the question is that
> PCollection has no interface to make PCollection queryable, so is there
> any solution for my case?

-- 
Jean-Baptiste Onofr�
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com