You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@fluo.apache.org by Alan Camillo <al...@blueshift.com.br> on 2017/12/19 18:25:11 UTC
Fluo application question
We just start a project that the objective is consolidate some personal
information using some business rules. It's a kind of ranking of the best
information of a person.
Today they use to reprocess every batch they receive comparing the new data
with all historical data. They're using Spark for this operation.
I'd like to propose something like this:
https://www.dropbox.com/s/glqhh7zzxd7g433/architecture.png?dl=0
Two questions:
- is it possible create an observer to synchronizes with HBase?
- Am I doing a good use of Fluo? If not, why?
Thank you all!
-----Original Message-----
From: Keith Turner [mailto:keith@deenlo.com]
Sent: Tuesday, December 19, 2017 2:21 PM
To: fluo-dev <de...@fluo.apache.org>
Subject: Re: About user group
On Tue, Dec 19, 2017 at 8:18 AM, Alan Camillo <al...@blueshift.com.br> wrote:
> Hello Fluo group!
>
> My name is Alan, I'm a big date architect and owner of a company
> called BlueShift Brasil. And I'm looking foward for Apache Fluo. I'd
> like to know about a user group to because I was no able to find, is it
> exist?
We currently do not have a user list. Feel free to ask any questions you
have here on the dev list.
>
> I have many questions to do and I would'nt like to post those here.
> If I could help if something in the project, please count on me.
If you are interested in contributing, the following may be a good issue to
start with.
https://github.com/apache/fluo-docker/issues/9
>
> Thanks!
> Alan Camillo
> *BlueShift *I IT Director
> Cel.: +55 11 98283-6358
> Tel.: +55 11 4605-5082
Re: Fluo application question
Posted by Keith Turner <ke...@deenlo.com>.
I have not tried the following, but if I were going to read data from
Kafka into Fluo I would start with the following code.
Consumer consumer = //a Kafka consumer
FluoClient client = //a Fluo client
while (true) {
ConsumerRecords<String, String> records = consumer.poll(1000);
try (LoaderExecutor le = client.newLoaderExecutor())
{
for (ConsumerRecord<String, String> record : records)
{
loader.execute((tx,ctx)-> {
// execute a Fluo transaction using record
});
}
} //when this try block exits all Fluo transactions are committed
//let Kafka know the data was successfully processed.
consumer.commitSync();
}
On Wed, Dec 20, 2017 at 10:46 AM, Alan Camillo <al...@blueshift.com.br> wrote:
> Thank you Keith for the answers and material you sent me.
> Just one more question about this solution:
> - What's the best way to consume data from Kafka to Flue. Do I need to
> implement something like in the webindex project: Kafka (Common Crawl) ->
> Spark -> Fluo? Or it's possible to ingest data directly from a Flue
> application?
>
> Thank you again!
> Alan Camillo
>
> -----Original Message-----
> From: Keith Turner [mailto:keith@deenlo.com]
> Sent: Tuesday, December 19, 2017 4:52 PM
> To: fluo-dev <de...@fluo.apache.org>
> Subject: Re: Fluo application question
>
> On Tue, Dec 19, 2017 at 1:25 PM, Alan Camillo <al...@blueshift.com.br> wrote:
>> We just start a project that the objective is consolidate some
>> personal information using some business rules. It's a kind of ranking
>> of the best information of a person.
>>
>> Today they use to reprocess every batch they receive comparing the new
>> data with all historical data. They're using Spark for this operation.
>> I'd like to propose something like this:
>> https://www.dropbox.com/s/glqhh7zzxd7g433/architecture.png?dl=0
>>
>> Two questions:
>> - is it possible create an observer to synchronizes with HBase?
>
> You could use an export queue to make updates to an HBase instance.
>
> http://fluo.apache.org/docs/fluo-recipes/1.1.0-incubating/export-queue/
>
> Also the slides below discuss the export queue (slide 27) and the concept of
> invert on export (slide 33). Invert on export would likely be useful for a
> key value store like hbase.
>
> https://www.slideshare.net/AccumuloSummit/accumulo-summit-2016-tips-for-writing-fluo-applications
>
> Fluo recipes does not currently have an exporter for HBase. It would be
> useful to add one to Fluo Recipes like the following for Accumulo.
>
> http://fluo.apache.org/docs/fluo-recipes/1.1.0-incubating/accumulo-export-queue/
>
>> - Am I doing a good use of Fluo? If not, why?
>
> It sounds like it may be a good fit. However, the exporter for HBase would
> need to be implemented. The Accumulo exporter is written in such a way that
> multiple transactions can share a single writer for efficiency. Not sure if
> this pattern should be followed for HBase.
>
>>
>> Thank you all!
>>
>> -----Original Message-----
>> From: Keith Turner [mailto:keith@deenlo.com]
>> Sent: Tuesday, December 19, 2017 2:21 PM
>> To: fluo-dev <de...@fluo.apache.org>
>> Subject: Re: About user group
>>
>> On Tue, Dec 19, 2017 at 8:18 AM, Alan Camillo <al...@blueshift.com.br>
>> wrote:
>>> Hello Fluo group!
>>>
>>> My name is Alan, I'm a big date architect and owner of a company
>>> called BlueShift Brasil. And I'm looking foward for Apache Fluo. I'd
>>> like to know about a user group to because I was no able to find, is
>>> it exist?
>>
>> We currently do not have a user list. Feel free to ask any questions
>> you have here on the dev list.
>>
>>>
>>> I have many questions to do and I would'nt like to post those here.
>>> If I could help if something in the project, please count on me.
>>
>> If you are interested in contributing, the following may be a good
>> issue to start with.
>>
>> https://github.com/apache/fluo-docker/issues/9
>>
>>>
>>> Thanks!
>>> Alan Camillo
>>> *BlueShift *I IT Director
>>> Cel.: +55 11 98283-6358
>>> Tel.: +55 11 4605-5082
RE: Fluo application question
Posted by Alan Camillo <al...@blueshift.com.br>.
Thank you Keith for the answers and material you sent me.
Just one more question about this solution:
- What's the best way to consume data from Kafka to Flue. Do I need to
implement something like in the webindex project: Kafka (Common Crawl) ->
Spark -> Fluo? Or it's possible to ingest data directly from a Flue
application?
Thank you again!
Alan Camillo
-----Original Message-----
From: Keith Turner [mailto:keith@deenlo.com]
Sent: Tuesday, December 19, 2017 4:52 PM
To: fluo-dev <de...@fluo.apache.org>
Subject: Re: Fluo application question
On Tue, Dec 19, 2017 at 1:25 PM, Alan Camillo <al...@blueshift.com.br> wrote:
> We just start a project that the objective is consolidate some
> personal information using some business rules. It's a kind of ranking
> of the best information of a person.
>
> Today they use to reprocess every batch they receive comparing the new
> data with all historical data. They're using Spark for this operation.
> I'd like to propose something like this:
> https://www.dropbox.com/s/glqhh7zzxd7g433/architecture.png?dl=0
>
> Two questions:
> - is it possible create an observer to synchronizes with HBase?
You could use an export queue to make updates to an HBase instance.
http://fluo.apache.org/docs/fluo-recipes/1.1.0-incubating/export-queue/
Also the slides below discuss the export queue (slide 27) and the concept of
invert on export (slide 33). Invert on export would likely be useful for a
key value store like hbase.
https://www.slideshare.net/AccumuloSummit/accumulo-summit-2016-tips-for-writing-fluo-applications
Fluo recipes does not currently have an exporter for HBase. It would be
useful to add one to Fluo Recipes like the following for Accumulo.
http://fluo.apache.org/docs/fluo-recipes/1.1.0-incubating/accumulo-export-queue/
> - Am I doing a good use of Fluo? If not, why?
It sounds like it may be a good fit. However, the exporter for HBase would
need to be implemented. The Accumulo exporter is written in such a way that
multiple transactions can share a single writer for efficiency. Not sure if
this pattern should be followed for HBase.
>
> Thank you all!
>
> -----Original Message-----
> From: Keith Turner [mailto:keith@deenlo.com]
> Sent: Tuesday, December 19, 2017 2:21 PM
> To: fluo-dev <de...@fluo.apache.org>
> Subject: Re: About user group
>
> On Tue, Dec 19, 2017 at 8:18 AM, Alan Camillo <al...@blueshift.com.br>
> wrote:
>> Hello Fluo group!
>>
>> My name is Alan, I'm a big date architect and owner of a company
>> called BlueShift Brasil. And I'm looking foward for Apache Fluo. I'd
>> like to know about a user group to because I was no able to find, is
>> it exist?
>
> We currently do not have a user list. Feel free to ask any questions
> you have here on the dev list.
>
>>
>> I have many questions to do and I would'nt like to post those here.
>> If I could help if something in the project, please count on me.
>
> If you are interested in contributing, the following may be a good
> issue to start with.
>
> https://github.com/apache/fluo-docker/issues/9
>
>>
>> Thanks!
>> Alan Camillo
>> *BlueShift *I IT Director
>> Cel.: +55 11 98283-6358
>> Tel.: +55 11 4605-5082
Re: Fluo application question
Posted by Keith Turner <ke...@deenlo.com>.
On Tue, Dec 19, 2017 at 1:25 PM, Alan Camillo <al...@blueshift.com.br> wrote:
> We just start a project that the objective is consolidate some personal
> information using some business rules. It's a kind of ranking of the best
> information of a person.
>
> Today they use to reprocess every batch they receive comparing the new data
> with all historical data. They're using Spark for this operation.
> I'd like to propose something like this:
> https://www.dropbox.com/s/glqhh7zzxd7g433/architecture.png?dl=0
>
> Two questions:
> - is it possible create an observer to synchronizes with HBase?
You could use an export queue to make updates to an HBase instance.
http://fluo.apache.org/docs/fluo-recipes/1.1.0-incubating/export-queue/
Also the slides below discuss the export queue (slide 27) and the
concept of invert on export (slide 33). Invert on export would likely
be useful for a key value store like hbase.
https://www.slideshare.net/AccumuloSummit/accumulo-summit-2016-tips-for-writing-fluo-applications
Fluo recipes does not currently have an exporter for HBase. It would
be useful to add one to Fluo Recipes like the following for Accumulo.
http://fluo.apache.org/docs/fluo-recipes/1.1.0-incubating/accumulo-export-queue/
> - Am I doing a good use of Fluo? If not, why?
It sounds like it may be a good fit. However, the exporter for HBase
would need to be implemented. The Accumulo exporter is written in
such a way that multiple transactions can share a single writer for
efficiency. Not sure if this pattern should be followed for HBase.
>
> Thank you all!
>
> -----Original Message-----
> From: Keith Turner [mailto:keith@deenlo.com]
> Sent: Tuesday, December 19, 2017 2:21 PM
> To: fluo-dev <de...@fluo.apache.org>
> Subject: Re: About user group
>
> On Tue, Dec 19, 2017 at 8:18 AM, Alan Camillo <al...@blueshift.com.br> wrote:
>> Hello Fluo group!
>>
>> My name is Alan, I'm a big date architect and owner of a company
>> called BlueShift Brasil. And I'm looking foward for Apache Fluo. I'd
>> like to know about a user group to because I was no able to find, is it
>> exist?
>
> We currently do not have a user list. Feel free to ask any questions you
> have here on the dev list.
>
>>
>> I have many questions to do and I would'nt like to post those here.
>> If I could help if something in the project, please count on me.
>
> If you are interested in contributing, the following may be a good issue to
> start with.
>
> https://github.com/apache/fluo-docker/issues/9
>
>>
>> Thanks!
>> Alan Camillo
>> *BlueShift *I IT Director
>> Cel.: +55 11 98283-6358
>> Tel.: +55 11 4605-5082