You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Vladimir Tchernyi <vt...@gmail.com> on 2020/12/04 10:58:33 UTC

IgniteDataStreamer.keepBinary proposal

Hi, community



I've just finished drilling a small page [1] about Ignite data streaming
and I want to share my impressions. The situation is common for many Ignite
documentation pages, impressions are the same.



My problem was to adapt IgniteDataStreamer to data loading using the binary
format as described in my article [2]. I try to use the same approach:

1) load data on the client node;

2) convert it to the binary form;

3) use IgniteDataStreamer/StreamReceiver pair (instead of
ComputeTaskAdapter/ComputeJobAdapter) to ingest data in the cache.



I modified my production code using IgniteDataStreamer<BinaryObject,
BinaryObject> and  StreamReceiver<BinaryObject, BinaryObject>, tried to
start on the dev cluster made of 2 server nodes and 1 client node. That is
it: ClassNotFoundException for the class that exists on the client node
only.



The solution to the problem seems to be in setting
streamer.keepBinary(true), but page [1] never says about it. I found that
setter in the IgniteDataStreamer source code after a single day of
troubleshooting. Definitely, "In Ignite We Trust" - what else reason would
drive me to spend so much time?



The code snippets on the page [1] are hard to implement in real-world
applications because of using only primitive types String, Integer, etc.
These are more like unit tests.



My proposal - it would be great to create a small GitHub repo containing a
complete compilable code example, one repo for every page. I think such
repos will keep the newbie Ignite users inside the project and prevent them
from leaving.



Regards,

Vladimir Tchernyi

--

[1] https://ignite.apache.org/docs/latest/data-streaming

[2]
https://www.gridgain.com/resources/blog/how-fast-load-large-datasets-apache-ignite-using-key-value-api

Re: IgniteDataStreamer.keepBinary proposal

Posted by Vladimir Tchernyi <vt...@gmail.com>.
Hi Denis,

I think the code examples we already have do not show the nature of Ignite
as a DISTRIBUTED database. These examples are oriented on a single-node
start. An inexperienced user can have a false impression that a single
Ignite node can outperform, for example, a commercial database server.

IMHO the documentation should be written for a multinode Ignite cluster. I
do not understand what is the purpose to show how to stream 100_000 integer
values in a cache defined as <Integer, String>. In the real world, I need
to stream structured records (Kafka Avro messages), and I will create a
POJO to hold each message. It is known that Ignite does not
peer-deploy user POJOs, so using BinaryObject is the only way to forward my
POJOs to the remote nodes (correct me if I am wrong).

I trust Ignite and I managed to create really fast Ignite app in
production. But recently I faced again the long-forgotten feeling - the
page is nice but hard to use. Hope my experience will help to
improve documentation.

Vladimir

PS
as for contributing, I need some time to get my Kafka Ignite app to
production to be sure of it. After that, I will be ready to contribute

сб, 5 дек. 2020 г. в 06:31, Denis Magda <dm...@apache.org>:

> Hi Vladimir,
>
> Most of the code snippets are already arranged in complete and
> ready-for-usage samples:
>
> https://github.com/apache/ignite/tree/master/docs/_docs/code-snippets/java/src/main/java/org/apache/ignite/snippets
>
> Anyway, those are code snippets that are injected into quite generic
> documentation pages. Your case represents a situation when someone needs to
> work with binary objects and streaming APIs. What if we add a data streamer
> example for BinaryObjects into Ignite's examples and put a reference to
> that example from the documentation page? Are you interested in
> contributing the example?
> https://github.com/apache/ignite/tree/master/examples
>
> -
> Denis
>
>
> On Fri, Dec 4, 2020 at 2:58 AM Vladimir Tchernyi <vt...@gmail.com>
> wrote:
>
>> Hi, community
>>
>>
>>
>> I've just finished drilling a small page [1] about Ignite data streaming
>> and I want to share my impressions. The situation is common for many Ignite
>> documentation pages, impressions are the same.
>>
>>
>>
>> My problem was to adapt IgniteDataStreamer to data loading using the
>> binary format as described in my article [2]. I try to use the same
>> approach:
>>
>> 1) load data on the client node;
>>
>> 2) convert it to the binary form;
>>
>> 3) use IgniteDataStreamer/StreamReceiver pair (instead of
>> ComputeTaskAdapter/ComputeJobAdapter) to ingest data in the cache.
>>
>>
>>
>> I modified my production code using IgniteDataStreamer<BinaryObject,
>> BinaryObject> and  StreamReceiver<BinaryObject, BinaryObject>, tried to
>> start on the dev cluster made of 2 server nodes and 1 client node. That is
>> it: ClassNotFoundException for the class that exists on the client node
>> only.
>>
>>
>>
>> The solution to the problem seems to be in setting
>> streamer.keepBinary(true), but page [1] never says about it. I found that
>> setter in the IgniteDataStreamer source code after a single day of
>> troubleshooting. Definitely, "In Ignite We Trust" - what else reason would
>> drive me to spend so much time?
>>
>>
>>
>> The code snippets on the page [1] are hard to implement in real-world
>> applications because of using only primitive types String, Integer, etc.
>> These are more like unit tests.
>>
>>
>>
>> My proposal - it would be great to create a small GitHub repo containing
>> a complete compilable code example, one repo for every page. I think such
>> repos will keep the newbie Ignite users inside the project and prevent them
>> from leaving.
>>
>>
>>
>> Regards,
>>
>> Vladimir Tchernyi
>>
>> --
>>
>> [1] https://ignite.apache.org/docs/latest/data-streaming
>>
>> [2]
>> https://www.gridgain.com/resources/blog/how-fast-load-large-datasets-apache-ignite-using-key-value-api
>>
>>
>>
>

Re: IgniteDataStreamer.keepBinary proposal

Posted by Denis Magda <dm...@apache.org>.
Hi Vladimir,

Most of the code snippets are already arranged in complete and
ready-for-usage samples:
https://github.com/apache/ignite/tree/master/docs/_docs/code-snippets/java/src/main/java/org/apache/ignite/snippets

Anyway, those are code snippets that are injected into quite generic
documentation pages. Your case represents a situation when someone needs to
work with binary objects and streaming APIs. What if we add a data streamer
example for BinaryObjects into Ignite's examples and put a reference to
that example from the documentation page? Are you interested in
contributing the example?
https://github.com/apache/ignite/tree/master/examples

-
Denis


On Fri, Dec 4, 2020 at 2:58 AM Vladimir Tchernyi <vt...@gmail.com>
wrote:

> Hi, community
>
>
>
> I've just finished drilling a small page [1] about Ignite data streaming
> and I want to share my impressions. The situation is common for many Ignite
> documentation pages, impressions are the same.
>
>
>
> My problem was to adapt IgniteDataStreamer to data loading using the
> binary format as described in my article [2]. I try to use the same
> approach:
>
> 1) load data on the client node;
>
> 2) convert it to the binary form;
>
> 3) use IgniteDataStreamer/StreamReceiver pair (instead of
> ComputeTaskAdapter/ComputeJobAdapter) to ingest data in the cache.
>
>
>
> I modified my production code using IgniteDataStreamer<BinaryObject,
> BinaryObject> and  StreamReceiver<BinaryObject, BinaryObject>, tried to
> start on the dev cluster made of 2 server nodes and 1 client node. That is
> it: ClassNotFoundException for the class that exists on the client node
> only.
>
>
>
> The solution to the problem seems to be in setting
> streamer.keepBinary(true), but page [1] never says about it. I found that
> setter in the IgniteDataStreamer source code after a single day of
> troubleshooting. Definitely, "In Ignite We Trust" - what else reason would
> drive me to spend so much time?
>
>
>
> The code snippets on the page [1] are hard to implement in real-world
> applications because of using only primitive types String, Integer, etc.
> These are more like unit tests.
>
>
>
> My proposal - it would be great to create a small GitHub repo containing a
> complete compilable code example, one repo for every page. I think such
> repos will keep the newbie Ignite users inside the project and prevent them
> from leaving.
>
>
>
> Regards,
>
> Vladimir Tchernyi
>
> --
>
> [1] https://ignite.apache.org/docs/latest/data-streaming
>
> [2]
> https://www.gridgain.com/resources/blog/how-fast-load-large-datasets-apache-ignite-using-key-value-api
>
>
>