You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by sri hari kali charan Tummala <ka...@gmail.com> on 2019/08/08 17:33:39 UTC

Ignite Spark Example Question

Hi All,

I am new to Apache Ignite community I am testing out ignite for knowledge
sake in the below example the code reads a json file and writes to ingite
in-memory table is it overwriting can I do append mode I did try spark
append mode .mode(org.apache.spark.sql.SaveMode.Append)
without stopping one ignite application inginte.stop which keeps the cache
alive and tried to insert data to cache twice but I am still getting 4
records I was expecting 8 records , what would be the reason ?

https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89

-- 
Thanks & Regards
Sri Tummala

Re: Ignite Spark Example Question

Posted by sri hari kali charan Tummala <ka...@gmail.com>.

can I run ignite and spark on cluster mode ? in the github example what I
see is just local mode, if I use grid cloud ignite cluster how would I
install spark distributed mode is it comes with the ignite cluster ?

https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89

On Tue, Aug 13, 2019 at 6:53 AM Stephen Darlington <
stephen.darlington@gridgain.com> wrote:

> As I say, there’s nothing "out of the box” — you’d have to write it
> yourself. Exactly how you architect it would depend on what you’re trying
> to do.
>
> Regards,
> Stephen
>
> On 12 Aug 2019, at 19:59, sri hari kali charan Tummala <
> kali.tummala@gmail.com> wrote:
>
> Thanks Stephen , last question so I have to keep looping to find new data
> files in S3 and write to cache real time or is it already built in ?
>
> On Mon, Aug 12, 2019 at 5:43 AM Stephen Darlington <
> stephen.darlington@gridgain.com> wrote:
>
>> I don’t think there’s anything “out of the box,” but you could write a
>> custom CacheStore to do that.
>>
>> See here for more details:
>> https://apacheignite.readme.io/docs/3rd-party-store#section-custom-cachestore
>>
>> Regards,
>> Stephen
>>
>> On 9 Aug 2019, at 21:50, sri hari kali charan Tummala <
>> kali.tummala@gmail.com> wrote:
>>
>> one last question, is there an S3 connector for Ignite which can load s3
>> objects in realtime to ignite cache and data updates directly back to S3? I
>> can use spark as one alternative but is there another approach of doing?
>>
>> Let's say I want to build in-memory near real-time data lake files which
>> get loaded to S3 automatically gets loaded to Ignite (I can use spark
>> structured streaming jobs but is there a direct approach ?)
>>
>> On Fri, Aug 9, 2019 at 4:34 PM sri hari kali charan Tummala <
>> kali.tummala@gmail.com> wrote:
>>
>>> Thank you, I got it now I have to change the id values to see the same
>>> data as extra results (this is just for testing) amazing.
>>>
>>> val df = spark.sql(SELECT monolitically_id() as id, name, department
>>> FROM json_person)
>>>
>>> df.write(append)... to ignite
>>>
>>> Thanks
>>> Sri
>>>
>>>
>>> On Fri, Aug 9, 2019 at 6:08 AM Andrei Aleksandrov <
>>> aealexsandrov@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> Spark contains several *SaveModes *that will be applied if the table
>>>> that you are going to use exists:
>>>>
>>>> * *Overwrite *- with this option you *will try to re-create* existed
>>>> table or create new and load data there using IgniteDataStreamer
>>>> implementation
>>>> * *Append *- with this option you *will not try to re-create* existed
>>>> table or create new table and just load the data to existed table
>>>>
>>>> * *ErrorIfExists *- with this option you will get the exception if the
>>>> table that you are going to use exists
>>>>
>>>> * *Ignore *- with this option nothing will be done in case if the
>>>> table that you are going to use exists. If table already exists, the save
>>>> operation is expected to not save the contents of the DataFrame and to not
>>>> change the existing data.
>>>> According to your question:
>>>>
>>>> You should use the *Append *SaveMode for your spark integration in
>>>> case if you are going to store new data to cache and save the previous
>>>> stored data.
>>>>
>>>> Note, that in case if you will store the data for the same Primary Keys
>>>> then with data will be overwritten in Ignite table. For example:
>>>>
>>>> 1)Add person {id=1, name=Vlad, age=19} where id is the primary key
>>>> 2)Add person {id=1, name=Nikita, age=26} where id is the primary key
>>>>
>>>> In Ignite you will see only {id=1, name=Nikita, age=26}.
>>>>
>>>> Also here you can see the code sample for you and other information
>>>> about SaveModes:
>>>>
>>>>
>>>> https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes
>>>>
>>>> BR,
>>>> Andrei
>>>>
>>>> On 2019/08/08 17:33:39, sri hari kali charan Tummala <k....@gmail.com>
>>>> <k....@gmail.com> wrote:
>>>> > Hi All,>
>>>> >
>>>> > I am new to Apache Ignite community I am testing out ignite for
>>>> knowledge>
>>>> > sake in the below example the code reads a json file and writes to
>>>> ingite>
>>>> > in-memory table is it overwriting can I do append mode I did try
>>>> spark>
>>>> > append mode .mode(org.apache.spark.sql.SaveMode.Append)>
>>>> > without stopping one ignite application inginte.stop which keeps the
>>>> cache>
>>>> > alive and tried to insert data to cache twice but I am still getting
>>>> 4>
>>>> > records I was expecting 8 records , what would be the reason ?>
>>>> >
>>>> >
>>>> https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89>
>>>>
>>>> >
>>>> > -- >
>>>> > Thanks & Regards>
>>>> > Sri Tummala>
>>>> >
>>>>
>>>
>>>
>>> --
>>> Thanks & Regards
>>> Sri Tummala
>>>
>>>
>>
>> --
>> Thanks & Regards
>> Sri Tummala
>>
>>
>>
>>
>
> --
> Thanks & Regards
> Sri Tummala
>
>
>
>

-- 
Thanks & Regards
Sri Tummala

Re: Ignite Spark Example Question

Posted by Stephen Darlington <st...@gridgain.com>.

As I say, there’s nothing "out of the box” — you’d have to write it yourself. Exactly how you architect it would depend on what you’re trying to do.

Regards,
Stephen

> On 12 Aug 2019, at 19:59, sri hari kali charan Tummala <ka...@gmail.com> wrote:
> 
> Thanks Stephen , last question so I have to keep looping to find new data files in S3 and write to cache real time or is it already built in ?
> 
> On Mon, Aug 12, 2019 at 5:43 AM Stephen Darlington <stephen.darlington@gridgain.com <ma...@gridgain.com>> wrote:
> I don’t think there’s anything “out of the box,” but you could write a custom CacheStore to do that.
> 
> See here for more details: https://apacheignite.readme.io/docs/3rd-party-store#section-custom-cachestore <https://apacheignite.readme.io/docs/3rd-party-store#section-custom-cachestore>
> 
> Regards,
> Stephen
> 
>> On 9 Aug 2019, at 21:50, sri hari kali charan Tummala <kali.tummala@gmail.com <ma...@gmail.com>> wrote:
>> 
>> one last question, is there an S3 connector for Ignite which can load s3 objects in realtime to ignite cache and data updates directly back to S3? I can use spark as one alternative but is there another approach of doing?
>> 
>> Let's say I want to build in-memory near real-time data lake files which get loaded to S3 automatically gets loaded to Ignite (I can use spark structured streaming jobs but is there a direct approach ?)
>> 
>> On Fri, Aug 9, 2019 at 4:34 PM sri hari kali charan Tummala <kali.tummala@gmail.com <ma...@gmail.com>> wrote:
>> Thank you, I got it now I have to change the id values to see the same data as extra results (this is just for testing) amazing.
>> 
>> val df = spark.sql(SELECT monolitically_id() as id, name, department FROM json_person)
>> 
>> df.write(append)... to ignite
>> 
>> Thanks
>> Sri 
>> 
>> 
>> On Fri, Aug 9, 2019 at 6:08 AM Andrei Aleksandrov <aealexsandrov@gmail.com <ma...@gmail.com>> wrote:
>> Hi,
>> 
>> Spark contains several SaveModes that will be applied if the table that you are going to use exists:
>> 
>> * Overwrite - with this option you will try to re-create existed table or create new and load data there using IgniteDataStreamer implementation
>> * Append - with this option you will not try to re-create existed table or create new table and just load the data to existed table
>> * ErrorIfExists - with this option you will get the exception if the table that you are going to use exists
>> 
>> * Ignore - with this option nothing will be done in case if the table that you are going to use exists. If table already exists, the save operation is expected to not save the contents of the DataFrame and to not change the existing data.
>> 
>> According to your question:
>> 
>> You should use the Append SaveMode for your spark integration in case if you are going to store new data to cache and save the previous stored data.
>> 
>> Note, that in case if you will store the data for the same Primary Keys then with data will be overwritten in Ignite table. For example:
>> 
>> 1)Add person {id=1, name=Vlad, age=19} where id is the primary key
>> 2)Add person {id=1, name=Nikita, age=26} where id is the primary key
>> 
>> In Ignite you will see only {id=1, name=Nikita, age=26}.
>> 
>> Also here you can see the code sample for you and other information about SaveModes:
>> 
>> https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes <https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes>
>> 
>> BR,
>> Andrei
>> 
>> On 2019/08/08 17:33:39, sri hari kali charan Tummala <k....@gmail.com> <ma...@gmail.com> wrote: 
>> > Hi All,> 
>> > 
>> > I am new to Apache Ignite community I am testing out ignite for knowledge> 
>> > sake in the below example the code reads a json file and writes to ingite> 
>> > in-memory table is it overwriting can I do append mode I did try spark> 
>> > append mode .mode(org.apache.spark.sql.SaveMode.Append)> 
>> > without stopping one ignite application inginte.stop which keeps the cache> 
>> > alive and tried to insert data to cache twice but I am still getting 4> 
>> > records I was expecting 8 records , what would be the reason ?> 
>> > 
>> > https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89 <https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89>> 
>> > 
>> > -- > 
>> > Thanks & Regards> 
>> > Sri Tummala> 
>> >
>> 
>> 
>> -- 
>> Thanks & Regards
>> Sri Tummala
>> 
>> 
>> 
>> -- 
>> Thanks & Regards
>> Sri Tummala
>> 
> 
> 
> 
> 
> -- 
> Thanks & Regards
> Sri Tummala
>

Re: Ignite Spark Example Question

Posted by sri hari kali charan Tummala <ka...@gmail.com>.

Thanks Stephen , last question so I have to keep looping to find new data
files in S3 and write to cache real time or is it already built in ?

On Mon, Aug 12, 2019 at 5:43 AM Stephen Darlington <
stephen.darlington@gridgain.com> wrote:

> I don’t think there’s anything “out of the box,” but you could write a
> custom CacheStore to do that.
>
> See here for more details:
> https://apacheignite.readme.io/docs/3rd-party-store#section-custom-cachestore
>
> Regards,
> Stephen
>
> On 9 Aug 2019, at 21:50, sri hari kali charan Tummala <
> kali.tummala@gmail.com> wrote:
>
> one last question, is there an S3 connector for Ignite which can load s3
> objects in realtime to ignite cache and data updates directly back to S3? I
> can use spark as one alternative but is there another approach of doing?
>
> Let's say I want to build in-memory near real-time data lake files which
> get loaded to S3 automatically gets loaded to Ignite (I can use spark
> structured streaming jobs but is there a direct approach ?)
>
> On Fri, Aug 9, 2019 at 4:34 PM sri hari kali charan Tummala <
> kali.tummala@gmail.com> wrote:
>
>> Thank you, I got it now I have to change the id values to see the same
>> data as extra results (this is just for testing) amazing.
>>
>> val df = spark.sql(SELECT monolitically_id() as id, name, department FROM
>> json_person)
>>
>> df.write(append)... to ignite
>>
>> Thanks
>> Sri
>>
>>
>> On Fri, Aug 9, 2019 at 6:08 AM Andrei Aleksandrov <
>> aealexsandrov@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Spark contains several *SaveModes *that will be applied if the table
>>> that you are going to use exists:
>>>
>>> * *Overwrite *- with this option you *will try to re-create* existed
>>> table or create new and load data there using IgniteDataStreamer
>>> implementation
>>> * *Append *- with this option you *will not try to re-create* existed
>>> table or create new table and just load the data to existed table
>>>
>>> * *ErrorIfExists *- with this option you will get the exception if the
>>> table that you are going to use exists
>>>
>>> * *Ignore *- with this option nothing will be done in case if the table
>>> that you are going to use exists. If table already exists, the save
>>> operation is expected to not save the contents of the DataFrame and to not
>>> change the existing data.
>>> According to your question:
>>>
>>> You should use the *Append *SaveMode for your spark integration in case
>>> if you are going to store new data to cache and save the previous stored
>>> data.
>>>
>>> Note, that in case if you will store the data for the same Primary Keys
>>> then with data will be overwritten in Ignite table. For example:
>>>
>>> 1)Add person {id=1, name=Vlad, age=19} where id is the primary key
>>> 2)Add person {id=1, name=Nikita, age=26} where id is the primary key
>>>
>>> In Ignite you will see only {id=1, name=Nikita, age=26}.
>>>
>>> Also here you can see the code sample for you and other information
>>> about SaveModes:
>>>
>>>
>>> https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes
>>>
>>> BR,
>>> Andrei
>>>
>>> On 2019/08/08 17:33:39, sri hari kali charan Tummala <k....@gmail.com>
>>> <k....@gmail.com> wrote:
>>> > Hi All,>
>>> >
>>> > I am new to Apache Ignite community I am testing out ignite for
>>> knowledge>
>>> > sake in the below example the code reads a json file and writes to
>>> ingite>
>>> > in-memory table is it overwriting can I do append mode I did try
>>> spark>
>>> > append mode .mode(org.apache.spark.sql.SaveMode.Append)>
>>> > without stopping one ignite application inginte.stop which keeps the
>>> cache>
>>> > alive and tried to insert data to cache twice but I am still getting
>>> 4>
>>> > records I was expecting 8 records , what would be the reason ?>
>>> >
>>> >
>>> https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89>
>>>
>>> >
>>> > -- >
>>> > Thanks & Regards>
>>> > Sri Tummala>
>>> >
>>>
>>
>>
>> --
>> Thanks & Regards
>> Sri Tummala
>>
>>
>
> --
> Thanks & Regards
> Sri Tummala
>
>
>
>

-- 
Thanks & Regards
Sri Tummala

Re: Ignite Spark Example Question

Posted by Stephen Darlington <st...@gridgain.com>.

I don’t think there’s anything “out of the box,” but you could write a custom CacheStore to do that.

See here for more details: https://apacheignite.readme.io/docs/3rd-party-store#section-custom-cachestore

Regards,
Stephen

> On 9 Aug 2019, at 21:50, sri hari kali charan Tummala <ka...@gmail.com> wrote:
> 
> one last question, is there an S3 connector for Ignite which can load s3 objects in realtime to ignite cache and data updates directly back to S3? I can use spark as one alternative but is there another approach of doing?
> 
> Let's say I want to build in-memory near real-time data lake files which get loaded to S3 automatically gets loaded to Ignite (I can use spark structured streaming jobs but is there a direct approach ?)
> 
> On Fri, Aug 9, 2019 at 4:34 PM sri hari kali charan Tummala <kali.tummala@gmail.com <ma...@gmail.com>> wrote:
> Thank you, I got it now I have to change the id values to see the same data as extra results (this is just for testing) amazing.
> 
> val df = spark.sql(SELECT monolitically_id() as id, name, department FROM json_person)
> 
> df.write(append)... to ignite
> 
> Thanks
> Sri 
> 
> 
> On Fri, Aug 9, 2019 at 6:08 AM Andrei Aleksandrov <aealexsandrov@gmail.com <ma...@gmail.com>> wrote:
> Hi,
> 
> Spark contains several SaveModes that will be applied if the table that you are going to use exists:
> 
> * Overwrite - with this option you will try to re-create existed table or create new and load data there using IgniteDataStreamer implementation
> * Append - with this option you will not try to re-create existed table or create new table and just load the data to existed table
> * ErrorIfExists - with this option you will get the exception if the table that you are going to use exists
> 
> * Ignore - with this option nothing will be done in case if the table that you are going to use exists. If table already exists, the save operation is expected to not save the contents of the DataFrame and to not change the existing data.
> 
> According to your question:
> 
> You should use the Append SaveMode for your spark integration in case if you are going to store new data to cache and save the previous stored data.
> 
> Note, that in case if you will store the data for the same Primary Keys then with data will be overwritten in Ignite table. For example:
> 
> 1)Add person {id=1, name=Vlad, age=19} where id is the primary key
> 2)Add person {id=1, name=Nikita, age=26} where id is the primary key
> 
> In Ignite you will see only {id=1, name=Nikita, age=26}.
> 
> Also here you can see the code sample for you and other information about SaveModes:
> 
> https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes <https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes>
> 
> BR,
> Andrei
> 
> On 2019/08/08 17:33:39, sri hari kali charan Tummala <k....@gmail.com> <ma...@gmail.com> wrote: 
> > Hi All,> 
> > 
> > I am new to Apache Ignite community I am testing out ignite for knowledge> 
> > sake in the below example the code reads a json file and writes to ingite> 
> > in-memory table is it overwriting can I do append mode I did try spark> 
> > append mode .mode(org.apache.spark.sql.SaveMode.Append)> 
> > without stopping one ignite application inginte.stop which keeps the cache> 
> > alive and tried to insert data to cache twice but I am still getting 4> 
> > records I was expecting 8 records , what would be the reason ?> 
> > 
> > https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89 <https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89>> 
> > 
> > -- > 
> > Thanks & Regards> 
> > Sri Tummala> 
> >
> 
> 
> -- 
> Thanks & Regards
> Sri Tummala
> 
> 
> 
> -- 
> Thanks & Regards
> Sri Tummala
>

Re: Ignite Spark Example Question

Posted by sri hari kali charan Tummala <ka...@gmail.com>.

one last question, is there an S3 connector for Ignite which can load s3
objects in realtime to ignite cache and data updates directly back to S3? I
can use spark as one alternative but is there another approach of doing?

Let's say I want to build in-memory near real-time data lake files which
get loaded to S3 automatically gets loaded to Ignite (I can use spark
structured streaming jobs but is there a direct approach ?)

On Fri, Aug 9, 2019 at 4:34 PM sri hari kali charan Tummala <
kali.tummala@gmail.com> wrote:

> Thank you, I got it now I have to change the id values to see the same
> data as extra results (this is just for testing) amazing.
>
> val df = spark.sql(SELECT monolitically_id() as id, name, department FROM
> json_person)
>
> df.write(append)... to ignite
>
> Thanks
> Sri
>
>
> On Fri, Aug 9, 2019 at 6:08 AM Andrei Aleksandrov <ae...@gmail.com>
> wrote:
>
>> Hi,
>>
>> Spark contains several *SaveModes *that will be applied if the table
>> that you are going to use exists:
>>
>> * *Overwrite *- with this option you *will try to re-create* existed
>> table or create new and load data there using IgniteDataStreamer
>> implementation
>> * *Append *- with this option you *will not try to re-create* existed
>> table or create new table and just load the data to existed table
>>
>> * *ErrorIfExists *- with this option you will get the exception if the
>> table that you are going to use exists
>>
>> * *Ignore *- with this option nothing will be done in case if the table
>> that you are going to use exists. If table already exists, the save
>> operation is expected to not save the contents of the DataFrame and to not
>> change the existing data.
>> According to your question:
>>
>> You should use the *Append *SaveMode for your spark integration in case
>> if you are going to store new data to cache and save the previous stored
>> data.
>>
>> Note, that in case if you will store the data for the same Primary Keys
>> then with data will be overwritten in Ignite table. For example:
>>
>> 1)Add person {id=1, name=Vlad, age=19} where id is the primary key
>> 2)Add person {id=1, name=Nikita, age=26} where id is the primary key
>>
>> In Ignite you will see only {id=1, name=Nikita, age=26}.
>>
>> Also here you can see the code sample for you and other information about
>> SaveModes:
>>
>>
>> https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes
>>
>> BR,
>> Andrei
>>
>> On 2019/08/08 17:33:39, sri hari kali charan Tummala <k....@gmail.com>
>> <k....@gmail.com> wrote:
>> > Hi All,>
>> >
>> > I am new to Apache Ignite community I am testing out ignite for
>> knowledge>
>> > sake in the below example the code reads a json file and writes to
>> ingite>
>> > in-memory table is it overwriting can I do append mode I did try spark>
>> > append mode .mode(org.apache.spark.sql.SaveMode.Append)>
>> > without stopping one ignite application inginte.stop which keeps the
>> cache>
>> > alive and tried to insert data to cache twice but I am still getting 4>
>> > records I was expecting 8 records , what would be the reason ?>
>> >
>> >
>> https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89>
>>
>> >
>> > -- >
>> > Thanks & Regards>
>> > Sri Tummala>
>> >
>>
>
>
> --
> Thanks & Regards
> Sri Tummala
>
>

-- 
Thanks & Regards
Sri Tummala

Re: Ignite Spark Example Question

Posted by sri hari kali charan Tummala <ka...@gmail.com>.

Thank you, I got it now I have to change the id values to see the same data
as extra results (this is just for testing) amazing.

val df = spark.sql(SELECT monolitically_id() as id, name, department FROM
json_person)

df.write(append)... to ignite

Thanks
Sri


On Fri, Aug 9, 2019 at 6:08 AM Andrei Aleksandrov <ae...@gmail.com>
wrote:

> Hi,
>
> Spark contains several *SaveModes *that will be applied if the table that
> you are going to use exists:
>
> * *Overwrite *- with this option you *will try to re-create* existed
> table or create new and load data there using IgniteDataStreamer
> implementation
> * *Append *- with this option you *will not try to re-create* existed
> table or create new table and just load the data to existed table
>
> * *ErrorIfExists *- with this option you will get the exception if the
> table that you are going to use exists
>
> * *Ignore *- with this option nothing will be done in case if the table
> that you are going to use exists. If table already exists, the save
> operation is expected to not save the contents of the DataFrame and to not
> change the existing data.
> According to your question:
>
> You should use the *Append *SaveMode for your spark integration in case
> if you are going to store new data to cache and save the previous stored
> data.
>
> Note, that in case if you will store the data for the same Primary Keys
> then with data will be overwritten in Ignite table. For example:
>
> 1)Add person {id=1, name=Vlad, age=19} where id is the primary key
> 2)Add person {id=1, name=Nikita, age=26} where id is the primary key
>
> In Ignite you will see only {id=1, name=Nikita, age=26}.
>
> Also here you can see the code sample for you and other information about
> SaveModes:
>
>
> https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes
>
> BR,
> Andrei
>
> On 2019/08/08 17:33:39, sri hari kali charan Tummala <k....@gmail.com>
> <k....@gmail.com> wrote:
> > Hi All,>
> >
> > I am new to Apache Ignite community I am testing out ignite for
> knowledge>
> > sake in the below example the code reads a json file and writes to
> ingite>
> > in-memory table is it overwriting can I do append mode I did try spark>
> > append mode .mode(org.apache.spark.sql.SaveMode.Append)>
> > without stopping one ignite application inginte.stop which keeps the
> cache>
> > alive and tried to insert data to cache twice but I am still getting 4>
> > records I was expecting 8 records , what would be the reason ?>
> >
> >
> https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89>
>
> >
> > -- >
> > Thanks & Regards>
> > Sri Tummala>
> >
>


-- 
Thanks & Regards
Sri Tummala

Re: Ignite Spark Example Question

Posted by Andrei Aleksandrov <ae...@gmail.com>.

Hi,

Spark contains several *SaveModes *that will be applied if the table 
that you are going to use exists:

* *Overwrite *- with this option you *will try to re-create* existed 
table or create new and load data there using IgniteDataStreamer 
implementation
* *Append *- with this option you *will not try to re-create* existed 
table or create new table and just load the data to existed table

* *ErrorIfExists *- with this option you will get the exception if the 
table that you are going to use exists

* *Ignore *- with this option nothing will be done in case if the table 
that you are going to use exists. If table already exists, the save 
operation is expected to not save the contents of the DataFrame and to 
not change the existing data.

According to your question:

You should use the *Append *SaveMode for your spark integration in case 
if you are going to store new data to cache and save the previous stored 
data.

Note, that in case if you will store the data for the same Primary Keys 
then with data will be overwritten in Ignite table. For example:

1)Add person {id=1, name=Vlad, age=19} where id is the primary key
2)Add person {id=1, name=Nikita, age=26} where id is the primary key

In Ignite you will see only {id=1, name=Nikita, age=26}.

Also here you can see the code sample for you and other information 
about SaveModes:

https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes

BR,
Andrei

On 2019/08/08 17:33:39, sri hari kali charan Tummala <k....@gmail.com> 
wrote:
 > Hi All,>
 >
 > I am new to Apache Ignite community I am testing out ignite for 
knowledge>
 > sake in the below example the code reads a json file and writes to 
ingite>
 > in-memory table is it overwriting can I do append mode I did try spark>
 > append mode .mode(org.apache.spark.sql.SaveMode.Append)>
 > without stopping one ignite application inginte.stop which keeps the 
cache>
 > alive and tried to insert data to cache twice but I am still getting 4>
 > records I was expecting 8 records , what would be the reason ?>
 >
 > 
https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89> 

 >
 > -- >
 > Thanks & Regards>
 > Sri Tummala>
 >