You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by sri hari kali charan Tummala <ka...@gmail.com> on 2019/08/08 17:33:39 UTC
Ignite Spark Example Question
Hi All,
I am new to Apache Ignite community I am testing out ignite for knowledge
sake in the below example the code reads a json file and writes to ingite
in-memory table is it overwriting can I do append mode I did try spark
append mode .mode(org.apache.spark.sql.SaveMode.Append)
without stopping one ignite application inginte.stop which keeps the cache
alive and tried to insert data to cache twice but I am still getting 4
records I was expecting 8 records , what would be the reason ?
https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89
--
Thanks & Regards
Sri Tummala
Re: Ignite Spark Example Question
Posted by sri hari kali charan Tummala <ka...@gmail.com>.
can I run ignite and spark on cluster mode ? in the github example what I
see is just local mode, if I use grid cloud ignite cluster how would I
install spark distributed mode is it comes with the ignite cluster ?
https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89
On Tue, Aug 13, 2019 at 6:53 AM Stephen Darlington <
stephen.darlington@gridgain.com> wrote:
> As I say, there’s nothing "out of the box” — you’d have to write it
> yourself. Exactly how you architect it would depend on what you’re trying
> to do.
>
> Regards,
> Stephen
>
> On 12 Aug 2019, at 19:59, sri hari kali charan Tummala <
> kali.tummala@gmail.com> wrote:
>
> Thanks Stephen , last question so I have to keep looping to find new data
> files in S3 and write to cache real time or is it already built in ?
>
> On Mon, Aug 12, 2019 at 5:43 AM Stephen Darlington <
> stephen.darlington@gridgain.com> wrote:
>
>> I don’t think there’s anything “out of the box,” but you could write a
>> custom CacheStore to do that.
>>
>> See here for more details:
>> https://apacheignite.readme.io/docs/3rd-party-store#section-custom-cachestore
>>
>> Regards,
>> Stephen
>>
>> On 9 Aug 2019, at 21:50, sri hari kali charan Tummala <
>> kali.tummala@gmail.com> wrote:
>>
>> one last question, is there an S3 connector for Ignite which can load s3
>> objects in realtime to ignite cache and data updates directly back to S3? I
>> can use spark as one alternative but is there another approach of doing?
>>
>> Let's say I want to build in-memory near real-time data lake files which
>> get loaded to S3 automatically gets loaded to Ignite (I can use spark
>> structured streaming jobs but is there a direct approach ?)
>>
>> On Fri, Aug 9, 2019 at 4:34 PM sri hari kali charan Tummala <
>> kali.tummala@gmail.com> wrote:
>>
>>> Thank you, I got it now I have to change the id values to see the same
>>> data as extra results (this is just for testing) amazing.
>>>
>>> val df = spark.sql(SELECT monolitically_id() as id, name, department
>>> FROM json_person)
>>>
>>> df.write(append)... to ignite
>>>
>>> Thanks
>>> Sri
>>>
>>>
>>> On Fri, Aug 9, 2019 at 6:08 AM Andrei Aleksandrov <
>>> aealexsandrov@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> Spark contains several *SaveModes *that will be applied if the table
>>>> that you are going to use exists:
>>>>
>>>> * *Overwrite *- with this option you *will try to re-create* existed
>>>> table or create new and load data there using IgniteDataStreamer
>>>> implementation
>>>> * *Append *- with this option you *will not try to re-create* existed
>>>> table or create new table and just load the data to existed table
>>>>
>>>> * *ErrorIfExists *- with this option you will get the exception if the
>>>> table that you are going to use exists
>>>>
>>>> * *Ignore *- with this option nothing will be done in case if the
>>>> table that you are going to use exists. If table already exists, the save
>>>> operation is expected to not save the contents of the DataFrame and to not
>>>> change the existing data.
>>>> According to your question:
>>>>
>>>> You should use the *Append *SaveMode for your spark integration in
>>>> case if you are going to store new data to cache and save the previous
>>>> stored data.
>>>>
>>>> Note, that in case if you will store the data for the same Primary Keys
>>>> then with data will be overwritten in Ignite table. For example:
>>>>
>>>> 1)Add person {id=1, name=Vlad, age=19} where id is the primary key
>>>> 2)Add person {id=1, name=Nikita, age=26} where id is the primary key
>>>>
>>>> In Ignite you will see only {id=1, name=Nikita, age=26}.
>>>>
>>>> Also here you can see the code sample for you and other information
>>>> about SaveModes:
>>>>
>>>>
>>>> https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes
>>>>
>>>> BR,
>>>> Andrei
>>>>
>>>> On 2019/08/08 17:33:39, sri hari kali charan Tummala <k....@gmail.com>
>>>> <k....@gmail.com> wrote:
>>>> > Hi All,>
>>>> >
>>>> > I am new to Apache Ignite community I am testing out ignite for
>>>> knowledge>
>>>> > sake in the below example the code reads a json file and writes to
>>>> ingite>
>>>> > in-memory table is it overwriting can I do append mode I did try
>>>> spark>
>>>> > append mode .mode(org.apache.spark.sql.SaveMode.Append)>
>>>> > without stopping one ignite application inginte.stop which keeps the
>>>> cache>
>>>> > alive and tried to insert data to cache twice but I am still getting
>>>> 4>
>>>> > records I was expecting 8 records , what would be the reason ?>
>>>> >
>>>> >
>>>> https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89>
>>>>
>>>> >
>>>> > -- >
>>>> > Thanks & Regards>
>>>> > Sri Tummala>
>>>> >
>>>>
>>>
>>>
>>> --
>>> Thanks & Regards
>>> Sri Tummala
>>>
>>>
>>
>> --
>> Thanks & Regards
>> Sri Tummala
>>
>>
>>
>>
>
> --
> Thanks & Regards
> Sri Tummala
>
>
>
>
--
Thanks & Regards
Sri Tummala
Re: Ignite Spark Example Question
Posted by Stephen Darlington <st...@gridgain.com>.
As I say, there’s nothing "out of the box” — you’d have to write it yourself. Exactly how you architect it would depend on what you’re trying to do.
Regards,
Stephen
> On 12 Aug 2019, at 19:59, sri hari kali charan Tummala <ka...@gmail.com> wrote:
>
> Thanks Stephen , last question so I have to keep looping to find new data files in S3 and write to cache real time or is it already built in ?
>
> On Mon, Aug 12, 2019 at 5:43 AM Stephen Darlington <stephen.darlington@gridgain.com <ma...@gridgain.com>> wrote:
> I don’t think there’s anything “out of the box,” but you could write a custom CacheStore to do that.
>
> See here for more details: https://apacheignite.readme.io/docs/3rd-party-store#section-custom-cachestore <https://apacheignite.readme.io/docs/3rd-party-store#section-custom-cachestore>
>
> Regards,
> Stephen
>
>> On 9 Aug 2019, at 21:50, sri hari kali charan Tummala <kali.tummala@gmail.com <ma...@gmail.com>> wrote:
>>
>> one last question, is there an S3 connector for Ignite which can load s3 objects in realtime to ignite cache and data updates directly back to S3? I can use spark as one alternative but is there another approach of doing?
>>
>> Let's say I want to build in-memory near real-time data lake files which get loaded to S3 automatically gets loaded to Ignite (I can use spark structured streaming jobs but is there a direct approach ?)
>>
>> On Fri, Aug 9, 2019 at 4:34 PM sri hari kali charan Tummala <kali.tummala@gmail.com <ma...@gmail.com>> wrote:
>> Thank you, I got it now I have to change the id values to see the same data as extra results (this is just for testing) amazing.
>>
>> val df = spark.sql(SELECT monolitically_id() as id, name, department FROM json_person)
>>
>> df.write(append)... to ignite
>>
>> Thanks
>> Sri
>>
>>
>> On Fri, Aug 9, 2019 at 6:08 AM Andrei Aleksandrov <aealexsandrov@gmail.com <ma...@gmail.com>> wrote:
>> Hi,
>>
>> Spark contains several SaveModes that will be applied if the table that you are going to use exists:
>>
>> * Overwrite - with this option you will try to re-create existed table or create new and load data there using IgniteDataStreamer implementation
>> * Append - with this option you will not try to re-create existed table or create new table and just load the data to existed table
>> * ErrorIfExists - with this option you will get the exception if the table that you are going to use exists
>>
>> * Ignore - with this option nothing will be done in case if the table that you are going to use exists. If table already exists, the save operation is expected to not save the contents of the DataFrame and to not change the existing data.
>>
>> According to your question:
>>
>> You should use the Append SaveMode for your spark integration in case if you are going to store new data to cache and save the previous stored data.
>>
>> Note, that in case if you will store the data for the same Primary Keys then with data will be overwritten in Ignite table. For example:
>>
>> 1)Add person {id=1, name=Vlad, age=19} where id is the primary key
>> 2)Add person {id=1, name=Nikita, age=26} where id is the primary key
>>
>> In Ignite you will see only {id=1, name=Nikita, age=26}.
>>
>> Also here you can see the code sample for you and other information about SaveModes:
>>
>> https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes <https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes>
>>
>> BR,
>> Andrei
>>
>> On 2019/08/08 17:33:39, sri hari kali charan Tummala <k....@gmail.com> <ma...@gmail.com> wrote:
>> > Hi All,>
>> >
>> > I am new to Apache Ignite community I am testing out ignite for knowledge>
>> > sake in the below example the code reads a json file and writes to ingite>
>> > in-memory table is it overwriting can I do append mode I did try spark>
>> > append mode .mode(org.apache.spark.sql.SaveMode.Append)>
>> > without stopping one ignite application inginte.stop which keeps the cache>
>> > alive and tried to insert data to cache twice but I am still getting 4>
>> > records I was expecting 8 records , what would be the reason ?>
>> >
>> > https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89 <https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89>>
>> >
>> > -- >
>> > Thanks & Regards>
>> > Sri Tummala>
>> >
>>
>>
>> --
>> Thanks & Regards
>> Sri Tummala
>>
>>
>>
>> --
>> Thanks & Regards
>> Sri Tummala
>>
>
>
>
>
> --
> Thanks & Regards
> Sri Tummala
>
Re: Ignite Spark Example Question
Posted by sri hari kali charan Tummala <ka...@gmail.com>.
Thanks Stephen , last question so I have to keep looping to find new data
files in S3 and write to cache real time or is it already built in ?
On Mon, Aug 12, 2019 at 5:43 AM Stephen Darlington <
stephen.darlington@gridgain.com> wrote:
> I don’t think there’s anything “out of the box,” but you could write a
> custom CacheStore to do that.
>
> See here for more details:
> https://apacheignite.readme.io/docs/3rd-party-store#section-custom-cachestore
>
> Regards,
> Stephen
>
> On 9 Aug 2019, at 21:50, sri hari kali charan Tummala <
> kali.tummala@gmail.com> wrote:
>
> one last question, is there an S3 connector for Ignite which can load s3
> objects in realtime to ignite cache and data updates directly back to S3? I
> can use spark as one alternative but is there another approach of doing?
>
> Let's say I want to build in-memory near real-time data lake files which
> get loaded to S3 automatically gets loaded to Ignite (I can use spark
> structured streaming jobs but is there a direct approach ?)
>
> On Fri, Aug 9, 2019 at 4:34 PM sri hari kali charan Tummala <
> kali.tummala@gmail.com> wrote:
>
>> Thank you, I got it now I have to change the id values to see the same
>> data as extra results (this is just for testing) amazing.
>>
>> val df = spark.sql(SELECT monolitically_id() as id, name, department FROM
>> json_person)
>>
>> df.write(append)... to ignite
>>
>> Thanks
>> Sri
>>
>>
>> On Fri, Aug 9, 2019 at 6:08 AM Andrei Aleksandrov <
>> aealexsandrov@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Spark contains several *SaveModes *that will be applied if the table
>>> that you are going to use exists:
>>>
>>> * *Overwrite *- with this option you *will try to re-create* existed
>>> table or create new and load data there using IgniteDataStreamer
>>> implementation
>>> * *Append *- with this option you *will not try to re-create* existed
>>> table or create new table and just load the data to existed table
>>>
>>> * *ErrorIfExists *- with this option you will get the exception if the
>>> table that you are going to use exists
>>>
>>> * *Ignore *- with this option nothing will be done in case if the table
>>> that you are going to use exists. If table already exists, the save
>>> operation is expected to not save the contents of the DataFrame and to not
>>> change the existing data.
>>> According to your question:
>>>
>>> You should use the *Append *SaveMode for your spark integration in case
>>> if you are going to store new data to cache and save the previous stored
>>> data.
>>>
>>> Note, that in case if you will store the data for the same Primary Keys
>>> then with data will be overwritten in Ignite table. For example:
>>>
>>> 1)Add person {id=1, name=Vlad, age=19} where id is the primary key
>>> 2)Add person {id=1, name=Nikita, age=26} where id is the primary key
>>>
>>> In Ignite you will see only {id=1, name=Nikita, age=26}.
>>>
>>> Also here you can see the code sample for you and other information
>>> about SaveModes:
>>>
>>>
>>> https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes
>>>
>>> BR,
>>> Andrei
>>>
>>> On 2019/08/08 17:33:39, sri hari kali charan Tummala <k....@gmail.com>
>>> <k....@gmail.com> wrote:
>>> > Hi All,>
>>> >
>>> > I am new to Apache Ignite community I am testing out ignite for
>>> knowledge>
>>> > sake in the below example the code reads a json file and writes to
>>> ingite>
>>> > in-memory table is it overwriting can I do append mode I did try
>>> spark>
>>> > append mode .mode(org.apache.spark.sql.SaveMode.Append)>
>>> > without stopping one ignite application inginte.stop which keeps the
>>> cache>
>>> > alive and tried to insert data to cache twice but I am still getting
>>> 4>
>>> > records I was expecting 8 records , what would be the reason ?>
>>> >
>>> >
>>> https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89>
>>>
>>> >
>>> > -- >
>>> > Thanks & Regards>
>>> > Sri Tummala>
>>> >
>>>
>>
>>
>> --
>> Thanks & Regards
>> Sri Tummala
>>
>>
>
> --
> Thanks & Regards
> Sri Tummala
>
>
>
>
--
Thanks & Regards
Sri Tummala
Re: Ignite Spark Example Question
Posted by Stephen Darlington <st...@gridgain.com>.
I don’t think there’s anything “out of the box,” but you could write a custom CacheStore to do that.
See here for more details: https://apacheignite.readme.io/docs/3rd-party-store#section-custom-cachestore
Regards,
Stephen
> On 9 Aug 2019, at 21:50, sri hari kali charan Tummala <ka...@gmail.com> wrote:
>
> one last question, is there an S3 connector for Ignite which can load s3 objects in realtime to ignite cache and data updates directly back to S3? I can use spark as one alternative but is there another approach of doing?
>
> Let's say I want to build in-memory near real-time data lake files which get loaded to S3 automatically gets loaded to Ignite (I can use spark structured streaming jobs but is there a direct approach ?)
>
> On Fri, Aug 9, 2019 at 4:34 PM sri hari kali charan Tummala <kali.tummala@gmail.com <ma...@gmail.com>> wrote:
> Thank you, I got it now I have to change the id values to see the same data as extra results (this is just for testing) amazing.
>
> val df = spark.sql(SELECT monolitically_id() as id, name, department FROM json_person)
>
> df.write(append)... to ignite
>
> Thanks
> Sri
>
>
> On Fri, Aug 9, 2019 at 6:08 AM Andrei Aleksandrov <aealexsandrov@gmail.com <ma...@gmail.com>> wrote:
> Hi,
>
> Spark contains several SaveModes that will be applied if the table that you are going to use exists:
>
> * Overwrite - with this option you will try to re-create existed table or create new and load data there using IgniteDataStreamer implementation
> * Append - with this option you will not try to re-create existed table or create new table and just load the data to existed table
> * ErrorIfExists - with this option you will get the exception if the table that you are going to use exists
>
> * Ignore - with this option nothing will be done in case if the table that you are going to use exists. If table already exists, the save operation is expected to not save the contents of the DataFrame and to not change the existing data.
>
> According to your question:
>
> You should use the Append SaveMode for your spark integration in case if you are going to store new data to cache and save the previous stored data.
>
> Note, that in case if you will store the data for the same Primary Keys then with data will be overwritten in Ignite table. For example:
>
> 1)Add person {id=1, name=Vlad, age=19} where id is the primary key
> 2)Add person {id=1, name=Nikita, age=26} where id is the primary key
>
> In Ignite you will see only {id=1, name=Nikita, age=26}.
>
> Also here you can see the code sample for you and other information about SaveModes:
>
> https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes <https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes>
>
> BR,
> Andrei
>
> On 2019/08/08 17:33:39, sri hari kali charan Tummala <k....@gmail.com> <ma...@gmail.com> wrote:
> > Hi All,>
> >
> > I am new to Apache Ignite community I am testing out ignite for knowledge>
> > sake in the below example the code reads a json file and writes to ingite>
> > in-memory table is it overwriting can I do append mode I did try spark>
> > append mode .mode(org.apache.spark.sql.SaveMode.Append)>
> > without stopping one ignite application inginte.stop which keeps the cache>
> > alive and tried to insert data to cache twice but I am still getting 4>
> > records I was expecting 8 records , what would be the reason ?>
> >
> > https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89 <https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89>>
> >
> > -- >
> > Thanks & Regards>
> > Sri Tummala>
> >
>
>
> --
> Thanks & Regards
> Sri Tummala
>
>
>
> --
> Thanks & Regards
> Sri Tummala
>
Re: Ignite Spark Example Question
Posted by sri hari kali charan Tummala <ka...@gmail.com>.
one last question, is there an S3 connector for Ignite which can load s3
objects in realtime to ignite cache and data updates directly back to S3? I
can use spark as one alternative but is there another approach of doing?
Let's say I want to build in-memory near real-time data lake files which
get loaded to S3 automatically gets loaded to Ignite (I can use spark
structured streaming jobs but is there a direct approach ?)
On Fri, Aug 9, 2019 at 4:34 PM sri hari kali charan Tummala <
kali.tummala@gmail.com> wrote:
> Thank you, I got it now I have to change the id values to see the same
> data as extra results (this is just for testing) amazing.
>
> val df = spark.sql(SELECT monolitically_id() as id, name, department FROM
> json_person)
>
> df.write(append)... to ignite
>
> Thanks
> Sri
>
>
> On Fri, Aug 9, 2019 at 6:08 AM Andrei Aleksandrov <ae...@gmail.com>
> wrote:
>
>> Hi,
>>
>> Spark contains several *SaveModes *that will be applied if the table
>> that you are going to use exists:
>>
>> * *Overwrite *- with this option you *will try to re-create* existed
>> table or create new and load data there using IgniteDataStreamer
>> implementation
>> * *Append *- with this option you *will not try to re-create* existed
>> table or create new table and just load the data to existed table
>>
>> * *ErrorIfExists *- with this option you will get the exception if the
>> table that you are going to use exists
>>
>> * *Ignore *- with this option nothing will be done in case if the table
>> that you are going to use exists. If table already exists, the save
>> operation is expected to not save the contents of the DataFrame and to not
>> change the existing data.
>> According to your question:
>>
>> You should use the *Append *SaveMode for your spark integration in case
>> if you are going to store new data to cache and save the previous stored
>> data.
>>
>> Note, that in case if you will store the data for the same Primary Keys
>> then with data will be overwritten in Ignite table. For example:
>>
>> 1)Add person {id=1, name=Vlad, age=19} where id is the primary key
>> 2)Add person {id=1, name=Nikita, age=26} where id is the primary key
>>
>> In Ignite you will see only {id=1, name=Nikita, age=26}.
>>
>> Also here you can see the code sample for you and other information about
>> SaveModes:
>>
>>
>> https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes
>>
>> BR,
>> Andrei
>>
>> On 2019/08/08 17:33:39, sri hari kali charan Tummala <k....@gmail.com>
>> <k....@gmail.com> wrote:
>> > Hi All,>
>> >
>> > I am new to Apache Ignite community I am testing out ignite for
>> knowledge>
>> > sake in the below example the code reads a json file and writes to
>> ingite>
>> > in-memory table is it overwriting can I do append mode I did try spark>
>> > append mode .mode(org.apache.spark.sql.SaveMode.Append)>
>> > without stopping one ignite application inginte.stop which keeps the
>> cache>
>> > alive and tried to insert data to cache twice but I am still getting 4>
>> > records I was expecting 8 records , what would be the reason ?>
>> >
>> >
>> https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89>
>>
>> >
>> > -- >
>> > Thanks & Regards>
>> > Sri Tummala>
>> >
>>
>
>
> --
> Thanks & Regards
> Sri Tummala
>
>
--
Thanks & Regards
Sri Tummala
Re: Ignite Spark Example Question
Posted by sri hari kali charan Tummala <ka...@gmail.com>.
Thank you, I got it now I have to change the id values to see the same data
as extra results (this is just for testing) amazing.
val df = spark.sql(SELECT monolitically_id() as id, name, department FROM
json_person)
df.write(append)... to ignite
Thanks
Sri
On Fri, Aug 9, 2019 at 6:08 AM Andrei Aleksandrov <ae...@gmail.com>
wrote:
> Hi,
>
> Spark contains several *SaveModes *that will be applied if the table that
> you are going to use exists:
>
> * *Overwrite *- with this option you *will try to re-create* existed
> table or create new and load data there using IgniteDataStreamer
> implementation
> * *Append *- with this option you *will not try to re-create* existed
> table or create new table and just load the data to existed table
>
> * *ErrorIfExists *- with this option you will get the exception if the
> table that you are going to use exists
>
> * *Ignore *- with this option nothing will be done in case if the table
> that you are going to use exists. If table already exists, the save
> operation is expected to not save the contents of the DataFrame and to not
> change the existing data.
> According to your question:
>
> You should use the *Append *SaveMode for your spark integration in case
> if you are going to store new data to cache and save the previous stored
> data.
>
> Note, that in case if you will store the data for the same Primary Keys
> then with data will be overwritten in Ignite table. For example:
>
> 1)Add person {id=1, name=Vlad, age=19} where id is the primary key
> 2)Add person {id=1, name=Nikita, age=26} where id is the primary key
>
> In Ignite you will see only {id=1, name=Nikita, age=26}.
>
> Also here you can see the code sample for you and other information about
> SaveModes:
>
>
> https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes
>
> BR,
> Andrei
>
> On 2019/08/08 17:33:39, sri hari kali charan Tummala <k....@gmail.com>
> <k....@gmail.com> wrote:
> > Hi All,>
> >
> > I am new to Apache Ignite community I am testing out ignite for
> knowledge>
> > sake in the below example the code reads a json file and writes to
> ingite>
> > in-memory table is it overwriting can I do append mode I did try spark>
> > append mode .mode(org.apache.spark.sql.SaveMode.Append)>
> > without stopping one ignite application inginte.stop which keeps the
> cache>
> > alive and tried to insert data to cache twice but I am still getting 4>
> > records I was expecting 8 records , what would be the reason ?>
> >
> >
> https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89>
>
> >
> > -- >
> > Thanks & Regards>
> > Sri Tummala>
> >
>
--
Thanks & Regards
Sri Tummala
Re: Ignite Spark Example Question
Posted by Andrei Aleksandrov <ae...@gmail.com>.
Hi,
Spark contains several *SaveModes *that will be applied if the table
that you are going to use exists:
* *Overwrite *- with this option you *will try to re-create* existed
table or create new and load data there using IgniteDataStreamer
implementation
* *Append *- with this option you *will not try to re-create* existed
table or create new table and just load the data to existed table
* *ErrorIfExists *- with this option you will get the exception if the
table that you are going to use exists
* *Ignore *- with this option nothing will be done in case if the table
that you are going to use exists. If table already exists, the save
operation is expected to not save the contents of the DataFrame and to
not change the existing data.
According to your question:
You should use the *Append *SaveMode for your spark integration in case
if you are going to store new data to cache and save the previous stored
data.
Note, that in case if you will store the data for the same Primary Keys
then with data will be overwritten in Ignite table. For example:
1)Add person {id=1, name=Vlad, age=19} where id is the primary key
2)Add person {id=1, name=Nikita, age=26} where id is the primary key
In Ignite you will see only {id=1, name=Nikita, age=26}.
Also here you can see the code sample for you and other information
about SaveModes:
https://apacheignite-fs.readme.io/docs/ignite-data-frame#section-saving-dataframes
BR,
Andrei
On 2019/08/08 17:33:39, sri hari kali charan Tummala <k....@gmail.com>
wrote:
> Hi All,>
>
> I am new to Apache Ignite community I am testing out ignite for
knowledge>
> sake in the below example the code reads a json file and writes to
ingite>
> in-memory table is it overwriting can I do append mode I did try spark>
> append mode .mode(org.apache.spark.sql.SaveMode.Append)>
> without stopping one ignite application inginte.stop which keeps the
cache>
> alive and tried to insert data to cache twice but I am still getting 4>
> records I was expecting 8 records , what would be the reason ?>
>
>
https://github.com/apache/ignite/blob/1f8cf042f67f523e23f795571f609a9c81726258/examples/src/main/spark/org/apache/ignite/examples/spark/IgniteDataFrameWriteExample.scala#L89>
>
> -- >
> Thanks & Regards>
> Sri Tummala>
>