You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by joseheitor <jo...@heitorprojects.com> on 2018/11/18 07:36:13 UTC

JDBC Streaming

JDBC Client connection URL with 'streaming=true' runs (much faster than
without) but no data is inserted into table. No errors are reported.

I have two nodes with one partitioned table. Without 'streaming=true' it
take around 8 hours to load 8,000,000 rows. The insert loop takes only
around 8 minutes with streaming=true, but no data is inserted into the
table.

Any suggestions?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: JDBC Streaming

Posted by joseheitor <jo...@heitorprojects.com>.
Hi Ilya,

Tried executing SQL statement - 'SET STREAMING ON' from code, but this
produces an exception.

Here is my table definition:

CREATE TABLE public.test_data (
  pk LONG, id LONG, k VARCHAR, v VARCHAR,  PRIMARY KEY (pk, id))
  WITH "TEMPLATE=PARTITIONED, BACKUPS=1, ATOMICITY=TRANSACTIONAL,
WRITE_SYNCHRONIZATION_MODE=FULL_SYNC, AFFINITY_KEY=id";

CREATE INDEX test_data_id_k_v ON public.test_data (id, k, v);
CREATE INDEX test_data_k_id_v ON public.test_data (k, id, v);
CREATE INDEX test_data_k_v_id ON public.test_data (k, v, id);
CREATE INDEX test_data_id ON public.test_data (id);
CREATE INDEX test_data_k ON public.test_data (k);
CREATE INDEX test_data_v ON public.test_data (v);
CREATE INDEX test_data_k_v ON public.test_data (k, v);

Here is my code:

    public static void streamTestData() throws Exception {

        Connection dbConnection = null;
        PreparedStatement statement = null;

        String insertTableSQL = "INSERT INTO test_data (pk, id, k, v) VALUES
(?, ?, ?, ?)";
        try {
            
            Class.forName("org.apache.ignite.IgniteJdbcDriver");
            dbConnection = DriverManager.getConnection(
                   
"jdbc:ignite:cfg://cache=DATASTORE:streaming=true:streamingFlushFrequency=1000@file:///home/me/ignite-jdbc-client.xml",
                    "", "");
            dbConnection.setSchema("PUBLIC");

            statement = dbConnection.prepareStatement(insertTableSQL);
            int pk = 0;
            for (int id = 0; id < 10000; id ++) {

                statement.setInt(1, ++pk);
                statement.setInt(2, id);
                statement.setString(3, "trans.cust.first_name");
                statement.setString(4, pk % 2 == 0 ? "Joe" : "Mary");
                statement.executeUpdate();

                statement.setInt(1, ++pk);
                statement.setInt(2, id);
                statement.setString(3, "trans.cust.last_name");
                statement.setString(4, pk % 2 == 0 ? "Smythe" : "Pearson");
                statement.executeUpdate();

                statement.setInt(1, ++pk);
                statement.setInt(2, id);
                statement.setString(3, "trans.cust.email");
                statement.setString(4, pk % 2 == 0 ? "joe@work.com" :
"mary@home.org");
                statement.executeUpdate();

                statement.setInt(1, ++pk);
                statement.setInt(2, id);
                statement.setString(3, "trans.cust.gender");
                statement.setString(4, pk % 2 == 0 ? "Male" : "Female");
                statement.executeUpdate();

                long now = System.currentTimeMillis();

                statement.setInt(1, ++pk);
                statement.setInt(2, id);
                statement.setString(3, "trans.date");
                statement.setString(4, "201" + String.valueOf(now % 10) +
"-12-0" + String.valueOf(pk % 10));
                statement.executeUpdate();

                statement.setInt(1, ++pk);
                statement.setInt(2, id);
                statement.setString(3, "trans.amount");
                statement.setString(4, String.valueOf((now % 1000000) + 1));
                statement.executeUpdate();
            }
        } catch (Exception e) {
            e.printStackTrace(System.err);
        } finally {

            if (statement != null) {
                statement.close();
            }

            if (dbConnection != null) {
                dbConnection.close();
            }
        }
    }




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: JDBC Streaming

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Have you tried force-flushing by doing SET STREAMING OFF; SET STREAMING ON;
every once in a while?

If this does not help, can you post a small reproducer on github?

Regards,
-- 
Ilya Kasnacheev


вс, 18 нояб. 2018 г. в 10:43, joseheitor <jo...@heitorprojects.com>:

> JDBC Client connection URL with 'streaming=true' runs (much faster than
> without) but no data is inserted into table. No errors are reported.
>
> I have two nodes with one partitioned table. Without 'streaming=true' it
> take around 8 hours to load 8,000,000 rows. The insert loop takes only
> around 8 minutes with streaming=true, but no data is inserted into the
> table.
>
> Any suggestions?
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: JDBC Streaming

Posted by joseheitor <jo...@heitorprojects.com>.
Can anyone help with this?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: [RESOLVED] JDBC Streaming

Posted by joseheitor <jo...@heitorprojects.com>.
Awesome - it worked!

Thanks, Ilya.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: [RESOLVED] JDBC Streaming

Posted by joseheitor <jo...@heitorprojects.com>.
Thanks Ilya,

Will try your suggestions...

Where in the XML config do I set the 'failure detection timeout'?

Thanks,
Jose



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: [RESOLVED] JDBC Streaming

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Can you try to increase 'failure detection timeout' property of your
DiscoverySpi to something like 5 minutes (300000)?
This is failure detection handler which is supposed to catch deadlocks but
instead it catches things like WAL copying.

Also consider turning off WAL while ingesting data, turning it back on when
data load is finished.

this is by ALTER TABLE public.tableName NOLOGGING;
and then back ALTER TABLE public.tableName LOGGING;

Regards,
-- 
Ilya Kasnacheev


пт, 7 дек. 2018 г. в 19:29, joseheitor <jo...@heitorprojects.com>:

> Hi Ilya,
>
> Your recommendation works - thanks.
>
> When loading a few million records, I always seem to experience a node
> failure on one of my two nodes.
>
> I have attached the log from a recent run on version 2.7.0, failure
> occurred
> after about 10 minutes...
>
> Please let me know if you see anything I must address to prevent this?
>
> Thanks,
> Jose
>
> ignite-66c8e66a.gz
> <
> http://apache-ignite-users.70518.x6.nabble.com/file/t1652/ignite-66c8e66a.gz>
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: [RESOLVED] JDBC Streaming

Posted by joseheitor <jo...@heitorprojects.com>.
Hi Ilya,

Your recommendation works - thanks.

When loading a few million records, I always seem to experience a node
failure on one of my two nodes.

I have attached the log from a recent run on version 2.7.0, failure occurred
after about 10 minutes...

Please let me know if you see anything I must address to prevent this?

Thanks,
Jose

ignite-66c8e66a.gz
<http://apache-ignite-users.70518.x6.nabble.com/file/t1652/ignite-66c8e66a.gz>  



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: [RESOLVED] JDBC Streaming

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Yes, thin client has different approach to failover. However, it should not
freeze. Please collect thread dumps so that we can see why it froze. The
expectation here is that connection will be eventually dropped.

I have found a reason for your Thick driver streaming troubles. Turns out,
you have to specify cache=the cache you are streaming do.
You were supplying DATASTORE but were actually trying to stream to
TRANSACTIONS table which isn't in it.
The correct line will be:

if (stream)
    url = "jdbc:ignite:cfg://cache=SQL_PUBLIC_TRANSACTIONS:streaming=true:streamingFlushFrequency=1000@file://"
+ config;

I have created a ticket https://issues.apache.org/jira/browse/IGNITE-10515

Can't promise it will get traction since everyone seems to switch to
thin driver.

Regards,

-- 

Ilya Kasnacheev


пн, 3 дек. 2018 г. в 20:10, joseheitor <jo...@heitorprojects.com>:

> Hooray!!! - It works.
>
> Thanks, Ilya.
>
> Please continue your investigation of the JDBC Client Driver
> (thick-client),
> and let me know what you find...?
>
> What follows should perhaps be posted separately...but here's something I
> noticed, which I don't fully understand or know how to deal with:
>
> While bulk-loading data previously without streaming, via the THICK Client,
> I had a node go down. Data continued to load though, as my other node
> remained active. When I noticed this, I simply started up the failed node
> again. Data-loading paused for a while (presumably while the nodes
> synced???), and then continued until completed.
>
> Now, while bulk-loading data with streaming, on the THIN Client, I also had
> a node go down. Data did not however continue to load. The application did
> not see a dropped connection - it simply froze. When I noticed this, I
> started up the failed node again, as before. Data-loading did not however
> resume loading as before. (I had to kill the application and restart
> it...).
>
> Should Ignite not throw an exception, to alert the application that the
> database connection dropped? Or is there support for a connection timeout
> for an SQL execution statement? Or (even better) should the driver not have
> automatically detected the connection loss and re-established a new
> connection to the other node (as the THICK client presumably does...)? and
> continued.
>
> Thanks,
> Jose
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

[RESOLVED] JDBC Streaming

Posted by joseheitor <jo...@heitorprojects.com>.
Hooray!!! - It works.

Thanks, Ilya.

Please continue your investigation of the JDBC Client Driver (thick-client),
and let me know what you find...?

What follows should perhaps be posted separately...but here's something I
noticed, which I don't fully understand or know how to deal with:

While bulk-loading data previously without streaming, via the THICK Client,
I had a node go down. Data continued to load though, as my other node
remained active. When I noticed this, I simply started up the failed node
again. Data-loading paused for a while (presumably while the nodes
synced???), and then continued until completed.

Now, while bulk-loading data with streaming, on the THIN Client, I also had
a node go down. Data did not however continue to load. The application did
not see a dropped connection - it simply froze. When I noticed this, I
started up the failed node again, as before. Data-loading did not however
resume loading as before. (I had to kill the application and restart it...).

Should Ignite not throw an exception, to alert the application that the
database connection dropped? Or is there support for a connection timeout
for an SQL execution statement? Or (even better) should the driver not have
automatically detected the connection loss and re-established a new
connection to the other node (as the THICK client presumably does...)? and
continued.

Thanks,
Jose



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: JDBC Streaming

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Indeed, you will need to reconnect if node that you're connected to fails.

It always supported returning cluster-wide data retrieval.

Regarda,
-- 
Ilya Kasnacheev


пн, 3 дек. 2018 г. в 15:25, joseheitor <jo...@heitorprojects.com>:

> Hi Ilya,
>
> Thanks for the response.
>
> My understanding was that Thin JDBC driver was only able to connect to a
> single node (not a cluster), so that if that node failed - it was not able
> to continue operating on the cluster... It would also only return data
> residing on that node (not records residing on other cluster nodes)???
>
> But I see that the docs now mention failover support with a list of hosts
> and aggregation of data from multiple nodes on the 'connected' node (was
> this added recently?).
>
> I will apply the necessary changes and test your suggestion.
>
> Thanks,
> Jose
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: JDBC Streaming

Posted by joseheitor <jo...@heitorprojects.com>.
Hi Ilya,

Thanks for the response.

My understanding was that Thin JDBC driver was only able to connect to a
single node (not a cluster), so that if that node failed - it was not able
to continue operating on the cluster... It would also only return data
residing on that node (not records residing on other cluster nodes)???

But I see that the docs now mention failover support with a list of hosts
and aggregation of data from multiple nodes on the 'connected' node (was
this added recently?).

I will apply the necessary changes and test your suggestion.

Thanks,
Jose



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: JDBC Streaming

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Apache Ignite SQL should be accessed by Ignite JDBC Thin driver. This is
the preferred way.

JDBC Thin driver also has streaming mode in the form of SET STREAMING
ON/OFF.

Please see attached file where I have introduced "thin" mode.

As for client mode streaming not working, I will look into it further.

Regards,
-- 
Ilya Kasnacheev


пн, 3 дек. 2018 г. в 01:13, joseheitor <jo...@heitorprojects.com>:

> Hi Ilya,
>
> Any update on your investigation of this issue...?
>
> Your comments that 'streaming mode' in Client driver and Client driver
> itself are near-deprecated - are very surprising and concerning!
>
> 1. Are you saying that Apache Ignite SQL will seize to be accessible via
> standard JDBC?
>
> 2. If 'streaming mode' is to be deprecated - will there be an alternative
> method of inserting high-throughput data via SQL?
>
> Thanks,
> Jose
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: JDBC Streaming

Posted by joseheitor <jo...@heitorprojects.com>.
Hi Ilya,

Any update on your investigation of this issue...?

Your comments that 'streaming mode' in Client driver and Client driver
itself are near-deprecated - are very surprising and concerning!

1. Are you saying that Apache Ignite SQL will seize to be accessible via
standard JDBC?

2. If 'streaming mode' is to be deprecated - will there be an alternative
method of inserting high-throughput data via SQL?

Thanks,
Jose



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: JDBC Streaming

Posted by joseheitor <jo...@heitorprojects.com>.
Hi Ilya,

Please see attached archive with simple application demonstrating the
problem.

ignite-streaming.gz
<http://apache-ignite-users.70518.x6.nabble.com/file/t1652/ignite-streaming.gz>  

Once you have extracted the contents of the archive - please refer to the
README for instructions... (and source)

Thanks,
Jose





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: JDBC Streaming

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Can you create a small reproducer project which will exhibit this behavior?
Put it on e.g. Github.

Note that 'streaming mode' in Client driver and Client driver itself are
near-deprecated. So there may be some rough edges.
However, there are streaming mode tests in Apache Ignite and they pass.

Regards,
-- 
Ilya Kasnacheev


чт, 29 нояб. 2018 г. в 19:45, joseheitor <jo...@heitorprojects.com>:

> Hi Ilya,
>
> Yes - I am using JDBC Client driver to INSERT data into the SQL table. It
> works correctly (but slow) without setting 'streaming=true'.
>
> When I set 'streaming=true' in the connection string, as per the Ignite
> docs
> (
>
> https://apacheignite-sql.readme.io/docs/jdbc-client-driver#section-streaming-mode
> <
> https://apacheignite-sql.readme.io/docs/jdbc-client-driver#section-streaming-mode>
>
> ), then my data-insert code runs much faster, and without errors - but the
> table remains empty.
>
> I have followed the instructions on the docs carefully and reviewed several
> times over the last couple of days.
>
> There is either something else that needs to be done and is undocumented -
> or there is a bug in this feature.
>
> Please can you verify this functionality on your end and confirm...?
>
> Thanks,
> Jose
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: JDBC Streaming

Posted by joseheitor <jo...@heitorprojects.com>.
Hi Ilya,

Yes - I am using JDBC Client driver to INSERT data into the SQL table. It
works correctly (but slow) without setting 'streaming=true'.

When I set 'streaming=true' in the connection string, as per the Ignite docs
(
https://apacheignite-sql.readme.io/docs/jdbc-client-driver#section-streaming-mode
<https://apacheignite-sql.readme.io/docs/jdbc-client-driver#section-streaming-mode> 
), then my data-insert code runs much faster, and without errors - but the
table remains empty.

I have followed the instructions on the docs carefully and reviewed several
times over the last couple of days.

There is either something else that needs to be done and is undocumented -
or there is a bug in this feature.

Please can you verify this functionality on your end and confirm...?

Thanks,
Jose



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: JDBC Streaming

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

You can use DML on thick client, too, but it will be slower than Data
Streamer approach.

Regards,
-- 
Ilya Kasnacheev


вт, 27 нояб. 2018 г. в 11:22, joseheitor <jo...@heitorprojects.com>:

> Shouldn't I be able to do this through plain JDBC DML operations? The
> documentation for the JDBC Client Driver specifies this as a feature...
>
> Don't know whether I am doing something wrong, or if there is a bug?
>
> Is there any example showcasing this feature?
>
> Thanks,
> Jose
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: JDBC Streaming

Posted by joseheitor <jo...@heitorprojects.com>.
Shouldn't I be able to do this through plain JDBC DML operations? The
documentation for the JDBC Client Driver specifies this as a feature...

Don't know whether I am doing something wrong, or if there is a bug?

Is there any example showcasing this feature?

Thanks,
Jose



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: JDBC Streaming

Posted by Stephen Darlington <st...@gridgain.com>.
The streaming examples should work fine for you. “Grid caches” and “SQL tables” are not two, different things. They are just two ways of accessing the same underlying structures. You can happily insert data using a Data Streamer and access it later using SQL.

Regards,
Stephen

> On 26 Nov 2018, at 09:40, joseheitor <jo...@heitorprojects.com> wrote:
> 
> Hi Ilya,
> 
> I realize that Streaming is not supported via the JDBC Client node interface
> (*thanks for that info). But is it possible to stream-import data via (code)
> SQL directly and load it into persistent SQL table?
> 
> (The Ignite code examples for Streaming only show loading data into grid
> caches...not SQL tables)
> 
> Thanks
> 
> 
> 
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/



Re: JDBC Streaming

Posted by joseheitor <jo...@heitorprojects.com>.
Hi Ilya,

I realize that Streaming is not supported via the JDBC Client node interface
(*thanks for that info). But is it possible to stream-import data via (code)
SQL directly and load it into persistent SQL table?

(The Ignite code examples for Streaming only show loading data into grid
caches...not SQL tables)

Thanks



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: JDBC Streaming

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Yes, streaming mode is not supported by client node driver. In this case
your best bet is using Data Streamer anyway.
https://apacheignite.readme.io/v2.6/docs/data-streamers

Regards,
-- 
Ilya Kasnacheev


вс, 25 нояб. 2018 г. в 11:45, joseheitor <jo...@heitorprojects.com>:

> Is Streaming perhaps not supported with JDBC Client driver...?
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: JDBC Streaming

Posted by joseheitor <jo...@heitorprojects.com>.
Is Streaming perhaps not supported with JDBC Client driver...?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: JDBC Streaming

Posted by joseheitor <jo...@heitorprojects.com>.
Can anyone spot if I am doing something wrong...?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/