You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Banias H <ba...@gmail.com> on 2019/05/16 20:40:53 UTC

count not updated through SQL using DataStreamer

Hello Ignite experts,

I am very new to Ignite. I am trying to ingest 15M rows of data using
DataStreamer into a table in a two-node Ignite cluster (v2.7) but run into
problems of not getting the data through running SQL on DBeaver.

Here is the list of steps I took:

1. Start up two nodes using the following xml.

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="
       http://www.springframework.org/schema/beans
       http://www.springframework.org/schema/beans/spring-beans.xsd">
  <bean class="org.apache.ignite.configuration.IgniteConfiguration">
    <!-- Enabling Apache Ignite native persistence. -->
    <property name="dataStorageConfiguration">
      <bean
class="org.apache.ignite.configuration.DataStorageConfiguration">
        <property name="defaultDataRegionConfiguration">
          <bean
class="org.apache.ignite.configuration.DataRegionConfiguration">
            <property name="persistenceEnabled" value="true"/>
          </bean>
        </property>
      </bean>
    </property>
    <property name="discoverySpi">
      <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
        <property name="ipFinder">
          <bean
class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
            <property name="addresses">
              <list>
                <value>IP1</value>
                <value>IP2</value>
              </list>
            </property>
          </bean>
        </property>
      </bean>
    </property>
    <property name="cacheConfiguration">
      <bean class="org.apache.ignite.configuration.CacheConfiguration">
        <property name="name" value="FOO"/>
        <property name="cacheMode" value="PARTITIONED"/>
        <property name="backups" value="0"/>
      </bean>
    </property>
  </bean>
</beans>

2. Use python thin client to get the cache SQL_PUBLIC_FOO and insert ten
row of data. After this step, both thin client and DBeaver SQL client
report the same count:

- thin client:

nodes = [
    (IP1, 10800),
    (IP2, 10800),
]
client = Client()
client.connect(nodes)
cache = client.get_cache("SQL_PUBLIC_FOO")
print(cache.get_size())

returns 10

- SQL through DBeaver

SELECT COUNT(*) FROM FOO

returns 10

3. However when I tried using DataStreamer to ingest 100 rows into the
cache SQL_PUBLIC_FOO, only thin client showed new count value and SQL
returned old count value:

- ingesting through DataStreamer
//I ran the jar on one of the Ignite nodes
String CONFIG_FILE = <path to the xml file shown above>;
Ignition.setClientMode(true);
Ignite ignite = Ignition.start(CONFIG_FILE);
IgniteDataStreamer<Integer, String> stmr =
ignite.dataStreamer("SQL_PUBLIC_FOO");
stmr.addData(rowCount, value);

- thin client:

nodes = [
    (IP1, 10800),
    (IP2, 10800),
]
client = Client()
client.connect(nodes)
cache = client.get_cache("SQL_PUBLIC_FOO")
cache.get_size()

returns 110

- SQL through DBeaver

SELECT COUNT(*) FROM FOO

returns 10

Would anyone shed some lights on what I did wrong? I would love to use
DataStreamer to put much more data into the cache so that I would wan to be
able to query them through SQL.

Thanks for the help. I appreciate it.

Regards,
Calvin

Re: count not updated through SQL using DataStreamer

Posted by Ivan Pavlukhina <vo...@gmail.com>.
Hi Calvin,

Cache.size and SELECT COUNT(*) are not always equal in Ignite. Could you please tell what arguments did you pass to IgniteDataStreamer.addData method?

Sent from my iPhone

> On 16 May 2019, at 23:40, Banias H <ba...@gmail.com> wrote:
> 
> Hello Ignite experts,
> 
> I am very new to Ignite. I am trying to ingest 15M rows of data using DataStreamer into a table in a two-node Ignite cluster (v2.7) but run into problems of not getting the data through running SQL on DBeaver.
> 
> Here is the list of steps I took:
> 
> 1. Start up two nodes using the following xml.
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <beans xmlns="http://www.springframework.org/schema/beans"
>        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>        xsi:schemaLocation="
>        http://www.springframework.org/schema/beans
>        http://www.springframework.org/schema/beans/spring-beans.xsd">
>   <bean class="org.apache.ignite.configuration.IgniteConfiguration">
>     <!-- Enabling Apache Ignite native persistence. -->
>     <property name="dataStorageConfiguration">
>       <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
>         <property name="defaultDataRegionConfiguration">
>           <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
>             <property name="persistenceEnabled" value="true"/>
>           </bean>
>         </property>
>       </bean>
>     </property>
>     <property name="discoverySpi">
>       <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
>         <property name="ipFinder">
>           <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
>             <property name="addresses">
>               <list>
>                 <value>IP1</value>
>                 <value>IP2</value>
>               </list>
>             </property>
>           </bean>
>         </property>
>       </bean>
>     </property>
>     <property name="cacheConfiguration">
>       <bean class="org.apache.ignite.configuration.CacheConfiguration">
>         <property name="name" value="FOO"/>
>         <property name="cacheMode" value="PARTITIONED"/>
>         <property name="backups" value="0"/>
>       </bean>
>     </property>
>   </bean>
> </beans>
> 
> 2. Use python thin client to get the cache SQL_PUBLIC_FOO and insert ten row of data. After this step, both thin client and DBeaver SQL client report the same count:
> 
> - thin client:
> 
> nodes = [
>     (IP1, 10800),
>     (IP2, 10800),
> ]
> client = Client()
> client.connect(nodes)
> cache = client.get_cache("SQL_PUBLIC_FOO")
> print(cache.get_size())
> 
> returns 10
> 
> - SQL through DBeaver
> 
> SELECT COUNT(*) FROM FOO 
> 
> returns 10
> 
> 3. However when I tried using DataStreamer to ingest 100 rows into the cache SQL_PUBLIC_FOO, only thin client showed new count value and SQL returned old count value:
> 
> - ingesting through DataStreamer
> //I ran the jar on one of the Ignite nodes
> String CONFIG_FILE = <path to the xml file shown above>;
> Ignition.setClientMode(true);
> Ignite ignite = Ignition.start(CONFIG_FILE);
> IgniteDataStreamer<Integer, String> stmr = ignite.dataStreamer("SQL_PUBLIC_FOO");
> stmr.addData(rowCount, value);
> 
> - thin client:
> 
> nodes = [
>     (IP1, 10800),
>     (IP2, 10800),
> ]
> client = Client()
> client.connect(nodes)
> cache = client.get_cache("SQL_PUBLIC_FOO")
> cache.get_size() 
> 
> returns 110
> 
> - SQL through DBeaver
> 
> SELECT COUNT(*) FROM FOO 
> 
> returns 10
> 
> Would anyone shed some lights on what I did wrong? I would love to use DataStreamer to put much more data into the cache so that I would wan to be able to query them through SQL.
> 
> Thanks for the help. I appreciate it.
> 
> Regards,
> Calvin
> 

Re: count not updated through SQL using DataStreamer

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Can you show how you create FOO table? I guess expected cache type is NOT
<Integer, String>, that's why no data is seen.

Regards,
-- 
Ilya Kasnacheev


чт, 16 мая 2019 г. в 23:49, Banias H <ba...@gmail.com>:

> Hello Ignite experts,
>
> I am very new to Ignite. I am trying to ingest 15M rows of data using
> DataStreamer into a table in a two-node Ignite cluster (v2.7) but run into
> problems of not getting the data through running SQL on DBeaver.
>
> Here is the list of steps I took:
>
> 1. Start up two nodes using the following xml.
>
> <?xml version="1.0" encoding="UTF-8"?>
> <beans xmlns="http://www.springframework.org/schema/beans"
>        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>        xsi:schemaLocation="
>        http://www.springframework.org/schema/beans
>        http://www.springframework.org/schema/beans/spring-beans.xsd">
>   <bean class="org.apache.ignite.configuration.IgniteConfiguration">
>     <!-- Enabling Apache Ignite native persistence. -->
>     <property name="dataStorageConfiguration">
>       <bean
> class="org.apache.ignite.configuration.DataStorageConfiguration">
>         <property name="defaultDataRegionConfiguration">
>           <bean
> class="org.apache.ignite.configuration.DataRegionConfiguration">
>             <property name="persistenceEnabled" value="true"/>
>           </bean>
>         </property>
>       </bean>
>     </property>
>     <property name="discoverySpi">
>       <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
>         <property name="ipFinder">
>           <bean
> class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
>             <property name="addresses">
>               <list>
>                 <value>IP1</value>
>                 <value>IP2</value>
>               </list>
>             </property>
>           </bean>
>         </property>
>       </bean>
>     </property>
>     <property name="cacheConfiguration">
>       <bean class="org.apache.ignite.configuration.CacheConfiguration">
>         <property name="name" value="FOO"/>
>         <property name="cacheMode" value="PARTITIONED"/>
>         <property name="backups" value="0"/>
>       </bean>
>     </property>
>   </bean>
> </beans>
>
> 2. Use python thin client to get the cache SQL_PUBLIC_FOO and insert ten
> row of data. After this step, both thin client and DBeaver SQL client
> report the same count:
>
> - thin client:
>
> nodes = [
>     (IP1, 10800),
>     (IP2, 10800),
> ]
> client = Client()
> client.connect(nodes)
> cache = client.get_cache("SQL_PUBLIC_FOO")
> print(cache.get_size())
>
> returns 10
>
> - SQL through DBeaver
>
> SELECT COUNT(*) FROM FOO
>
> returns 10
>
> 3. However when I tried using DataStreamer to ingest 100 rows into the
> cache SQL_PUBLIC_FOO, only thin client showed new count value and SQL
> returned old count value:
>
> - ingesting through DataStreamer
> //I ran the jar on one of the Ignite nodes
> String CONFIG_FILE = <path to the xml file shown above>;
> Ignition.setClientMode(true);
> Ignite ignite = Ignition.start(CONFIG_FILE);
> IgniteDataStreamer<Integer, String> stmr =
> ignite.dataStreamer("SQL_PUBLIC_FOO");
> stmr.addData(rowCount, value);
>
> - thin client:
>
> nodes = [
>     (IP1, 10800),
>     (IP2, 10800),
> ]
> client = Client()
> client.connect(nodes)
> cache = client.get_cache("SQL_PUBLIC_FOO")
> cache.get_size()
>
> returns 110
>
> - SQL through DBeaver
>
> SELECT COUNT(*) FROM FOO
>
> returns 10
>
> Would anyone shed some lights on what I did wrong? I would love to use
> DataStreamer to put much more data into the cache so that I would wan to be
> able to query them through SQL.
>
> Thanks for the help. I appreciate it.
>
> Regards,
> Calvin
>
>