You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by csumi <sc...@sapient.com> on 2017/08/02 09:35:11 UTC

Re: how to append data to IGFS so that data gets saved to Hive partitioned table?

Ignite-hadoop accelerator is configured as
https://apacheignite.readme.io/v1.0/docs/hadoop-accelerator#section-secondary-file-system.

Idea is to run hive queries on IGFS but looks like its not working that way.
Not sure how to confirm if hive is connecting to HDFS or IGFS. Also, I think
creating partition by passing comma separated string to fs.append is not
creating partition correctly which is resulting in no result on hive query.

Below are some of the configurations I have. Please let me know if I need to
share any more details.

core-site.xml is having these properties:

<property>
		<name>fs.defaultFS</name>
		<value>hdfs://localhost:9000</value>
	</property>
	
	
	<property>
		<name>fs.igfs.impl</name>
		<value>org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem</value>
  </property>
  <property>
		<name>fs.AbstractFileSystem.igfs.impl</name>
		<value>org.apache.ignite.hadoop.fs.v2.IgniteHadoopFileSystem</value>
  </property>
      <property>
       
<name>dfs.client.block.write.replace-datanode-on-failure.policy</name>
        <value>NEVER</value>
    </property>

ignite's default-config has below configuration

<property name="secondaryFileSystem">
                        <bean
class="org.apache.ignite.hadoop.fs.IgniteHadoopIgfsSecondaryFileSystem">
                            <property name="fileSystemFactory">
                                <bean
class="org.apache.ignite.hadoop.fs.CachingHadoopFileSystemFactory">
                                    <property name="uri"
value="hdfs://localhost:9000/"/>
									<property name="configPaths">
										<list>
											<value>D:/hadoop/etc/hadoop/core-site.xml</value>
										</list>
									</property>
                                </bean>
                            </property>
                        </bean>
                    </property>







--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/how-to-append-data-to-IGFS-so-that-data-gets-saved-to-Hive-partitioned-table-tp15725p15887.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: how to append data to IGFS so that data gets saved to Hive partitioned table?

Posted by csumi <sc...@sapient.com>.

Hi Mikhail,

Your last reply is showing as blank. Any luck with reproduce?

Thanks!



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/how-to-append-data-to-IGFS-so-that-data-gets-saved-to-Hive-partitioned-table-tp15725p16217.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: how to append data to IGFS so that data gets saved to Hive partitioned table?

Posted by Mikhail Getmanov <mi...@getmanov.name>.

Re: how to append data to IGFS so that data gets saved to Hive partitioned table?

Posted by csumi <sc...@sapient.com>.

Hi Mikhail,

Any luck with reproducing the issue?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/how-to-append-data-to-IGFS-so-that-data-gets-saved-to-Hive-partitioned-table-tp15725p16124.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: how to append data to IGFS so that data gets saved to Hive partitioned table?

Posted by Michael Cherkasov <mi...@gmail.com>.

Ok, I'll try to reproduce this issue locally, will response tomorrow.

2017-08-05 13:50 GMT+03:00 csumi <sc...@sapient.com>:

> Yes if I create partition using hive, select works fine. Please see last
> two
> bullets of my previous comment. Copying here for your quick reference:
>
> -      Now insert new row in the table to the partition created through
> code
> earlier
>         insert into table stocks3 PARTITION (years=2017,months=7,days=4)
> values('AAPL',1501236980,120.34);
> -       Run select query again. Now it gives 3 rows. Two of which were
> inserted
> using insert command and one through code which was not coming in select
> query earlier.
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/how-to-append-data-to-IGFS-so-that-
> data-gets-saved-to-Hive-partitioned-table-tp15725p16013.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: how to append data to IGFS so that data gets saved to Hive partitioned table?

Posted by csumi <sc...@sapient.com>.

Yes if I create partition using hive, select works fine. Please see last two
bullets of my previous comment. Copying here for your quick reference:

-      Now insert new row in the table to the partition created through code
earlier
        insert into table stocks3 PARTITION (years=2017,months=7,days=4)
values('AAPL',1501236980,120.34);
-       Run select query again. Now it gives 3 rows. Two of which were
inserted
using insert command and one through code which was not coming in select
query earlier.




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/how-to-append-data-to-IGFS-so-that-data-gets-saved-to-Hive-partitioned-table-tp15725p16013.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: how to append data to IGFS so that data gets saved to Hive partitioned table?

Posted by Michael Cherkasov <mi...@gmail.com>.

Hi,

Could you please clarify, if you run all actions using IGFS, but instaed of
fs.appedn use Hive, like:
insert into table stocks PARTITION (years=2004,months=12,days=3)
values('AAPL',1501236980,120.34);

Does select work this time?

Thanks,
Mikhail.

2017-08-04 12:56 GMT+03:00 csumi <sc...@sapient.com>:

> Let me try to clear here with the sequence of steps performed.
> -       Created table with partition through hive using below query. It
> creates a
> directory in hdfs.
>         create table stocks3 (stock string, time timestamp, price float)
> PARTITIONED BY (years bigint, months bigint, days bigint) ROW FORMAT
> DELIMITED FIELDS TERMINATED BY ',';
> -       Then I get streaming data and using IgniteFileSystem's
> append/create
> method, it gets saved to ignited hadoop.
> -       Run below select query. No result returned
>         select * from stocks3;
> -       Stop ignite and run the select again on hive. No result with below
> logs
>
> hive> select * from stocks3;
> 17/08/04 14:59:08 INFO conf.HiveConf: Using the default value passed in for
> log id: b5e3e924-e46a-481c-8aef-30d48605a2da
> 17/08/04 14:59:08 INFO session.SessionState: Updating thread name to
> b5e3e924-e46a-481c-8aef-30d48605a2da main
> 17/08/04 14:59:08 WARN operation.Operation: Unable to create operation log
> file:
> D:\tmp\hive\<user>\operation_logs\b5e3e924-e46a-481c-8aef-
> 30d48605a2da\137adad6-ea23-462c-a414-6ce260e5bd49
> java.io.IOException: The system cannot find the path specified
>         at java.io.WinNTFileSystem.createFileExclusively(Native Method)
>         at java.io.File.createNewFile(File.java:1012)
>         at
> org.apache.hive.service.cli.operation.Operation.
> createOperationLog(Operation.java:237)
>         at
> org.apache.hive.service.cli.operation.Operation.beforeRun(
> Operation.java:279)
>         at
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:314)
>         at
> org.apache.hive.service.cli.session.HiveSessionImpl.
> executeStatementInternal(HiveSessionImpl.java:499)
>         at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(
> HiveSessionImpl.java:486)
>         at
> org.apache.hive.service.cli.CLIService.executeStatementAsync(
> CLIService.java:295)
>         at
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(
> ThriftCLIService.java:506)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> 57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:491)
>         at
> org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(
> HiveConnection.java:1412)
>         at com.sun.proxy.$Proxy21.ExecuteStatement(Unknown Source)
>         at
> org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(
> HiveStatement.java:308)
>         at
> org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:250)
>         at
> org.apache.hive.beeline.Commands.executeInternal(Commands.java:988)
>         at org.apache.hive.beeline.Commands.execute(Commands.java:1160)
>         at org.apache.hive.beeline.Commands.sql(Commands.java:1074)
>         at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1148)
>         at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:976)
>         at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:886)
>         at org.apache.hive.beeline.cli.HiveCli.runWithArgs(HiveCli.
> java:35)
>         at org.apache.hive.beeline.cli.HiveCli.main(HiveCli.java:29)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> 57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:491)
>         at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> 17/08/04 14:59:08 INFO ql.Driver: Compiling
> command(queryId=<user>_20170804145908_b270c978-ab00-
> 4160-a2a6-c19b42eab676):
> select * from stocks3
> 17/08/04 14:59:08 INFO parse.CalcitePlanner: Starting Semantic Analysis
> 17/08/04 14:59:08 INFO parse.CalcitePlanner: Completed phase 1 of Semantic
> Analysis
> 17/08/04 14:59:08 INFO parse.CalcitePlanner: Get metadata for source tables
> 17/08/04 14:59:08 INFO metastore.HiveMetaStore: 0: get_table : db=yt
> tbl=stocks3
> 17/08/04 14:59:08 INFO HiveMetaStore.audit: ugi=<user>  ip=unknown-ip-addr
> cmd=get_table : db=yt tbl=stocks3
> 17/08/04 14:59:08 INFO parse.CalcitePlanner: Get metadata for subqueries
> 17/08/04 14:59:08 INFO parse.CalcitePlanner: Get metadata for destination
> tables
> 17/08/04 14:59:09 INFO ql.Context: New scratch dir is
> hdfs://localhost:9000/tmp/hive/<user>/b5e3e924-e46a-
> 481c-8aef-30d48605a2da/hive_2017-08-04_14-59-08_935_8316159022041430928-1
> 17/08/04 14:59:09 INFO parse.CalcitePlanner: Completed getting MetaData in
> Semantic Analysis
> 17/08/04 14:59:09 INFO parse.CalcitePlanner: Get metadata for source tables
> 17/08/04 14:59:09 INFO metastore.HiveMetaStore: 0: get_table : db=yt
> tbl=stocks3
> 17/08/04 14:59:09 INFO HiveMetaStore.audit: ugi=<user>  ip=unknown-ip-addr
> cmd=get_table : db=yt tbl=stocks3
> 17/08/04 14:59:09 INFO parse.CalcitePlanner: Get metadata for subqueries
> 17/08/04 14:59:09 INFO parse.CalcitePlanner: Get metadata for destination
> tables
> 17/08/04 14:59:09 INFO ql.Context: New scratch dir is
> hdfs://localhost:9000/tmp/hive/<user>/b5e3e924-e46a-
> 481c-8aef-30d48605a2da/hive_2017-08-04_14-59-08_935_8316159022041430928-1
> 17/08/04 14:59:09 INFO common.FileUtils: Creating directory if it doesn't
> exist:
> hdfs://localhost:9000/tmp/hive/<user>/b5e3e924-e46a-
> 481c-8aef-30d48605a2da/hive_2017-08-04_14-59-08_935_831615902204143
> 0928-1/-mr-10001/.hive-staging_hive_2017-08-04_14-59-
> 08_935_8316159022041430928-1
> 17/08/04 14:59:09 INFO parse.CalcitePlanner: CBO Succeeded; optimized
> logical plan.
> 17/08/04 14:59:09 INFO ppd.OpProcFactory: Processing for FS(2)
> 17/08/04 14:59:09 INFO ppd.OpProcFactory: Processing for SEL(1)
> 17/08/04 14:59:09 INFO ppd.OpProcFactory: Processing for TS(0)
> 17/08/04 14:59:09 INFO metastore.HiveMetaStore: 0: get_partitions : db=yt
> tbl=stocks3
> 17/08/04 14:59:09 INFO HiveMetaStore.audit: ugi=<user>  ip=unknown-ip-addr
> cmd=get_partitions : db=yt tbl=stocks3
> 17/08/04 14:59:09 INFO parse.CalcitePlanner: Completed plan generation
> 17/08/04 14:59:09 INFO ql.Driver: Semantic Analysis Completed
> 17/08/04 14:59:09 INFO ql.Driver: Returning Hive schema:
> Schema(fieldSchemas:[FieldSchema(name:stocks3.stock, type:string,
> comment:null), FieldSchema(name:stocks3.time, type:timestamp,
> comment:null),
> FieldSchema(name:stocks3.price, type:float, comment:null),
> FieldSchema(name:stocks3.years, type:bigint, comment:null),
> FieldSchema(name:stocks3.months, type:bigint, comment:null),
> FieldSchema(name:sto
> cks3.days, type:bigint, comment:null)], properties:null)
> 17/08/04 14:59:09 INFO exec.TableScanOperator: Initializing operator TS[0]
> 17/08/04 14:59:09 INFO exec.SelectOperator: Initializing operator SEL[1]
> 17/08/04 14:59:09 INFO exec.SelectOperator: SELECT
> struct<stock:string,time:timestamp,price:float,years:
> bigint,months:bigint,days:bigint>
> 17/08/04 14:59:09 INFO exec.ListSinkOperator: Initializing operator
> LIST_SINK[3]
> 17/08/04 14:59:09 INFO ql.Driver: EXPLAIN output for queryid
> <user>_20170804145908_b270c978-ab00-4160-a2a6-c19b42eab676 : STAGE
> DEPENDENCIES:
>   Stage-0 is a root stage [FETCH]
>
> STAGE PLANS:
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         TableScan
>           alias: stocks3
>           GatherStats: false
>           Select Operator
>             expressions: stock (type: string), time (type: timestamp),
> price
> (type: float), years (type: bigint), months (type: bigint), days (type:
> bigint)
>             outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
>             ListSink
>
>
> 17/08/04 14:59:09 INFO ql.Driver: Completed compiling
> command(queryId=<user>_20170804145908_b270c978-ab00-
> 4160-a2a6-c19b42eab676);
> Time taken: 0.586 seconds
> 17/08/04 14:59:09 INFO conf.HiveConf: Using the default value passed in for
> log id: b5e3e924-e46a-481c-8aef-30d48605a2da
> 17/08/04 14:59:09 INFO session.SessionState: Resetting thread name to  main
> 17/08/04 14:59:09 INFO ql.Driver: Concurrency mode is disabled, not
> creating
> a lock manager
> 17/08/04 14:59:09 INFO ql.Driver: Executing
> command(queryId=<user>_20170804145908_b270c978-ab00-
> 4160-a2a6-c19b42eab676):
> select * from stocks3
> 17/08/04 14:59:09 INFO ql.Driver: Completed executing
> command(queryId=<user>_20170804145908_b270c978-ab00-
> 4160-a2a6-c19b42eab676);
> Time taken: 0.002 seconds
> OK
> 17/08/04 14:59:09 INFO ql.Driver: OK
> 17/08/04 14:59:09 INFO conf.HiveConf: Using the default value passed in for
> log id: b5e3e924-e46a-481c-8aef-30d48605a2da
> 17/08/04 14:59:09 INFO session.SessionState: Updating thread name to
> b5e3e924-e46a-481c-8aef-30d48605a2da main
> 17/08/04 14:59:09 INFO conf.HiveConf: Using the default value passed in for
> log id: b5e3e924-e46a-481c-8aef-30d48605a2da
> 17/08/04 14:59:09 INFO session.SessionState: Resetting thread name to  main
> 17/08/04 14:59:09 INFO conf.HiveConf: Using the default value passed in for
> log id: b5e3e924-e46a-481c-8aef-30d48605a2da
> 17/08/04 14:59:09 INFO session.SessionState: Updating thread name to
> b5e3e924-e46a-481c-8aef-30d48605a2da Thread-52
> 17/08/04 14:59:09 INFO conf.HiveConf: Using the default value passed in for
> log id: b5e3e924-e46a-481c-8aef-30d48605a2da
> 17/08/04 14:59:09 INFO session.SessionState: Resetting thread name to
> Thread-52
> 17/08/04 14:59:09 WARN thrift.ThriftCLIService: Error fetching results:
> org.apache.hive.service.cli.HiveSQLException: Couldn't find log associated
> with operation handle: OperationHandle [opType=EXECUTE_STATEMENT,
> getHandleIdentifier()=137adad6-ea23-462c-a414-6ce260e5bd49]
>
>         at
> org.apache.hive.service.cli.operation.OperationManager.
> getOperationLogRowSet(OperationManager.java:324)
>         at
> org.apache.hive.service.cli.session.HiveSessionImpl.
> fetchResults(HiveSessionImpl.java:849)
>         at
> org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:505)
>         at
> org.apache.hive.service.cli.thrift.ThriftCLIService.
> FetchResults(ThriftCLIService.java:698)
>         at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:491)
>         at
> org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(
> HiveConnection.java:1412)
>         at com.sun.proxy.$Proxy21.FetchResults(Unknown Source)
>         at
> org.apache.hive.jdbc.HiveStatement.getQueryLog(HiveStatement.java:871)
>         at
> org.apache.hive.jdbc.HiveStatement.getQueryLog(HiveStatement.java:842)
>         at
> org.apache.hive.beeline.Commands.showRemainingLogsIfAny(
> Commands.java:1211)
>         at org.apache.hive.beeline.Commands.access$200(Commands.java:68)
>         at org.apache.hive.beeline.Commands$2.run(Commands.java:1187)
>         at java.lang.Thread.run(Thread.java:724)
> 17/08/04 14:59:09 INFO conf.HiveConf: Using the default value passed in for
> log id: b5e3e924-e46a-481c-8aef-30d48605a2da
> 17/08/04 14:59:09 INFO session.SessionState: Updating thread name to
> b5e3e924-e46a-481c-8aef-30d48605a2da main
> 17/08/04 14:59:09 INFO conf.HiveConf: Using the default value passed in for
> log id: b5e3e924-e46a-481c-8aef-30d48605a2da
> 17/08/04 14:59:09 INFO session.SessionState: Resetting thread name to  main
> No rows selected (0.612 seconds)
> 17/08/04 14:59:09 INFO conf.HiveConf: Using the default value passed in for
> log id: b5e3e924-e46a-481c-8aef-30d48605a2da
>
> -       Data created in HDFS
> (http://localhost:50070/explorer.html#/usr/hive/
> warehouse/yt.db/stocks3/years=2017/months=7/days=4)
> is as follows:
> -rw-r--r--      <user>  supergroup      44 B    Aug 04 14:48    3
>  128 MB  1
>
> -       Start ignite
> -       Run insert query as below
>         insert into table stocks3 PARTITION (years=2004,months=12,days=3)
> values('AAPL',1501236980,120.34);
> -       New partition created
> http://localhost:50070/explorer.html#/usr/hive/
> warehouse/yt.db/stocks3/years=2004/months=12/days=3
>         -rwxr-xr-x      <user>  supergroup      15 B    Aug 04 15:16    1
>      128 MB  000000_0
> -       Run below select query which is returning the row inserting using
> the aboe
> insert.
>         select * from stocks3;
> -       Now insert new row in the table to the partition created through
> code
> earlier
>         insert into table stocks3 PARTITION (years=2017,months=7,days=4)
> values('AAPL',1501236980,120.34);
> -       Run select query again. Now it gives 3 rows. Two of which were
> inserted
> using insert command and one through code which was not coming in select
> query earlier.
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/how-to-append-data-to-IGFS-so-that-
> data-gets-saved-to-Hive-partitioned-table-tp15725p15991.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: how to append data to IGFS so that data gets saved to Hive partitioned table?

Posted by csumi <sc...@sapient.com>.

Let me try to clear here with the sequence of steps performed. 
-	Created table with partition through hive using below query. It creates a
directory in hdfs.
	create table stocks3 (stock string, time timestamp, price float)
PARTITIONED BY (years bigint, months bigint, days bigint) ROW FORMAT
DELIMITED FIELDS TERMINATED BY ',';
-	Then I get streaming data and using IgniteFileSystem's append/create
method, it gets saved to ignited hadoop. 
-	Run below select query. No result returned
	select * from stocks3;
-	Stop ignite and run the select again on hive. No result with below logs

hive> select * from stocks3;
17/08/04 14:59:08 INFO conf.HiveConf: Using the default value passed in for
log id: b5e3e924-e46a-481c-8aef-30d48605a2da
17/08/04 14:59:08 INFO session.SessionState: Updating thread name to
b5e3e924-e46a-481c-8aef-30d48605a2da main
17/08/04 14:59:08 WARN operation.Operation: Unable to create operation log
file:
D:\tmp\hive\<user>\operation_logs\b5e3e924-e46a-481c-8aef-30d48605a2da\137adad6-ea23-462c-a414-6ce260e5bd49
java.io.IOException: The system cannot find the path specified
        at java.io.WinNTFileSystem.createFileExclusively(Native Method)
        at java.io.File.createNewFile(File.java:1012)
        at
org.apache.hive.service.cli.operation.Operation.createOperationLog(Operation.java:237)
        at
org.apache.hive.service.cli.operation.Operation.beforeRun(Operation.java:279)
        at
org.apache.hive.service.cli.operation.Operation.run(Operation.java:314)
        at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:499)
        at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:486)
        at
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:295)
        at
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:506)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:491)
        at
org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1412)
        at com.sun.proxy.$Proxy21.ExecuteStatement(Unknown Source)
        at
org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:308)
        at
org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:250)
        at
org.apache.hive.beeline.Commands.executeInternal(Commands.java:988)
        at org.apache.hive.beeline.Commands.execute(Commands.java:1160)
        at org.apache.hive.beeline.Commands.sql(Commands.java:1074)
        at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1148)
        at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:976)
        at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:886)
        at org.apache.hive.beeline.cli.HiveCli.runWithArgs(HiveCli.java:35)
        at org.apache.hive.beeline.cli.HiveCli.main(HiveCli.java:29)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:491)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
17/08/04 14:59:08 INFO ql.Driver: Compiling
command(queryId=<user>_20170804145908_b270c978-ab00-4160-a2a6-c19b42eab676):
select * from stocks3
17/08/04 14:59:08 INFO parse.CalcitePlanner: Starting Semantic Analysis
17/08/04 14:59:08 INFO parse.CalcitePlanner: Completed phase 1 of Semantic
Analysis
17/08/04 14:59:08 INFO parse.CalcitePlanner: Get metadata for source tables
17/08/04 14:59:08 INFO metastore.HiveMetaStore: 0: get_table : db=yt
tbl=stocks3
17/08/04 14:59:08 INFO HiveMetaStore.audit: ugi=<user>  ip=unknown-ip-addr     
cmd=get_table : db=yt tbl=stocks3
17/08/04 14:59:08 INFO parse.CalcitePlanner: Get metadata for subqueries
17/08/04 14:59:08 INFO parse.CalcitePlanner: Get metadata for destination
tables
17/08/04 14:59:09 INFO ql.Context: New scratch dir is
hdfs://localhost:9000/tmp/hive/<user>/b5e3e924-e46a-481c-8aef-30d48605a2da/hive_2017-08-04_14-59-08_935_8316159022041430928-1
17/08/04 14:59:09 INFO parse.CalcitePlanner: Completed getting MetaData in
Semantic Analysis
17/08/04 14:59:09 INFO parse.CalcitePlanner: Get metadata for source tables
17/08/04 14:59:09 INFO metastore.HiveMetaStore: 0: get_table : db=yt
tbl=stocks3
17/08/04 14:59:09 INFO HiveMetaStore.audit: ugi=<user>  ip=unknown-ip-addr     
cmd=get_table : db=yt tbl=stocks3
17/08/04 14:59:09 INFO parse.CalcitePlanner: Get metadata for subqueries
17/08/04 14:59:09 INFO parse.CalcitePlanner: Get metadata for destination
tables
17/08/04 14:59:09 INFO ql.Context: New scratch dir is
hdfs://localhost:9000/tmp/hive/<user>/b5e3e924-e46a-481c-8aef-30d48605a2da/hive_2017-08-04_14-59-08_935_8316159022041430928-1
17/08/04 14:59:09 INFO common.FileUtils: Creating directory if it doesn't
exist:
hdfs://localhost:9000/tmp/hive/<user>/b5e3e924-e46a-481c-8aef-30d48605a2da/hive_2017-08-04_14-59-08_935_831615902204143
0928-1/-mr-10001/.hive-staging_hive_2017-08-04_14-59-08_935_8316159022041430928-1
17/08/04 14:59:09 INFO parse.CalcitePlanner: CBO Succeeded; optimized
logical plan.
17/08/04 14:59:09 INFO ppd.OpProcFactory: Processing for FS(2)
17/08/04 14:59:09 INFO ppd.OpProcFactory: Processing for SEL(1)
17/08/04 14:59:09 INFO ppd.OpProcFactory: Processing for TS(0)
17/08/04 14:59:09 INFO metastore.HiveMetaStore: 0: get_partitions : db=yt
tbl=stocks3
17/08/04 14:59:09 INFO HiveMetaStore.audit: ugi=<user>  ip=unknown-ip-addr     
cmd=get_partitions : db=yt tbl=stocks3
17/08/04 14:59:09 INFO parse.CalcitePlanner: Completed plan generation
17/08/04 14:59:09 INFO ql.Driver: Semantic Analysis Completed
17/08/04 14:59:09 INFO ql.Driver: Returning Hive schema:
Schema(fieldSchemas:[FieldSchema(name:stocks3.stock, type:string,
comment:null), FieldSchema(name:stocks3.time, type:timestamp, comment:null),
FieldSchema(name:stocks3.price, type:float, comment:null),
FieldSchema(name:stocks3.years, type:bigint, comment:null),
FieldSchema(name:stocks3.months, type:bigint, comment:null),
FieldSchema(name:sto
cks3.days, type:bigint, comment:null)], properties:null)
17/08/04 14:59:09 INFO exec.TableScanOperator: Initializing operator TS[0]
17/08/04 14:59:09 INFO exec.SelectOperator: Initializing operator SEL[1]
17/08/04 14:59:09 INFO exec.SelectOperator: SELECT
struct<stock:string,time:timestamp,price:float,years:bigint,months:bigint,days:bigint>
17/08/04 14:59:09 INFO exec.ListSinkOperator: Initializing operator
LIST_SINK[3]
17/08/04 14:59:09 INFO ql.Driver: EXPLAIN output for queryid
<user>_20170804145908_b270c978-ab00-4160-a2a6-c19b42eab676 : STAGE
DEPENDENCIES:
  Stage-0 is a root stage [FETCH]

STAGE PLANS:
  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        TableScan
          alias: stocks3
          GatherStats: false
          Select Operator
            expressions: stock (type: string), time (type: timestamp), price
(type: float), years (type: bigint), months (type: bigint), days (type:
bigint)
            outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
            ListSink


17/08/04 14:59:09 INFO ql.Driver: Completed compiling
command(queryId=<user>_20170804145908_b270c978-ab00-4160-a2a6-c19b42eab676);
Time taken: 0.586 seconds
17/08/04 14:59:09 INFO conf.HiveConf: Using the default value passed in for
log id: b5e3e924-e46a-481c-8aef-30d48605a2da
17/08/04 14:59:09 INFO session.SessionState: Resetting thread name to  main
17/08/04 14:59:09 INFO ql.Driver: Concurrency mode is disabled, not creating
a lock manager
17/08/04 14:59:09 INFO ql.Driver: Executing
command(queryId=<user>_20170804145908_b270c978-ab00-4160-a2a6-c19b42eab676):
select * from stocks3
17/08/04 14:59:09 INFO ql.Driver: Completed executing
command(queryId=<user>_20170804145908_b270c978-ab00-4160-a2a6-c19b42eab676);
Time taken: 0.002 seconds
OK
17/08/04 14:59:09 INFO ql.Driver: OK
17/08/04 14:59:09 INFO conf.HiveConf: Using the default value passed in for
log id: b5e3e924-e46a-481c-8aef-30d48605a2da
17/08/04 14:59:09 INFO session.SessionState: Updating thread name to
b5e3e924-e46a-481c-8aef-30d48605a2da main
17/08/04 14:59:09 INFO conf.HiveConf: Using the default value passed in for
log id: b5e3e924-e46a-481c-8aef-30d48605a2da
17/08/04 14:59:09 INFO session.SessionState: Resetting thread name to  main
17/08/04 14:59:09 INFO conf.HiveConf: Using the default value passed in for
log id: b5e3e924-e46a-481c-8aef-30d48605a2da
17/08/04 14:59:09 INFO session.SessionState: Updating thread name to
b5e3e924-e46a-481c-8aef-30d48605a2da Thread-52
17/08/04 14:59:09 INFO conf.HiveConf: Using the default value passed in for
log id: b5e3e924-e46a-481c-8aef-30d48605a2da
17/08/04 14:59:09 INFO session.SessionState: Resetting thread name to 
Thread-52
17/08/04 14:59:09 WARN thrift.ThriftCLIService: Error fetching results:
org.apache.hive.service.cli.HiveSQLException: Couldn't find log associated
with operation handle: OperationHandle [opType=EXECUTE_STATEMENT,
getHandleIdentifier()=137adad6-ea23-462c-a414-6ce260e5bd49]

        at
org.apache.hive.service.cli.operation.OperationManager.getOperationLogRowSet(OperationManager.java:324)
        at
org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:849)
        at
org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:505)
        at
org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:698)
        at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:491)
        at
org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1412)
        at com.sun.proxy.$Proxy21.FetchResults(Unknown Source)
        at
org.apache.hive.jdbc.HiveStatement.getQueryLog(HiveStatement.java:871)
        at
org.apache.hive.jdbc.HiveStatement.getQueryLog(HiveStatement.java:842)
        at
org.apache.hive.beeline.Commands.showRemainingLogsIfAny(Commands.java:1211)
        at org.apache.hive.beeline.Commands.access$200(Commands.java:68)
        at org.apache.hive.beeline.Commands$2.run(Commands.java:1187)
        at java.lang.Thread.run(Thread.java:724)
17/08/04 14:59:09 INFO conf.HiveConf: Using the default value passed in for
log id: b5e3e924-e46a-481c-8aef-30d48605a2da
17/08/04 14:59:09 INFO session.SessionState: Updating thread name to
b5e3e924-e46a-481c-8aef-30d48605a2da main
17/08/04 14:59:09 INFO conf.HiveConf: Using the default value passed in for
log id: b5e3e924-e46a-481c-8aef-30d48605a2da
17/08/04 14:59:09 INFO session.SessionState: Resetting thread name to  main
No rows selected (0.612 seconds)
17/08/04 14:59:09 INFO conf.HiveConf: Using the default value passed in for
log id: b5e3e924-e46a-481c-8aef-30d48605a2da

-	Data created in HDFS
(http://localhost:50070/explorer.html#/usr/hive/warehouse/yt.db/stocks3/years=2017/months=7/days=4)
is as follows:
-rw-r--r--	<user>	supergroup	44 B	Aug 04 14:48	3	128 MB	1

-	Start ignite
-	Run insert query as below
	insert into table stocks3 PARTITION (years=2004,months=12,days=3)
values('AAPL',1501236980,120.34);
-	New partition created
http://localhost:50070/explorer.html#/usr/hive/warehouse/yt.db/stocks3/years=2004/months=12/days=3
	-rwxr-xr-x	<user>	supergroup	15 B	Aug 04 15:16	1	128 MB	000000_0
-	Run below select query which is returning the row inserting using the aboe
insert.
	select * from stocks3;
-	Now insert new row in the table to the partition created through code
earlier
	insert into table stocks3 PARTITION (years=2017,months=7,days=4)
values('AAPL',1501236980,120.34);
-	Run select query again. Now it gives 3 rows. Two of which were inserted
using insert command and one through code which was not coming in select
query earlier.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/how-to-append-data-to-IGFS-so-that-data-gets-saved-to-Hive-partitioned-table-tp15725p15991.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: how to append data to IGFS so that data gets saved to Hive partitioned table?

Posted by Jörn Franke <jo...@gmail.com>.

I think it is still not clear what you are doing. What do you mean by using the fs.append function? Can you please provide each query that you execute? From where is the data inserted? Did you check all the logfiles of Hive and in Yarn?

Then single inserts are highly inefficient. Try to use create table as select or insert overwrite into a new partition.

As the people already said - if it even does not work when not using Ignite then it is a Hive or HDFS problem.

> On 4. Aug 2017, at 06:45, csumi <sc...@sapient.com> wrote:
> 
> Thank you for the response.
> 
> If I insert data from hive directly using below query, select query works
> fine.
> 
> insert into table stocks PARTITION (years=2004,months=12,days=3)
> values('AAPL',1501236980,120.34);
> 
> I think the issue here is that when we insert data using IGFS api (append
> method), select fails to return the results but if we use insert query,
> partitions are created and select query works fine. How to resolve this? Is
> partition supported through IGFS?
> 
> 
> 
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/how-to-append-data-to-IGFS-so-that-data-gets-saved-to-Hive-partitioned-table-tp15725p15983.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: how to append data to IGFS so that data gets saved to Hive partitioned table?

Posted by csumi <sc...@sapient.com>.

Thank you for the response.

If I insert data from hive directly using below query, select query works
fine.

insert into table stocks PARTITION (years=2004,months=12,days=3)
values('AAPL',1501236980,120.34);

I think the issue here is that when we insert data using IGFS api (append
method), select fails to return the results but if we use insert query,
partitions are created and select query works fine. How to resolve this? Is
partition supported through IGFS?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/how-to-append-data-to-IGFS-so-that-data-gets-saved-to-Hive-partitioned-table-tp15725p15983.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: how to append data to IGFS so that data gets saved to Hive partitioned table?

Posted by Mikhail Cherkasov <mc...@gridgain.com>.

Hi again,

so it's correct hadoop core-site.xml should have "hdfs://localhost:9000".

Hadoop works with a disk. So you write to IGFS, IGFS writes to the
secondary file system which is hadoop,
that means hadoop itself shouldn't know anything about IGFS,
otherwise, it will write to IGFS, which will write to secondary FS which is
hadoop, which will write to IGFS and so on.

However,  your code or tools like hive need to work with IGFS so they
should work with core-site.xml witn igfs://localhost:9000 .

As you said looks like hive still works with hadoop directly:
https://apacheignite-fs.readme.io/docs/running-apache-hive-over-ignited-hadoop#section-starting-hive

you need to feed core-site.xml to Hive with igfs://localhost:9000 and
    <property>
        <name>fs.igfs.impl</name>
        <value>org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem</value>
    </property>
    <property>
        <name>fs.AbstractFileSystem.igfs.impl</name>
        <value>org.apache.ignite.hadoop.fs.v2.IgniteHadoopFileSystem</value>
    </property>

So, for now, I think your very first question should be asked to Hive and
Hadoop guys because it doesn't
work when you work directly with Hadoop.

Thanks,
Mikhail.

On Thu, Aug 3, 2017 at 12:46 PM, csumi <sc...@sapient.com> wrote:

> Hi Mikhail,
>
> Thanks for your response.
>
> I changed hdfs://localhost:9000 to igfs://localhost:9000 in the
> core-site.xml of hive home directory but core-site.xml of Hadoop home was
> still pointing to hdfs://localhost:9000. Issue persists.
>
> If I create partition using the insert command on hive I am getting rows on
> select query. Even after stopping all ignite nodes I am still getting the
> data on select query. Does this mean that hive is still pointing to hdfs?
>
> What else can we check here?
>
> Thanks!
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/how-to-append-data-to-IGFS-so-that-
> data-gets-saved-to-Hive-partitioned-table-tp15725p15937.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

-- 
Thanks,
Mikhail.

Re: how to append data to IGFS so that data gets saved to Hive partitioned table?

Posted by csumi <sc...@sapient.com>.

Hi Mikhail,

Thanks for your response.

I changed hdfs://localhost:9000 to igfs://localhost:9000 in the
core-site.xml of hive home directory but core-site.xml of Hadoop home was
still pointing to hdfs://localhost:9000. Issue persists. 

If I create partition using the insert command on hive I am getting rows on
select query. Even after stopping all ignite nodes I am still getting the
data on select query. Does this mean that hive is still pointing to hdfs?

What else can we check here?

Thanks!



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/how-to-append-data-to-IGFS-so-that-data-gets-saved-to-Hive-partitioned-table-tp15725p15937.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: how to append data to IGFS so that data gets saved to Hive partitioned table?

Posted by Mikhail Cherkasov <mc...@gridgain.com>.

Hi again,

try to do this without ignite, looks like there's a problem on hive side.

<property>
                <name>fs.defaultFS</name>
                <value>hdfs://localhost:9000</value>
        </property>

this means that you uses Hadoop directly, to make hive work with IGFS
you need to change hdfs to igfs:

<property>
                <name>fs.defaultFS</name>
                <value>igfs://localhost:9000</value>
        </property>

Thanks,
Mikhail.

On Wed, Aug 2, 2017 at 12:35 PM, csumi <sc...@sapient.com> wrote:

> Ignite-hadoop accelerator is configured as
> https://apacheignite.readme.io/v1.0/docs/hadoop-
> accelerator#section-secondary-file-system.
>
> Idea is to run hive queries on IGFS but looks like its not working that
> way.
> Not sure how to confirm if hive is connecting to HDFS or IGFS. Also, I
> think
> creating partition by passing comma separated string to fs.append is not
> creating partition correctly which is resulting in no result on hive query.
>
> Below are some of the configurations I have. Please let me know if I need
> to
> share any more details.
>
> core-site.xml is having these properties:
>
> <property>
>                 <name>fs.defaultFS</name>
>                 <value>hdfs://localhost:9000</value>
>         </property>
>
>
>         <property>
>                 <name>fs.igfs.impl</name>
>                 <value>org.apache.ignite.hadoop.fs.v1.
> IgniteHadoopFileSystem</value>
>   </property>
>   <property>
>                 <name>fs.AbstractFileSystem.igfs.impl</name>
>                 <value>org.apache.ignite.hadoop.fs.v2.
> IgniteHadoopFileSystem</value>
>   </property>
>       <property>
>
> <name>dfs.client.block.write.replace-datanode-on-failure.policy</name>
>         <value>NEVER</value>
>     </property>
>
> ignite's default-config has below configuration
>
> <property name="secondaryFileSystem">
>                         <bean
> class="org.apache.ignite.hadoop.fs.IgniteHadoopIgfsSecondaryFileSystem">
>                             <property name="fileSystemFactory">
>                                 <bean
> class="org.apache.ignite.hadoop.fs.CachingHadoopFileSystemFactory">
>                                     <property name="uri"
> value="hdfs://localhost:9000/"/>
>
> <property name="configPaths">
>
>       <list>
>
>               <value>D:/hadoop/etc/hadoop/core-site.xml</value>
>
>       </list>
>
> </property>
>                                 </bean>
>                             </property>
>                         </bean>
>                     </property>
>
>
>
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/how-to-append-data-to-IGFS-so-that-
> data-gets-saved-to-Hive-partitioned-table-tp15725p15887.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Thanks,
Mikhail.