You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Thomas Nys (JIRA)" <ji...@apache.org> on 2018/05/09 09:37:00 UTC

[jira] [Created] (HIVE-19475) Issue when streaming data to Azure Data Lake Store

Thomas Nys created HIVE-19475:
---------------------------------

             Summary: Issue when streaming data to Azure Data Lake Store
                 Key: HIVE-19475
                 URL: https://issues.apache.org/jira/browse/HIVE-19475
             Project: Hive
          Issue Type: Bug
          Components: Streaming
    Affects Versions: 2.2.0
         Environment: HDInsight 3.6 on Ubuntu 16.04.4 LTS (GNU/Linux 4.13.0-1012-azure x86_64)

Used java libraries:
{code:java}
libraryDependencies += "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" % "2.2.0"
libraryDependencies += "org.apache.hive.hcatalog" % "hive-hcatalog-core" % "2.2.0"
libraryDependencies += "org.apache.hadoop" % "hadoop-client" % "2.8.0"
{code}
Please let me know if more details are needed.
            Reporter: Thomas Nys


I am trying to stream data from a Java (Play2 api) to  HDInsight Hive interactive query with Azure Data Lake Store as storage back-end. The following code is ran on one of the head nodes of the cluster.

When fetching a transaction-batch:
{code:java}
TransactionBatch txnBatch = this.connection.fetchTransactionBatch(10, (RecordWriter)writer);
{code}
I receive the following error:
{code:java}
play.api.UnexpectedException: Unexpected exception[StreamingIOFailure: Failed creating RecordUpdaterS for adl://home/hive/warehouse/raw_telemetry_data/ingest_date=2018-05-07 txnIds[506,515]]
    at play.api.http.HttpErrorHandlerExceptions$.throwableToUsefulException(HttpErrorHandler.scala:251)
    at play.api.http.DefaultHttpErrorHandler.onServerError(HttpErrorHandler.scala:182)
    at play.core.server.AkkaHttpServer$$anonfun$2.applyOrElse(AkkaHttpServer.scala:343)
    at play.core.server.AkkaHttpServer$$anonfun$2.applyOrElse(AkkaHttpServer.scala:341)
    at scala.concurrent.Future.$anonfun$recoverWith$1(Future.scala:414)
    at scala.concurrent.impl.Promise.$anonfun$transformWith$1(Promise.scala:37)
    at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
    at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
    at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:91)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
Caused by: org.apache.hive.hcatalog.streaming.StreamingIOFailure: Failed creating RecordUpdaterS for adl://home/hive/warehouse/raw_telemetry_data/ingest_date=2018-05-07 txnIds[506,515]
    at org.apache.hive.hcatalog.streaming.AbstractRecordWriter.newBatch(AbstractRecordWriter.java:208)
    at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:608)
    at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:556)
    at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatchImpl(HiveEndPoint.java:442)
    at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatch(HiveEndPoint.java:422)
    at hive.HiveRepository.createMany(HiveRepository.java:76)
    at controllers.HiveController.create(HiveController.java:40)
    at router.Routes$$anonfun$routes$1.$anonfun$applyOrElse$2(Routes.scala:70)
    at play.core.routing.HandlerInvokerFactory$$anon$4.resultCall(HandlerInvoker.scala:137)
    at play.core.routing.HandlerInvokerFactory$JavaActionInvokerFactory$$anon$8$$anon$2$$anon$1.invocation(HandlerInvoker.scala:108)
Caused by: java.io.IOException: No FileSystem for scheme: adl
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2798)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2809)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:100)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2848)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2830)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:389)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)
    at org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.<init>(OrcRecordUpdater.java:187)
    at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat.getRecordUpdater(OrcOutputFormat.java:278)
    at org.apache.hive.hcatalog.streaming.AbstractRecordWriter.createRecordUpdater(AbstractRecordWriter.java:268){code}
 

Any help would be greatly appreciated.

 

 
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)