You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@nifi.apache.org by shweta soni <ss...@gmail.com> on 2020/06/25 18:06:25 UTC

Need help regarding some challenges while using Apache Nifi

Hello Team,



We are using Nifi in our data ingestion process. The version details
are:  *Nifi
1.11.4 , Cloudera Enterprise 5.16.2 and Hive 1.1*. I posted my issues on
Nifi SLACK channel, but did not get answer for some of my questions. So I
am posting all my queries here in ardent hope of getting
solutions/workarounds for them.  We are facing below issues:





1.       *SCENARIO*: In RDBMS Source we have Date/Timestamp column. In Hive
Destination we have Date/Timestamp columns but when we are trying to ingest
from source to destination, we are getting Int/Longwritable cannot be
written to Date/Timestamp errors in Hue. We are using following
processors:  QueryDatabaseProcessor à UpdateRecord( column mapping and
output schema) à PutHDFS à ReplaceText à PutHiveQL. Below are the Avro
Output Schema since we don’t have Date or Timestamp as datatype in Avro
Schema.

       {"name":"dob","type":["null",{"type":"long","logicalType":"timestamp-millis"}

        {"name":"doA","type":["null",{"type":"int","logicalType":"date"}


       *Q. Please let me know how can we put data/timestamp source
columns to data/timestamp destination columns?*






2.       *SCENARIO* : Decimal data is not being inserted in ORC table.

Solution: I am loading data in Avro table and then doing INSERT INTO ORC
table from it. This solution I found from Cloudera community.


 *Q. Is there any other solution for loading decimal data in ORC table?*






*3.       **SCENARIO**: *We have a 1 time full load flow in Nifi –
QueryDatabase à PutHiveQL. à LogAttribute. This acts as a pipeline in
our custom based UI. This will run only once. In Nifi UI we can
manually start processors to start the flow and once all the flowfiles
are processed and the success queue of PutHiveQL becomes empty we can
stop the processor in Nifi UI. But now we want to know
programmatically that this flow ended at particular time and we want
to show the pipeline status as completed in our custom based UI. So
how can we stimulate this?



*        Q. *Since Nifi is for continuous data transfer, how can we
know that a particular flow has ended?






4.       *SCENARIO** :*I have Hive table with complex datatypes i.e.
Array, Map. When I am trying to get this data via SELECTHIVEQL
processor , it is giving output in String format for all the columns.
Then in next UpdateProcessor it is giving error that string datatype
cannot be converted to Array or Map.

Avro Output Schema:

{"type": "array", "items": "double"}

{"type": "map", "values": "int"}


 *Q. How to handle complex datatype in Hive via Nifi. Source table as
Hive and destination table as another Hive table.*






5.       *SCENARIO*: In QueryDatabase Processor we have Max-value
column which helps in incremental load. But there is not such
functionality for Hive table incremental load(i.e.SELECTHIVEQL). I
tried with GenerateTableFetch processor and QueryDatabase processor
using Hive1_1 Connection service but it is not working. I was told on
Nifi SLACK channel to raise JIRA for new processor
GenerateHiveTableFetch/QueryHiveDatabase processor.



*Q. Is there any alternative in which we can handle hive table
incremental load or should I go ahead and raise JIRA for the same?*



Request you to please help us. Thanking you in anticipation.





Thanks & Regards,
Shweta Soni

Re: Need help regarding some challenges while using Apache Nifi

Posted by Jason Iannone <br...@gmail.com>.

Additionally any non CDH 6.x possesses challenges on data type mappings
from Avro and Parquet into Hive. I see mention of ORC which is more akin to
Hortonworks and I don't believe will be fully supported until Cloudera has
fully consolidated the platforms. I would take a step back and look at what
you're feeding, desired data types, and whether that data type version and
hive (or Impala) support what you're trying to achieve.

On Mon, Jun 29, 2020 at 11:11 AM Andy LoPresto <al...@apache.org> wrote:

> Shweta, if you need responses within a certain timeframe, you may want to
> investigate commercial vendor support. The Apache NiFi open source
> community attempts to answer any questions we can, but makes no guarantee
> about accuracy, availability, or response time.
>
>
> Andy LoPresto
> alopresto@apache.org
> alopresto.apache@gmail.com
> He/Him
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> > On Jun 29, 2020, at 4:24 AM, shweta soni <ss...@gmail.com> wrote:
> >
> > Hello Team,
> >
> > Request you to please help me with the issues mentioned in the trailing
> > mail. Thanking you in anticipation.
> >
> > Thanks & Regards
> > Shweta Soni
> >
> > Am Do., 25. Juni 2020 um 23:36 Uhr schrieb shweta soni <
> sssmb145@gmail.com>:
> >
> >> Hello Team,
> >>
> >>
> >>
> >> We are using Nifi in our data ingestion process. The version details
> are:  *Nifi
> >> 1.11.4 , Cloudera Enterprise 5.16.2 and Hive 1.1*. I posted my issues on
> >> Nifi SLACK channel, but did not get answer for some of my questions. So
> I
> >> am posting all my queries here in ardent hope of getting
> >> solutions/workarounds for them.  We are facing below issues:
> >>
> >>
> >>
> >>
> >>
> >> 1.       *SCENARIO*: In RDBMS Source we have Date/Timestamp column. In
> >> Hive Destination we have Date/Timestamp columns but when we are trying
> to
> >> ingest from source to destination, we are getting Int/Longwritable
> cannot
> >> be written to Date/Timestamp errors in Hue. We are using following
> >> processors:  QueryDatabaseProcessor à UpdateRecord( column mapping and
> >> output schema) à PutHDFS à ReplaceText à PutHiveQL. Below are the Avro
> >> Output Schema since we don’t have Date or Timestamp as datatype in Avro
> >> Schema.
> >>
> >>
>  {"name":"dob","type":["null",{"type":"long","logicalType":"timestamp-millis"}
> >>
> >>        {"name":"doA","type":["null",{"type":"int","logicalType":"date"}
> >>
> >>
> >>       *Q. Please let me know how can we put data/timestamp source
> columns to data/timestamp destination columns?*
> >>
> >>
> >>
> >>
> >>
> >>
> >> 2.       *SCENARIO* : Decimal data is not being inserted in ORC table.
> >>
> >> Solution: I am loading data in Avro table and then doing INSERT INTO ORC
> >> table from it. This solution I found from Cloudera community.
> >>
> >>
> >> *Q. Is there any other solution for loading decimal data in ORC table?*
> >>
> >>
> >>
> >>
> >>
> >>
> >> *3.       **SCENARIO**: *We have a 1 time full load flow in Nifi –
> QueryDatabase à PutHiveQL. à LogAttribute. This acts as a pipeline in our
> custom based UI. This will run only once. In Nifi UI we can manually start
> processors to start the flow and once all the flowfiles are processed and
> the success queue of PutHiveQL becomes empty we can stop the processor in
> Nifi UI. But now we want to know programmatically that this flow ended at
> particular time and we want to show the pipeline status as completed in our
> custom based UI. So how can we stimulate this?
> >>
> >>
> >>
> >> *        Q. *Since Nifi is for continuous data transfer, how can we
> know that a particular flow has ended?
> >>
> >>
> >>
> >>
> >>
> >>
> >> 4.       *SCENARIO** :*I have Hive table with complex datatypes i.e.
> Array, Map. When I am trying to get this data via SELECTHIVEQL processor ,
> it is giving output in String format for all the columns. Then in next
> UpdateProcessor it is giving error that string datatype cannot be converted
> to Array or Map.
> >>
> >> Avro Output Schema:
> >>
> >> {"type": "array", "items": "double"}
> >>
> >> {"type": "map", "values": "int"}
> >>
> >>
> >> *Q. How to handle complex datatype in Hive via Nifi. Source table as
> Hive and destination table as another Hive table.*
> >>
> >>
> >>
> >>
> >>
> >>
> >> 5.       *SCENARIO*: In QueryDatabase Processor we have Max-value
> column which helps in incremental load. But there is not such functionality
> for Hive table incremental load(i.e.SELECTHIVEQL). I tried with
> GenerateTableFetch processor and QueryDatabase processor using Hive1_1
> Connection service but it is not working. I was told on Nifi SLACK channel
> to raise JIRA for new processor GenerateHiveTableFetch/QueryHiveDatabase
> processor.
> >>
> >>
> >>
> >> *Q. Is there any alternative in which we can handle hive table
> incremental load or should I go ahead and raise JIRA for the same?*
> >>
> >>
> >>
> >> Request you to please help us. Thanking you in anticipation.
> >>
> >>
> >>
> >>
> >>
> >> Thanks & Regards,
> >> Shweta Soni
> >>
>
>

Re: Need help regarding some challenges while using Apache Nifi

Posted by Andy LoPresto <al...@apache.org>.

Shweta, if you need responses within a certain timeframe, you may want to investigate commercial vendor support. The Apache NiFi open source community attempts to answer any questions we can, but makes no guarantee about accuracy, availability, or response time. 


Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Jun 29, 2020, at 4:24 AM, shweta soni <ss...@gmail.com> wrote:
> 
> Hello Team,
> 
> Request you to please help me with the issues mentioned in the trailing
> mail. Thanking you in anticipation.
> 
> Thanks & Regards
> Shweta Soni
> 
> Am Do., 25. Juni 2020 um 23:36 Uhr schrieb shweta soni <ss...@gmail.com>:
> 
>> Hello Team,
>> 
>> 
>> 
>> We are using Nifi in our data ingestion process. The version details are:  *Nifi
>> 1.11.4 , Cloudera Enterprise 5.16.2 and Hive 1.1*. I posted my issues on
>> Nifi SLACK channel, but did not get answer for some of my questions. So I
>> am posting all my queries here in ardent hope of getting
>> solutions/workarounds for them.  We are facing below issues:
>> 
>> 
>> 
>> 
>> 
>> 1.       *SCENARIO*: In RDBMS Source we have Date/Timestamp column. In
>> Hive Destination we have Date/Timestamp columns but when we are trying to
>> ingest from source to destination, we are getting Int/Longwritable cannot
>> be written to Date/Timestamp errors in Hue. We are using following
>> processors:  QueryDatabaseProcessor à UpdateRecord( column mapping and
>> output schema) à PutHDFS à ReplaceText à PutHiveQL. Below are the Avro
>> Output Schema since we don’t have Date or Timestamp as datatype in Avro
>> Schema.
>> 
>>       {"name":"dob","type":["null",{"type":"long","logicalType":"timestamp-millis"}
>> 
>>        {"name":"doA","type":["null",{"type":"int","logicalType":"date"}
>> 
>> 
>>       *Q. Please let me know how can we put data/timestamp source columns to data/timestamp destination columns?*
>> 
>> 
>> 
>> 
>> 
>> 
>> 2.       *SCENARIO* : Decimal data is not being inserted in ORC table.
>> 
>> Solution: I am loading data in Avro table and then doing INSERT INTO ORC
>> table from it. This solution I found from Cloudera community.
>> 
>> 
>> *Q. Is there any other solution for loading decimal data in ORC table?*
>> 
>> 
>> 
>> 
>> 
>> 
>> *3.       **SCENARIO**: *We have a 1 time full load flow in Nifi – QueryDatabase à PutHiveQL. à LogAttribute. This acts as a pipeline in our custom based UI. This will run only once. In Nifi UI we can manually start processors to start the flow and once all the flowfiles are processed and the success queue of PutHiveQL becomes empty we can stop the processor in Nifi UI. But now we want to know programmatically that this flow ended at particular time and we want to show the pipeline status as completed in our custom based UI. So how can we stimulate this?
>> 
>> 
>> 
>> *        Q. *Since Nifi is for continuous data transfer, how can we know that a particular flow has ended?
>> 
>> 
>> 
>> 
>> 
>> 
>> 4.       *SCENARIO** :*I have Hive table with complex datatypes i.e. Array, Map. When I am trying to get this data via SELECTHIVEQL processor , it is giving output in String format for all the columns. Then in next UpdateProcessor it is giving error that string datatype cannot be converted to Array or Map.
>> 
>> Avro Output Schema:
>> 
>> {"type": "array", "items": "double"}
>> 
>> {"type": "map", "values": "int"}
>> 
>> 
>> *Q. How to handle complex datatype in Hive via Nifi. Source table as Hive and destination table as another Hive table.*
>> 
>> 
>> 
>> 
>> 
>> 
>> 5.       *SCENARIO*: In QueryDatabase Processor we have Max-value column which helps in incremental load. But there is not such functionality for Hive table incremental load(i.e.SELECTHIVEQL). I tried with GenerateTableFetch processor and QueryDatabase processor using Hive1_1 Connection service but it is not working. I was told on Nifi SLACK channel to raise JIRA for new processor GenerateHiveTableFetch/QueryHiveDatabase processor.
>> 
>> 
>> 
>> *Q. Is there any alternative in which we can handle hive table incremental load or should I go ahead and raise JIRA for the same?*
>> 
>> 
>> 
>> Request you to please help us. Thanking you in anticipation.
>> 
>> 
>> 
>> 
>> 
>> Thanks & Regards,
>> Shweta Soni
>>

Re: Need help regarding some challenges while using Apache Nifi

Posted by Joe Witt <jo...@gmail.com>.

Shweta (on bcc specifically)

You need to subscribe to the mailing list to see replies to your notes and
to have your notes not get moderated.

http://nifi.apache.org/mailing_lists.html

You've already received two replies.

You also probably want to be asking on the users list.

You also might want to ask about one thing and make progress with it at a
time.

Thanks

On Wed, Jul 8, 2020 at 12:10 PM shweta soni <ss...@gmail.com> wrote:

> Hello Team,
>
> Request you to please help me with the issues mentioned in the trailing
> mail. Thanking you in anticipation.
>
> Thanks & Regards
> Shweta Soni
>
> Am Mo., 29. Juni 2020 um 16:54 Uhr schrieb shweta soni <sssmb145@gmail.com
> >:
>
> > Hello Team,
> >
> > Request you to please help me with the issues mentioned in the trailing
> > mail. Thanking you in anticipation.
> >
> > Thanks & Regards
> > Shweta Soni
> >
> > Am Do., 25. Juni 2020 um 23:36 Uhr schrieb shweta soni <
> sssmb145@gmail.com
> > >:
> >
> >> Hello Team,
> >>
> >>
> >>
> >> We are using Nifi in our data ingestion process. The version details
> are:
> >>  *Nifi 1.11.4 , Cloudera Enterprise 5.16.2 and Hive 1.1*. I posted my
> >> issues on Nifi SLACK channel, but did not get answer for some of my
> >> questions. So I am posting all my queries here in ardent hope of getting
> >> solutions/workarounds for them.  We are facing below issues:
> >>
> >>
> >>
> >>
> >>
> >> 1.       *SCENARIO*: In RDBMS Source we have Date/Timestamp column. In
> >> Hive Destination we have Date/Timestamp columns but when we are trying
> to
> >> ingest from source to destination, we are getting Int/Longwritable
> cannot
> >> be written to Date/Timestamp errors in Hue. We are using following
> >> processors:  QueryDatabaseProcessor à UpdateRecord( column mapping and
> >> output schema) à PutHDFS à ReplaceText à PutHiveQL. Below are the Avro
> >> Output Schema since we don’t have Date or Timestamp as datatype in Avro
> >> Schema.
> >>
> >>
> {"name":"dob","type":["null",{"type":"long","logicalType":"timestamp-millis"}
> >>
> >>         {"name":"doA","type":["null",{"type":"int","logicalType":"date"}
> >>
> >>
> >>        *Q. Please let me know how can we put data/timestamp source
> columns to data/timestamp destination columns?*
> >>
> >>
> >>
> >>
> >>
> >>
> >> 2.       *SCENARIO* : Decimal data is not being inserted in ORC table.
> >>
> >> Solution: I am loading data in Avro table and then doing INSERT INTO ORC
> >> table from it. This solution I found from Cloudera community.
> >>
> >>
> >>  *Q. Is there any other solution for loading decimal data in ORC table?*
> >>
> >>
> >>
> >>
> >>
> >>
> >> *3.       **SCENARIO**: *We have a 1 time full load flow in Nifi –
> QueryDatabase à PutHiveQL. à LogAttribute. This acts as a pipeline in our
> custom based UI. This will run only once. In Nifi UI we can manually start
> processors to start the flow and once all the flowfiles are processed and
> the success queue of PutHiveQL becomes empty we can stop the processor in
> Nifi UI. But now we want to know programmatically that this flow ended at
> particular time and we want to show the pipeline status as completed in our
> custom based UI. So how can we stimulate this?
> >>
> >>
> >>
> >> *        Q. *Since Nifi is for continuous data transfer, how can we
> know that a particular flow has ended?
> >>
> >>
> >>
> >>
> >>
> >>
> >> 4.       *SCENARIO** :*I have Hive table with complex datatypes i.e.
> Array, Map. When I am trying to get this data via SELECTHIVEQL processor ,
> it is giving output in String format for all the columns. Then in next
> UpdateProcessor it is giving error that string datatype cannot be converted
> to Array or Map.
> >>
> >> Avro Output Schema:
> >>
> >> {"type": "array", "items": "double"}
> >>
> >> {"type": "map", "values": "int"}
> >>
> >>
> >>  *Q. How to handle complex datatype in Hive via Nifi. Source table as
> Hive and destination table as another Hive table.*
> >>
> >>
> >>
> >>
> >>
> >>
> >> 5.       *SCENARIO*: In QueryDatabase Processor we have Max-value
> column which helps in incremental load. But there is not such functionality
> for Hive table incremental load(i.e.SELECTHIVEQL). I tried with
> GenerateTableFetch processor and QueryDatabase processor using Hive1_1
> Connection service but it is not working. I was told on Nifi SLACK channel
> to raise JIRA for new processor GenerateHiveTableFetch/QueryHiveDatabase
> processor.
> >>
> >>
> >>
> >> *Q. Is there any alternative in which we can handle hive table
> incremental load or should I go ahead and raise JIRA for the same?*
> >>
> >>
> >>
> >> Request you to please help us. Thanking you in anticipation.
> >>
> >>
> >>
> >>
> >>
> >> Thanks & Regards,
> >> Shweta Soni
> >>
> >
>

Re: Need help regarding some challenges while using Apache Nifi

Posted by shweta soni <ss...@gmail.com>.

Hello Team,

Request you to please help me with the issues mentioned in the trailing
mail. Thanking you in anticipation.

Thanks & Regards
Shweta Soni

Am Mo., 29. Juni 2020 um 16:54 Uhr schrieb shweta soni <ss...@gmail.com>:

> Hello Team,
>
> Request you to please help me with the issues mentioned in the trailing
> mail. Thanking you in anticipation.
>
> Thanks & Regards
> Shweta Soni
>
> Am Do., 25. Juni 2020 um 23:36 Uhr schrieb shweta soni <sssmb145@gmail.com
> >:
>
>> Hello Team,
>>
>>
>>
>> We are using Nifi in our data ingestion process. The version details are:
>>  *Nifi 1.11.4 , Cloudera Enterprise 5.16.2 and Hive 1.1*. I posted my
>> issues on Nifi SLACK channel, but did not get answer for some of my
>> questions. So I am posting all my queries here in ardent hope of getting
>> solutions/workarounds for them.  We are facing below issues:
>>
>>
>>
>>
>>
>> 1.       *SCENARIO*: In RDBMS Source we have Date/Timestamp column. In
>> Hive Destination we have Date/Timestamp columns but when we are trying to
>> ingest from source to destination, we are getting Int/Longwritable cannot
>> be written to Date/Timestamp errors in Hue. We are using following
>> processors:  QueryDatabaseProcessor à UpdateRecord( column mapping and
>> output schema) à PutHDFS à ReplaceText à PutHiveQL. Below are the Avro
>> Output Schema since we don’t have Date or Timestamp as datatype in Avro
>> Schema.
>>
>>        {"name":"dob","type":["null",{"type":"long","logicalType":"timestamp-millis"}
>>
>>         {"name":"doA","type":["null",{"type":"int","logicalType":"date"}
>>
>>
>>        *Q. Please let me know how can we put data/timestamp source columns to data/timestamp destination columns?*
>>
>>
>>
>>
>>
>>
>> 2.       *SCENARIO* : Decimal data is not being inserted in ORC table.
>>
>> Solution: I am loading data in Avro table and then doing INSERT INTO ORC
>> table from it. This solution I found from Cloudera community.
>>
>>
>>  *Q. Is there any other solution for loading decimal data in ORC table?*
>>
>>
>>
>>
>>
>>
>> *3.       **SCENARIO**: *We have a 1 time full load flow in Nifi – QueryDatabase à PutHiveQL. à LogAttribute. This acts as a pipeline in our custom based UI. This will run only once. In Nifi UI we can manually start processors to start the flow and once all the flowfiles are processed and the success queue of PutHiveQL becomes empty we can stop the processor in Nifi UI. But now we want to know programmatically that this flow ended at particular time and we want to show the pipeline status as completed in our custom based UI. So how can we stimulate this?
>>
>>
>>
>> *        Q. *Since Nifi is for continuous data transfer, how can we know that a particular flow has ended?
>>
>>
>>
>>
>>
>>
>> 4.       *SCENARIO** :*I have Hive table with complex datatypes i.e. Array, Map. When I am trying to get this data via SELECTHIVEQL processor , it is giving output in String format for all the columns. Then in next UpdateProcessor it is giving error that string datatype cannot be converted to Array or Map.
>>
>> Avro Output Schema:
>>
>> {"type": "array", "items": "double"}
>>
>> {"type": "map", "values": "int"}
>>
>>
>>  *Q. How to handle complex datatype in Hive via Nifi. Source table as Hive and destination table as another Hive table.*
>>
>>
>>
>>
>>
>>
>> 5.       *SCENARIO*: In QueryDatabase Processor we have Max-value column which helps in incremental load. But there is not such functionality for Hive table incremental load(i.e.SELECTHIVEQL). I tried with GenerateTableFetch processor and QueryDatabase processor using Hive1_1 Connection service but it is not working. I was told on Nifi SLACK channel to raise JIRA for new processor GenerateHiveTableFetch/QueryHiveDatabase processor.
>>
>>
>>
>> *Q. Is there any alternative in which we can handle hive table incremental load or should I go ahead and raise JIRA for the same?*
>>
>>
>>
>> Request you to please help us. Thanking you in anticipation.
>>
>>
>>
>>
>>
>> Thanks & Regards,
>> Shweta Soni
>>
>

Re: Need help regarding some challenges while using Apache Nifi

Posted by shweta soni <ss...@gmail.com>.

Hello Team,

Request you to please help me with the issues mentioned in the trailing
mail. Thanking you in anticipation.

Thanks & Regards
Shweta Soni

Am Do., 25. Juni 2020 um 23:36 Uhr schrieb shweta soni <ss...@gmail.com>:

> Hello Team,
>
>
>
> We are using Nifi in our data ingestion process. The version details are:  *Nifi
> 1.11.4 , Cloudera Enterprise 5.16.2 and Hive 1.1*. I posted my issues on
> Nifi SLACK channel, but did not get answer for some of my questions. So I
> am posting all my queries here in ardent hope of getting
> solutions/workarounds for them.  We are facing below issues:
>
>
>
>
>
> 1.       *SCENARIO*: In RDBMS Source we have Date/Timestamp column. In
> Hive Destination we have Date/Timestamp columns but when we are trying to
> ingest from source to destination, we are getting Int/Longwritable cannot
> be written to Date/Timestamp errors in Hue. We are using following
> processors:  QueryDatabaseProcessor à UpdateRecord( column mapping and
> output schema) à PutHDFS à ReplaceText à PutHiveQL. Below are the Avro
> Output Schema since we don’t have Date or Timestamp as datatype in Avro
> Schema.
>
>        {"name":"dob","type":["null",{"type":"long","logicalType":"timestamp-millis"}
>
>         {"name":"doA","type":["null",{"type":"int","logicalType":"date"}
>
>
>        *Q. Please let me know how can we put data/timestamp source columns to data/timestamp destination columns?*
>
>
>
>
>
>
> 2.       *SCENARIO* : Decimal data is not being inserted in ORC table.
>
> Solution: I am loading data in Avro table and then doing INSERT INTO ORC
> table from it. This solution I found from Cloudera community.
>
>
>  *Q. Is there any other solution for loading decimal data in ORC table?*
>
>
>
>
>
>
> *3.       **SCENARIO**: *We have a 1 time full load flow in Nifi – QueryDatabase à PutHiveQL. à LogAttribute. This acts as a pipeline in our custom based UI. This will run only once. In Nifi UI we can manually start processors to start the flow and once all the flowfiles are processed and the success queue of PutHiveQL becomes empty we can stop the processor in Nifi UI. But now we want to know programmatically that this flow ended at particular time and we want to show the pipeline status as completed in our custom based UI. So how can we stimulate this?
>
>
>
> *        Q. *Since Nifi is for continuous data transfer, how can we know that a particular flow has ended?
>
>
>
>
>
>
> 4.       *SCENARIO** :*I have Hive table with complex datatypes i.e. Array, Map. When I am trying to get this data via SELECTHIVEQL processor , it is giving output in String format for all the columns. Then in next UpdateProcessor it is giving error that string datatype cannot be converted to Array or Map.
>
> Avro Output Schema:
>
> {"type": "array", "items": "double"}
>
> {"type": "map", "values": "int"}
>
>
>  *Q. How to handle complex datatype in Hive via Nifi. Source table as Hive and destination table as another Hive table.*
>
>
>
>
>
>
> 5.       *SCENARIO*: In QueryDatabase Processor we have Max-value column which helps in incremental load. But there is not such functionality for Hive table incremental load(i.e.SELECTHIVEQL). I tried with GenerateTableFetch processor and QueryDatabase processor using Hive1_1 Connection service but it is not working. I was told on Nifi SLACK channel to raise JIRA for new processor GenerateHiveTableFetch/QueryHiveDatabase processor.
>
>
>
> *Q. Is there any alternative in which we can handle hive table incremental load or should I go ahead and raise JIRA for the same?*
>
>
>
> Request you to please help us. Thanking you in anticipation.
>
>
>
>
>
> Thanks & Regards,
> Shweta Soni
>