You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Aruna Sankaralingam <Ar...@Cormac-Corp.com> on 2017/10/16 15:41:18 UTC

UpdateRecord Processor

Hi,

I updated to version 1.4 now. I am using Updaterecord processor. I have assigned values to two of the columns that are there in the target table as shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am I missing something?
I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry and my Nifi Flow below.

[cid:image004.png@01D34673.218B0A60]   [cid:image005.png@01D34673.218B0A60]

CSVRecordSetWriter:
[cid:image001.png@01D34673.7C26D760]

AVROSchemaRegistry:
[cid:image002.png@01D34673.7C26D760]

[cid:image006.png@01D34673.9FF1E590]


From: Koji Kawamura [mailto:ijokarumawak@gmail.com]
Sent: Thursday, October 12, 2017 8:28 PM
To: users@nifi.apache.org
Subject: Re: Transformations using Nifi

Hi Aruna,

If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
As Russ mentioned, EL might be helpful when you implement the scripted components.

#4 is a bit harder since it requires a database connection, but doable, I don't know it works efficiently though..

Alternative approach would be create a temporary table, insert rows as they are, then perform a insert/update query using the temporary table and postgresql lookup table such as:
"insert into X select a, b, c, from T inner join L on X.j = T.j where d = ..."
Probably data conversion #1, 2, and 3 can be performed from this query as well.

Thanks,
Koji

On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>> wrote:

Aruna,

I don't think there is any generalized NiFi training course yet. I started writing a book a year ago, but it's a pretty thankless task and takes a lot of time. I mostly write custom processors so I tend to see the world differently from most "user" users of NiFi. When I answer questions, which I don't do too much, I often tell an impractical story that doesn't answer the question asked.

Now, I haven't done much database work at all, so I'm not going to be too much help to you. However, looking at your questions, particularly the one about obtaining the current date when the flow is run (#2) and also #3, I would suggest that you study the NiFi Expression Language. This would allow you to insert the "now" date into your flow at some point.

The other suggestions I have are to experiment, which I'm sure you're doing, and Google hard for help. For this forum, which is filled with very nice and helpful people, you'll tend to get a lot more and better help if you come in with a very specific question rather than a list of things or a general, "help me" sort of plea.

Cheers,

Russ

P.S. NiFi absolutely rocks, which you'll see as soon as you get over your initial hump here. But, you're on the right track.

On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
Hi,

Could you someone please help me with these requirements in the email below?

Thanks
Aruna

From: Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com]
Sent: Wednesday, October 11, 2017 11:26 AM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Transformations using Nifi

I am trying to see what kind of transformations that can be done in nifi and how.
Now I have a basic flow that takes CSV from the local dir and puts into s3 and loads into postgres database.
There are 4 columns in my test file 3 of which are string and one is an integer field. I would like to do the following before I load the data into postgres. If someone can help me on how to go about these, it will be great.

1.       Convert one of the string columns to upper case

For converting to upper case, I was told to use the Update Record processor but  my version is 1.2.0 and the update record processor is not available.



2.       Postgres has an extra column called “Load_Date” in which I would like to load the current date with timestamp when the flow is run

3.       If the integer column has more than 5 digits, I would like to take only the first 5 digits and load to the table

4.       There is a look up table in postgres. I would like to check if the first column value is present in the look up table and if yes, proceed ahead and if not ignore the record

I am trying to learn nifi so I would really appreciate any kind of help here. Is there any training available online that I can take in order to understand and do all these?

[cid:image003.png@01D34671.7E47CD80]



Re: UpdateRecord Processor

Posted by Bryan Bende <bb...@gmail.com>.
I think you need to set "Treat First Line as Header" to true in the CSV
Reader being used by the second UpdateRecord.

The CSV Writer from the first UpdateRecord has "Include Header Line" set to
true, so going into the second UpdateRecord you would need to skip it, or
not write it in the first place.

On Thu, Oct 19, 2017 at 11:47 AM, Aruna Sankaralingam <
Aruna.Sankaralingam@cormac-corp.com> wrote:

> Bryan,
>
>
>
> I created another Update Record Processor in record path mode. But I keep
> getting this error. I have attached my template. Could you please let me
> know where I am wrong?
>
>
>
>
>
> *15:44:38 UTC*
>
> *ERROR*
>
> *352b83fa-015f-1000-b22f-1ce3e446367f*
>
> UpdateRecord[id=352b83fa-015f-1000-b22f-1ce3e446367f] Failed to process
> StandardFlowFileRecord[uuid=40bc0ac4-56c9-4fbd-a73c-d180ff98dc87,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1508425490786-1,
> container=default, section=1], offset=27337, length=2746],offset=0,name=3-
> Address_Sub-_File.csv,size=2746]; will route to failure: java.lang.NumberFormatException:
> For input string: "ZIP_CD"
>
>
>
>
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Wednesday, October 18, 2017 3:48 PM
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> When using UpdateRecord you can only use one Replacement Strategy at a
> time....
>
> "Literal Value" means you can use literal text including expression
> language, but cannot use record path.
>
> "Record Path" means the value has to be a record path expression.
>
> In your example you are trying to do both... literal for ACTV_IND and
> SYS_CREATE_TS, and record path for FULL_NAME.
>
> You need to use two different UpdateRecord processors, the first one in
> literal mode that updates ACTV_IND and SYS_CREATE_TS, and the second one in
> record path mode that updates FULL_NAME.
>
> Also, I believe the record path expression does not start with $ so it
> would just be concat(/FIRST_NAME, ' ', /LAST_NAME)
>
>
>
>
>
> On Wed, Oct 18, 2017 at 3:23 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
>
>
> I included First Name and Last name in Source CSV file and updated the
> reader schema. In the target table, I added columns for First name, Last
> name and Full Name and updated the writer schema. I am trying to make use
> of concat function but I am not getting the correct usage.
>
> Could you please let me know where I am wrong?
>
> *19:16:10 UTC*
>
> *ERROR*
>
> *1221b2fe-015f-1000-9235-11d12a9fdbcc*
>
> UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process
> StandardFlowFileRecord[uuid=f1030e37-1b67-4961-807e-9a21a9d5681e,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1508352826506-1,
> container=default, section=1], offset=6381, length=2145],offset=0,name=3-
> Address_Sub-_File.csv,size=2145]; will route to failure:
> org.apache.nifi.attribute.expression.language.exception.
> AttributeExpressionLanguageException: Invalid Expression: ${concat(
> /FIRST_NAME, ,/LAST_NAME )} due to Unrecognized token at line 1, column 10.
> Query: ${concat( /FIRST_NAME, ,/LAST_NAME )}
>
> *19:19:22 UTC*
>
> *ERROR*
>
> *1221b2fe-015f-1000-9235-11d12a9fdbcc*
>
> UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process
> StandardFlowFileRecord[uuid=8562d402-08dd-44ef-821b-ac18a586e7bd,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1508352826506-1,
> container=default, section=1], offset=8526, length=2145],offset=0,name=3-
> Address_Sub-_File.csv,size=2145]; will route to failure:
> org.apache.nifi.attribute.expression.language.exception.
> AttributeExpressionLanguageException: Invalid Expression: ${concat(
> /FIRST_NAME,' ',/LAST_NAME )} due to Unrecognized token at line 1, column
> 10. Query: ${concat( /FIRST_NAME,' ',/LAST_NAME )}
>
>
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Wednesday, October 18, 2017 1:07 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> A lot of the other functions like substring, concat, and replace already
> exist, its just upper/lower that are missing.
>
> You can see all the functions here:
> https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html
>
>
>
> On Wed, Oct 18, 2017 at 1:02 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Thank you Bryan.
>
>
>
> Does this hold good for all string manipulations? Like Substring,
> Concatenation, replace can all be done using Scripted Record Writed/Reader
> and not with UpdateRecord Processor?
>
>
>
> Thanks
>
> Aruna
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Wednesday, October 18, 2017 12:40 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Ideally the way this would be done is by using another UpdateRecord with
> Replacement Strategy set to Record Path, and then providing a record path
> like /CITY_NAME[ toUpper() ]
>
> Unfortunately record path currently does not have upper and lower
> functions, so I created this JIRA:
> https://issues.apache.org/jira/browse/NIFI-4498
>
> I think the only way to do this right now might be using a Scripted Record
> Reader/Writer to do it manually, some examples:
>
> https://github.com/apache/nifi/blob/master/nifi-nar-
> bundles/nifi-scripting-bundle/nifi-scripting-processors/src/
> test/resources/groovy/test_record_reader_inline.groovy
> https://github.com/apache/nifi/blob/master/nifi-nar-
> bundles/nifi-scripting-bundle/nifi-scripting-processors/src/
> test/resources/groovy/test_record_reader_xml.groovy
> https://github.com/apache/nifi/blob/master/nifi-nar-
> bundles/nifi-scripting-bundle/nifi-scripting-processors/src/
> test/resources/groovy/test_record_writer_inline.groovy
>
>
>
>
>
> On Wed, Oct 18, 2017 at 11:29 AM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
>
>
> Sys_CREAT_TS and ACTV_IND got updated in the target table successfully.
> Thank you for your help.
>
>
>
> I am trying to convert one of the existing columns in CSV – CITY_NAME to
> Upper Case.  I don’t think my usage is correct. I tried with forward slash,
> quotes but the final value is coming as blank though the CSV file has valid
> values.
>
> I googled yesterday with no luck. Could you please let me know what I am
> missing?
>
>
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 5:19 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Ok you have to make a second instance of CSVReader that uses the "write"
> schema that UpdateRecord is using.
>
> If you try to use the same CSVReader that UpdateRecord is using, then it
> is reading records with a schema that doesn't contain your new fields,
> which is why they are blank in your database.
>
>
>
> On Mon, Oct 16, 2017 at 5:01 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
>
>
> PutDatabaseRecord was reading the reader schema. Attached is the template
> of my flow.
>
>
>
> Thanks
>
> Aruna
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 4:38 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Can you make sure that your PutDatabaseRecord is using a reader with a
> schema that has those fields (i.e. the same schema as the writer of
> UpdateRecord)?
>
> If PutDatabaseRecord was using the reader schema that UpdateRecord is
> using, then it would be reading in the records without those fields.
>
> If that doesn't solve it then I would need to see a template of your whole
> flow.
>
> Thanks,
>
> Bryan
>
>
>
> On Mon, Oct 16, 2017 at 4:26 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
> I tried both these options just now but sys_creat_ts is not getting
> populated for any record.
>
> Option 1: If your writer schema has long for sys_create_ts then your
> UpdateRecord needs to have ${now():toNumber()}.
>
> Option 2: If your writer schema has the original logicalType, then you
> need ${now():format('yyyy-MM-dd HH:mm:ss')}.
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 4:14 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Aruna,
>
> If your writer schema has long for sys_create_ts then your UpdateRecord
> needs to have ${now():toNumber()}.
>
> If your writer schema has the original logicalType, then you need
> ${now():format('yyyy-MM-dd HH:mm:ss')}.
>
>
>
> It looks like you have a mix of both because you said you changed it to
> long, but the error shows "2017-10-16 20:03:58" which isn't a long.
>
>
>
> -Bryan
>
>
>
>
>
> On Mon, Oct 16, 2017 at 4:07 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
>
>
> I followed your methods and the flow was successful. However, the
> sys_creat_ts and actv_ind fields were not populated for any of the records.
>
>
>
>
>
> This is what I see in the “Data Provenance” of the PutDatabaseRecord
> processor.
>
>
>
> ENRLMT_ID,CITY_NAME,ZIP_CD,STATE_CD,SYS_CREAT_TS,ACTV_IND
>
> I20031103000005,MECHANICSBURG,170509414,PA,1508183538000,
>
> I20031103000013,HATO REY,917,PR,1508183538000,
>
> I20031103000015,TOMS RIVER,8757,NJ,1508183538000,
>
> I20031103000028,JERSEY CITY,73062305,NJ,1508183538000,
>
> I20031103000030,AGUADILLA,6055256,PR,1508183538000,
>
> I20031103000032,BINGHAM FARMS,480254514,MI,1508183538000,
>
>
>
> So I changed the writer schema to use long for the sys_creat_ts field and
> started the flow but I got an error
>
> “*20:03:58 UTC**ERROR*
>
> *1221b2fe-015f-1000-9235-11d12a9fdbcc*
>
> UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process
> StandardFlowFileRecord[uuid=25c48eec-42a0-4766-8605-69c5a0dad3cf,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1508167656905-1,
> container=default, section=1], offset=8591, length=1606],offset=0,name=3-
> Address_Sub-_File.csv,size=1606]; will route to failure: java.lang.NumberFormatException:
> For input string: "2017-10-16 20:03:58"
>
>
>
> “
>
>
>
>
>
>
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 3:29 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Sticking with the original logicalType of timestamp-millis, I was able to
> get it working using ${now():format('yyyy-MM-dd HH:mm:ss')}  as the value
> of the timestamp column.
>
>
>
> On Mon, Oct 16, 2017 at 3:23 PM, Bryan Bende <bb...@gmail.com> wrote:
>
> If it is helpful, here is a template of the working version where I
> changed to a long:
>
> https://gist.githubusercontent.com/bbende/3187d3905f868bef3143899841bb70
> f9/raw/f266b149eb845a371543a1b68f9797d5437c6a2d/update-record-
> timestamp.xml
>
>
>
> On Mon, Oct 16, 2017 at 3:19 PM, Bryan Bende <bb...@gmail.com> wrote:
>
> Aruna,
>
> I think the issue here might be on the reader side of things...
>
> If you're incoming data does not have SYS_CREAT_TS in it, then your schema
> needs to allow null values for this field so you can create records from
> the incoming data, or you can create two versions of your schema and have
> the reader use a version that doesn't have this field, and have the writer
> use a version that does.
>
>
>
> I am currently testing this using these examples:
>
>
>
> {
> "type": "record",
> "name": "schema1",
> "fields": [
>  { "name": "id", "type": "string" }
> ]
> }
>
> {
> "type": "record",
> "name": "schema2",
> "fields": [
>  { "name": "id", "type": "string" },
>  { "name": "timestamp", "type" : { "type" : "long", "logicalType" :
> "timestamp-millis" } }
> ]
> }
>
>
>
> I have a CSVReader using schema1 and a CSVWriter using schema2.
>
>
>
> I'm able to get past the issue you are having and now it is getting to the
> point where it is trying to take the result ofr ${now():toNumber()} and
> convert it to a timestamp, and I'm running into a different issue:
>
>
>
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
> Could not convert value [1508180727622] of type java.lang.String to
> Timestamp for field timestamp because the value is not in the expected date
> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$
> Lambda$414/572291470@33d51c66
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
> Could not convert value [1508180727622] of type java.lang.String to
> Timestamp for field timestamp because the value is not in the expected date
> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$
> Lambda$414/572291470@33d51c66
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> toTimestamp(DataTypeUtils.java:564)
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> convertType(DataTypeUtils.java:134)
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> convertType(DataTypeUtils.java:84)
>     at org.apache.nifi.serialization.record.MapRecord.setValue(
> MapRecord.java:317)
>     at org.apache.nifi.record.path.StandardFieldValue.updateValue(
> StandardFieldValue.java:132)
>     at org.apache.nifi.processors.standard.UpdateRecord.lambda$
> process$1(UpdateRecord.java:177)
>
>
>
> I haven't figured out the issue, but it has something to do with the logic
> of trying to convert a string to timestamp and whether or not a date format
> is provided.
>
>
>
> If I change schema1 to use a regular long for the timestamp field then it
> works:
>
>
>
> {
> "type": "record",
> "name": "schema2",
> "fields": [
>  { "name": "id", "type": "string" },
>  { "name": "timestamp", "type" : "long" }
> ]
> }
>
>
>
> I don't know what the ramifications of this would be in PutDatabaseRecord.
>
>
>
>
>
> On Mon, Oct 16, 2017 at 2:23 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Matt,
>
>
>
> My issue is that when I am trying to assign current date to sys_creat_ts
> for every record, I am getting this error that it cannot be null.
> SYS_CREAT_TS is not coming from the source. It is present in the target
> table alone and I need to load the current datetime in that field for every
> record that is coming from source.
>
>
>
> Mark,
>
>
>
> I am still getting the same error.
>
>
>
> Also, actv_ind must display just “Y” in the target table.
>
>
>
>
>
>
>
>
>
> *From:* Mark Payne [mailto:markap14@hotmail.com]
> *Sent:* Monday, October 16, 2017 12:03 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Hi Aruna,
>
>
>
> I think the issue is likely that you are setting /sys_create_ts but your
> schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
>
> Would recommend you change the property names to /ACTV_IND and
> /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
>
> set to "{now()}" but I think what you really are wanting is
> "${now():toNumber()}".
>
>
>
> Also, of note, /actv_ind is set to '/Y' so I just want to clarify that the
> literal value '/Y' (with the single quotes) is what will be placed there.
> Is that
>
> the intention? Or did you actually want just /Y to be there?
>
>
>
> Thanks
>
> -Mark
>
>
>
>
>
>
>
> On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <
> Aruna.Sankaralingam@Cormac-Corp.com> wrote:
>
>
>
> Hi,
>
>
>
> I updated to version 1.4 now. I am using Updaterecord processor. I have
> assigned values to two of the columns that are there in the target table as
> shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am
> I missing something?
>
> I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry
> and my Nifi Flow below.
>
>
>
> <image004.png>   <image005.png>
>
>
>
> CSVRecordSetWriter:
>
> <image001.png>
>
>
>
> AVROSchemaRegistry:
>
> <image002.png>
>
>
>
> <image006.png>
>
>
>
>
>
> *From:* Koji Kawamura [mailto:ijokarumawak@gmail.com
> <ij...@gmail.com>]
> *Sent:* Thursday, October 12, 2017 8:28 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: Transformations using Nifi
>
>
>
> Hi Aruna,
>
>
>
> If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
>
> ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
>
> As Russ mentioned, EL might be helpful when you implement the scripted
> components.
>
>
>
> #4 is a bit harder since it requires a database connection, but doable, I
> don't know it works efficiently though..
>
>
>
> Alternative approach would be create a temporary table, insert rows as
> they are, then perform a insert/update query using the temporary table and
> postgresql lookup table such as:
>
> "insert into X select a, b, c, from T inner join L on X.j = T.j where d =
> ..."
>
> Probably data conversion #1, 2, and 3 can be performed from this query as
> well.
>
>
>
> Thanks,
>
> Koji
>
>
>
> On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>
> wrote:
>
> Aruna,
>
> I don't think there is any generalized NiFi training course yet. I started
> writing a book a year ago, but it's a pretty thankless task and takes a lot
> of time. I mostly write custom processors so I tend to see the world
> differently from most "user" users of NiFi. When I answer questions, which
> I don't do too much, I often tell an impractical story that doesn't answer
> the question asked.
>
> Now, I haven't done much database work at all, so I'm not going to be too
> much help to you. However, looking at your questions, particularly the one
> about obtaining the current date when the flow is run (#2) and also #3, I
> would suggest that you study the NiFi Expression Language. This would allow
> you to insert the "now" date into your flow at some point.
>
> The other suggestions I have are to experiment, which I'm sure you're
> doing, and Google hard for help. For this forum, which is filled with very
> nice and helpful people, you'll tend to get a lot more and better help if
> you come in with a very specific question rather than a list of things or a
> general, "help me" sort of plea.
>
> Cheers,
>
> Russ
>
> P.S. NiFi absolutely rocks, which you'll see as soon as you get over your
> initial hump here. But, you're on the right track.
>
>
>
> On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
>
> Hi,
>
>
>
> Could you someone please help me with these requirements in the email
> below?
>
>
>
> Thanks
>
> Aruna
>
>
>
> *From:* Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com
> <Ar...@Cormac-Corp.com>]
> *Sent:* Wednesday, October 11, 2017 11:26 AM
> *To:* users@nifi.apache.org
> *Subject:* Transformations using Nifi
>
>
>
> I am trying to see what kind of transformations that can be done in nifi
> and how.
>
> Now I have a basic flow that takes CSV from the local dir and puts into s3
> and loads into postgres database.
>
> There are 4 columns in my test file 3 of which are string and one is an
> integer field. I would like to do the following before I load the data into
> postgres. If someone can help me on how to go about these, it will be great.
>
> 1.       Convert one of the string columns to upper case
>
> *For converting to upper case, I was told to use the Update Record
> processor but  my version is 1.2.0 and the update record processor is not
> available.*
>
>
>
> 2.       Postgres has an extra column called “Load_Date” in which I would
> like to load the current date with timestamp when the flow is run
>
> 3.       If the integer column has more than 5 digits, I would like to
> take only the first 5 digits and load to the table
>
> 4.       There is a look up table in postgres. I would like to check if
> the first column value is present in the look up table and if yes, proceed
> ahead and if not ignore the record
>
>
>
> I am trying to learn nifi so I would really appreciate any kind of help
> here. Is there any training available online that I can take in order to
> understand and do all these?
>
>
>
> <image003.png>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

RE: UpdateRecord Processor

Posted by Aruna Sankaralingam <Ar...@Cormac-Corp.com>.
Bryan,

I created another Update Record Processor in record path mode. But I keep getting this error. I have attached my template. Could you please let me know where I am wrong?


15:44:38 UTC
ERROR
352b83fa-015f-1000-b22f-1ce3e446367f
UpdateRecord[id=352b83fa-015f-1000-b22f-1ce3e446367f] Failed to process StandardFlowFileRecord[uuid=40bc0ac4-56c9-4fbd-a73c-d180ff98dc87,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1508425490786-1, container=default, section=1], offset=27337, length=2746],offset=0,name=3-Address_Sub-_File.csv,size=2746]; will route to failure: java.lang.NumberFormatException: For input string: "ZIP_CD"



From: Bryan Bende [mailto:bbende@gmail.com]
Sent: Wednesday, October 18, 2017 3:48 PM
To: users@nifi.apache.org
Subject: Re: UpdateRecord Processor

When using UpdateRecord you can only use one Replacement Strategy at a time....
"Literal Value" means you can use literal text including expression language, but cannot use record path.
"Record Path" means the value has to be a record path expression.
In your example you are trying to do both... literal for ACTV_IND and SYS_CREATE_TS, and record path for FULL_NAME.
You need to use two different UpdateRecord processors, the first one in literal mode that updates ACTV_IND and SYS_CREATE_TS, and the second one in record path mode that updates FULL_NAME.
Also, I believe the record path expression does not start with $ so it would just be concat(/FIRST_NAME, ' ', /LAST_NAME)


On Wed, Oct 18, 2017 at 3:23 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,

I included First Name and Last name in Source CSV file and updated the reader schema. In the target table, I added columns for First name, Last name and Full Name and updated the writer schema. I am trying to make use of concat function but I am not getting the correct usage.
Could you please let me know where I am wrong?
[cid:image001.png@01D348D0.14B8A900]
19:16:10 UTC
ERROR
1221b2fe-015f-1000-9235-11d12a9fdbcc
UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process StandardFlowFileRecord[uuid=f1030e37-1b67-4961-807e-9a21a9d5681e,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1508352826506-1, container=default, section=1], offset=6381, length=2145],offset=0,name=3-Address_Sub-_File.csv,size=2145]; will route to failure: org.apache.nifi.attribute.expression.language.exception.AttributeExpressionLanguageException: Invalid Expression: ${concat( /FIRST_NAME, ,/LAST_NAME )} due to Unrecognized token at line 1, column 10. Query: ${concat( /FIRST_NAME, ,/LAST_NAME )}
19:19:22 UTC
ERROR
1221b2fe-015f-1000-9235-11d12a9fdbcc
UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process StandardFlowFileRecord[uuid=8562d402-08dd-44ef-821b-ac18a586e7bd,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1508352826506-1, container=default, section=1], offset=8526, length=2145],offset=0,name=3-Address_Sub-_File.csv,size=2145]; will route to failure: org.apache.nifi.attribute.expression.language.exception.AttributeExpressionLanguageException: Invalid Expression: ${concat( /FIRST_NAME,' ',/LAST_NAME )} due to Unrecognized token at line 1, column 10. Query: ${concat( /FIRST_NAME,' ',/LAST_NAME )}


From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Wednesday, October 18, 2017 1:07 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

A lot of the other functions like substring, concat, and replace already exist, its just upper/lower that are missing.

You can see all the functions here:
https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html

On Wed, Oct 18, 2017 at 1:02 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Thank you Bryan.

Does this hold good for all string manipulations? Like Substring, Concatenation, replace can all be done using Scripted Record Writed/Reader and not with UpdateRecord Processor?

Thanks
Aruna

From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Wednesday, October 18, 2017 12:40 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Ideally the way this would be done is by using another UpdateRecord with Replacement Strategy set to Record Path, and then providing a record path like /CITY_NAME[ toUpper() ]
Unfortunately record path currently does not have upper and lower functions, so I created this JIRA:
https://issues.apache.org/jira/browse/NIFI-4498
I think the only way to do this right now might be using a Scripted Record Reader/Writer to do it manually, some examples:

https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-scripting-bundle/nifi-scripting-processors/src/test/resources/groovy/test_record_reader_inline.groovy
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-scripting-bundle/nifi-scripting-processors/src/test/resources/groovy/test_record_reader_xml.groovy
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-scripting-bundle/nifi-scripting-processors/src/test/resources/groovy/test_record_writer_inline.groovy


On Wed, Oct 18, 2017 at 11:29 AM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,

Sys_CREAT_TS and ACTV_IND got updated in the target table successfully. Thank you for your help.

I am trying to convert one of the existing columns in CSV – CITY_NAME to Upper Case.  I don’t think my usage is correct. I tried with forward slash, quotes but the final value is coming as blank though the CSV file has valid values.
I googled yesterday with no luck. Could you please let me know what I am missing?

[cid:image002.png@01D348D0.14B8A900]

From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 5:19 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Ok you have to make a second instance of CSVReader that uses the "write" schema that UpdateRecord is using.
If you try to use the same CSVReader that UpdateRecord is using, then it is reading records with a schema that doesn't contain your new fields, which is why they are blank in your database.

On Mon, Oct 16, 2017 at 5:01 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,

PutDatabaseRecord was reading the reader schema. Attached is the template of my flow.

Thanks
Aruna

From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 4:38 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Can you make sure that your PutDatabaseRecord is using a reader with a schema that has those fields (i.e. the same schema as the writer of UpdateRecord)?
If PutDatabaseRecord was using the reader schema that UpdateRecord is using, then it would be reading in the records without those fields.
If that doesn't solve it then I would need to see a template of your whole flow.
Thanks,
Bryan

On Mon, Oct 16, 2017 at 4:26 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,
I tried both these options just now but sys_creat_ts is not getting populated for any record.
Option 1: If your writer schema has long for sys_create_ts then your UpdateRecord needs to have ${now():toNumber()}.
Option 2: If your writer schema has the original logicalType, then you need ${now():format('yyyy-MM-dd HH:mm:ss')}.

From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 4:14 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Aruna,
If your writer schema has long for sys_create_ts then your UpdateRecord needs to have ${now():toNumber()}.
If your writer schema has the original logicalType, then you need ${now():format('yyyy-MM-dd HH:mm:ss')}.

It looks like you have a mix of both because you said you changed it to long, but the error shows "2017-10-16 20:03:58" which isn't a long.

-Bryan


On Mon, Oct 16, 2017 at 4:07 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,

I followed your methods and the flow was successful. However, the sys_creat_ts and actv_ind fields were not populated for any of the records.

[cid:image003.png@01D348D0.14B8A900]

This is what I see in the “Data Provenance” of the PutDatabaseRecord processor.

ENRLMT_ID,CITY_NAME,ZIP_CD,STATE_CD,SYS_CREAT_TS,ACTV_IND
I20031103000005,MECHANICSBURG,170509414,PA,1508183538000,
I20031103000013,HATO REY,917,PR,1508183538000,
I20031103000015,TOMS RIVER,8757,NJ,1508183538000,
I20031103000028,JERSEY CITY,73062305,NJ,1508183538000,
I20031103000030,AGUADILLA,6055256,PR,1508183538000,
I20031103000032,BINGHAM FARMS,480254514,MI,1508183538000,

So I changed the writer schema to use long for the sys_creat_ts field and started the flow but I got an error
“20:03:58 UTCERROR
1221b2fe-015f-1000-9235-11d12a9fdbcc
UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process StandardFlowFileRecord[uuid=25c48eec-42a0-4766-8605-69c5a0dad3cf,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1508167656905-1, container=default, section=1], offset=8591, length=1606],offset=0,name=3-Address_Sub-_File.csv,size=1606]; will route to failure: java.lang.NumberFormatException: For input string: "2017-10-16 20:03:58"

“

[cid:image004.jpg@01D348D0.14B8A900]



From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 3:29 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Sticking with the original logicalType of timestamp-millis, I was able to get it working using ${now():format('yyyy-MM-dd HH:mm:ss')}  as the value of the timestamp column.

On Mon, Oct 16, 2017 at 3:23 PM, Bryan Bende <bb...@gmail.com>> wrote:
If it is helpful, here is a template of the working version where I changed to a long:

https://gist.githubusercontent.com/bbende/3187d3905f868bef3143899841bb70f9/raw/f266b149eb845a371543a1b68f9797d5437c6a2d/update-record-timestamp.xml

On Mon, Oct 16, 2017 at 3:19 PM, Bryan Bende <bb...@gmail.com>> wrote:
Aruna,
I think the issue here might be on the reader side of things...
If you're incoming data does not have SYS_CREAT_TS in it, then your schema needs to allow null values for this field so you can create records from the incoming data, or you can create two versions of your schema and have the reader use a version that doesn't have this field, and have the writer use a version that does.

I am currently testing this using these examples:

{
"type": "record",
"name": "schema1",
"fields": [
 { "name": "id", "type": "string" }
]
}

{
"type": "record",
"name": "schema2",
"fields": [
 { "name": "id", "type": "string" },
 { "name": "timestamp", "type" : { "type" : "long", "logicalType" : "timestamp-millis" } }
]
}

I have a CSVReader using schema1 and a CSVWriter using schema2.

I'm able to get past the issue you are having and now it is getting to the point where it is trying to take the result ofr ${now():toNumber()} and convert it to a timestamp, and I'm running into a different issue:

org.apache.nifi.serialization.record.util.IllegalTypeConversionException: Could not convert value [1508180727622] of type java.lang.String to Timestamp for field timestamp because the value is not in the expected date format: org.apache.nifi.serialization.record.util.DataTypeUtils$$Lambda$414/572291470@33d51c66<ma...@33d51c66>
org.apache.nifi.serialization.record.util.IllegalTypeConversionException: Could not convert value [1508180727622] of type java.lang.String to Timestamp for field timestamp because the value is not in the expected date format: org.apache.nifi.serialization.record.util.DataTypeUtils$$Lambda$414/572291470@33d51c66<ma...@33d51c66>
    at org.apache.nifi.serialization.record.util.DataTypeUtils.toTimestamp(DataTypeUtils.java:564)
    at org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:134)
    at org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:84)
    at org.apache.nifi.serialization.record.MapRecord.setValue(MapRecord.java:317)
    at org.apache.nifi.record.path.St<http://org.apache.nifi.record.path.St>andardFieldValue.updateValue(StandardFieldValue.java:132)
    at org.apache.nifi.processors.standard.UpdateRecord.lambda$process$1(UpdateRecord.java:177)

I haven't figured out the issue, but it has something to do with the logic of trying to convert a string to timestamp and whether or not a date format is provided.

If I change schema1 to use a regular long for the timestamp field then it works:

{
"type": "record",
"name": "schema2",
"fields": [
 { "name": "id", "type": "string" },
 { "name": "timestamp", "type" : "long" }
]
}

I don't know what the ramifications of this would be in PutDatabaseRecord.


On Mon, Oct 16, 2017 at 2:23 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Matt,

My issue is that when I am trying to assign current date to sys_creat_ts for every record, I am getting this error that it cannot be null. SYS_CREAT_TS is not coming from the source. It is present in the target table alone and I need to load the current datetime in that field for every record that is coming from source.

Mark,

I am still getting the same error.

Also, actv_ind must display just “Y” in the target table.


[cid:image005.png@01D348D0.14B8A900]

[cid:image006.png@01D348D0.14B8A900]

From: Mark Payne [mailto:markap14@hotmail.com<ma...@hotmail.com>]
Sent: Monday, October 16, 2017 12:03 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Hi Aruna,

I think the issue is likely that you are setting /sys_create_ts but your schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
Would recommend you change the property names to /ACTV_IND and /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
set to "{now()}" but I think what you really are wanting is "${now():toNumber()}".

Also, of note, /actv_ind is set to '/Y' so I just want to clarify that the literal value '/Y' (with the single quotes) is what will be placed there. Is that
the intention? Or did you actually want just /Y to be there?

Thanks
-Mark



On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <Ar...@Cormac-Corp.com>> wrote:

Hi,

I updated to version 1.4 now. I am using Updaterecord processor. I have assigned values to two of the columns that are there in the target table as shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am I missing something?
I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry and my Nifi Flow below.

<image004.png>   <image005.png>

CSVRecordSetWriter:
<image001.png>

AVROSchemaRegistry:
<image002.png>

<image006.png>


From: Koji Kawamura [mailto:ijokarumawak@gmail.com]
Sent: Thursday, October 12, 2017 8:28 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: Transformations using Nifi

Hi Aruna,

If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
As Russ mentioned, EL might be helpful when you implement the scripted components.

#4 is a bit harder since it requires a database connection, but doable, I don't know it works efficiently though..

Alternative approach would be create a temporary table, insert rows as they are, then perform a insert/update query using the temporary table and postgresql lookup table such as:
"insert into X select a, b, c, from T inner join L on X.j = T.j where d = ..."
Probably data conversion #1, 2, and 3 can be performed from this query as well.

Thanks,
Koji

On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>> wrote:
Aruna,
I don't think there is any generalized NiFi training course yet. I started writing a book a year ago, but it's a pretty thankless task and takes a lot of time. I mostly write custom processors so I tend to see the world differently from most "user" users of NiFi. When I answer questions, which I don't do too much, I often tell an impractical story that doesn't answer the question asked.
Now, I haven't done much database work at all, so I'm not going to be too much help to you. However, looking at your questions, particularly the one about obtaining the current date when the flow is run (#2) and also #3, I would suggest that you study the NiFi Expression Language. This would allow you to insert the "now" date into your flow at some point.
The other suggestions I have are to experiment, which I'm sure you're doing, and Google hard for help. For this forum, which is filled with very nice and helpful people, you'll tend to get a lot more and better help if you come in with a very specific question rather than a list of things or a general, "help me" sort of plea.
Cheers,
Russ
P.S. NiFi absolutely rocks, which you'll see as soon as you get over your initial hump here. But, you're on the right track.

On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
Hi,

Could you someone please help me with these requirements in the email below?

Thanks
Aruna

From: Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com]
Sent: Wednesday, October 11, 2017 11:26 AM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Transformations using Nifi

I am trying to see what kind of transformations that can be done in nifi and how.
Now I have a basic flow that takes CSV from the local dir and puts into s3 and loads into postgres database.
There are 4 columns in my test file 3 of which are string and one is an integer field. I would like to do the following before I load the data into postgres. If someone can help me on how to go about these, it will be great.

1.       Convert one of the string columns to upper case

For converting to upper case, I was told to use the Update Record processor but  my version is 1.2.0 and the update record processor is not available.



2.       Postgres has an extra column called “Load_Date” in which I would like to load the current date with timestamp when the flow is run

3.       If the integer column has more than 5 digits, I would like to take only the first 5 digits and load to the table

4.       There is a look up table in postgres. I would like to check if the first column value is present in the look up table and if yes, proceed ahead and if not ignore the record

I am trying to learn nifi so I would really appreciate any kind of help here. Is there any training available online that I can take in order to understand and do all these?

<image003.png>











Re: UpdateRecord Processor

Posted by Bryan Bende <bb...@gmail.com>.
When using UpdateRecord you can only use one Replacement Strategy at a
time....

"Literal Value" means you can use literal text including expression
language, but cannot use record path.

"Record Path" means the value has to be a record path expression.

In your example you are trying to do both... literal for ACTV_IND and
SYS_CREATE_TS, and record path for FULL_NAME.

You need to use two different UpdateRecord processors, the first one in
literal mode that updates ACTV_IND and SYS_CREATE_TS, and the second one in
record path mode that updates FULL_NAME.

Also, I believe the record path expression does not start with $ so it
would just be concat(/FIRST_NAME, ' ', /LAST_NAME)


On Wed, Oct 18, 2017 at 3:23 PM, Aruna Sankaralingam <
Aruna.Sankaralingam@cormac-corp.com> wrote:

> Bryan,
>
>
>
> I included First Name and Last name in Source CSV file and updated the
> reader schema. In the target table, I added columns for First name, Last
> name and Full Name and updated the writer schema. I am trying to make use
> of concat function but I am not getting the correct usage.
>
> Could you please let me know where I am wrong?
>
> *19:16:10 UTC*
>
> *ERROR*
>
> *1221b2fe-015f-1000-9235-11d12a9fdbcc*
>
> UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process
> StandardFlowFileRecord[uuid=f1030e37-1b67-4961-807e-9a21a9d5681e,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1508352826506-1,
> container=default, section=1], offset=6381, length=2145],offset=0,name=3-
> Address_Sub-_File.csv,size=2145]; will route to failure:
> org.apache.nifi.attribute.expression.language.exception.
> AttributeExpressionLanguageException: Invalid Expression: ${concat(
> /FIRST_NAME, ,/LAST_NAME )} due to Unrecognized token at line 1, column 10.
> Query: ${concat( /FIRST_NAME, ,/LAST_NAME )}
>
> *19:19:22 UTC*
>
> *ERROR*
>
> *1221b2fe-015f-1000-9235-11d12a9fdbcc*
>
> UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process
> StandardFlowFileRecord[uuid=8562d402-08dd-44ef-821b-ac18a586e7bd,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1508352826506-1,
> container=default, section=1], offset=8526, length=2145],offset=0,name=3-
> Address_Sub-_File.csv,size=2145]; will route to failure:
> org.apache.nifi.attribute.expression.language.exception.
> AttributeExpressionLanguageException: Invalid Expression: ${concat(
> /FIRST_NAME,' ',/LAST_NAME )} due to Unrecognized token at line 1, column
> 10. Query: ${concat( /FIRST_NAME,' ',/LAST_NAME )}
>
>
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Wednesday, October 18, 2017 1:07 PM
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> A lot of the other functions like substring, concat, and replace already
> exist, its just upper/lower that are missing.
>
> You can see all the functions here:
> https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html
>
>
>
> On Wed, Oct 18, 2017 at 1:02 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Thank you Bryan.
>
>
>
> Does this hold good for all string manipulations? Like Substring,
> Concatenation, replace can all be done using Scripted Record Writed/Reader
> and not with UpdateRecord Processor?
>
>
>
> Thanks
>
> Aruna
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Wednesday, October 18, 2017 12:40 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Ideally the way this would be done is by using another UpdateRecord with
> Replacement Strategy set to Record Path, and then providing a record path
> like /CITY_NAME[ toUpper() ]
>
> Unfortunately record path currently does not have upper and lower
> functions, so I created this JIRA:
> https://issues.apache.org/jira/browse/NIFI-4498
>
> I think the only way to do this right now might be using a Scripted Record
> Reader/Writer to do it manually, some examples:
>
> https://github.com/apache/nifi/blob/master/nifi-nar-
> bundles/nifi-scripting-bundle/nifi-scripting-processors/src/
> test/resources/groovy/test_record_reader_inline.groovy
> https://github.com/apache/nifi/blob/master/nifi-nar-
> bundles/nifi-scripting-bundle/nifi-scripting-processors/src/
> test/resources/groovy/test_record_reader_xml.groovy
> https://github.com/apache/nifi/blob/master/nifi-nar-
> bundles/nifi-scripting-bundle/nifi-scripting-processors/src/
> test/resources/groovy/test_record_writer_inline.groovy
>
>
>
>
>
> On Wed, Oct 18, 2017 at 11:29 AM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
>
>
> Sys_CREAT_TS and ACTV_IND got updated in the target table successfully.
> Thank you for your help.
>
>
>
> I am trying to convert one of the existing columns in CSV – CITY_NAME to
> Upper Case.  I don’t think my usage is correct. I tried with forward slash,
> quotes but the final value is coming as blank though the CSV file has valid
> values.
>
> I googled yesterday with no luck. Could you please let me know what I am
> missing?
>
>
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 5:19 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Ok you have to make a second instance of CSVReader that uses the "write"
> schema that UpdateRecord is using.
>
> If you try to use the same CSVReader that UpdateRecord is using, then it
> is reading records with a schema that doesn't contain your new fields,
> which is why they are blank in your database.
>
>
>
> On Mon, Oct 16, 2017 at 5:01 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
>
>
> PutDatabaseRecord was reading the reader schema. Attached is the template
> of my flow.
>
>
>
> Thanks
>
> Aruna
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 4:38 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Can you make sure that your PutDatabaseRecord is using a reader with a
> schema that has those fields (i.e. the same schema as the writer of
> UpdateRecord)?
>
> If PutDatabaseRecord was using the reader schema that UpdateRecord is
> using, then it would be reading in the records without those fields.
>
> If that doesn't solve it then I would need to see a template of your whole
> flow.
>
> Thanks,
>
> Bryan
>
>
>
> On Mon, Oct 16, 2017 at 4:26 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
> I tried both these options just now but sys_creat_ts is not getting
> populated for any record.
>
> Option 1: If your writer schema has long for sys_create_ts then your
> UpdateRecord needs to have ${now():toNumber()}.
>
> Option 2: If your writer schema has the original logicalType, then you
> need ${now():format('yyyy-MM-dd HH:mm:ss')}.
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 4:14 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Aruna,
>
> If your writer schema has long for sys_create_ts then your UpdateRecord
> needs to have ${now():toNumber()}.
>
> If your writer schema has the original logicalType, then you need
> ${now():format('yyyy-MM-dd HH:mm:ss')}.
>
>
>
> It looks like you have a mix of both because you said you changed it to
> long, but the error shows "2017-10-16 20:03:58" which isn't a long.
>
>
>
> -Bryan
>
>
>
>
>
> On Mon, Oct 16, 2017 at 4:07 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
>
>
> I followed your methods and the flow was successful. However, the
> sys_creat_ts and actv_ind fields were not populated for any of the records.
>
>
>
>
>
> This is what I see in the “Data Provenance” of the PutDatabaseRecord
> processor.
>
>
>
> ENRLMT_ID,CITY_NAME,ZIP_CD,STATE_CD,SYS_CREAT_TS,ACTV_IND
>
> I20031103000005,MECHANICSBURG,170509414,PA,1508183538000,
>
> I20031103000013,HATO REY,917,PR,1508183538000,
>
> I20031103000015,TOMS RIVER,8757,NJ,1508183538000,
>
> I20031103000028,JERSEY CITY,73062305,NJ,1508183538000,
>
> I20031103000030,AGUADILLA,6055256,PR,1508183538000,
>
> I20031103000032,BINGHAM FARMS,480254514,MI,1508183538000,
>
>
>
> So I changed the writer schema to use long for the sys_creat_ts field and
> started the flow but I got an error
>
> “*20:03:58 UTC**ERROR*
>
> *1221b2fe-015f-1000-9235-11d12a9fdbcc*
>
> UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process
> StandardFlowFileRecord[uuid=25c48eec-42a0-4766-8605-69c5a0dad3cf,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1508167656905-1,
> container=default, section=1], offset=8591, length=1606],offset=0,name=3-
> Address_Sub-_File.csv,size=1606]; will route to failure: java.lang.NumberFormatException:
> For input string: "2017-10-16 20:03:58"
>
>
>
> “
>
>
>
>
>
>
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 3:29 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Sticking with the original logicalType of timestamp-millis, I was able to
> get it working using ${now():format('yyyy-MM-dd HH:mm:ss')}  as the value
> of the timestamp column.
>
>
>
> On Mon, Oct 16, 2017 at 3:23 PM, Bryan Bende <bb...@gmail.com> wrote:
>
> If it is helpful, here is a template of the working version where I
> changed to a long:
>
> https://gist.githubusercontent.com/bbende/3187d3905f868bef3143899841bb70
> f9/raw/f266b149eb845a371543a1b68f9797d5437c6a2d/update-record-
> timestamp.xml
>
>
>
> On Mon, Oct 16, 2017 at 3:19 PM, Bryan Bende <bb...@gmail.com> wrote:
>
> Aruna,
>
> I think the issue here might be on the reader side of things...
>
> If you're incoming data does not have SYS_CREAT_TS in it, then your schema
> needs to allow null values for this field so you can create records from
> the incoming data, or you can create two versions of your schema and have
> the reader use a version that doesn't have this field, and have the writer
> use a version that does.
>
>
>
> I am currently testing this using these examples:
>
>
>
> {
> "type": "record",
> "name": "schema1",
> "fields": [
>  { "name": "id", "type": "string" }
> ]
> }
>
> {
> "type": "record",
> "name": "schema2",
> "fields": [
>  { "name": "id", "type": "string" },
>  { "name": "timestamp", "type" : { "type" : "long", "logicalType" :
> "timestamp-millis" } }
> ]
> }
>
>
>
> I have a CSVReader using schema1 and a CSVWriter using schema2.
>
>
>
> I'm able to get past the issue you are having and now it is getting to the
> point where it is trying to take the result ofr ${now():toNumber()} and
> convert it to a timestamp, and I'm running into a different issue:
>
>
>
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
> Could not convert value [1508180727622] of type java.lang.String to
> Timestamp for field timestamp because the value is not in the expected date
> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$
> Lambda$414/572291470@33d51c66
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
> Could not convert value [1508180727622] of type java.lang.String to
> Timestamp for field timestamp because the value is not in the expected date
> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$
> Lambda$414/572291470@33d51c66
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> toTimestamp(DataTypeUtils.java:564)
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> convertType(DataTypeUtils.java:134)
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> convertType(DataTypeUtils.java:84)
>     at org.apache.nifi.serialization.record.MapRecord.setValue(
> MapRecord.java:317)
>     at org.apache.nifi.record.path.StandardFieldValue.updateValue(
> StandardFieldValue.java:132)
>     at org.apache.nifi.processors.standard.UpdateRecord.lambda$
> process$1(UpdateRecord.java:177)
>
>
>
> I haven't figured out the issue, but it has something to do with the logic
> of trying to convert a string to timestamp and whether or not a date format
> is provided.
>
>
>
> If I change schema1 to use a regular long for the timestamp field then it
> works:
>
>
>
> {
> "type": "record",
> "name": "schema2",
> "fields": [
>  { "name": "id", "type": "string" },
>  { "name": "timestamp", "type" : "long" }
> ]
> }
>
>
>
> I don't know what the ramifications of this would be in PutDatabaseRecord.
>
>
>
>
>
> On Mon, Oct 16, 2017 at 2:23 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Matt,
>
>
>
> My issue is that when I am trying to assign current date to sys_creat_ts
> for every record, I am getting this error that it cannot be null.
> SYS_CREAT_TS is not coming from the source. It is present in the target
> table alone and I need to load the current datetime in that field for every
> record that is coming from source.
>
>
>
> Mark,
>
>
>
> I am still getting the same error.
>
>
>
> Also, actv_ind must display just “Y” in the target table.
>
>
>
>
>
>
>
>
>
> *From:* Mark Payne [mailto:markap14@hotmail.com]
> *Sent:* Monday, October 16, 2017 12:03 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Hi Aruna,
>
>
>
> I think the issue is likely that you are setting /sys_create_ts but your
> schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
>
> Would recommend you change the property names to /ACTV_IND and
> /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
>
> set to "{now()}" but I think what you really are wanting is
> "${now():toNumber()}".
>
>
>
> Also, of note, /actv_ind is set to '/Y' so I just want to clarify that the
> literal value '/Y' (with the single quotes) is what will be placed there.
> Is that
>
> the intention? Or did you actually want just /Y to be there?
>
>
>
> Thanks
>
> -Mark
>
>
>
>
>
>
>
> On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <
> Aruna.Sankaralingam@Cormac-Corp.com> wrote:
>
>
>
> Hi,
>
>
>
> I updated to version 1.4 now. I am using Updaterecord processor. I have
> assigned values to two of the columns that are there in the target table as
> shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am
> I missing something?
>
> I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry
> and my Nifi Flow below.
>
>
>
> <image004.png>   <image005.png>
>
>
>
> CSVRecordSetWriter:
>
> <image001.png>
>
>
>
> AVROSchemaRegistry:
>
> <image002.png>
>
>
>
> <image006.png>
>
>
>
>
>
> *From:* Koji Kawamura [mailto:ijokarumawak@gmail.com
> <ij...@gmail.com>]
> *Sent:* Thursday, October 12, 2017 8:28 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: Transformations using Nifi
>
>
>
> Hi Aruna,
>
>
>
> If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
>
> ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
>
> As Russ mentioned, EL might be helpful when you implement the scripted
> components.
>
>
>
> #4 is a bit harder since it requires a database connection, but doable, I
> don't know it works efficiently though..
>
>
>
> Alternative approach would be create a temporary table, insert rows as
> they are, then perform a insert/update query using the temporary table and
> postgresql lookup table such as:
>
> "insert into X select a, b, c, from T inner join L on X.j = T.j where d =
> ..."
>
> Probably data conversion #1, 2, and 3 can be performed from this query as
> well.
>
>
>
> Thanks,
>
> Koji
>
>
>
> On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>
> wrote:
>
> Aruna,
>
> I don't think there is any generalized NiFi training course yet. I started
> writing a book a year ago, but it's a pretty thankless task and takes a lot
> of time. I mostly write custom processors so I tend to see the world
> differently from most "user" users of NiFi. When I answer questions, which
> I don't do too much, I often tell an impractical story that doesn't answer
> the question asked.
>
> Now, I haven't done much database work at all, so I'm not going to be too
> much help to you. However, looking at your questions, particularly the one
> about obtaining the current date when the flow is run (#2) and also #3, I
> would suggest that you study the NiFi Expression Language. This would allow
> you to insert the "now" date into your flow at some point.
>
> The other suggestions I have are to experiment, which I'm sure you're
> doing, and Google hard for help. For this forum, which is filled with very
> nice and helpful people, you'll tend to get a lot more and better help if
> you come in with a very specific question rather than a list of things or a
> general, "help me" sort of plea.
>
> Cheers,
>
> Russ
>
> P.S. NiFi absolutely rocks, which you'll see as soon as you get over your
> initial hump here. But, you're on the right track.
>
>
>
> On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
>
> Hi,
>
>
>
> Could you someone please help me with these requirements in the email
> below?
>
>
>
> Thanks
>
> Aruna
>
>
>
> *From:* Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com
> <Ar...@Cormac-Corp.com>]
> *Sent:* Wednesday, October 11, 2017 11:26 AM
> *To:* users@nifi.apache.org
> *Subject:* Transformations using Nifi
>
>
>
> I am trying to see what kind of transformations that can be done in nifi
> and how.
>
> Now I have a basic flow that takes CSV from the local dir and puts into s3
> and loads into postgres database.
>
> There are 4 columns in my test file 3 of which are string and one is an
> integer field. I would like to do the following before I load the data into
> postgres. If someone can help me on how to go about these, it will be great.
>
> 1.       Convert one of the string columns to upper case
>
> *For converting to upper case, I was told to use the Update Record
> processor but  my version is 1.2.0 and the update record processor is not
> available.*
>
>
>
> 2.       Postgres has an extra column called “Load_Date” in which I would
> like to load the current date with timestamp when the flow is run
>
> 3.       If the integer column has more than 5 digits, I would like to
> take only the first 5 digits and load to the table
>
> 4.       There is a look up table in postgres. I would like to check if
> the first column value is present in the look up table and if yes, proceed
> ahead and if not ignore the record
>
>
>
> I am trying to learn nifi so I would really appreciate any kind of help
> here. Is there any training available online that I can take in order to
> understand and do all these?
>
>
>
> <image003.png>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

RE: UpdateRecord Processor

Posted by Aruna Sankaralingam <Ar...@Cormac-Corp.com>.
Bryan,

I included First Name and Last name in Source CSV file and updated the reader schema. In the target table, I added columns for First name, Last name and Full Name and updated the writer schema. I am trying to make use of concat function but I am not getting the correct usage.
Could you please let me know where I am wrong?
[cid:image006.png@01D34825.00BEF220]
19:16:10 UTC
ERROR
1221b2fe-015f-1000-9235-11d12a9fdbcc
UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process StandardFlowFileRecord[uuid=f1030e37-1b67-4961-807e-9a21a9d5681e,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1508352826506-1, container=default, section=1], offset=6381, length=2145],offset=0,name=3-Address_Sub-_File.csv,size=2145]; will route to failure: org.apache.nifi.attribute.expression.language.exception.AttributeExpressionLanguageException: Invalid Expression: ${concat( /FIRST_NAME, ,/LAST_NAME )} due to Unrecognized token at line 1, column 10. Query: ${concat( /FIRST_NAME, ,/LAST_NAME )}
19:19:22 UTC
ERROR
1221b2fe-015f-1000-9235-11d12a9fdbcc
UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process StandardFlowFileRecord[uuid=8562d402-08dd-44ef-821b-ac18a586e7bd,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1508352826506-1, container=default, section=1], offset=8526, length=2145],offset=0,name=3-Address_Sub-_File.csv,size=2145]; will route to failure: org.apache.nifi.attribute.expression.language.exception.AttributeExpressionLanguageException: Invalid Expression: ${concat( /FIRST_NAME,' ',/LAST_NAME )} due to Unrecognized token at line 1, column 10. Query: ${concat( /FIRST_NAME,' ',/LAST_NAME )}


From: Bryan Bende [mailto:bbende@gmail.com]
Sent: Wednesday, October 18, 2017 1:07 PM
To: users@nifi.apache.org
Subject: Re: UpdateRecord Processor

A lot of the other functions like substring, concat, and replace already exist, its just upper/lower that are missing.

You can see all the functions here:
https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html

On Wed, Oct 18, 2017 at 1:02 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Thank you Bryan.

Does this hold good for all string manipulations? Like Substring, Concatenation, replace can all be done using Scripted Record Writed/Reader and not with UpdateRecord Processor?

Thanks
Aruna

From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Wednesday, October 18, 2017 12:40 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Ideally the way this would be done is by using another UpdateRecord with Replacement Strategy set to Record Path, and then providing a record path like /CITY_NAME[ toUpper() ]
Unfortunately record path currently does not have upper and lower functions, so I created this JIRA:
https://issues.apache.org/jira/browse/NIFI-4498
I think the only way to do this right now might be using a Scripted Record Reader/Writer to do it manually, some examples:

https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-scripting-bundle/nifi-scripting-processors/src/test/resources/groovy/test_record_reader_inline.groovy
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-scripting-bundle/nifi-scripting-processors/src/test/resources/groovy/test_record_reader_xml.groovy
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-scripting-bundle/nifi-scripting-processors/src/test/resources/groovy/test_record_writer_inline.groovy


On Wed, Oct 18, 2017 at 11:29 AM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,

Sys_CREAT_TS and ACTV_IND got updated in the target table successfully. Thank you for your help.

I am trying to convert one of the existing columns in CSV – CITY_NAME to Upper Case.  I don’t think my usage is correct. I tried with forward slash, quotes but the final value is coming as blank though the CSV file has valid values.
I googled yesterday with no luck. Could you please let me know what I am missing?

[cid:image007.png@01D34825.00BEF220]

From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 5:19 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Ok you have to make a second instance of CSVReader that uses the "write" schema that UpdateRecord is using.
If you try to use the same CSVReader that UpdateRecord is using, then it is reading records with a schema that doesn't contain your new fields, which is why they are blank in your database.

On Mon, Oct 16, 2017 at 5:01 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,

PutDatabaseRecord was reading the reader schema. Attached is the template of my flow.

Thanks
Aruna

From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 4:38 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Can you make sure that your PutDatabaseRecord is using a reader with a schema that has those fields (i.e. the same schema as the writer of UpdateRecord)?
If PutDatabaseRecord was using the reader schema that UpdateRecord is using, then it would be reading in the records without those fields.
If that doesn't solve it then I would need to see a template of your whole flow.
Thanks,
Bryan

On Mon, Oct 16, 2017 at 4:26 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,
I tried both these options just now but sys_creat_ts is not getting populated for any record.
Option 1: If your writer schema has long for sys_create_ts then your UpdateRecord needs to have ${now():toNumber()}.
Option 2: If your writer schema has the original logicalType, then you need ${now():format('yyyy-MM-dd HH:mm:ss')}.

From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 4:14 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Aruna,
If your writer schema has long for sys_create_ts then your UpdateRecord needs to have ${now():toNumber()}.
If your writer schema has the original logicalType, then you need ${now():format('yyyy-MM-dd HH:mm:ss')}.

It looks like you have a mix of both because you said you changed it to long, but the error shows "2017-10-16 20:03:58" which isn't a long.

-Bryan


On Mon, Oct 16, 2017 at 4:07 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,

I followed your methods and the flow was successful. However, the sys_creat_ts and actv_ind fields were not populated for any of the records.

[cid:image008.png@01D34825.00BEF220]

This is what I see in the “Data Provenance” of the PutDatabaseRecord processor.

ENRLMT_ID,CITY_NAME,ZIP_CD,STATE_CD,SYS_CREAT_TS,ACTV_IND
I20031103000005,MECHANICSBURG,170509414,PA,1508183538000,
I20031103000013,HATO REY,917,PR,1508183538000,
I20031103000015,TOMS RIVER,8757,NJ,1508183538000,
I20031103000028,JERSEY CITY,73062305,NJ,1508183538000,
I20031103000030,AGUADILLA,6055256,PR,1508183538000,
I20031103000032,BINGHAM FARMS,480254514,MI,1508183538000,

So I changed the writer schema to use long for the sys_creat_ts field and started the flow but I got an error
“20:03:58 UTCERROR
1221b2fe-015f-1000-9235-11d12a9fdbcc
UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process StandardFlowFileRecord[uuid=25c48eec-42a0-4766-8605-69c5a0dad3cf,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1508167656905-1, container=default, section=1], offset=8591, length=1606],offset=0,name=3-Address_Sub-_File.csv,size=1606]; will route to failure: java.lang.NumberFormatException: For input string: "2017-10-16 20:03:58"

“

[cid:image009.jpg@01D34825.00BEF220]



From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 3:29 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Sticking with the original logicalType of timestamp-millis, I was able to get it working using ${now():format('yyyy-MM-dd HH:mm:ss')}  as the value of the timestamp column.

On Mon, Oct 16, 2017 at 3:23 PM, Bryan Bende <bb...@gmail.com>> wrote:
If it is helpful, here is a template of the working version where I changed to a long:

https://gist.githubusercontent.com/bbende/3187d3905f868bef3143899841bb70f9/raw/f266b149eb845a371543a1b68f9797d5437c6a2d/update-record-timestamp.xml

On Mon, Oct 16, 2017 at 3:19 PM, Bryan Bende <bb...@gmail.com>> wrote:
Aruna,
I think the issue here might be on the reader side of things...
If you're incoming data does not have SYS_CREAT_TS in it, then your schema needs to allow null values for this field so you can create records from the incoming data, or you can create two versions of your schema and have the reader use a version that doesn't have this field, and have the writer use a version that does.

I am currently testing this using these examples:

{
"type": "record",
"name": "schema1",
"fields": [
 { "name": "id", "type": "string" }
]
}

{
"type": "record",
"name": "schema2",
"fields": [
 { "name": "id", "type": "string" },
 { "name": "timestamp", "type" : { "type" : "long", "logicalType" : "timestamp-millis" } }
]
}

I have a CSVReader using schema1 and a CSVWriter using schema2.

I'm able to get past the issue you are having and now it is getting to the point where it is trying to take the result ofr ${now():toNumber()} and convert it to a timestamp, and I'm running into a different issue:

org.apache.nifi.serialization.record.util.IllegalTypeConversionException: Could not convert value [1508180727622] of type java.lang.String to Timestamp for field timestamp because the value is not in the expected date format: org.apache.nifi.serialization.record.util.DataTypeUtils$$Lambda$414/572291470@33d51c66<ma...@33d51c66>
org.apache.nifi.serialization.record.util.IllegalTypeConversionException: Could not convert value [1508180727622] of type java.lang.String to Timestamp for field timestamp because the value is not in the expected date format: org.apache.nifi.serialization.record.util.DataTypeUtils$$Lambda$414/572291470@33d51c66<ma...@33d51c66>
    at org.apache.nifi.serialization.record.util.DataTypeUtils.toTimestamp(DataTypeUtils.java:564)
    at org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:134)
    at org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:84)
    at org.apache.nifi.serialization.record.MapRecord.setValue(MapRecord.java:317)
    at org.apache.nifi.record.path.St<http://org.apache.nifi.record.path.St>andardFieldValue.updateValue(StandardFieldValue.java:132)
    at org.apache.nifi.processors.standard.UpdateRecord.lambda$process$1(UpdateRecord.java:177)

I haven't figured out the issue, but it has something to do with the logic of trying to convert a string to timestamp and whether or not a date format is provided.

If I change schema1 to use a regular long for the timestamp field then it works:

{
"type": "record",
"name": "schema2",
"fields": [
 { "name": "id", "type": "string" },
 { "name": "timestamp", "type" : "long" }
]
}

I don't know what the ramifications of this would be in PutDatabaseRecord.


On Mon, Oct 16, 2017 at 2:23 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Matt,

My issue is that when I am trying to assign current date to sys_creat_ts for every record, I am getting this error that it cannot be null. SYS_CREAT_TS is not coming from the source. It is present in the target table alone and I need to load the current datetime in that field for every record that is coming from source.

Mark,

I am still getting the same error.

Also, actv_ind must display just “Y” in the target table.


[cid:image010.png@01D34825.00BEF220]

[cid:image011.png@01D34825.00BEF220]

From: Mark Payne [mailto:markap14@hotmail.com<ma...@hotmail.com>]
Sent: Monday, October 16, 2017 12:03 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Hi Aruna,

I think the issue is likely that you are setting /sys_create_ts but your schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
Would recommend you change the property names to /ACTV_IND and /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
set to "{now()}" but I think what you really are wanting is "${now():toNumber()}".

Also, of note, /actv_ind is set to '/Y' so I just want to clarify that the literal value '/Y' (with the single quotes) is what will be placed there. Is that
the intention? Or did you actually want just /Y to be there?

Thanks
-Mark



On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <Ar...@Cormac-Corp.com>> wrote:

Hi,

I updated to version 1.4 now. I am using Updaterecord processor. I have assigned values to two of the columns that are there in the target table as shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am I missing something?
I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry and my Nifi Flow below.

<image004.png>   <image005.png>

CSVRecordSetWriter:
<image001.png>

AVROSchemaRegistry:
<image002.png>

<image006.png>


From: Koji Kawamura [mailto:ijokarumawak@gmail.com]
Sent: Thursday, October 12, 2017 8:28 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: Transformations using Nifi

Hi Aruna,

If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
As Russ mentioned, EL might be helpful when you implement the scripted components.

#4 is a bit harder since it requires a database connection, but doable, I don't know it works efficiently though..

Alternative approach would be create a temporary table, insert rows as they are, then perform a insert/update query using the temporary table and postgresql lookup table such as:
"insert into X select a, b, c, from T inner join L on X.j = T.j where d = ..."
Probably data conversion #1, 2, and 3 can be performed from this query as well.

Thanks,
Koji

On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>> wrote:
Aruna,
I don't think there is any generalized NiFi training course yet. I started writing a book a year ago, but it's a pretty thankless task and takes a lot of time. I mostly write custom processors so I tend to see the world differently from most "user" users of NiFi. When I answer questions, which I don't do too much, I often tell an impractical story that doesn't answer the question asked.
Now, I haven't done much database work at all, so I'm not going to be too much help to you. However, looking at your questions, particularly the one about obtaining the current date when the flow is run (#2) and also #3, I would suggest that you study the NiFi Expression Language. This would allow you to insert the "now" date into your flow at some point.
The other suggestions I have are to experiment, which I'm sure you're doing, and Google hard for help. For this forum, which is filled with very nice and helpful people, you'll tend to get a lot more and better help if you come in with a very specific question rather than a list of things or a general, "help me" sort of plea.
Cheers,
Russ
P.S. NiFi absolutely rocks, which you'll see as soon as you get over your initial hump here. But, you're on the right track.

On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
Hi,

Could you someone please help me with these requirements in the email below?

Thanks
Aruna

From: Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com]
Sent: Wednesday, October 11, 2017 11:26 AM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Transformations using Nifi

I am trying to see what kind of transformations that can be done in nifi and how.
Now I have a basic flow that takes CSV from the local dir and puts into s3 and loads into postgres database.
There are 4 columns in my test file 3 of which are string and one is an integer field. I would like to do the following before I load the data into postgres. If someone can help me on how to go about these, it will be great.

1.       Convert one of the string columns to upper case

For converting to upper case, I was told to use the Update Record processor but  my version is 1.2.0 and the update record processor is not available.



2.       Postgres has an extra column called “Load_Date” in which I would like to load the current date with timestamp when the flow is run

3.       If the integer column has more than 5 digits, I would like to take only the first 5 digits and load to the table

4.       There is a look up table in postgres. I would like to check if the first column value is present in the look up table and if yes, proceed ahead and if not ignore the record

I am trying to learn nifi so I would really appreciate any kind of help here. Is there any training available online that I can take in order to understand and do all these?

<image003.png>










Re: UpdateRecord Processor

Posted by Bryan Bende <bb...@gmail.com>.
A lot of the other functions like substring, concat, and replace already
exist, its just upper/lower that are missing.

You can see all the functions here:
https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html

On Wed, Oct 18, 2017 at 1:02 PM, Aruna Sankaralingam <
Aruna.Sankaralingam@cormac-corp.com> wrote:

> Thank you Bryan.
>
>
>
> Does this hold good for all string manipulations? Like Substring,
> Concatenation, replace can all be done using Scripted Record Writed/Reader
> and not with UpdateRecord Processor?
>
>
>
> Thanks
>
> Aruna
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Wednesday, October 18, 2017 12:40 PM
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Ideally the way this would be done is by using another UpdateRecord with
> Replacement Strategy set to Record Path, and then providing a record path
> like /CITY_NAME[ toUpper() ]
>
> Unfortunately record path currently does not have upper and lower
> functions, so I created this JIRA:
> https://issues.apache.org/jira/browse/NIFI-4498
>
> I think the only way to do this right now might be using a Scripted Record
> Reader/Writer to do it manually, some examples:
>
> https://github.com/apache/nifi/blob/master/nifi-nar-
> bundles/nifi-scripting-bundle/nifi-scripting-processors/src/
> test/resources/groovy/test_record_reader_inline.groovy
> https://github.com/apache/nifi/blob/master/nifi-nar-
> bundles/nifi-scripting-bundle/nifi-scripting-processors/src/
> test/resources/groovy/test_record_reader_xml.groovy
> https://github.com/apache/nifi/blob/master/nifi-nar-
> bundles/nifi-scripting-bundle/nifi-scripting-processors/src/
> test/resources/groovy/test_record_writer_inline.groovy
>
>
>
>
>
> On Wed, Oct 18, 2017 at 11:29 AM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
>
>
> Sys_CREAT_TS and ACTV_IND got updated in the target table successfully.
> Thank you for your help.
>
>
>
> I am trying to convert one of the existing columns in CSV – CITY_NAME to
> Upper Case.  I don’t think my usage is correct. I tried with forward slash,
> quotes but the final value is coming as blank though the CSV file has valid
> values.
>
> I googled yesterday with no luck. Could you please let me know what I am
> missing?
>
>
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 5:19 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Ok you have to make a second instance of CSVReader that uses the "write"
> schema that UpdateRecord is using.
>
> If you try to use the same CSVReader that UpdateRecord is using, then it
> is reading records with a schema that doesn't contain your new fields,
> which is why they are blank in your database.
>
>
>
> On Mon, Oct 16, 2017 at 5:01 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
>
>
> PutDatabaseRecord was reading the reader schema. Attached is the template
> of my flow.
>
>
>
> Thanks
>
> Aruna
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 4:38 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Can you make sure that your PutDatabaseRecord is using a reader with a
> schema that has those fields (i.e. the same schema as the writer of
> UpdateRecord)?
>
> If PutDatabaseRecord was using the reader schema that UpdateRecord is
> using, then it would be reading in the records without those fields.
>
> If that doesn't solve it then I would need to see a template of your whole
> flow.
>
> Thanks,
>
> Bryan
>
>
>
> On Mon, Oct 16, 2017 at 4:26 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
> I tried both these options just now but sys_creat_ts is not getting
> populated for any record.
>
> Option 1: If your writer schema has long for sys_create_ts then your
> UpdateRecord needs to have ${now():toNumber()}.
>
> Option 2: If your writer schema has the original logicalType, then you
> need ${now():format('yyyy-MM-dd HH:mm:ss')}.
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 4:14 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Aruna,
>
> If your writer schema has long for sys_create_ts then your UpdateRecord
> needs to have ${now():toNumber()}.
>
> If your writer schema has the original logicalType, then you need
> ${now():format('yyyy-MM-dd HH:mm:ss')}.
>
>
>
> It looks like you have a mix of both because you said you changed it to
> long, but the error shows "2017-10-16 20:03:58" which isn't a long.
>
>
>
> -Bryan
>
>
>
>
>
> On Mon, Oct 16, 2017 at 4:07 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
>
>
> I followed your methods and the flow was successful. However, the
> sys_creat_ts and actv_ind fields were not populated for any of the records.
>
>
>
>
>
> This is what I see in the “Data Provenance” of the PutDatabaseRecord
> processor.
>
>
>
> ENRLMT_ID,CITY_NAME,ZIP_CD,STATE_CD,SYS_CREAT_TS,ACTV_IND
>
> I20031103000005,MECHANICSBURG,170509414,PA,1508183538000,
>
> I20031103000013,HATO REY,917,PR,1508183538000,
>
> I20031103000015,TOMS RIVER,8757,NJ,1508183538000,
>
> I20031103000028,JERSEY CITY,73062305,NJ,1508183538000,
>
> I20031103000030,AGUADILLA,6055256,PR,1508183538000,
>
> I20031103000032,BINGHAM FARMS,480254514,MI,1508183538000,
>
>
>
> So I changed the writer schema to use long for the sys_creat_ts field and
> started the flow but I got an error
>
> “*20:03:58 UTC**ERROR*
>
> *1221b2fe-015f-1000-9235-11d12a9fdbcc*
>
> UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process
> StandardFlowFileRecord[uuid=25c48eec-42a0-4766-8605-69c5a0dad3cf,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1508167656905-1,
> container=default, section=1], offset=8591, length=1606],offset=0,name=3-
> Address_Sub-_File.csv,size=1606]; will route to failure: java.lang.NumberFormatException:
> For input string: "2017-10-16 20:03:58"
>
>
>
> “
>
>
>
>
>
>
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 3:29 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Sticking with the original logicalType of timestamp-millis, I was able to
> get it working using ${now():format('yyyy-MM-dd HH:mm:ss')}  as the value
> of the timestamp column.
>
>
>
> On Mon, Oct 16, 2017 at 3:23 PM, Bryan Bende <bb...@gmail.com> wrote:
>
> If it is helpful, here is a template of the working version where I
> changed to a long:
>
> https://gist.githubusercontent.com/bbende/3187d3905f868bef3143899841bb70
> f9/raw/f266b149eb845a371543a1b68f9797d5437c6a2d/update-record-
> timestamp.xml
>
>
>
> On Mon, Oct 16, 2017 at 3:19 PM, Bryan Bende <bb...@gmail.com> wrote:
>
> Aruna,
>
> I think the issue here might be on the reader side of things...
>
> If you're incoming data does not have SYS_CREAT_TS in it, then your schema
> needs to allow null values for this field so you can create records from
> the incoming data, or you can create two versions of your schema and have
> the reader use a version that doesn't have this field, and have the writer
> use a version that does.
>
>
>
> I am currently testing this using these examples:
>
>
>
> {
> "type": "record",
> "name": "schema1",
> "fields": [
>  { "name": "id", "type": "string" }
> ]
> }
>
> {
> "type": "record",
> "name": "schema2",
> "fields": [
>  { "name": "id", "type": "string" },
>  { "name": "timestamp", "type" : { "type" : "long", "logicalType" :
> "timestamp-millis" } }
> ]
> }
>
>
>
> I have a CSVReader using schema1 and a CSVWriter using schema2.
>
>
>
> I'm able to get past the issue you are having and now it is getting to the
> point where it is trying to take the result ofr ${now():toNumber()} and
> convert it to a timestamp, and I'm running into a different issue:
>
>
>
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
> Could not convert value [1508180727622] of type java.lang.String to
> Timestamp for field timestamp because the value is not in the expected date
> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$
> Lambda$414/572291470@33d51c66
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
> Could not convert value [1508180727622] of type java.lang.String to
> Timestamp for field timestamp because the value is not in the expected date
> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$
> Lambda$414/572291470@33d51c66
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> toTimestamp(DataTypeUtils.java:564)
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> convertType(DataTypeUtils.java:134)
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> convertType(DataTypeUtils.java:84)
>     at org.apache.nifi.serialization.record.MapRecord.setValue(
> MapRecord.java:317)
>     at org.apache.nifi.record.path.StandardFieldValue.updateValue(
> StandardFieldValue.java:132)
>     at org.apache.nifi.processors.standard.UpdateRecord.lambda$
> process$1(UpdateRecord.java:177)
>
>
>
> I haven't figured out the issue, but it has something to do with the logic
> of trying to convert a string to timestamp and whether or not a date format
> is provided.
>
>
>
> If I change schema1 to use a regular long for the timestamp field then it
> works:
>
>
>
> {
> "type": "record",
> "name": "schema2",
> "fields": [
>  { "name": "id", "type": "string" },
>  { "name": "timestamp", "type" : "long" }
> ]
> }
>
>
>
> I don't know what the ramifications of this would be in PutDatabaseRecord.
>
>
>
>
>
> On Mon, Oct 16, 2017 at 2:23 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Matt,
>
>
>
> My issue is that when I am trying to assign current date to sys_creat_ts
> for every record, I am getting this error that it cannot be null.
> SYS_CREAT_TS is not coming from the source. It is present in the target
> table alone and I need to load the current datetime in that field for every
> record that is coming from source.
>
>
>
> Mark,
>
>
>
> I am still getting the same error.
>
>
>
> Also, actv_ind must display just “Y” in the target table.
>
>
>
>
>
>
>
>
>
> *From:* Mark Payne [mailto:markap14@hotmail.com]
> *Sent:* Monday, October 16, 2017 12:03 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Hi Aruna,
>
>
>
> I think the issue is likely that you are setting /sys_create_ts but your
> schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
>
> Would recommend you change the property names to /ACTV_IND and
> /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
>
> set to "{now()}" but I think what you really are wanting is
> "${now():toNumber()}".
>
>
>
> Also, of note, /actv_ind is set to '/Y' so I just want to clarify that the
> literal value '/Y' (with the single quotes) is what will be placed there.
> Is that
>
> the intention? Or did you actually want just /Y to be there?
>
>
>
> Thanks
>
> -Mark
>
>
>
>
>
>
>
> On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <
> Aruna.Sankaralingam@Cormac-Corp.com> wrote:
>
>
>
> Hi,
>
>
>
> I updated to version 1.4 now. I am using Updaterecord processor. I have
> assigned values to two of the columns that are there in the target table as
> shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am
> I missing something?
>
> I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry
> and my Nifi Flow below.
>
>
>
> <image004.png>   <image005.png>
>
>
>
> CSVRecordSetWriter:
>
> <image001.png>
>
>
>
> AVROSchemaRegistry:
>
> <image002.png>
>
>
>
> <image006.png>
>
>
>
>
>
> *From:* Koji Kawamura [mailto:ijokarumawak@gmail.com
> <ij...@gmail.com>]
> *Sent:* Thursday, October 12, 2017 8:28 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: Transformations using Nifi
>
>
>
> Hi Aruna,
>
>
>
> If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
>
> ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
>
> As Russ mentioned, EL might be helpful when you implement the scripted
> components.
>
>
>
> #4 is a bit harder since it requires a database connection, but doable, I
> don't know it works efficiently though..
>
>
>
> Alternative approach would be create a temporary table, insert rows as
> they are, then perform a insert/update query using the temporary table and
> postgresql lookup table such as:
>
> "insert into X select a, b, c, from T inner join L on X.j = T.j where d =
> ..."
>
> Probably data conversion #1, 2, and 3 can be performed from this query as
> well.
>
>
>
> Thanks,
>
> Koji
>
>
>
> On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>
> wrote:
>
> Aruna,
>
> I don't think there is any generalized NiFi training course yet. I started
> writing a book a year ago, but it's a pretty thankless task and takes a lot
> of time. I mostly write custom processors so I tend to see the world
> differently from most "user" users of NiFi. When I answer questions, which
> I don't do too much, I often tell an impractical story that doesn't answer
> the question asked.
>
> Now, I haven't done much database work at all, so I'm not going to be too
> much help to you. However, looking at your questions, particularly the one
> about obtaining the current date when the flow is run (#2) and also #3, I
> would suggest that you study the NiFi Expression Language. This would allow
> you to insert the "now" date into your flow at some point.
>
> The other suggestions I have are to experiment, which I'm sure you're
> doing, and Google hard for help. For this forum, which is filled with very
> nice and helpful people, you'll tend to get a lot more and better help if
> you come in with a very specific question rather than a list of things or a
> general, "help me" sort of plea.
>
> Cheers,
>
> Russ
>
> P.S. NiFi absolutely rocks, which you'll see as soon as you get over your
> initial hump here. But, you're on the right track.
>
>
>
> On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
>
> Hi,
>
>
>
> Could you someone please help me with these requirements in the email
> below?
>
>
>
> Thanks
>
> Aruna
>
>
>
> *From:* Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com
> <Ar...@Cormac-Corp.com>]
> *Sent:* Wednesday, October 11, 2017 11:26 AM
> *To:* users@nifi.apache.org
> *Subject:* Transformations using Nifi
>
>
>
> I am trying to see what kind of transformations that can be done in nifi
> and how.
>
> Now I have a basic flow that takes CSV from the local dir and puts into s3
> and loads into postgres database.
>
> There are 4 columns in my test file 3 of which are string and one is an
> integer field. I would like to do the following before I load the data into
> postgres. If someone can help me on how to go about these, it will be great.
>
> 1.       Convert one of the string columns to upper case
>
> *For converting to upper case, I was told to use the Update Record
> processor but  my version is 1.2.0 and the update record processor is not
> available.*
>
>
>
> 2.       Postgres has an extra column called “Load_Date” in which I would
> like to load the current date with timestamp when the flow is run
>
> 3.       If the integer column has more than 5 digits, I would like to
> take only the first 5 digits and load to the table
>
> 4.       There is a look up table in postgres. I would like to check if
> the first column value is present in the look up table and if yes, proceed
> ahead and if not ignore the record
>
>
>
> I am trying to learn nifi so I would really appreciate any kind of help
> here. Is there any training available online that I can take in order to
> understand and do all these?
>
>
>
> <image003.png>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

RE: UpdateRecord Processor

Posted by Aruna Sankaralingam <Ar...@Cormac-Corp.com>.
Thank you Bryan.

Does this hold good for all string manipulations? Like Substring, Concatenation, replace can all be done using Scripted Record Writed/Reader and not with UpdateRecord Processor?

Thanks
Aruna

From: Bryan Bende [mailto:bbende@gmail.com]
Sent: Wednesday, October 18, 2017 12:40 PM
To: users@nifi.apache.org
Subject: Re: UpdateRecord Processor

Ideally the way this would be done is by using another UpdateRecord with Replacement Strategy set to Record Path, and then providing a record path like /CITY_NAME[ toUpper() ]
Unfortunately record path currently does not have upper and lower functions, so I created this JIRA:
https://issues.apache.org/jira/browse/NIFI-4498
I think the only way to do this right now might be using a Scripted Record Reader/Writer to do it manually, some examples:

https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-scripting-bundle/nifi-scripting-processors/src/test/resources/groovy/test_record_reader_inline.groovy
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-scripting-bundle/nifi-scripting-processors/src/test/resources/groovy/test_record_reader_xml.groovy
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-scripting-bundle/nifi-scripting-processors/src/test/resources/groovy/test_record_writer_inline.groovy



On Wed, Oct 18, 2017 at 11:29 AM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,

Sys_CREAT_TS and ACTV_IND got updated in the target table successfully. Thank you for your help.

I am trying to convert one of the existing columns in CSV – CITY_NAME to Upper Case.  I don’t think my usage is correct. I tried with forward slash, quotes but the final value is coming as blank though the CSV file has valid values.
I googled yesterday with no luck. Could you please let me know what I am missing?

[cid:image001.png@01D34811.5A971110]

From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 5:19 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Ok you have to make a second instance of CSVReader that uses the "write" schema that UpdateRecord is using.
If you try to use the same CSVReader that UpdateRecord is using, then it is reading records with a schema that doesn't contain your new fields, which is why they are blank in your database.

On Mon, Oct 16, 2017 at 5:01 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,

PutDatabaseRecord was reading the reader schema. Attached is the template of my flow.

Thanks
Aruna

From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 4:38 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Can you make sure that your PutDatabaseRecord is using a reader with a schema that has those fields (i.e. the same schema as the writer of UpdateRecord)?
If PutDatabaseRecord was using the reader schema that UpdateRecord is using, then it would be reading in the records without those fields.
If that doesn't solve it then I would need to see a template of your whole flow.
Thanks,
Bryan

On Mon, Oct 16, 2017 at 4:26 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,
I tried both these options just now but sys_creat_ts is not getting populated for any record.
Option 1: If your writer schema has long for sys_create_ts then your UpdateRecord needs to have ${now():toNumber()}.
Option 2: If your writer schema has the original logicalType, then you need ${now():format('yyyy-MM-dd HH:mm:ss')}.

From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 4:14 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Aruna,
If your writer schema has long for sys_create_ts then your UpdateRecord needs to have ${now():toNumber()}.
If your writer schema has the original logicalType, then you need ${now():format('yyyy-MM-dd HH:mm:ss')}.

It looks like you have a mix of both because you said you changed it to long, but the error shows "2017-10-16 20:03:58" which isn't a long.

-Bryan


On Mon, Oct 16, 2017 at 4:07 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,

I followed your methods and the flow was successful. However, the sys_creat_ts and actv_ind fields were not populated for any of the records.

[cid:image002.png@01D34811.5A971110]

This is what I see in the “Data Provenance” of the PutDatabaseRecord processor.

ENRLMT_ID,CITY_NAME,ZIP_CD,STATE_CD,SYS_CREAT_TS,ACTV_IND
I20031103000005,MECHANICSBURG,170509414,PA,1508183538000,
I20031103000013,HATO REY,917,PR,1508183538000,
I20031103000015,TOMS RIVER,8757,NJ,1508183538000,
I20031103000028,JERSEY CITY,73062305,NJ,1508183538000,
I20031103000030,AGUADILLA,6055256,PR,1508183538000,
I20031103000032,BINGHAM FARMS,480254514,MI,1508183538000,

So I changed the writer schema to use long for the sys_creat_ts field and started the flow but I got an error
“20:03:58 UTCERROR
1221b2fe-015f-1000-9235-11d12a9fdbcc
UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process StandardFlowFileRecord[uuid=25c48eec-42a0-4766-8605-69c5a0dad3cf,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1508167656905-1, container=default, section=1], offset=8591, length=1606],offset=0,name=3-Address_Sub-_File.csv,size=1606]; will route to failure: java.lang.NumberFormatException: For input string: "2017-10-16 20:03:58"

“

[cid:image003.jpg@01D34811.5A971110]



From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 3:29 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Sticking with the original logicalType of timestamp-millis, I was able to get it working using ${now():format('yyyy-MM-dd HH:mm:ss')}  as the value of the timestamp column.

On Mon, Oct 16, 2017 at 3:23 PM, Bryan Bende <bb...@gmail.com>> wrote:
If it is helpful, here is a template of the working version where I changed to a long:

https://gist.githubusercontent.com/bbende/3187d3905f868bef3143899841bb70f9/raw/f266b149eb845a371543a1b68f9797d5437c6a2d/update-record-timestamp.xml

On Mon, Oct 16, 2017 at 3:19 PM, Bryan Bende <bb...@gmail.com>> wrote:
Aruna,
I think the issue here might be on the reader side of things...
If you're incoming data does not have SYS_CREAT_TS in it, then your schema needs to allow null values for this field so you can create records from the incoming data, or you can create two versions of your schema and have the reader use a version that doesn't have this field, and have the writer use a version that does.

I am currently testing this using these examples:

{
"type": "record",
"name": "schema1",
"fields": [
 { "name": "id", "type": "string" }
]
}

{
"type": "record",
"name": "schema2",
"fields": [
 { "name": "id", "type": "string" },
 { "name": "timestamp", "type" : { "type" : "long", "logicalType" : "timestamp-millis" } }
]
}

I have a CSVReader using schema1 and a CSVWriter using schema2.

I'm able to get past the issue you are having and now it is getting to the point where it is trying to take the result ofr ${now():toNumber()} and convert it to a timestamp, and I'm running into a different issue:

org.apache.nifi.serialization.record.util.IllegalTypeConversionException: Could not convert value [1508180727622] of type java.lang.String to Timestamp for field timestamp because the value is not in the expected date format: org.apache.nifi.serialization.record.util.DataTypeUtils$$Lambda$414/572291470@33d51c66<ma...@33d51c66>
org.apache.nifi.serialization.record.util.IllegalTypeConversionException: Could not convert value [1508180727622] of type java.lang.String to Timestamp for field timestamp because the value is not in the expected date format: org.apache.nifi.serialization.record.util.DataTypeUtils$$Lambda$414/572291470@33d51c66<ma...@33d51c66>
    at org.apache.nifi.serialization.record.util.DataTypeUtils.toTimestamp(DataTypeUtils.java:564)
    at org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:134)
    at org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:84)
    at org.apache.nifi.serialization.record.MapRecord.setValue(MapRecord.java:317)
    at org.apache.nifi.record.path.St<http://org.apache.nifi.record.path.St>andardFieldValue.updateValue(StandardFieldValue.java:132)
    at org.apache.nifi.processors.standard.UpdateRecord.lambda$process$1(UpdateRecord.java:177)

I haven't figured out the issue, but it has something to do with the logic of trying to convert a string to timestamp and whether or not a date format is provided.

If I change schema1 to use a regular long for the timestamp field then it works:

{
"type": "record",
"name": "schema2",
"fields": [
 { "name": "id", "type": "string" },
 { "name": "timestamp", "type" : "long" }
]
}

I don't know what the ramifications of this would be in PutDatabaseRecord.


On Mon, Oct 16, 2017 at 2:23 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Matt,

My issue is that when I am trying to assign current date to sys_creat_ts for every record, I am getting this error that it cannot be null. SYS_CREAT_TS is not coming from the source. It is present in the target table alone and I need to load the current datetime in that field for every record that is coming from source.

Mark,

I am still getting the same error.

Also, actv_ind must display just “Y” in the target table.


[cid:image004.png@01D34811.5A971110]

[cid:image005.png@01D34811.5A971110]

From: Mark Payne [mailto:markap14@hotmail.com<ma...@hotmail.com>]
Sent: Monday, October 16, 2017 12:03 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Hi Aruna,

I think the issue is likely that you are setting /sys_create_ts but your schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
Would recommend you change the property names to /ACTV_IND and /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
set to "{now()}" but I think what you really are wanting is "${now():toNumber()}".

Also, of note, /actv_ind is set to '/Y' so I just want to clarify that the literal value '/Y' (with the single quotes) is what will be placed there. Is that
the intention? Or did you actually want just /Y to be there?

Thanks
-Mark



On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <Ar...@Cormac-Corp.com>> wrote:

Hi,

I updated to version 1.4 now. I am using Updaterecord processor. I have assigned values to two of the columns that are there in the target table as shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am I missing something?
I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry and my Nifi Flow below.

<image004.png>   <image005.png>

CSVRecordSetWriter:
<image001.png>

AVROSchemaRegistry:
<image002.png>

<image006.png>


From: Koji Kawamura [mailto:ijokarumawak@gmail.com]
Sent: Thursday, October 12, 2017 8:28 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: Transformations using Nifi

Hi Aruna,

If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
As Russ mentioned, EL might be helpful when you implement the scripted components.

#4 is a bit harder since it requires a database connection, but doable, I don't know it works efficiently though..

Alternative approach would be create a temporary table, insert rows as they are, then perform a insert/update query using the temporary table and postgresql lookup table such as:
"insert into X select a, b, c, from T inner join L on X.j = T.j where d = ..."
Probably data conversion #1, 2, and 3 can be performed from this query as well.

Thanks,
Koji

On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>> wrote:
Aruna,
I don't think there is any generalized NiFi training course yet. I started writing a book a year ago, but it's a pretty thankless task and takes a lot of time. I mostly write custom processors so I tend to see the world differently from most "user" users of NiFi. When I answer questions, which I don't do too much, I often tell an impractical story that doesn't answer the question asked.
Now, I haven't done much database work at all, so I'm not going to be too much help to you. However, looking at your questions, particularly the one about obtaining the current date when the flow is run (#2) and also #3, I would suggest that you study the NiFi Expression Language. This would allow you to insert the "now" date into your flow at some point.
The other suggestions I have are to experiment, which I'm sure you're doing, and Google hard for help. For this forum, which is filled with very nice and helpful people, you'll tend to get a lot more and better help if you come in with a very specific question rather than a list of things or a general, "help me" sort of plea.
Cheers,
Russ
P.S. NiFi absolutely rocks, which you'll see as soon as you get over your initial hump here. But, you're on the right track.

On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
Hi,

Could you someone please help me with these requirements in the email below?

Thanks
Aruna

From: Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com]
Sent: Wednesday, October 11, 2017 11:26 AM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Transformations using Nifi

I am trying to see what kind of transformations that can be done in nifi and how.
Now I have a basic flow that takes CSV from the local dir and puts into s3 and loads into postgres database.
There are 4 columns in my test file 3 of which are string and one is an integer field. I would like to do the following before I load the data into postgres. If someone can help me on how to go about these, it will be great.

1.       Convert one of the string columns to upper case

For converting to upper case, I was told to use the Update Record processor but  my version is 1.2.0 and the update record processor is not available.



2.       Postgres has an extra column called “Load_Date” in which I would like to load the current date with timestamp when the flow is run

3.       If the integer column has more than 5 digits, I would like to take only the first 5 digits and load to the table

4.       There is a look up table in postgres. I would like to check if the first column value is present in the look up table and if yes, proceed ahead and if not ignore the record

I am trying to learn nifi so I would really appreciate any kind of help here. Is there any training available online that I can take in order to understand and do all these?

<image003.png>









Re: UpdateRecord Processor

Posted by Bryan Bende <bb...@gmail.com>.
Ideally the way this would be done is by using another UpdateRecord with
Replacement Strategy set to Record Path, and then providing a record path
like /CITY_NAME[ toUpper() ]

Unfortunately record path currently does not have upper and lower
functions, so I created this JIRA:
https://issues.apache.org/jira/browse/NIFI-4498

I think the only way to do this right now might be using a Scripted Record
Reader/Writer to do it manually, some examples:

https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-scripting-bundle/nifi-scripting-processors/src/test/resources/groovy/test_record_reader_inline.groovy
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-scripting-bundle/nifi-scripting-processors/src/test/resources/groovy/test_record_reader_xml.groovy
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-scripting-bundle/nifi-scripting-processors/src/test/resources/groovy/test_record_writer_inline.groovy




On Wed, Oct 18, 2017 at 11:29 AM, Aruna Sankaralingam <
Aruna.Sankaralingam@cormac-corp.com> wrote:

> Bryan,
>
>
>
> Sys_CREAT_TS and ACTV_IND got updated in the target table successfully.
> Thank you for your help.
>
>
>
> I am trying to convert one of the existing columns in CSV – CITY_NAME to
> Upper Case.  I don’t think my usage is correct. I tried with forward slash,
> quotes but the final value is coming as blank though the CSV file has valid
> values.
>
> I googled yesterday with no luck. Could you please let me know what I am
> missing?
>
>
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 5:19 PM
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Ok you have to make a second instance of CSVReader that uses the "write"
> schema that UpdateRecord is using.
>
> If you try to use the same CSVReader that UpdateRecord is using, then it
> is reading records with a schema that doesn't contain your new fields,
> which is why they are blank in your database.
>
>
>
> On Mon, Oct 16, 2017 at 5:01 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
>
>
> PutDatabaseRecord was reading the reader schema. Attached is the template
> of my flow.
>
>
>
> Thanks
>
> Aruna
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 4:38 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Can you make sure that your PutDatabaseRecord is using a reader with a
> schema that has those fields (i.e. the same schema as the writer of
> UpdateRecord)?
>
> If PutDatabaseRecord was using the reader schema that UpdateRecord is
> using, then it would be reading in the records without those fields.
>
> If that doesn't solve it then I would need to see a template of your whole
> flow.
>
> Thanks,
>
> Bryan
>
>
>
> On Mon, Oct 16, 2017 at 4:26 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
> I tried both these options just now but sys_creat_ts is not getting
> populated for any record.
>
> Option 1: If your writer schema has long for sys_create_ts then your
> UpdateRecord needs to have ${now():toNumber()}.
>
> Option 2: If your writer schema has the original logicalType, then you
> need ${now():format('yyyy-MM-dd HH:mm:ss')}.
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 4:14 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Aruna,
>
> If your writer schema has long for sys_create_ts then your UpdateRecord
> needs to have ${now():toNumber()}.
>
> If your writer schema has the original logicalType, then you need
> ${now():format('yyyy-MM-dd HH:mm:ss')}.
>
>
>
> It looks like you have a mix of both because you said you changed it to
> long, but the error shows "2017-10-16 20:03:58" which isn't a long.
>
>
>
> -Bryan
>
>
>
>
>
> On Mon, Oct 16, 2017 at 4:07 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
>
>
> I followed your methods and the flow was successful. However, the
> sys_creat_ts and actv_ind fields were not populated for any of the records.
>
>
>
>
>
> This is what I see in the “Data Provenance” of the PutDatabaseRecord
> processor.
>
>
>
> ENRLMT_ID,CITY_NAME,ZIP_CD,STATE_CD,SYS_CREAT_TS,ACTV_IND
>
> I20031103000005,MECHANICSBURG,170509414,PA,1508183538000,
>
> I20031103000013,HATO REY,917,PR,1508183538000,
>
> I20031103000015,TOMS RIVER,8757,NJ,1508183538000,
>
> I20031103000028,JERSEY CITY,73062305,NJ,1508183538000,
>
> I20031103000030,AGUADILLA,6055256,PR,1508183538000,
>
> I20031103000032,BINGHAM FARMS,480254514,MI,1508183538000,
>
>
>
> So I changed the writer schema to use long for the sys_creat_ts field and
> started the flow but I got an error
>
> “*20:03:58 UTC**ERROR*
>
> *1221b2fe-015f-1000-9235-11d12a9fdbcc*
>
> UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process
> StandardFlowFileRecord[uuid=25c48eec-42a0-4766-8605-69c5a0dad3cf,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1508167656905-1,
> container=default, section=1], offset=8591, length=1606],offset=0,name=3-
> Address_Sub-_File.csv,size=1606]; will route to failure: java.lang.NumberFormatException:
> For input string: "2017-10-16 20:03:58"
>
>
>
> “
>
>
>
>
>
>
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 3:29 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Sticking with the original logicalType of timestamp-millis, I was able to
> get it working using ${now():format('yyyy-MM-dd HH:mm:ss')}  as the value
> of the timestamp column.
>
>
>
> On Mon, Oct 16, 2017 at 3:23 PM, Bryan Bende <bb...@gmail.com> wrote:
>
> If it is helpful, here is a template of the working version where I
> changed to a long:
>
> https://gist.githubusercontent.com/bbende/3187d3905f868bef3143899841bb70
> f9/raw/f266b149eb845a371543a1b68f9797d5437c6a2d/update-record-
> timestamp.xml
>
>
>
> On Mon, Oct 16, 2017 at 3:19 PM, Bryan Bende <bb...@gmail.com> wrote:
>
> Aruna,
>
> I think the issue here might be on the reader side of things...
>
> If you're incoming data does not have SYS_CREAT_TS in it, then your schema
> needs to allow null values for this field so you can create records from
> the incoming data, or you can create two versions of your schema and have
> the reader use a version that doesn't have this field, and have the writer
> use a version that does.
>
>
>
> I am currently testing this using these examples:
>
>
>
> {
> "type": "record",
> "name": "schema1",
> "fields": [
>  { "name": "id", "type": "string" }
> ]
> }
>
> {
> "type": "record",
> "name": "schema2",
> "fields": [
>  { "name": "id", "type": "string" },
>  { "name": "timestamp", "type" : { "type" : "long", "logicalType" :
> "timestamp-millis" } }
> ]
> }
>
>
>
> I have a CSVReader using schema1 and a CSVWriter using schema2.
>
>
>
> I'm able to get past the issue you are having and now it is getting to the
> point where it is trying to take the result ofr ${now():toNumber()} and
> convert it to a timestamp, and I'm running into a different issue:
>
>
>
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
> Could not convert value [1508180727622] of type java.lang.String to
> Timestamp for field timestamp because the value is not in the expected date
> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$
> Lambda$414/572291470@33d51c66
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
> Could not convert value [1508180727622] of type java.lang.String to
> Timestamp for field timestamp because the value is not in the expected date
> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$
> Lambda$414/572291470@33d51c66
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> toTimestamp(DataTypeUtils.java:564)
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> convertType(DataTypeUtils.java:134)
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> convertType(DataTypeUtils.java:84)
>     at org.apache.nifi.serialization.record.MapRecord.setValue(
> MapRecord.java:317)
>     at org.apache.nifi.record.path.StandardFieldValue.updateValue(
> StandardFieldValue.java:132)
>     at org.apache.nifi.processors.standard.UpdateRecord.lambda$
> process$1(UpdateRecord.java:177)
>
>
>
> I haven't figured out the issue, but it has something to do with the logic
> of trying to convert a string to timestamp and whether or not a date format
> is provided.
>
>
>
> If I change schema1 to use a regular long for the timestamp field then it
> works:
>
>
>
> {
> "type": "record",
> "name": "schema2",
> "fields": [
>  { "name": "id", "type": "string" },
>  { "name": "timestamp", "type" : "long" }
> ]
> }
>
>
>
> I don't know what the ramifications of this would be in PutDatabaseRecord.
>
>
>
>
>
> On Mon, Oct 16, 2017 at 2:23 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Matt,
>
>
>
> My issue is that when I am trying to assign current date to sys_creat_ts
> for every record, I am getting this error that it cannot be null.
> SYS_CREAT_TS is not coming from the source. It is present in the target
> table alone and I need to load the current datetime in that field for every
> record that is coming from source.
>
>
>
> Mark,
>
>
>
> I am still getting the same error.
>
>
>
> Also, actv_ind must display just “Y” in the target table.
>
>
>
>
>
>
>
>
>
> *From:* Mark Payne [mailto:markap14@hotmail.com]
> *Sent:* Monday, October 16, 2017 12:03 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Hi Aruna,
>
>
>
> I think the issue is likely that you are setting /sys_create_ts but your
> schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
>
> Would recommend you change the property names to /ACTV_IND and
> /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
>
> set to "{now()}" but I think what you really are wanting is
> "${now():toNumber()}".
>
>
>
> Also, of note, /actv_ind is set to '/Y' so I just want to clarify that the
> literal value '/Y' (with the single quotes) is what will be placed there.
> Is that
>
> the intention? Or did you actually want just /Y to be there?
>
>
>
> Thanks
>
> -Mark
>
>
>
>
>
>
>
> On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <
> Aruna.Sankaralingam@Cormac-Corp.com> wrote:
>
>
>
> Hi,
>
>
>
> I updated to version 1.4 now. I am using Updaterecord processor. I have
> assigned values to two of the columns that are there in the target table as
> shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am
> I missing something?
>
> I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry
> and my Nifi Flow below.
>
>
>
> <image004.png>   <image005.png>
>
>
>
> CSVRecordSetWriter:
>
> <image001.png>
>
>
>
> AVROSchemaRegistry:
>
> <image002.png>
>
>
>
> <image006.png>
>
>
>
>
>
> *From:* Koji Kawamura [mailto:ijokarumawak@gmail.com
> <ij...@gmail.com>]
> *Sent:* Thursday, October 12, 2017 8:28 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: Transformations using Nifi
>
>
>
> Hi Aruna,
>
>
>
> If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
>
> ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
>
> As Russ mentioned, EL might be helpful when you implement the scripted
> components.
>
>
>
> #4 is a bit harder since it requires a database connection, but doable, I
> don't know it works efficiently though..
>
>
>
> Alternative approach would be create a temporary table, insert rows as
> they are, then perform a insert/update query using the temporary table and
> postgresql lookup table such as:
>
> "insert into X select a, b, c, from T inner join L on X.j = T.j where d =
> ..."
>
> Probably data conversion #1, 2, and 3 can be performed from this query as
> well.
>
>
>
> Thanks,
>
> Koji
>
>
>
> On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>
> wrote:
>
> Aruna,
>
> I don't think there is any generalized NiFi training course yet. I started
> writing a book a year ago, but it's a pretty thankless task and takes a lot
> of time. I mostly write custom processors so I tend to see the world
> differently from most "user" users of NiFi. When I answer questions, which
> I don't do too much, I often tell an impractical story that doesn't answer
> the question asked.
>
> Now, I haven't done much database work at all, so I'm not going to be too
> much help to you. However, looking at your questions, particularly the one
> about obtaining the current date when the flow is run (#2) and also #3, I
> would suggest that you study the NiFi Expression Language. This would allow
> you to insert the "now" date into your flow at some point.
>
> The other suggestions I have are to experiment, which I'm sure you're
> doing, and Google hard for help. For this forum, which is filled with very
> nice and helpful people, you'll tend to get a lot more and better help if
> you come in with a very specific question rather than a list of things or a
> general, "help me" sort of plea.
>
> Cheers,
>
> Russ
>
> P.S. NiFi absolutely rocks, which you'll see as soon as you get over your
> initial hump here. But, you're on the right track.
>
>
>
> On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
>
> Hi,
>
>
>
> Could you someone please help me with these requirements in the email
> below?
>
>
>
> Thanks
>
> Aruna
>
>
>
> *From:* Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com
> <Ar...@Cormac-Corp.com>]
> *Sent:* Wednesday, October 11, 2017 11:26 AM
> *To:* users@nifi.apache.org
> *Subject:* Transformations using Nifi
>
>
>
> I am trying to see what kind of transformations that can be done in nifi
> and how.
>
> Now I have a basic flow that takes CSV from the local dir and puts into s3
> and loads into postgres database.
>
> There are 4 columns in my test file 3 of which are string and one is an
> integer field. I would like to do the following before I load the data into
> postgres. If someone can help me on how to go about these, it will be great.
>
> 1.       Convert one of the string columns to upper case
>
> *For converting to upper case, I was told to use the Update Record
> processor but  my version is 1.2.0 and the update record processor is not
> available.*
>
>
>
> 2.       Postgres has an extra column called “Load_Date” in which I would
> like to load the current date with timestamp when the flow is run
>
> 3.       If the integer column has more than 5 digits, I would like to
> take only the first 5 digits and load to the table
>
> 4.       There is a look up table in postgres. I would like to check if
> the first column value is present in the look up table and if yes, proceed
> ahead and if not ignore the record
>
>
>
> I am trying to learn nifi so I would really appreciate any kind of help
> here. Is there any training available online that I can take in order to
> understand and do all these?
>
>
>
> <image003.png>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

RE: UpdateRecord Processor

Posted by Aruna Sankaralingam <Ar...@Cormac-Corp.com>.
Bryan,

Sys_CREAT_TS and ACTV_IND got updated in the target table successfully. Thank you for your help.

I am trying to convert one of the existing columns in CSV – CITY_NAME to Upper Case.  I don’t think my usage is correct. I tried with forward slash, quotes but the final value is coming as blank though the CSV file has valid values.
I googled yesterday with no luck. Could you please let me know what I am missing?

[cid:image005.png@01D34804.48352D70]

From: Bryan Bende [mailto:bbende@gmail.com]
Sent: Monday, October 16, 2017 5:19 PM
To: users@nifi.apache.org
Subject: Re: UpdateRecord Processor

Ok you have to make a second instance of CSVReader that uses the "write" schema that UpdateRecord is using.
If you try to use the same CSVReader that UpdateRecord is using, then it is reading records with a schema that doesn't contain your new fields, which is why they are blank in your database.

On Mon, Oct 16, 2017 at 5:01 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,

PutDatabaseRecord was reading the reader schema. Attached is the template of my flow.

Thanks
Aruna

From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 4:38 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Can you make sure that your PutDatabaseRecord is using a reader with a schema that has those fields (i.e. the same schema as the writer of UpdateRecord)?
If PutDatabaseRecord was using the reader schema that UpdateRecord is using, then it would be reading in the records without those fields.
If that doesn't solve it then I would need to see a template of your whole flow.
Thanks,
Bryan

On Mon, Oct 16, 2017 at 4:26 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,
I tried both these options just now but sys_creat_ts is not getting populated for any record.
Option 1: If your writer schema has long for sys_create_ts then your UpdateRecord needs to have ${now():toNumber()}.
Option 2: If your writer schema has the original logicalType, then you need ${now():format('yyyy-MM-dd HH:mm:ss')}.

From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 4:14 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Aruna,
If your writer schema has long for sys_create_ts then your UpdateRecord needs to have ${now():toNumber()}.
If your writer schema has the original logicalType, then you need ${now():format('yyyy-MM-dd HH:mm:ss')}.

It looks like you have a mix of both because you said you changed it to long, but the error shows "2017-10-16 20:03:58" which isn't a long.

-Bryan


On Mon, Oct 16, 2017 at 4:07 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,

I followed your methods and the flow was successful. However, the sys_creat_ts and actv_ind fields were not populated for any of the records.

[cid:image006.png@01D34804.48352D70]

This is what I see in the “Data Provenance” of the PutDatabaseRecord processor.

ENRLMT_ID,CITY_NAME,ZIP_CD,STATE_CD,SYS_CREAT_TS,ACTV_IND
I20031103000005,MECHANICSBURG,170509414,PA,1508183538000,
I20031103000013,HATO REY,917,PR,1508183538000,
I20031103000015,TOMS RIVER,8757,NJ,1508183538000,
I20031103000028,JERSEY CITY,73062305,NJ,1508183538000,
I20031103000030,AGUADILLA,6055256,PR,1508183538000,
I20031103000032,BINGHAM FARMS,480254514,MI,1508183538000,

So I changed the writer schema to use long for the sys_creat_ts field and started the flow but I got an error
“20:03:58 UTCERROR
1221b2fe-015f-1000-9235-11d12a9fdbcc
UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process StandardFlowFileRecord[uuid=25c48eec-42a0-4766-8605-69c5a0dad3cf,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1508167656905-1, container=default, section=1], offset=8591, length=1606],offset=0,name=3-Address_Sub-_File.csv,size=1606]; will route to failure: java.lang.NumberFormatException: For input string: "2017-10-16 20:03:58"

“

[cid:image007.jpg@01D34804.48352D70]



From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 3:29 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Sticking with the original logicalType of timestamp-millis, I was able to get it working using ${now():format('yyyy-MM-dd HH:mm:ss')}  as the value of the timestamp column.

On Mon, Oct 16, 2017 at 3:23 PM, Bryan Bende <bb...@gmail.com>> wrote:
If it is helpful, here is a template of the working version where I changed to a long:

https://gist.githubusercontent.com/bbende/3187d3905f868bef3143899841bb70f9/raw/f266b149eb845a371543a1b68f9797d5437c6a2d/update-record-timestamp.xml

On Mon, Oct 16, 2017 at 3:19 PM, Bryan Bende <bb...@gmail.com>> wrote:
Aruna,
I think the issue here might be on the reader side of things...
If you're incoming data does not have SYS_CREAT_TS in it, then your schema needs to allow null values for this field so you can create records from the incoming data, or you can create two versions of your schema and have the reader use a version that doesn't have this field, and have the writer use a version that does.

I am currently testing this using these examples:

{
"type": "record",
"name": "schema1",
"fields": [
 { "name": "id", "type": "string" }
]
}

{
"type": "record",
"name": "schema2",
"fields": [
 { "name": "id", "type": "string" },
 { "name": "timestamp", "type" : { "type" : "long", "logicalType" : "timestamp-millis" } }
]
}

I have a CSVReader using schema1 and a CSVWriter using schema2.

I'm able to get past the issue you are having and now it is getting to the point where it is trying to take the result ofr ${now():toNumber()} and convert it to a timestamp, and I'm running into a different issue:

org.apache.nifi.serialization.record.util.IllegalTypeConversionException: Could not convert value [1508180727622] of type java.lang.String to Timestamp for field timestamp because the value is not in the expected date format: org.apache.nifi.serialization.record.util.DataTypeUtils$$Lambda$414/572291470@33d51c66<ma...@33d51c66>
org.apache.nifi.serialization.record.util.IllegalTypeConversionException: Could not convert value [1508180727622] of type java.lang.String to Timestamp for field timestamp because the value is not in the expected date format: org.apache.nifi.serialization.record.util.DataTypeUtils$$Lambda$414/572291470@33d51c66<ma...@33d51c66>
    at org.apache.nifi.serialization.record.util.DataTypeUtils.toTimestamp(DataTypeUtils.java:564)
    at org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:134)
    at org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:84)
    at org.apache.nifi.serialization.record.MapRecord.setValue(MapRecord.java:317)
    at org.apache.nifi.record.path.St<http://org.apache.nifi.record.path.St>andardFieldValue.updateValue(StandardFieldValue.java:132)
    at org.apache.nifi.processors.standard.UpdateRecord.lambda$process$1(UpdateRecord.java:177)

I haven't figured out the issue, but it has something to do with the logic of trying to convert a string to timestamp and whether or not a date format is provided.

If I change schema1 to use a regular long for the timestamp field then it works:

{
"type": "record",
"name": "schema2",
"fields": [
 { "name": "id", "type": "string" },
 { "name": "timestamp", "type" : "long" }
]
}

I don't know what the ramifications of this would be in PutDatabaseRecord.


On Mon, Oct 16, 2017 at 2:23 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Matt,

My issue is that when I am trying to assign current date to sys_creat_ts for every record, I am getting this error that it cannot be null. SYS_CREAT_TS is not coming from the source. It is present in the target table alone and I need to load the current datetime in that field for every record that is coming from source.

Mark,

I am still getting the same error.

Also, actv_ind must display just “Y” in the target table.


[cid:image008.png@01D34804.48352D70]

[cid:image009.png@01D34804.48352D70]

From: Mark Payne [mailto:markap14@hotmail.com<ma...@hotmail.com>]
Sent: Monday, October 16, 2017 12:03 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Hi Aruna,

I think the issue is likely that you are setting /sys_create_ts but your schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
Would recommend you change the property names to /ACTV_IND and /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
set to "{now()}" but I think what you really are wanting is "${now():toNumber()}".

Also, of note, /actv_ind is set to '/Y' so I just want to clarify that the literal value '/Y' (with the single quotes) is what will be placed there. Is that
the intention? Or did you actually want just /Y to be there?

Thanks
-Mark



On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <Ar...@Cormac-Corp.com>> wrote:

Hi,

I updated to version 1.4 now. I am using Updaterecord processor. I have assigned values to two of the columns that are there in the target table as shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am I missing something?
I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry and my Nifi Flow below.

<image004.png>   <image005.png>

CSVRecordSetWriter:
<image001.png>

AVROSchemaRegistry:
<image002.png>

<image006.png>


From: Koji Kawamura [mailto:ijokarumawak@gmail.com]
Sent: Thursday, October 12, 2017 8:28 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: Transformations using Nifi

Hi Aruna,

If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
As Russ mentioned, EL might be helpful when you implement the scripted components.

#4 is a bit harder since it requires a database connection, but doable, I don't know it works efficiently though..

Alternative approach would be create a temporary table, insert rows as they are, then perform a insert/update query using the temporary table and postgresql lookup table such as:
"insert into X select a, b, c, from T inner join L on X.j = T.j where d = ..."
Probably data conversion #1, 2, and 3 can be performed from this query as well.

Thanks,
Koji

On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>> wrote:
Aruna,
I don't think there is any generalized NiFi training course yet. I started writing a book a year ago, but it's a pretty thankless task and takes a lot of time. I mostly write custom processors so I tend to see the world differently from most "user" users of NiFi. When I answer questions, which I don't do too much, I often tell an impractical story that doesn't answer the question asked.
Now, I haven't done much database work at all, so I'm not going to be too much help to you. However, looking at your questions, particularly the one about obtaining the current date when the flow is run (#2) and also #3, I would suggest that you study the NiFi Expression Language. This would allow you to insert the "now" date into your flow at some point.
The other suggestions I have are to experiment, which I'm sure you're doing, and Google hard for help. For this forum, which is filled with very nice and helpful people, you'll tend to get a lot more and better help if you come in with a very specific question rather than a list of things or a general, "help me" sort of plea.
Cheers,
Russ
P.S. NiFi absolutely rocks, which you'll see as soon as you get over your initial hump here. But, you're on the right track.

On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
Hi,

Could you someone please help me with these requirements in the email below?

Thanks
Aruna

From: Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com]
Sent: Wednesday, October 11, 2017 11:26 AM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Transformations using Nifi

I am trying to see what kind of transformations that can be done in nifi and how.
Now I have a basic flow that takes CSV from the local dir and puts into s3 and loads into postgres database.
There are 4 columns in my test file 3 of which are string and one is an integer field. I would like to do the following before I load the data into postgres. If someone can help me on how to go about these, it will be great.

1.       Convert one of the string columns to upper case

For converting to upper case, I was told to use the Update Record processor but  my version is 1.2.0 and the update record processor is not available.



2.       Postgres has an extra column called “Load_Date” in which I would like to load the current date with timestamp when the flow is run

3.       If the integer column has more than 5 digits, I would like to take only the first 5 digits and load to the table

4.       There is a look up table in postgres. I would like to check if the first column value is present in the look up table and if yes, proceed ahead and if not ignore the record

I am trying to learn nifi so I would really appreciate any kind of help here. Is there any training available online that I can take in order to understand and do all these?

<image003.png>








Re: UpdateRecord Processor

Posted by Bryan Bende <bb...@gmail.com>.
Ok you have to make a second instance of CSVReader that uses the "write"
schema that UpdateRecord is using.

If you try to use the same CSVReader that UpdateRecord is using, then it is
reading records with a schema that doesn't contain your new fields, which
is why they are blank in your database.

On Mon, Oct 16, 2017 at 5:01 PM, Aruna Sankaralingam <
Aruna.Sankaralingam@cormac-corp.com> wrote:

> Bryan,
>
>
>
> PutDatabaseRecord was reading the reader schema. Attached is the template
> of my flow.
>
>
>
> Thanks
>
> Aruna
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 4:38 PM
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Can you make sure that your PutDatabaseRecord is using a reader with a
> schema that has those fields (i.e. the same schema as the writer of
> UpdateRecord)?
>
> If PutDatabaseRecord was using the reader schema that UpdateRecord is
> using, then it would be reading in the records without those fields.
>
> If that doesn't solve it then I would need to see a template of your whole
> flow.
>
> Thanks,
>
> Bryan
>
>
>
> On Mon, Oct 16, 2017 at 4:26 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
> I tried both these options just now but sys_creat_ts is not getting
> populated for any record.
>
> Option 1: If your writer schema has long for sys_create_ts then your
> UpdateRecord needs to have ${now():toNumber()}.
>
> Option 2: If your writer schema has the original logicalType, then you
> need ${now():format('yyyy-MM-dd HH:mm:ss')}.
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 4:14 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Aruna,
>
> If your writer schema has long for sys_create_ts then your UpdateRecord
> needs to have ${now():toNumber()}.
>
> If your writer schema has the original logicalType, then you need
> ${now():format('yyyy-MM-dd HH:mm:ss')}.
>
>
>
> It looks like you have a mix of both because you said you changed it to
> long, but the error shows "2017-10-16 20:03:58" which isn't a long.
>
>
>
> -Bryan
>
>
>
>
>
> On Mon, Oct 16, 2017 at 4:07 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
>
>
> I followed your methods and the flow was successful. However, the
> sys_creat_ts and actv_ind fields were not populated for any of the records.
>
>
>
>
>
> This is what I see in the “Data Provenance” of the PutDatabaseRecord
> processor.
>
>
>
> ENRLMT_ID,CITY_NAME,ZIP_CD,STATE_CD,SYS_CREAT_TS,ACTV_IND
>
> I20031103000005,MECHANICSBURG,170509414,PA,1508183538000,
>
> I20031103000013,HATO REY,917,PR,1508183538000,
>
> I20031103000015,TOMS RIVER,8757,NJ,1508183538000,
>
> I20031103000028,JERSEY CITY,73062305,NJ,1508183538000,
>
> I20031103000030,AGUADILLA,6055256,PR,1508183538000,
>
> I20031103000032,BINGHAM FARMS,480254514,MI,1508183538000,
>
>
>
> So I changed the writer schema to use long for the sys_creat_ts field and
> started the flow but I got an error
>
> “*20:03:58 UTC**ERROR*
>
> *1221b2fe-015f-1000-9235-11d12a9fdbcc*
>
> UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process
> StandardFlowFileRecord[uuid=25c48eec-42a0-4766-8605-69c5a0dad3cf,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1508167656905-1,
> container=default, section=1], offset=8591, length=1606],offset=0,name=3-
> Address_Sub-_File.csv,size=1606]; will route to failure: java.lang.NumberFormatException:
> For input string: "2017-10-16 20:03:58"
>
>
>
> “
>
>
>
>
>
>
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 3:29 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Sticking with the original logicalType of timestamp-millis, I was able to
> get it working using ${now():format('yyyy-MM-dd HH:mm:ss')}  as the value
> of the timestamp column.
>
>
>
> On Mon, Oct 16, 2017 at 3:23 PM, Bryan Bende <bb...@gmail.com> wrote:
>
> If it is helpful, here is a template of the working version where I
> changed to a long:
>
> https://gist.githubusercontent.com/bbende/3187d3905f868bef3143899841bb70
> f9/raw/f266b149eb845a371543a1b68f9797d5437c6a2d/update-record-
> timestamp.xml
>
>
>
> On Mon, Oct 16, 2017 at 3:19 PM, Bryan Bende <bb...@gmail.com> wrote:
>
> Aruna,
>
> I think the issue here might be on the reader side of things...
>
> If you're incoming data does not have SYS_CREAT_TS in it, then your schema
> needs to allow null values for this field so you can create records from
> the incoming data, or you can create two versions of your schema and have
> the reader use a version that doesn't have this field, and have the writer
> use a version that does.
>
>
>
> I am currently testing this using these examples:
>
>
>
> {
> "type": "record",
> "name": "schema1",
> "fields": [
>  { "name": "id", "type": "string" }
> ]
> }
>
> {
> "type": "record",
> "name": "schema2",
> "fields": [
>  { "name": "id", "type": "string" },
>  { "name": "timestamp", "type" : { "type" : "long", "logicalType" :
> "timestamp-millis" } }
> ]
> }
>
>
>
> I have a CSVReader using schema1 and a CSVWriter using schema2.
>
>
>
> I'm able to get past the issue you are having and now it is getting to the
> point where it is trying to take the result ofr ${now():toNumber()} and
> convert it to a timestamp, and I'm running into a different issue:
>
>
>
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
> Could not convert value [1508180727622] of type java.lang.String to
> Timestamp for field timestamp because the value is not in the expected date
> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$
> Lambda$414/572291470@33d51c66
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
> Could not convert value [1508180727622] of type java.lang.String to
> Timestamp for field timestamp because the value is not in the expected date
> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$
> Lambda$414/572291470@33d51c66
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> toTimestamp(DataTypeUtils.java:564)
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> convertType(DataTypeUtils.java:134)
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> convertType(DataTypeUtils.java:84)
>     at org.apache.nifi.serialization.record.MapRecord.setValue(
> MapRecord.java:317)
>     at org.apache.nifi.record.path.StandardFieldValue.updateValue(
> StandardFieldValue.java:132)
>     at org.apache.nifi.processors.standard.UpdateRecord.lambda$
> process$1(UpdateRecord.java:177)
>
>
>
> I haven't figured out the issue, but it has something to do with the logic
> of trying to convert a string to timestamp and whether or not a date format
> is provided.
>
>
>
> If I change schema1 to use a regular long for the timestamp field then it
> works:
>
>
>
> {
> "type": "record",
> "name": "schema2",
> "fields": [
>  { "name": "id", "type": "string" },
>  { "name": "timestamp", "type" : "long" }
> ]
> }
>
>
>
> I don't know what the ramifications of this would be in PutDatabaseRecord.
>
>
>
>
>
> On Mon, Oct 16, 2017 at 2:23 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Matt,
>
>
>
> My issue is that when I am trying to assign current date to sys_creat_ts
> for every record, I am getting this error that it cannot be null.
> SYS_CREAT_TS is not coming from the source. It is present in the target
> table alone and I need to load the current datetime in that field for every
> record that is coming from source.
>
>
>
> Mark,
>
>
>
> I am still getting the same error.
>
>
>
> Also, actv_ind must display just “Y” in the target table.
>
>
>
>
>
>
>
>
>
> *From:* Mark Payne [mailto:markap14@hotmail.com]
> *Sent:* Monday, October 16, 2017 12:03 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Hi Aruna,
>
>
>
> I think the issue is likely that you are setting /sys_create_ts but your
> schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
>
> Would recommend you change the property names to /ACTV_IND and
> /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
>
> set to "{now()}" but I think what you really are wanting is
> "${now():toNumber()}".
>
>
>
> Also, of note, /actv_ind is set to '/Y' so I just want to clarify that the
> literal value '/Y' (with the single quotes) is what will be placed there.
> Is that
>
> the intention? Or did you actually want just /Y to be there?
>
>
>
> Thanks
>
> -Mark
>
>
>
>
>
>
>
> On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <
> Aruna.Sankaralingam@Cormac-Corp.com> wrote:
>
>
>
> Hi,
>
>
>
> I updated to version 1.4 now. I am using Updaterecord processor. I have
> assigned values to two of the columns that are there in the target table as
> shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am
> I missing something?
>
> I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry
> and my Nifi Flow below.
>
>
>
> <image004.png>   <image005.png>
>
>
>
> CSVRecordSetWriter:
>
> <image001.png>
>
>
>
> AVROSchemaRegistry:
>
> <image002.png>
>
>
>
> <image006.png>
>
>
>
>
>
> *From:* Koji Kawamura [mailto:ijokarumawak@gmail.com
> <ij...@gmail.com>]
> *Sent:* Thursday, October 12, 2017 8:28 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: Transformations using Nifi
>
>
>
> Hi Aruna,
>
>
>
> If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
>
> ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
>
> As Russ mentioned, EL might be helpful when you implement the scripted
> components.
>
>
>
> #4 is a bit harder since it requires a database connection, but doable, I
> don't know it works efficiently though..
>
>
>
> Alternative approach would be create a temporary table, insert rows as
> they are, then perform a insert/update query using the temporary table and
> postgresql lookup table such as:
>
> "insert into X select a, b, c, from T inner join L on X.j = T.j where d =
> ..."
>
> Probably data conversion #1, 2, and 3 can be performed from this query as
> well.
>
>
>
> Thanks,
>
> Koji
>
>
>
> On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>
> wrote:
>
> Aruna,
>
> I don't think there is any generalized NiFi training course yet. I started
> writing a book a year ago, but it's a pretty thankless task and takes a lot
> of time. I mostly write custom processors so I tend to see the world
> differently from most "user" users of NiFi. When I answer questions, which
> I don't do too much, I often tell an impractical story that doesn't answer
> the question asked.
>
> Now, I haven't done much database work at all, so I'm not going to be too
> much help to you. However, looking at your questions, particularly the one
> about obtaining the current date when the flow is run (#2) and also #3, I
> would suggest that you study the NiFi Expression Language. This would allow
> you to insert the "now" date into your flow at some point.
>
> The other suggestions I have are to experiment, which I'm sure you're
> doing, and Google hard for help. For this forum, which is filled with very
> nice and helpful people, you'll tend to get a lot more and better help if
> you come in with a very specific question rather than a list of things or a
> general, "help me" sort of plea.
>
> Cheers,
>
> Russ
>
> P.S. NiFi absolutely rocks, which you'll see as soon as you get over your
> initial hump here. But, you're on the right track.
>
>
>
> On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
>
> Hi,
>
>
>
> Could you someone please help me with these requirements in the email
> below?
>
>
>
> Thanks
>
> Aruna
>
>
>
> *From:* Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com
> <Ar...@Cormac-Corp.com>]
> *Sent:* Wednesday, October 11, 2017 11:26 AM
> *To:* users@nifi.apache.org
> *Subject:* Transformations using Nifi
>
>
>
> I am trying to see what kind of transformations that can be done in nifi
> and how.
>
> Now I have a basic flow that takes CSV from the local dir and puts into s3
> and loads into postgres database.
>
> There are 4 columns in my test file 3 of which are string and one is an
> integer field. I would like to do the following before I load the data into
> postgres. If someone can help me on how to go about these, it will be great.
>
> 1.       Convert one of the string columns to upper case
>
> *For converting to upper case, I was told to use the Update Record
> processor but  my version is 1.2.0 and the update record processor is not
> available.*
>
>
>
> 2.       Postgres has an extra column called “Load_Date” in which I would
> like to load the current date with timestamp when the flow is run
>
> 3.       If the integer column has more than 5 digits, I would like to
> take only the first 5 digits and load to the table
>
> 4.       There is a look up table in postgres. I would like to check if
> the first column value is present in the look up table and if yes, proceed
> ahead and if not ignore the record
>
>
>
> I am trying to learn nifi so I would really appreciate any kind of help
> here. Is there any training available online that I can take in order to
> understand and do all these?
>
>
>
> <image003.png>
>
>
>
>
>
>
>
>
>
>
>
>
>

RE: UpdateRecord Processor

Posted by Aruna Sankaralingam <Ar...@Cormac-Corp.com>.
Bryan,

PutDatabaseRecord was reading the reader schema. Attached is the template of my flow.

Thanks
Aruna

From: Bryan Bende [mailto:bbende@gmail.com]
Sent: Monday, October 16, 2017 4:38 PM
To: users@nifi.apache.org
Subject: Re: UpdateRecord Processor

Can you make sure that your PutDatabaseRecord is using a reader with a schema that has those fields (i.e. the same schema as the writer of UpdateRecord)?
If PutDatabaseRecord was using the reader schema that UpdateRecord is using, then it would be reading in the records without those fields.
If that doesn't solve it then I would need to see a template of your whole flow.
Thanks,
Bryan

On Mon, Oct 16, 2017 at 4:26 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,
I tried both these options just now but sys_creat_ts is not getting populated for any record.
Option 1: If your writer schema has long for sys_create_ts then your UpdateRecord needs to have ${now():toNumber()}.
Option 2: If your writer schema has the original logicalType, then you need ${now():format('yyyy-MM-dd HH:mm:ss')}.

From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 4:14 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Aruna,
If your writer schema has long for sys_create_ts then your UpdateRecord needs to have ${now():toNumber()}.
If your writer schema has the original logicalType, then you need ${now():format('yyyy-MM-dd HH:mm:ss')}.

It looks like you have a mix of both because you said you changed it to long, but the error shows "2017-10-16 20:03:58" which isn't a long.

-Bryan


On Mon, Oct 16, 2017 at 4:07 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,

I followed your methods and the flow was successful. However, the sys_creat_ts and actv_ind fields were not populated for any of the records.

[cid:image001.png@01D346A0.7602C100]

This is what I see in the “Data Provenance” of the PutDatabaseRecord processor.

ENRLMT_ID,CITY_NAME,ZIP_CD,STATE_CD,SYS_CREAT_TS,ACTV_IND
I20031103000005,MECHANICSBURG,170509414,PA,1508183538000,
I20031103000013,HATO REY,917,PR,1508183538000,
I20031103000015,TOMS RIVER,8757,NJ,1508183538000,
I20031103000028,JERSEY CITY,73062305,NJ,1508183538000,
I20031103000030,AGUADILLA,6055256,PR,1508183538000,
I20031103000032,BINGHAM FARMS,480254514,MI,1508183538000,

So I changed the writer schema to use long for the sys_creat_ts field and started the flow but I got an error
“20:03:58 UTCERROR
1221b2fe-015f-1000-9235-11d12a9fdbcc
UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process StandardFlowFileRecord[uuid=25c48eec-42a0-4766-8605-69c5a0dad3cf,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1508167656905-1, container=default, section=1], offset=8591, length=1606],offset=0,name=3-Address_Sub-_File.csv,size=1606]; will route to failure: java.lang.NumberFormatException: For input string: "2017-10-16 20:03:58"

“

[cid:image002.jpg@01D346A0.7602C100]



From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 3:29 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Sticking with the original logicalType of timestamp-millis, I was able to get it working using ${now():format('yyyy-MM-dd HH:mm:ss')}  as the value of the timestamp column.

On Mon, Oct 16, 2017 at 3:23 PM, Bryan Bende <bb...@gmail.com>> wrote:
If it is helpful, here is a template of the working version where I changed to a long:

https://gist.githubusercontent.com/bbende/3187d3905f868bef3143899841bb70f9/raw/f266b149eb845a371543a1b68f9797d5437c6a2d/update-record-timestamp.xml

On Mon, Oct 16, 2017 at 3:19 PM, Bryan Bende <bb...@gmail.com>> wrote:
Aruna,
I think the issue here might be on the reader side of things...
If you're incoming data does not have SYS_CREAT_TS in it, then your schema needs to allow null values for this field so you can create records from the incoming data, or you can create two versions of your schema and have the reader use a version that doesn't have this field, and have the writer use a version that does.

I am currently testing this using these examples:

{
"type": "record",
"name": "schema1",
"fields": [
 { "name": "id", "type": "string" }
]
}

{
"type": "record",
"name": "schema2",
"fields": [
 { "name": "id", "type": "string" },
 { "name": "timestamp", "type" : { "type" : "long", "logicalType" : "timestamp-millis" } }
]
}

I have a CSVReader using schema1 and a CSVWriter using schema2.

I'm able to get past the issue you are having and now it is getting to the point where it is trying to take the result ofr ${now():toNumber()} and convert it to a timestamp, and I'm running into a different issue:

org.apache.nifi.serialization.record.util.IllegalTypeConversionException: Could not convert value [1508180727622] of type java.lang.String to Timestamp for field timestamp because the value is not in the expected date format: org.apache.nifi.serialization.record.util.DataTypeUtils$$Lambda$414/572291470@33d51c66<ma...@33d51c66>
org.apache.nifi.serialization.record.util.IllegalTypeConversionException: Could not convert value [1508180727622] of type java.lang.String to Timestamp for field timestamp because the value is not in the expected date format: org.apache.nifi.serialization.record.util.DataTypeUtils$$Lambda$414/572291470@33d51c66<ma...@33d51c66>
    at org.apache.nifi.serialization.record.util.DataTypeUtils.toTimestamp(DataTypeUtils.java:564)
    at org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:134)
    at org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:84)
    at org.apache.nifi.serialization.record.MapRecord.setValue(MapRecord.java:317)
    at org.apache.nifi.record.path.St<http://org.apache.nifi.record.path.St>andardFieldValue.updateValue(StandardFieldValue.java:132)
    at org.apache.nifi.processors.standard.UpdateRecord.lambda$process$1(UpdateRecord.java:177)

I haven't figured out the issue, but it has something to do with the logic of trying to convert a string to timestamp and whether or not a date format is provided.

If I change schema1 to use a regular long for the timestamp field then it works:

{
"type": "record",
"name": "schema2",
"fields": [
 { "name": "id", "type": "string" },
 { "name": "timestamp", "type" : "long" }
]
}

I don't know what the ramifications of this would be in PutDatabaseRecord.


On Mon, Oct 16, 2017 at 2:23 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Matt,

My issue is that when I am trying to assign current date to sys_creat_ts for every record, I am getting this error that it cannot be null. SYS_CREAT_TS is not coming from the source. It is present in the target table alone and I need to load the current datetime in that field for every record that is coming from source.

Mark,

I am still getting the same error.

Also, actv_ind must display just “Y” in the target table.


[cid:image003.png@01D346A0.7602C100]

[cid:image004.png@01D346A0.7602C100]

From: Mark Payne [mailto:markap14@hotmail.com<ma...@hotmail.com>]
Sent: Monday, October 16, 2017 12:03 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Hi Aruna,

I think the issue is likely that you are setting /sys_create_ts but your schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
Would recommend you change the property names to /ACTV_IND and /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
set to "{now()}" but I think what you really are wanting is "${now():toNumber()}".

Also, of note, /actv_ind is set to '/Y' so I just want to clarify that the literal value '/Y' (with the single quotes) is what will be placed there. Is that
the intention? Or did you actually want just /Y to be there?

Thanks
-Mark



On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <Ar...@Cormac-Corp.com>> wrote:

Hi,

I updated to version 1.4 now. I am using Updaterecord processor. I have assigned values to two of the columns that are there in the target table as shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am I missing something?
I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry and my Nifi Flow below.

<image004.png>   <image005.png>

CSVRecordSetWriter:
<image001.png>

AVROSchemaRegistry:
<image002.png>

<image006.png>


From: Koji Kawamura [mailto:ijokarumawak@gmail.com]
Sent: Thursday, October 12, 2017 8:28 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: Transformations using Nifi

Hi Aruna,

If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
As Russ mentioned, EL might be helpful when you implement the scripted components.

#4 is a bit harder since it requires a database connection, but doable, I don't know it works efficiently though..

Alternative approach would be create a temporary table, insert rows as they are, then perform a insert/update query using the temporary table and postgresql lookup table such as:
"insert into X select a, b, c, from T inner join L on X.j = T.j where d = ..."
Probably data conversion #1, 2, and 3 can be performed from this query as well.

Thanks,
Koji

On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>> wrote:
Aruna,
I don't think there is any generalized NiFi training course yet. I started writing a book a year ago, but it's a pretty thankless task and takes a lot of time. I mostly write custom processors so I tend to see the world differently from most "user" users of NiFi. When I answer questions, which I don't do too much, I often tell an impractical story that doesn't answer the question asked.
Now, I haven't done much database work at all, so I'm not going to be too much help to you. However, looking at your questions, particularly the one about obtaining the current date when the flow is run (#2) and also #3, I would suggest that you study the NiFi Expression Language. This would allow you to insert the "now" date into your flow at some point.
The other suggestions I have are to experiment, which I'm sure you're doing, and Google hard for help. For this forum, which is filled with very nice and helpful people, you'll tend to get a lot more and better help if you come in with a very specific question rather than a list of things or a general, "help me" sort of plea.
Cheers,
Russ
P.S. NiFi absolutely rocks, which you'll see as soon as you get over your initial hump here. But, you're on the right track.

On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
Hi,

Could you someone please help me with these requirements in the email below?

Thanks
Aruna

From: Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com]
Sent: Wednesday, October 11, 2017 11:26 AM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Transformations using Nifi

I am trying to see what kind of transformations that can be done in nifi and how.
Now I have a basic flow that takes CSV from the local dir and puts into s3 and loads into postgres database.
There are 4 columns in my test file 3 of which are string and one is an integer field. I would like to do the following before I load the data into postgres. If someone can help me on how to go about these, it will be great.

1.       Convert one of the string columns to upper case

For converting to upper case, I was told to use the Update Record processor but  my version is 1.2.0 and the update record processor is not available.



2.       Postgres has an extra column called “Load_Date” in which I would like to load the current date with timestamp when the flow is run

3.       If the integer column has more than 5 digits, I would like to take only the first 5 digits and load to the table

4.       There is a look up table in postgres. I would like to check if the first column value is present in the look up table and if yes, proceed ahead and if not ignore the record

I am trying to learn nifi so I would really appreciate any kind of help here. Is there any training available online that I can take in order to understand and do all these?

<image003.png>







Re: UpdateRecord Processor

Posted by Bryan Bende <bb...@gmail.com>.
Can you make sure that your PutDatabaseRecord is using a reader with a
schema that has those fields (i.e. the same schema as the writer of
UpdateRecord)?

If PutDatabaseRecord was using the reader schema that UpdateRecord is
using, then it would be reading in the records without those fields.

If that doesn't solve it then I would need to see a template of your whole
flow.

Thanks,

Bryan


On Mon, Oct 16, 2017 at 4:26 PM, Aruna Sankaralingam <
Aruna.Sankaralingam@cormac-corp.com> wrote:

> Bryan,
>
> I tried both these options just now but sys_creat_ts is not getting
> populated for any record.
>
> Option 1: If your writer schema has long for sys_create_ts then your
> UpdateRecord needs to have ${now():toNumber()}.
>
> Option 2: If your writer schema has the original logicalType, then you
> need ${now():format('yyyy-MM-dd HH:mm:ss')}.
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 4:14 PM
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Aruna,
>
> If your writer schema has long for sys_create_ts then your UpdateRecord
> needs to have ${now():toNumber()}.
>
> If your writer schema has the original logicalType, then you need
> ${now():format('yyyy-MM-dd HH:mm:ss')}.
>
>
>
> It looks like you have a mix of both because you said you changed it to
> long, but the error shows "2017-10-16 20:03:58" which isn't a long.
>
>
>
> -Bryan
>
>
>
>
>
> On Mon, Oct 16, 2017 at 4:07 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Bryan,
>
>
>
> I followed your methods and the flow was successful. However, the
> sys_creat_ts and actv_ind fields were not populated for any of the records.
>
>
>
>
>
> This is what I see in the “Data Provenance” of the PutDatabaseRecord
> processor.
>
>
>
> ENRLMT_ID,CITY_NAME,ZIP_CD,STATE_CD,SYS_CREAT_TS,ACTV_IND
>
> I20031103000005,MECHANICSBURG,170509414,PA,1508183538000,
>
> I20031103000013,HATO REY,917,PR,1508183538000,
>
> I20031103000015,TOMS RIVER,8757,NJ,1508183538000,
>
> I20031103000028,JERSEY CITY,73062305,NJ,1508183538000,
>
> I20031103000030,AGUADILLA,6055256,PR,1508183538000,
>
> I20031103000032,BINGHAM FARMS,480254514,MI,1508183538000,
>
>
>
> So I changed the writer schema to use long for the sys_creat_ts field and
> started the flow but I got an error
>
> “*20:03:58 UTC**ERROR*
>
> *1221b2fe-015f-1000-9235-11d12a9fdbcc*
>
> UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process
> StandardFlowFileRecord[uuid=25c48eec-42a0-4766-8605-69c5a0dad3cf,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1508167656905-1,
> container=default, section=1], offset=8591, length=1606],offset=0,name=3-
> Address_Sub-_File.csv,size=1606]; will route to failure: java.lang.NumberFormatException:
> For input string: "2017-10-16 20:03:58"
>
>
>
> “
>
>
>
>
>
>
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 3:29 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Sticking with the original logicalType of timestamp-millis, I was able to
> get it working using ${now():format('yyyy-MM-dd HH:mm:ss')}  as the value
> of the timestamp column.
>
>
>
> On Mon, Oct 16, 2017 at 3:23 PM, Bryan Bende <bb...@gmail.com> wrote:
>
> If it is helpful, here is a template of the working version where I
> changed to a long:
>
> https://gist.githubusercontent.com/bbende/3187d3905f868bef3143899841bb70
> f9/raw/f266b149eb845a371543a1b68f9797d5437c6a2d/update-record-
> timestamp.xml
>
>
>
> On Mon, Oct 16, 2017 at 3:19 PM, Bryan Bende <bb...@gmail.com> wrote:
>
> Aruna,
>
> I think the issue here might be on the reader side of things...
>
> If you're incoming data does not have SYS_CREAT_TS in it, then your schema
> needs to allow null values for this field so you can create records from
> the incoming data, or you can create two versions of your schema and have
> the reader use a version that doesn't have this field, and have the writer
> use a version that does.
>
>
>
> I am currently testing this using these examples:
>
>
>
> {
> "type": "record",
> "name": "schema1",
> "fields": [
>  { "name": "id", "type": "string" }
> ]
> }
>
> {
> "type": "record",
> "name": "schema2",
> "fields": [
>  { "name": "id", "type": "string" },
>  { "name": "timestamp", "type" : { "type" : "long", "logicalType" :
> "timestamp-millis" } }
> ]
> }
>
>
>
> I have a CSVReader using schema1 and a CSVWriter using schema2.
>
>
>
> I'm able to get past the issue you are having and now it is getting to the
> point where it is trying to take the result ofr ${now():toNumber()} and
> convert it to a timestamp, and I'm running into a different issue:
>
>
>
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
> Could not convert value [1508180727622] of type java.lang.String to
> Timestamp for field timestamp because the value is not in the expected date
> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$
> Lambda$414/572291470@33d51c66
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
> Could not convert value [1508180727622] of type java.lang.String to
> Timestamp for field timestamp because the value is not in the expected date
> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$
> Lambda$414/572291470@33d51c66
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> toTimestamp(DataTypeUtils.java:564)
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> convertType(DataTypeUtils.java:134)
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> convertType(DataTypeUtils.java:84)
>     at org.apache.nifi.serialization.record.MapRecord.setValue(
> MapRecord.java:317)
>     at org.apache.nifi.record.path.StandardFieldValue.updateValue(
> StandardFieldValue.java:132)
>     at org.apache.nifi.processors.standard.UpdateRecord.lambda$
> process$1(UpdateRecord.java:177)
>
>
>
> I haven't figured out the issue, but it has something to do with the logic
> of trying to convert a string to timestamp and whether or not a date format
> is provided.
>
>
>
> If I change schema1 to use a regular long for the timestamp field then it
> works:
>
>
>
> {
> "type": "record",
> "name": "schema2",
> "fields": [
>  { "name": "id", "type": "string" },
>  { "name": "timestamp", "type" : "long" }
> ]
> }
>
>
>
> I don't know what the ramifications of this would be in PutDatabaseRecord.
>
>
>
>
>
> On Mon, Oct 16, 2017 at 2:23 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Matt,
>
>
>
> My issue is that when I am trying to assign current date to sys_creat_ts
> for every record, I am getting this error that it cannot be null.
> SYS_CREAT_TS is not coming from the source. It is present in the target
> table alone and I need to load the current datetime in that field for every
> record that is coming from source.
>
>
>
> Mark,
>
>
>
> I am still getting the same error.
>
>
>
> Also, actv_ind must display just “Y” in the target table.
>
>
>
>
>
>
>
>
>
> *From:* Mark Payne [mailto:markap14@hotmail.com]
> *Sent:* Monday, October 16, 2017 12:03 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Hi Aruna,
>
>
>
> I think the issue is likely that you are setting /sys_create_ts but your
> schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
>
> Would recommend you change the property names to /ACTV_IND and
> /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
>
> set to "{now()}" but I think what you really are wanting is
> "${now():toNumber()}".
>
>
>
> Also, of note, /actv_ind is set to '/Y' so I just want to clarify that the
> literal value '/Y' (with the single quotes) is what will be placed there.
> Is that
>
> the intention? Or did you actually want just /Y to be there?
>
>
>
> Thanks
>
> -Mark
>
>
>
>
>
>
>
> On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <
> Aruna.Sankaralingam@Cormac-Corp.com> wrote:
>
>
>
> Hi,
>
>
>
> I updated to version 1.4 now. I am using Updaterecord processor. I have
> assigned values to two of the columns that are there in the target table as
> shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am
> I missing something?
>
> I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry
> and my Nifi Flow below.
>
>
>
> <image004.png>   <image005.png>
>
>
>
> CSVRecordSetWriter:
>
> <image001.png>
>
>
>
> AVROSchemaRegistry:
>
> <image002.png>
>
>
>
> <image006.png>
>
>
>
>
>
> *From:* Koji Kawamura [mailto:ijokarumawak@gmail.com
> <ij...@gmail.com>]
> *Sent:* Thursday, October 12, 2017 8:28 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: Transformations using Nifi
>
>
>
> Hi Aruna,
>
>
>
> If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
>
> ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
>
> As Russ mentioned, EL might be helpful when you implement the scripted
> components.
>
>
>
> #4 is a bit harder since it requires a database connection, but doable, I
> don't know it works efficiently though..
>
>
>
> Alternative approach would be create a temporary table, insert rows as
> they are, then perform a insert/update query using the temporary table and
> postgresql lookup table such as:
>
> "insert into X select a, b, c, from T inner join L on X.j = T.j where d =
> ..."
>
> Probably data conversion #1, 2, and 3 can be performed from this query as
> well.
>
>
>
> Thanks,
>
> Koji
>
>
>
> On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>
> wrote:
>
> Aruna,
>
> I don't think there is any generalized NiFi training course yet. I started
> writing a book a year ago, but it's a pretty thankless task and takes a lot
> of time. I mostly write custom processors so I tend to see the world
> differently from most "user" users of NiFi. When I answer questions, which
> I don't do too much, I often tell an impractical story that doesn't answer
> the question asked.
>
> Now, I haven't done much database work at all, so I'm not going to be too
> much help to you. However, looking at your questions, particularly the one
> about obtaining the current date when the flow is run (#2) and also #3, I
> would suggest that you study the NiFi Expression Language. This would allow
> you to insert the "now" date into your flow at some point.
>
> The other suggestions I have are to experiment, which I'm sure you're
> doing, and Google hard for help. For this forum, which is filled with very
> nice and helpful people, you'll tend to get a lot more and better help if
> you come in with a very specific question rather than a list of things or a
> general, "help me" sort of plea.
>
> Cheers,
>
> Russ
>
> P.S. NiFi absolutely rocks, which you'll see as soon as you get over your
> initial hump here. But, you're on the right track.
>
>
>
> On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
>
> Hi,
>
>
>
> Could you someone please help me with these requirements in the email
> below?
>
>
>
> Thanks
>
> Aruna
>
>
>
> *From:* Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com
> <Ar...@Cormac-Corp.com>]
> *Sent:* Wednesday, October 11, 2017 11:26 AM
> *To:* users@nifi.apache.org
> *Subject:* Transformations using Nifi
>
>
>
> I am trying to see what kind of transformations that can be done in nifi
> and how.
>
> Now I have a basic flow that takes CSV from the local dir and puts into s3
> and loads into postgres database.
>
> There are 4 columns in my test file 3 of which are string and one is an
> integer field. I would like to do the following before I load the data into
> postgres. If someone can help me on how to go about these, it will be great.
>
> 1.       Convert one of the string columns to upper case
>
> *For converting to upper case, I was told to use the Update Record
> processor but  my version is 1.2.0 and the update record processor is not
> available.*
>
>
>
> 2.       Postgres has an extra column called “Load_Date” in which I would
> like to load the current date with timestamp when the flow is run
>
> 3.       If the integer column has more than 5 digits, I would like to
> take only the first 5 digits and load to the table
>
> 4.       There is a look up table in postgres. I would like to check if
> the first column value is present in the look up table and if yes, proceed
> ahead and if not ignore the record
>
>
>
> I am trying to learn nifi so I would really appreciate any kind of help
> here. Is there any training available online that I can take in order to
> understand and do all these?
>
>
>
> <image003.png>
>
>
>
>
>
>
>
>
>
>
>

RE: UpdateRecord Processor

Posted by Aruna Sankaralingam <Ar...@Cormac-Corp.com>.
Bryan,
I tried both these options just now but sys_creat_ts is not getting populated for any record.
Option 1: If your writer schema has long for sys_create_ts then your UpdateRecord needs to have ${now():toNumber()}.
Option 2: If your writer schema has the original logicalType, then you need ${now():format('yyyy-MM-dd HH:mm:ss')}.

From: Bryan Bende [mailto:bbende@gmail.com]
Sent: Monday, October 16, 2017 4:14 PM
To: users@nifi.apache.org
Subject: Re: UpdateRecord Processor

Aruna,
If your writer schema has long for sys_create_ts then your UpdateRecord needs to have ${now():toNumber()}.
If your writer schema has the original logicalType, then you need ${now():format('yyyy-MM-dd HH:mm:ss')}.

It looks like you have a mix of both because you said you changed it to long, but the error shows "2017-10-16 20:03:58" which isn't a long.

-Bryan


On Mon, Oct 16, 2017 at 4:07 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Bryan,

I followed your methods and the flow was successful. However, the sys_creat_ts and actv_ind fields were not populated for any of the records.

[cid:image001.png@01D3469B.7F4415C0]

This is what I see in the “Data Provenance” of the PutDatabaseRecord processor.

ENRLMT_ID,CITY_NAME,ZIP_CD,STATE_CD,SYS_CREAT_TS,ACTV_IND
I20031103000005,MECHANICSBURG,170509414,PA,1508183538000,
I20031103000013,HATO REY,917,PR,1508183538000,
I20031103000015,TOMS RIVER,8757,NJ,1508183538000,
I20031103000028,JERSEY CITY,73062305,NJ,1508183538000,
I20031103000030,AGUADILLA,6055256,PR,1508183538000,
I20031103000032,BINGHAM FARMS,480254514,MI,1508183538000,

So I changed the writer schema to use long for the sys_creat_ts field and started the flow but I got an error
“20:03:58 UTCERROR
1221b2fe-015f-1000-9235-11d12a9fdbcc
UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process StandardFlowFileRecord[uuid=25c48eec-42a0-4766-8605-69c5a0dad3cf,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1508167656905-1, container=default, section=1], offset=8591, length=1606],offset=0,name=3-Address_Sub-_File.csv,size=1606]; will route to failure: java.lang.NumberFormatException: For input string: "2017-10-16 20:03:58"

“

[cid:image002.jpg@01D3469B.7F4415C0]



From: Bryan Bende [mailto:bbende@gmail.com<ma...@gmail.com>]
Sent: Monday, October 16, 2017 3:29 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Sticking with the original logicalType of timestamp-millis, I was able to get it working using ${now():format('yyyy-MM-dd HH:mm:ss')}  as the value of the timestamp column.

On Mon, Oct 16, 2017 at 3:23 PM, Bryan Bende <bb...@gmail.com>> wrote:
If it is helpful, here is a template of the working version where I changed to a long:

https://gist.githubusercontent.com/bbende/3187d3905f868bef3143899841bb70f9/raw/f266b149eb845a371543a1b68f9797d5437c6a2d/update-record-timestamp.xml

On Mon, Oct 16, 2017 at 3:19 PM, Bryan Bende <bb...@gmail.com>> wrote:
Aruna,
I think the issue here might be on the reader side of things...
If you're incoming data does not have SYS_CREAT_TS in it, then your schema needs to allow null values for this field so you can create records from the incoming data, or you can create two versions of your schema and have the reader use a version that doesn't have this field, and have the writer use a version that does.

I am currently testing this using these examples:

{
"type": "record",
"name": "schema1",
"fields": [
 { "name": "id", "type": "string" }
]
}

{
"type": "record",
"name": "schema2",
"fields": [
 { "name": "id", "type": "string" },
 { "name": "timestamp", "type" : { "type" : "long", "logicalType" : "timestamp-millis" } }
]
}

I have a CSVReader using schema1 and a CSVWriter using schema2.

I'm able to get past the issue you are having and now it is getting to the point where it is trying to take the result ofr ${now():toNumber()} and convert it to a timestamp, and I'm running into a different issue:

org.apache.nifi.serialization.record.util.IllegalTypeConversionException: Could not convert value [1508180727622] of type java.lang.String to Timestamp for field timestamp because the value is not in the expected date format: org.apache.nifi.serialization.record.util.DataTypeUtils$$Lambda$414/572291470@33d51c66<ma...@33d51c66>
org.apache.nifi.serialization.record.util.IllegalTypeConversionException: Could not convert value [1508180727622] of type java.lang.String to Timestamp for field timestamp because the value is not in the expected date format: org.apache.nifi.serialization.record.util.DataTypeUtils$$Lambda$414/572291470@33d51c66<ma...@33d51c66>
    at org.apache.nifi.serialization.record.util.DataTypeUtils.toTimestamp(DataTypeUtils.java:564)
    at org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:134)
    at org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:84)
    at org.apache.nifi.serialization.record.MapRecord.setValue(MapRecord.java:317)
    at org.apache.nifi.record.path.St<http://org.apache.nifi.record.path.St>andardFieldValue.updateValue(StandardFieldValue.java:132)
    at org.apache.nifi.processors.standard.UpdateRecord.lambda$process$1(UpdateRecord.java:177)

I haven't figured out the issue, but it has something to do with the logic of trying to convert a string to timestamp and whether or not a date format is provided.

If I change schema1 to use a regular long for the timestamp field then it works:

{
"type": "record",
"name": "schema2",
"fields": [
 { "name": "id", "type": "string" },
 { "name": "timestamp", "type" : "long" }
]
}

I don't know what the ramifications of this would be in PutDatabaseRecord.


On Mon, Oct 16, 2017 at 2:23 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Matt,

My issue is that when I am trying to assign current date to sys_creat_ts for every record, I am getting this error that it cannot be null. SYS_CREAT_TS is not coming from the source. It is present in the target table alone and I need to load the current datetime in that field for every record that is coming from source.

Mark,

I am still getting the same error.

Also, actv_ind must display just “Y” in the target table.


[cid:image003.png@01D3469B.7F4415C0]

[cid:image004.png@01D3469B.7F4415C0]

From: Mark Payne [mailto:markap14@hotmail.com<ma...@hotmail.com>]
Sent: Monday, October 16, 2017 12:03 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Hi Aruna,

I think the issue is likely that you are setting /sys_create_ts but your schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
Would recommend you change the property names to /ACTV_IND and /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
set to "{now()}" but I think what you really are wanting is "${now():toNumber()}".

Also, of note, /actv_ind is set to '/Y' so I just want to clarify that the literal value '/Y' (with the single quotes) is what will be placed there. Is that
the intention? Or did you actually want just /Y to be there?

Thanks
-Mark



On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <Ar...@Cormac-Corp.com>> wrote:

Hi,

I updated to version 1.4 now. I am using Updaterecord processor. I have assigned values to two of the columns that are there in the target table as shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am I missing something?
I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry and my Nifi Flow below.

<image004.png>   <image005.png>

CSVRecordSetWriter:
<image001.png>

AVROSchemaRegistry:
<image002.png>

<image006.png>


From: Koji Kawamura [mailto:ijokarumawak@gmail.com]
Sent: Thursday, October 12, 2017 8:28 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: Transformations using Nifi

Hi Aruna,

If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
As Russ mentioned, EL might be helpful when you implement the scripted components.

#4 is a bit harder since it requires a database connection, but doable, I don't know it works efficiently though..

Alternative approach would be create a temporary table, insert rows as they are, then perform a insert/update query using the temporary table and postgresql lookup table such as:
"insert into X select a, b, c, from T inner join L on X.j = T.j where d = ..."
Probably data conversion #1, 2, and 3 can be performed from this query as well.

Thanks,
Koji

On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>> wrote:
Aruna,
I don't think there is any generalized NiFi training course yet. I started writing a book a year ago, but it's a pretty thankless task and takes a lot of time. I mostly write custom processors so I tend to see the world differently from most "user" users of NiFi. When I answer questions, which I don't do too much, I often tell an impractical story that doesn't answer the question asked.
Now, I haven't done much database work at all, so I'm not going to be too much help to you. However, looking at your questions, particularly the one about obtaining the current date when the flow is run (#2) and also #3, I would suggest that you study the NiFi Expression Language. This would allow you to insert the "now" date into your flow at some point.
The other suggestions I have are to experiment, which I'm sure you're doing, and Google hard for help. For this forum, which is filled with very nice and helpful people, you'll tend to get a lot more and better help if you come in with a very specific question rather than a list of things or a general, "help me" sort of plea.
Cheers,
Russ
P.S. NiFi absolutely rocks, which you'll see as soon as you get over your initial hump here. But, you're on the right track.

On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
Hi,

Could you someone please help me with these requirements in the email below?

Thanks
Aruna

From: Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com]
Sent: Wednesday, October 11, 2017 11:26 AM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Transformations using Nifi

I am trying to see what kind of transformations that can be done in nifi and how.
Now I have a basic flow that takes CSV from the local dir and puts into s3 and loads into postgres database.
There are 4 columns in my test file 3 of which are string and one is an integer field. I would like to do the following before I load the data into postgres. If someone can help me on how to go about these, it will be great.

1.       Convert one of the string columns to upper case

For converting to upper case, I was told to use the Update Record processor but  my version is 1.2.0 and the update record processor is not available.



2.       Postgres has an extra column called “Load_Date” in which I would like to load the current date with timestamp when the flow is run

3.       If the integer column has more than 5 digits, I would like to take only the first 5 digits and load to the table

4.       There is a look up table in postgres. I would like to check if the first column value is present in the look up table and if yes, proceed ahead and if not ignore the record

I am trying to learn nifi so I would really appreciate any kind of help here. Is there any training available online that I can take in order to understand and do all these?

<image003.png>






Re: UpdateRecord Processor

Posted by Bryan Bende <bb...@gmail.com>.
Aruna,

If your writer schema has long for sys_create_ts then your UpdateRecord
needs to have ${now():toNumber()}.

If your writer schema has the original logicalType, then you need
${now():format('yyyy-MM-dd HH:mm:ss')}.

It looks like you have a mix of both because you said you changed it to
long, but the error shows "2017-10-16 20:03:58" which isn't a long.

-Bryan


On Mon, Oct 16, 2017 at 4:07 PM, Aruna Sankaralingam <
Aruna.Sankaralingam@cormac-corp.com> wrote:

> Bryan,
>
>
>
> I followed your methods and the flow was successful. However, the
> sys_creat_ts and actv_ind fields were not populated for any of the records.
>
>
>
>
>
> This is what I see in the “Data Provenance” of the PutDatabaseRecord
> processor.
>
>
>
> ENRLMT_ID,CITY_NAME,ZIP_CD,STATE_CD,SYS_CREAT_TS,ACTV_IND
>
> I20031103000005,MECHANICSBURG,170509414,PA,1508183538000,
>
> I20031103000013,HATO REY,917,PR,1508183538000,
>
> I20031103000015,TOMS RIVER,8757,NJ,1508183538000,
>
> I20031103000028,JERSEY CITY,73062305,NJ,1508183538000,
>
> I20031103000030,AGUADILLA,6055256,PR,1508183538000,
>
> I20031103000032,BINGHAM FARMS,480254514,MI,1508183538000,
>
>
>
> So I changed the writer schema to use long for the sys_creat_ts field and
> started the flow but I got an error
>
> “*20:03:58 UTC**ERROR*
>
> *1221b2fe-015f-1000-9235-11d12a9fdbcc*
>
> UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process
> StandardFlowFileRecord[uuid=25c48eec-42a0-4766-8605-69c5a0dad3cf,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1508167656905-1,
> container=default, section=1], offset=8591, length=1606],offset=0,name=3-
> Address_Sub-_File.csv,size=1606]; will route to failure: java.lang.NumberFormatException:
> For input string: "2017-10-16 20:03:58"
>
>
>
> “
>
>
>
>
>
>
>
>
>
> *From:* Bryan Bende [mailto:bbende@gmail.com]
> *Sent:* Monday, October 16, 2017 3:29 PM
>
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Sticking with the original logicalType of timestamp-millis, I was able to
> get it working using ${now():format('yyyy-MM-dd HH:mm:ss')}  as the value
> of the timestamp column.
>
>
>
> On Mon, Oct 16, 2017 at 3:23 PM, Bryan Bende <bb...@gmail.com> wrote:
>
> If it is helpful, here is a template of the working version where I
> changed to a long:
>
> https://gist.githubusercontent.com/bbende/3187d3905f868bef3143899841bb70
> f9/raw/f266b149eb845a371543a1b68f9797d5437c6a2d/update-record-
> timestamp.xml
>
>
>
> On Mon, Oct 16, 2017 at 3:19 PM, Bryan Bende <bb...@gmail.com> wrote:
>
> Aruna,
>
> I think the issue here might be on the reader side of things...
>
> If you're incoming data does not have SYS_CREAT_TS in it, then your schema
> needs to allow null values for this field so you can create records from
> the incoming data, or you can create two versions of your schema and have
> the reader use a version that doesn't have this field, and have the writer
> use a version that does.
>
>
>
> I am currently testing this using these examples:
>
>
>
> {
> "type": "record",
> "name": "schema1",
> "fields": [
>  { "name": "id", "type": "string" }
> ]
> }
>
> {
> "type": "record",
> "name": "schema2",
> "fields": [
>  { "name": "id", "type": "string" },
>  { "name": "timestamp", "type" : { "type" : "long", "logicalType" :
> "timestamp-millis" } }
> ]
> }
>
>
>
> I have a CSVReader using schema1 and a CSVWriter using schema2.
>
>
>
> I'm able to get past the issue you are having and now it is getting to the
> point where it is trying to take the result ofr ${now():toNumber()} and
> convert it to a timestamp, and I'm running into a different issue:
>
>
>
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
> Could not convert value [1508180727622] of type java.lang.String to
> Timestamp for field timestamp because the value is not in the expected date
> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$
> Lambda$414/572291470@33d51c66
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
> Could not convert value [1508180727622] of type java.lang.String to
> Timestamp for field timestamp because the value is not in the expected date
> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$
> Lambda$414/572291470@33d51c66
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> toTimestamp(DataTypeUtils.java:564)
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> convertType(DataTypeUtils.java:134)
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> convertType(DataTypeUtils.java:84)
>     at org.apache.nifi.serialization.record.MapRecord.setValue(
> MapRecord.java:317)
>     at org.apache.nifi.record.path.StandardFieldValue.updateValue(
> StandardFieldValue.java:132)
>     at org.apache.nifi.processors.standard.UpdateRecord.lambda$
> process$1(UpdateRecord.java:177)
>
>
>
> I haven't figured out the issue, but it has something to do with the logic
> of trying to convert a string to timestamp and whether or not a date format
> is provided.
>
>
>
> If I change schema1 to use a regular long for the timestamp field then it
> works:
>
>
>
> {
> "type": "record",
> "name": "schema2",
> "fields": [
>  { "name": "id", "type": "string" },
>  { "name": "timestamp", "type" : "long" }
> ]
> }
>
>
>
> I don't know what the ramifications of this would be in PutDatabaseRecord.
>
>
>
>
>
> On Mon, Oct 16, 2017 at 2:23 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Matt,
>
>
>
> My issue is that when I am trying to assign current date to sys_creat_ts
> for every record, I am getting this error that it cannot be null.
> SYS_CREAT_TS is not coming from the source. It is present in the target
> table alone and I need to load the current datetime in that field for every
> record that is coming from source.
>
>
>
> Mark,
>
>
>
> I am still getting the same error.
>
>
>
> Also, actv_ind must display just “Y” in the target table.
>
>
>
>
>
>
>
>
>
> *From:* Mark Payne [mailto:markap14@hotmail.com]
> *Sent:* Monday, October 16, 2017 12:03 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Hi Aruna,
>
>
>
> I think the issue is likely that you are setting /sys_create_ts but your
> schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
>
> Would recommend you change the property names to /ACTV_IND and
> /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
>
> set to "{now()}" but I think what you really are wanting is
> "${now():toNumber()}".
>
>
>
> Also, of note, /actv_ind is set to '/Y' so I just want to clarify that the
> literal value '/Y' (with the single quotes) is what will be placed there.
> Is that
>
> the intention? Or did you actually want just /Y to be there?
>
>
>
> Thanks
>
> -Mark
>
>
>
>
>
>
>
> On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <
> Aruna.Sankaralingam@Cormac-Corp.com> wrote:
>
>
>
> Hi,
>
>
>
> I updated to version 1.4 now. I am using Updaterecord processor. I have
> assigned values to two of the columns that are there in the target table as
> shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am
> I missing something?
>
> I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry
> and my Nifi Flow below.
>
>
>
> <image004.png>   <image005.png>
>
>
>
> CSVRecordSetWriter:
>
> <image001.png>
>
>
>
> AVROSchemaRegistry:
>
> <image002.png>
>
>
>
> <image006.png>
>
>
>
>
>
> *From:* Koji Kawamura [mailto:ijokarumawak@gmail.com
> <ij...@gmail.com>]
> *Sent:* Thursday, October 12, 2017 8:28 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: Transformations using Nifi
>
>
>
> Hi Aruna,
>
>
>
> If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
>
> ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
>
> As Russ mentioned, EL might be helpful when you implement the scripted
> components.
>
>
>
> #4 is a bit harder since it requires a database connection, but doable, I
> don't know it works efficiently though..
>
>
>
> Alternative approach would be create a temporary table, insert rows as
> they are, then perform a insert/update query using the temporary table and
> postgresql lookup table such as:
>
> "insert into X select a, b, c, from T inner join L on X.j = T.j where d =
> ..."
>
> Probably data conversion #1, 2, and 3 can be performed from this query as
> well.
>
>
>
> Thanks,
>
> Koji
>
>
>
> On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>
> wrote:
>
> Aruna,
>
> I don't think there is any generalized NiFi training course yet. I started
> writing a book a year ago, but it's a pretty thankless task and takes a lot
> of time. I mostly write custom processors so I tend to see the world
> differently from most "user" users of NiFi. When I answer questions, which
> I don't do too much, I often tell an impractical story that doesn't answer
> the question asked.
>
> Now, I haven't done much database work at all, so I'm not going to be too
> much help to you. However, looking at your questions, particularly the one
> about obtaining the current date when the flow is run (#2) and also #3, I
> would suggest that you study the NiFi Expression Language. This would allow
> you to insert the "now" date into your flow at some point.
>
> The other suggestions I have are to experiment, which I'm sure you're
> doing, and Google hard for help. For this forum, which is filled with very
> nice and helpful people, you'll tend to get a lot more and better help if
> you come in with a very specific question rather than a list of things or a
> general, "help me" sort of plea.
>
> Cheers,
>
> Russ
>
> P.S. NiFi absolutely rocks, which you'll see as soon as you get over your
> initial hump here. But, you're on the right track.
>
>
>
> On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
>
> Hi,
>
>
>
> Could you someone please help me with these requirements in the email
> below?
>
>
>
> Thanks
>
> Aruna
>
>
>
> *From:* Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com
> <Ar...@Cormac-Corp.com>]
> *Sent:* Wednesday, October 11, 2017 11:26 AM
> *To:* users@nifi.apache.org
> *Subject:* Transformations using Nifi
>
>
>
> I am trying to see what kind of transformations that can be done in nifi
> and how.
>
> Now I have a basic flow that takes CSV from the local dir and puts into s3
> and loads into postgres database.
>
> There are 4 columns in my test file 3 of which are string and one is an
> integer field. I would like to do the following before I load the data into
> postgres. If someone can help me on how to go about these, it will be great.
>
> 1.       Convert one of the string columns to upper case
>
> *For converting to upper case, I was told to use the Update Record
> processor but  my version is 1.2.0 and the update record processor is not
> available.*
>
>
>
> 2.       Postgres has an extra column called “Load_Date” in which I would
> like to load the current date with timestamp when the flow is run
>
> 3.       If the integer column has more than 5 digits, I would like to
> take only the first 5 digits and load to the table
>
> 4.       There is a look up table in postgres. I would like to check if
> the first column value is present in the look up table and if yes, proceed
> ahead and if not ignore the record
>
>
>
> I am trying to learn nifi so I would really appreciate any kind of help
> here. Is there any training available online that I can take in order to
> understand and do all these?
>
>
>
> <image003.png>
>
>
>
>
>
>
>
>
>

RE: UpdateRecord Processor

Posted by Aruna Sankaralingam <Ar...@Cormac-Corp.com>.
Bryan,

I followed your methods and the flow was successful. However, the sys_creat_ts and actv_ind fields were not populated for any of the records.

[cid:image003.png@01D34697.E528D1E0]

This is what I see in the “Data Provenance” of the PutDatabaseRecord processor.

ENRLMT_ID,CITY_NAME,ZIP_CD,STATE_CD,SYS_CREAT_TS,ACTV_IND
I20031103000005,MECHANICSBURG,170509414,PA,1508183538000,
I20031103000013,HATO REY,917,PR,1508183538000,
I20031103000015,TOMS RIVER,8757,NJ,1508183538000,
I20031103000028,JERSEY CITY,73062305,NJ,1508183538000,
I20031103000030,AGUADILLA,6055256,PR,1508183538000,
I20031103000032,BINGHAM FARMS,480254514,MI,1508183538000,

So I changed the writer schema to use long for the sys_creat_ts field and started the flow but I got an error
“20:03:58 UTCERROR
1221b2fe-015f-1000-9235-11d12a9fdbcc
UpdateRecord[id=1221b2fe-015f-1000-9235-11d12a9fdbcc] Failed to process StandardFlowFileRecord[uuid=25c48eec-42a0-4766-8605-69c5a0dad3cf,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1508167656905-1, container=default, section=1], offset=8591, length=1606],offset=0,name=3-Address_Sub-_File.csv,size=1606]; will route to failure: java.lang.NumberFormatException: For input string: "2017-10-16 20:03:58"

“

[cid:image006.jpg@01D34698.F1FFE8D0]



From: Bryan Bende [mailto:bbende@gmail.com]
Sent: Monday, October 16, 2017 3:29 PM
To: users@nifi.apache.org
Subject: Re: UpdateRecord Processor

Sticking with the original logicalType of timestamp-millis, I was able to get it working using ${now():format('yyyy-MM-dd HH:mm:ss')}  as the value of the timestamp column.

On Mon, Oct 16, 2017 at 3:23 PM, Bryan Bende <bb...@gmail.com>> wrote:
If it is helpful, here is a template of the working version where I changed to a long:

https://gist.githubusercontent.com/bbende/3187d3905f868bef3143899841bb70f9/raw/f266b149eb845a371543a1b68f9797d5437c6a2d/update-record-timestamp.xml

On Mon, Oct 16, 2017 at 3:19 PM, Bryan Bende <bb...@gmail.com>> wrote:
Aruna,
I think the issue here might be on the reader side of things...
If you're incoming data does not have SYS_CREAT_TS in it, then your schema needs to allow null values for this field so you can create records from the incoming data, or you can create two versions of your schema and have the reader use a version that doesn't have this field, and have the writer use a version that does.

I am currently testing this using these examples:

{
"type": "record",
"name": "schema1",
"fields": [
 { "name": "id", "type": "string" }
]
}

{
"type": "record",
"name": "schema2",
"fields": [
 { "name": "id", "type": "string" },
 { "name": "timestamp", "type" : { "type" : "long", "logicalType" : "timestamp-millis" } }
]
}

I have a CSVReader using schema1 and a CSVWriter using schema2.

I'm able to get past the issue you are having and now it is getting to the point where it is trying to take the result ofr ${now():toNumber()} and convert it to a timestamp, and I'm running into a different issue:

org.apache.nifi.serialization.record.util.IllegalTypeConversionException: Could not convert value [1508180727622] of type java.lang.String to Timestamp for field timestamp because the value is not in the expected date format: org.apache.nifi.serialization.record.util.DataTypeUtils$$Lambda$414/572291470@33d51c66<ma...@33d51c66>
org.apache.nifi.serialization.record.util.IllegalTypeConversionException: Could not convert value [1508180727622] of type java.lang.String to Timestamp for field timestamp because the value is not in the expected date format: org.apache.nifi.serialization.record.util.DataTypeUtils$$Lambda$414/572291470@33d51c66<ma...@33d51c66>
    at org.apache.nifi.serialization.record.util.DataTypeUtils.toTimestamp(DataTypeUtils.java:564)
    at org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:134)
    at org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:84)
    at org.apache.nifi.serialization.record.MapRecord.setValue(MapRecord.java:317)
    at org.apache.nifi.record.path.St<http://org.apache.nifi.record.path.St>andardFieldValue.updateValue(StandardFieldValue.java:132)
    at org.apache.nifi.processors.standard.UpdateRecord.lambda$process$1(UpdateRecord.java:177)

I haven't figured out the issue, but it has something to do with the logic of trying to convert a string to timestamp and whether or not a date format is provided.

If I change schema1 to use a regular long for the timestamp field then it works:

{
"type": "record",
"name": "schema2",
"fields": [
 { "name": "id", "type": "string" },
 { "name": "timestamp", "type" : "long" }
]
}

I don't know what the ramifications of this would be in PutDatabaseRecord.


On Mon, Oct 16, 2017 at 2:23 PM, Aruna Sankaralingam <Ar...@cormac-corp.com>> wrote:
Matt,

My issue is that when I am trying to assign current date to sys_creat_ts for every record, I am getting this error that it cannot be null. SYS_CREAT_TS is not coming from the source. It is present in the target table alone and I need to load the current datetime in that field for every record that is coming from source.

Mark,

I am still getting the same error.

Also, actv_ind must display just “Y” in the target table.


[cid:image004.png@01D34697.E528D1E0]

[cid:image005.png@01D34697.E528D1E0]

From: Mark Payne [mailto:markap14@hotmail.com<ma...@hotmail.com>]
Sent: Monday, October 16, 2017 12:03 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: UpdateRecord Processor

Hi Aruna,

I think the issue is likely that you are setting /sys_create_ts but your schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
Would recommend you change the property names to /ACTV_IND and /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
set to "{now()}" but I think what you really are wanting is "${now():toNumber()}".

Also, of note, /actv_ind is set to '/Y' so I just want to clarify that the literal value '/Y' (with the single quotes) is what will be placed there. Is that
the intention? Or did you actually want just /Y to be there?

Thanks
-Mark



On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <Ar...@Cormac-Corp.com>> wrote:

Hi,

I updated to version 1.4 now. I am using Updaterecord processor. I have assigned values to two of the columns that are there in the target table as shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am I missing something?
I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry and my Nifi Flow below.

<image004.png>   <image005.png>

CSVRecordSetWriter:
<image001.png>

AVROSchemaRegistry:
<image002.png>

<image006.png>


From: Koji Kawamura [mailto:ijokarumawak@gmail.com]
Sent: Thursday, October 12, 2017 8:28 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: Transformations using Nifi

Hi Aruna,

If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
As Russ mentioned, EL might be helpful when you implement the scripted components.

#4 is a bit harder since it requires a database connection, but doable, I don't know it works efficiently though..

Alternative approach would be create a temporary table, insert rows as they are, then perform a insert/update query using the temporary table and postgresql lookup table such as:
"insert into X select a, b, c, from T inner join L on X.j = T.j where d = ..."
Probably data conversion #1, 2, and 3 can be performed from this query as well.

Thanks,
Koji

On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>> wrote:
Aruna,
I don't think there is any generalized NiFi training course yet. I started writing a book a year ago, but it's a pretty thankless task and takes a lot of time. I mostly write custom processors so I tend to see the world differently from most "user" users of NiFi. When I answer questions, which I don't do too much, I often tell an impractical story that doesn't answer the question asked.
Now, I haven't done much database work at all, so I'm not going to be too much help to you. However, looking at your questions, particularly the one about obtaining the current date when the flow is run (#2) and also #3, I would suggest that you study the NiFi Expression Language. This would allow you to insert the "now" date into your flow at some point.
The other suggestions I have are to experiment, which I'm sure you're doing, and Google hard for help. For this forum, which is filled with very nice and helpful people, you'll tend to get a lot more and better help if you come in with a very specific question rather than a list of things or a general, "help me" sort of plea.
Cheers,
Russ
P.S. NiFi absolutely rocks, which you'll see as soon as you get over your initial hump here. But, you're on the right track.

On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
Hi,

Could you someone please help me with these requirements in the email below?

Thanks
Aruna

From: Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com]
Sent: Wednesday, October 11, 2017 11:26 AM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Transformations using Nifi

I am trying to see what kind of transformations that can be done in nifi and how.
Now I have a basic flow that takes CSV from the local dir and puts into s3 and loads into postgres database.
There are 4 columns in my test file 3 of which are string and one is an integer field. I would like to do the following before I load the data into postgres. If someone can help me on how to go about these, it will be great.

1.       Convert one of the string columns to upper case

For converting to upper case, I was told to use the Update Record processor but  my version is 1.2.0 and the update record processor is not available.



2.       Postgres has an extra column called “Load_Date” in which I would like to load the current date with timestamp when the flow is run

3.       If the integer column has more than 5 digits, I would like to take only the first 5 digits and load to the table

4.       There is a look up table in postgres. I would like to check if the first column value is present in the look up table and if yes, proceed ahead and if not ignore the record

I am trying to learn nifi so I would really appreciate any kind of help here. Is there any training available online that I can take in order to understand and do all these?

<image003.png>





Re: UpdateRecord Processor

Posted by Bryan Bende <bb...@gmail.com>.
Sticking with the original logicalType of timestamp-millis, I was able to
get it working using ${now():format('yyyy-MM-dd HH:mm:ss')}  as the value
of the timestamp column.

On Mon, Oct 16, 2017 at 3:23 PM, Bryan Bende <bb...@gmail.com> wrote:

> If it is helpful, here is a template of the working version where I
> changed to a long:
>
> https://gist.githubusercontent.com/bbende/3187d3905f868bef3143899841bb70
> f9/raw/f266b149eb845a371543a1b68f9797d5437c6a2d/update-record-
> timestamp.xml
>
> On Mon, Oct 16, 2017 at 3:19 PM, Bryan Bende <bb...@gmail.com> wrote:
>
>> Aruna,
>>
>> I think the issue here might be on the reader side of things...
>>
>> If you're incoming data does not have SYS_CREAT_TS in it, then your
>> schema needs to allow null values for this field so you can create records
>> from the incoming data, or you can create two versions of your schema and
>> have the reader use a version that doesn't have this field, and have the
>> writer use a version that does.
>>
>> I am currently testing this using these examples:
>>
>> {
>> "type": "record",
>> "name": "schema1",
>> "fields": [
>>  { "name": "id", "type": "string" }
>> ]
>> }
>>
>> {
>> "type": "record",
>> "name": "schema2",
>> "fields": [
>>  { "name": "id", "type": "string" },
>>  { "name": "timestamp", "type" : { "type" : "long", "logicalType" :
>> "timestamp-millis" } }
>> ]
>> }
>>
>> I have a CSVReader using schema1 and a CSVWriter using schema2.
>>
>> I'm able to get past the issue you are having and now it is getting to
>> the point where it is trying to take the result ofr ${now():toNumber()} and
>> convert it to a timestamp, and I'm running into a different issue:
>>
>> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
>> Could not convert value [1508180727622] of type java.lang.String to
>> Timestamp for field timestamp because the value is not in the expected date
>> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$Lam
>> bda$414/572291470@33d51c66
>> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
>> Could not convert value [1508180727622] of type java.lang.String to
>> Timestamp for field timestamp because the value is not in the expected date
>> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$Lam
>> bda$414/572291470@33d51c66
>>     at org.apache.nifi.serialization.record.util.DataTypeUtils.toTi
>> mestamp(DataTypeUtils.java:564)
>>     at org.apache.nifi.serialization.record.util.DataTypeUtils.conv
>> ertType(DataTypeUtils.java:134)
>>     at org.apache.nifi.serialization.record.util.DataTypeUtils.conv
>> ertType(DataTypeUtils.java:84)
>>     at org.apache.nifi.serialization.record.MapRecord.setValue(MapR
>> ecord.java:317)
>>     at org.apache.nifi.record.path.StandardFieldValue.updateValue(S
>> tandardFieldValue.java:132)
>>     at org.apache.nifi.processors.standard.UpdateRecord.lambda$proc
>> ess$1(UpdateRecord.java:177)
>>
>> I haven't figured out the issue, but it has something to do with the
>> logic of trying to convert a string to timestamp and whether or not a date
>> format is provided.
>>
>> If I change schema1 to use a regular long for the timestamp field then it
>> works:
>>
>> {
>> "type": "record",
>> "name": "schema2",
>> "fields": [
>>  { "name": "id", "type": "string" },
>>  { "name": "timestamp", "type" : "long" }
>> ]
>> }
>>
>> I don't know what the ramifications of this would be in PutDatabaseRecord.
>>
>>
>> On Mon, Oct 16, 2017 at 2:23 PM, Aruna Sankaralingam <
>> Aruna.Sankaralingam@cormac-corp.com> wrote:
>>
>>> Matt,
>>>
>>>
>>>
>>> My issue is that when I am trying to assign current date to sys_creat_ts
>>> for every record, I am getting this error that it cannot be null.
>>> SYS_CREAT_TS is not coming from the source. It is present in the target
>>> table alone and I need to load the current datetime in that field for every
>>> record that is coming from source.
>>>
>>>
>>>
>>> Mark,
>>>
>>>
>>>
>>> I am still getting the same error.
>>>
>>>
>>>
>>> Also, actv_ind must display just “Y” in the target table.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *From:* Mark Payne [mailto:markap14@hotmail.com]
>>> *Sent:* Monday, October 16, 2017 12:03 PM
>>> *To:* users@nifi.apache.org
>>> *Subject:* Re: UpdateRecord Processor
>>>
>>>
>>>
>>> Hi Aruna,
>>>
>>>
>>>
>>> I think the issue is likely that you are setting /sys_create_ts but your
>>> schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
>>>
>>> Would recommend you change the property names to /ACTV_IND and
>>> /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
>>>
>>> set to "{now()}" but I think what you really are wanting is
>>> "${now():toNumber()}".
>>>
>>>
>>>
>>> Also, of note, /actv_ind is set to '/Y' so I just want to clarify that
>>> the literal value '/Y' (with the single quotes) is what will be placed
>>> there. Is that
>>>
>>> the intention? Or did you actually want just /Y to be there?
>>>
>>>
>>>
>>> Thanks
>>>
>>> -Mark
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <
>>> Aruna.Sankaralingam@Cormac-Corp.com> wrote:
>>>
>>>
>>>
>>> Hi,
>>>
>>>
>>>
>>> I updated to version 1.4 now. I am using Updaterecord processor. I have
>>> assigned values to two of the columns that are there in the target table as
>>> shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am
>>> I missing something?
>>>
>>> I have provided the screenshots of the CSVReordWriter,
>>> AVROSchemaRegistry and my Nifi Flow below.
>>>
>>>
>>>
>>> <image004.png>   <image005.png>
>>>
>>>
>>>
>>> CSVRecordSetWriter:
>>>
>>> <image001.png>
>>>
>>>
>>>
>>> AVROSchemaRegistry:
>>>
>>> <image002.png>
>>>
>>>
>>>
>>> <image006.png>
>>>
>>>
>>>
>>>
>>>
>>> *From:* Koji Kawamura [mailto:ijokarumawak@gmail.com
>>> <ij...@gmail.com>]
>>> *Sent:* Thursday, October 12, 2017 8:28 PM
>>> *To:* users@nifi.apache.org
>>> *Subject:* Re: Transformations using Nifi
>>>
>>>
>>>
>>> Hi Aruna,
>>>
>>>
>>>
>>> If you can not upgrade from NiFi 1.2.0, then I think the best bed is
>>> using:
>>>
>>> ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and
>>> 3.
>>>
>>> As Russ mentioned, EL might be helpful when you implement the scripted
>>> components.
>>>
>>>
>>>
>>> #4 is a bit harder since it requires a database connection, but doable,
>>> I don't know it works efficiently though..
>>>
>>>
>>>
>>> Alternative approach would be create a temporary table, insert rows as
>>> they are, then perform a insert/update query using the temporary table and
>>> postgresql lookup table such as:
>>>
>>> "insert into X select a, b, c, from T inner join L on X.j = T.j where d
>>> = ..."
>>>
>>> Probably data conversion #1, 2, and 3 can be performed from this query
>>> as well.
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Koji
>>>
>>>
>>>
>>> On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>
>>> wrote:
>>>
>>> Aruna,
>>>
>>> I don't think there is any generalized NiFi training course yet. I
>>> started writing a book a year ago, but it's a pretty thankless task and
>>> takes a lot of time. I mostly write custom processors so I tend to see the
>>> world differently from most "user" users of NiFi. When I answer questions,
>>> which I don't do too much, I often tell an impractical story that doesn't
>>> answer the question asked.
>>>
>>> Now, I haven't done much database work at all, so I'm not going to be
>>> too much help to you. However, looking at your questions, particularly the
>>> one about obtaining the current date when the flow is run (#2) and also #3,
>>> I would suggest that you study the NiFi Expression Language. This would
>>> allow you to insert the "now" date into your flow at some point.
>>>
>>> The other suggestions I have are to experiment, which I'm sure you're
>>> doing, and Google hard for help. For this forum, which is filled with very
>>> nice and helpful people, you'll tend to get a lot more and better help if
>>> you come in with a very specific question rather than a list of things or a
>>> general, "help me" sort of plea.
>>>
>>> Cheers,
>>>
>>> Russ
>>>
>>> P.S. NiFi absolutely rocks, which you'll see as soon as you get over
>>> your initial hump here. But, you're on the right track.
>>>
>>>
>>>
>>> On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
>>>
>>> Hi,
>>>
>>>
>>>
>>> Could you someone please help me with these requirements in the email
>>> below?
>>>
>>>
>>>
>>> Thanks
>>>
>>> Aruna
>>>
>>>
>>>
>>> *From:* Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com
>>> <Ar...@Cormac-Corp.com>]
>>> *Sent:* Wednesday, October 11, 2017 11:26 AM
>>> *To:* users@nifi.apache.org
>>> *Subject:* Transformations using Nifi
>>>
>>>
>>>
>>> I am trying to see what kind of transformations that can be done in nifi
>>> and how.
>>>
>>> Now I have a basic flow that takes CSV from the local dir and puts into
>>> s3 and loads into postgres database.
>>>
>>> There are 4 columns in my test file 3 of which are string and one is an
>>> integer field. I would like to do the following before I load the data into
>>> postgres. If someone can help me on how to go about these, it will be great.
>>>
>>> 1.       Convert one of the string columns to upper case
>>>
>>> *For converting to upper case, I was told to use the Update Record
>>> processor but  my version is 1.2.0 and the update record processor is not
>>> available.*
>>>
>>>
>>>
>>> 2.       Postgres has an extra column called “Load_Date” in which I
>>> would like to load the current date with timestamp when the flow is run
>>>
>>> 3.       If the integer column has more than 5 digits, I would like to
>>> take only the first 5 digits and load to the table
>>>
>>> 4.       There is a look up table in postgres. I would like to check if
>>> the first column value is present in the look up table and if yes, proceed
>>> ahead and if not ignore the record
>>>
>>>
>>>
>>> I am trying to learn nifi so I would really appreciate any kind of help
>>> here. Is there any training available online that I can take in order to
>>> understand and do all these?
>>>
>>>
>>>
>>> <image003.png>
>>>
>>>
>>>
>>
>>
>

Re: UpdateRecord Processor

Posted by Bryan Bende <bb...@gmail.com>.
If it is helpful, here is a template of the working version where I changed
to a long:

https://gist.githubusercontent.com/bbende/3187d3905f868bef3143899841bb70f9/raw/f266b149eb845a371543a1b68f9797d5437c6a2d/update-record-timestamp.xml

On Mon, Oct 16, 2017 at 3:19 PM, Bryan Bende <bb...@gmail.com> wrote:

> Aruna,
>
> I think the issue here might be on the reader side of things...
>
> If you're incoming data does not have SYS_CREAT_TS in it, then your schema
> needs to allow null values for this field so you can create records from
> the incoming data, or you can create two versions of your schema and have
> the reader use a version that doesn't have this field, and have the writer
> use a version that does.
>
> I am currently testing this using these examples:
>
> {
> "type": "record",
> "name": "schema1",
> "fields": [
>  { "name": "id", "type": "string" }
> ]
> }
>
> {
> "type": "record",
> "name": "schema2",
> "fields": [
>  { "name": "id", "type": "string" },
>  { "name": "timestamp", "type" : { "type" : "long", "logicalType" :
> "timestamp-millis" } }
> ]
> }
>
> I have a CSVReader using schema1 and a CSVWriter using schema2.
>
> I'm able to get past the issue you are having and now it is getting to the
> point where it is trying to take the result ofr ${now():toNumber()} and
> convert it to a timestamp, and I'm running into a different issue:
>
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
> Could not convert value [1508180727622] of type java.lang.String to
> Timestamp for field timestamp because the value is not in the expected date
> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$
> Lambda$414/572291470@33d51c66
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
> Could not convert value [1508180727622] of type java.lang.String to
> Timestamp for field timestamp because the value is not in the expected date
> format: org.apache.nifi.serialization.record.util.DataTypeUtils$$
> Lambda$414/572291470@33d51c66
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> toTimestamp(DataTypeUtils.java:564)
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> convertType(DataTypeUtils.java:134)
>     at org.apache.nifi.serialization.record.util.DataTypeUtils.
> convertType(DataTypeUtils.java:84)
>     at org.apache.nifi.serialization.record.MapRecord.setValue(
> MapRecord.java:317)
>     at org.apache.nifi.record.path.StandardFieldValue.updateValue(
> StandardFieldValue.java:132)
>     at org.apache.nifi.processors.standard.UpdateRecord.lambda$
> process$1(UpdateRecord.java:177)
>
> I haven't figured out the issue, but it has something to do with the logic
> of trying to convert a string to timestamp and whether or not a date format
> is provided.
>
> If I change schema1 to use a regular long for the timestamp field then it
> works:
>
> {
> "type": "record",
> "name": "schema2",
> "fields": [
>  { "name": "id", "type": "string" },
>  { "name": "timestamp", "type" : "long" }
> ]
> }
>
> I don't know what the ramifications of this would be in PutDatabaseRecord.
>
>
> On Mon, Oct 16, 2017 at 2:23 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
>> Matt,
>>
>>
>>
>> My issue is that when I am trying to assign current date to sys_creat_ts
>> for every record, I am getting this error that it cannot be null.
>> SYS_CREAT_TS is not coming from the source. It is present in the target
>> table alone and I need to load the current datetime in that field for every
>> record that is coming from source.
>>
>>
>>
>> Mark,
>>
>>
>>
>> I am still getting the same error.
>>
>>
>>
>> Also, actv_ind must display just “Y” in the target table.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *From:* Mark Payne [mailto:markap14@hotmail.com]
>> *Sent:* Monday, October 16, 2017 12:03 PM
>> *To:* users@nifi.apache.org
>> *Subject:* Re: UpdateRecord Processor
>>
>>
>>
>> Hi Aruna,
>>
>>
>>
>> I think the issue is likely that you are setting /sys_create_ts but your
>> schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
>>
>> Would recommend you change the property names to /ACTV_IND and
>> /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
>>
>> set to "{now()}" but I think what you really are wanting is
>> "${now():toNumber()}".
>>
>>
>>
>> Also, of note, /actv_ind is set to '/Y' so I just want to clarify that
>> the literal value '/Y' (with the single quotes) is what will be placed
>> there. Is that
>>
>> the intention? Or did you actually want just /Y to be there?
>>
>>
>>
>> Thanks
>>
>> -Mark
>>
>>
>>
>>
>>
>>
>>
>> On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <
>> Aruna.Sankaralingam@Cormac-Corp.com> wrote:
>>
>>
>>
>> Hi,
>>
>>
>>
>> I updated to version 1.4 now. I am using Updaterecord processor. I have
>> assigned values to two of the columns that are there in the target table as
>> shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am
>> I missing something?
>>
>> I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry
>> and my Nifi Flow below.
>>
>>
>>
>> <image004.png>   <image005.png>
>>
>>
>>
>> CSVRecordSetWriter:
>>
>> <image001.png>
>>
>>
>>
>> AVROSchemaRegistry:
>>
>> <image002.png>
>>
>>
>>
>> <image006.png>
>>
>>
>>
>>
>>
>> *From:* Koji Kawamura [mailto:ijokarumawak@gmail.com
>> <ij...@gmail.com>]
>> *Sent:* Thursday, October 12, 2017 8:28 PM
>> *To:* users@nifi.apache.org
>> *Subject:* Re: Transformations using Nifi
>>
>>
>>
>> Hi Aruna,
>>
>>
>>
>> If you can not upgrade from NiFi 1.2.0, then I think the best bed is
>> using:
>>
>> ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
>>
>> As Russ mentioned, EL might be helpful when you implement the scripted
>> components.
>>
>>
>>
>> #4 is a bit harder since it requires a database connection, but doable, I
>> don't know it works efficiently though..
>>
>>
>>
>> Alternative approach would be create a temporary table, insert rows as
>> they are, then perform a insert/update query using the temporary table and
>> postgresql lookup table such as:
>>
>> "insert into X select a, b, c, from T inner join L on X.j = T.j where d =
>> ..."
>>
>> Probably data conversion #1, 2, and 3 can be performed from this query as
>> well.
>>
>>
>>
>> Thanks,
>>
>> Koji
>>
>>
>>
>> On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>
>> wrote:
>>
>> Aruna,
>>
>> I don't think there is any generalized NiFi training course yet. I
>> started writing a book a year ago, but it's a pretty thankless task and
>> takes a lot of time. I mostly write custom processors so I tend to see the
>> world differently from most "user" users of NiFi. When I answer questions,
>> which I don't do too much, I often tell an impractical story that doesn't
>> answer the question asked.
>>
>> Now, I haven't done much database work at all, so I'm not going to be too
>> much help to you. However, looking at your questions, particularly the one
>> about obtaining the current date when the flow is run (#2) and also #3, I
>> would suggest that you study the NiFi Expression Language. This would allow
>> you to insert the "now" date into your flow at some point.
>>
>> The other suggestions I have are to experiment, which I'm sure you're
>> doing, and Google hard for help. For this forum, which is filled with very
>> nice and helpful people, you'll tend to get a lot more and better help if
>> you come in with a very specific question rather than a list of things or a
>> general, "help me" sort of plea.
>>
>> Cheers,
>>
>> Russ
>>
>> P.S. NiFi absolutely rocks, which you'll see as soon as you get over your
>> initial hump here. But, you're on the right track.
>>
>>
>>
>> On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
>>
>> Hi,
>>
>>
>>
>> Could you someone please help me with these requirements in the email
>> below?
>>
>>
>>
>> Thanks
>>
>> Aruna
>>
>>
>>
>> *From:* Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com
>> <Ar...@Cormac-Corp.com>]
>> *Sent:* Wednesday, October 11, 2017 11:26 AM
>> *To:* users@nifi.apache.org
>> *Subject:* Transformations using Nifi
>>
>>
>>
>> I am trying to see what kind of transformations that can be done in nifi
>> and how.
>>
>> Now I have a basic flow that takes CSV from the local dir and puts into
>> s3 and loads into postgres database.
>>
>> There are 4 columns in my test file 3 of which are string and one is an
>> integer field. I would like to do the following before I load the data into
>> postgres. If someone can help me on how to go about these, it will be great.
>>
>> 1.       Convert one of the string columns to upper case
>>
>> *For converting to upper case, I was told to use the Update Record
>> processor but  my version is 1.2.0 and the update record processor is not
>> available.*
>>
>>
>>
>> 2.       Postgres has an extra column called “Load_Date” in which I
>> would like to load the current date with timestamp when the flow is run
>>
>> 3.       If the integer column has more than 5 digits, I would like to
>> take only the first 5 digits and load to the table
>>
>> 4.       There is a look up table in postgres. I would like to check if
>> the first column value is present in the look up table and if yes, proceed
>> ahead and if not ignore the record
>>
>>
>>
>> I am trying to learn nifi so I would really appreciate any kind of help
>> here. Is there any training available online that I can take in order to
>> understand and do all these?
>>
>>
>>
>> <image003.png>
>>
>>
>>
>
>

Re: UpdateRecord Processor

Posted by Bryan Bende <bb...@gmail.com>.
Aruna,

I think the issue here might be on the reader side of things...

If you're incoming data does not have SYS_CREAT_TS in it, then your schema
needs to allow null values for this field so you can create records from
the incoming data, or you can create two versions of your schema and have
the reader use a version that doesn't have this field, and have the writer
use a version that does.

I am currently testing this using these examples:

{
"type": "record",
"name": "schema1",
"fields": [
 { "name": "id", "type": "string" }
]
}

{
"type": "record",
"name": "schema2",
"fields": [
 { "name": "id", "type": "string" },
 { "name": "timestamp", "type" : { "type" : "long", "logicalType" :
"timestamp-millis" } }
]
}

I have a CSVReader using schema1 and a CSVWriter using schema2.

I'm able to get past the issue you are having and now it is getting to the
point where it is trying to take the result ofr ${now():toNumber()} and
convert it to a timestamp, and I'm running into a different issue:

org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
Could not convert value [1508180727622] of type java.lang.String to
Timestamp for field timestamp because the value is not in the expected date
format:
org.apache.nifi.serialization.record.util.DataTypeUtils$$Lambda$414/572291470@33d51c66
org.apache.nifi.serialization.record.util.IllegalTypeConversionException:
Could not convert value [1508180727622] of type java.lang.String to
Timestamp for field timestamp because the value is not in the expected date
format:
org.apache.nifi.serialization.record.util.DataTypeUtils$$Lambda$414/572291470@33d51c66
    at
org.apache.nifi.serialization.record.util.DataTypeUtils.toTimestamp(DataTypeUtils.java:564)
    at
org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:134)
    at
org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:84)
    at
org.apache.nifi.serialization.record.MapRecord.setValue(MapRecord.java:317)
    at
org.apache.nifi.record.path.StandardFieldValue.updateValue(StandardFieldValue.java:132)
    at
org.apache.nifi.processors.standard.UpdateRecord.lambda$process$1(UpdateRecord.java:177)

I haven't figured out the issue, but it has something to do with the logic
of trying to convert a string to timestamp and whether or not a date format
is provided.

If I change schema1 to use a regular long for the timestamp field then it
works:

{
"type": "record",
"name": "schema2",
"fields": [
 { "name": "id", "type": "string" },
 { "name": "timestamp", "type" : "long" }
]
}

I don't know what the ramifications of this would be in PutDatabaseRecord.


On Mon, Oct 16, 2017 at 2:23 PM, Aruna Sankaralingam <
Aruna.Sankaralingam@cormac-corp.com> wrote:

> Matt,
>
>
>
> My issue is that when I am trying to assign current date to sys_creat_ts
> for every record, I am getting this error that it cannot be null.
> SYS_CREAT_TS is not coming from the source. It is present in the target
> table alone and I need to load the current datetime in that field for every
> record that is coming from source.
>
>
>
> Mark,
>
>
>
> I am still getting the same error.
>
>
>
> Also, actv_ind must display just “Y” in the target table.
>
>
>
>
>
>
>
>
>
> *From:* Mark Payne [mailto:markap14@hotmail.com]
> *Sent:* Monday, October 16, 2017 12:03 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: UpdateRecord Processor
>
>
>
> Hi Aruna,
>
>
>
> I think the issue is likely that you are setting /sys_create_ts but your
> schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
>
> Would recommend you change the property names to /ACTV_IND and
> /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
>
> set to "{now()}" but I think what you really are wanting is
> "${now():toNumber()}".
>
>
>
> Also, of note, /actv_ind is set to '/Y' so I just want to clarify that the
> literal value '/Y' (with the single quotes) is what will be placed there.
> Is that
>
> the intention? Or did you actually want just /Y to be there?
>
>
>
> Thanks
>
> -Mark
>
>
>
>
>
>
>
> On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <
> Aruna.Sankaralingam@Cormac-Corp.com> wrote:
>
>
>
> Hi,
>
>
>
> I updated to version 1.4 now. I am using Updaterecord processor. I have
> assigned values to two of the columns that are there in the target table as
> shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am
> I missing something?
>
> I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry
> and my Nifi Flow below.
>
>
>
> <image004.png>   <image005.png>
>
>
>
> CSVRecordSetWriter:
>
> <image001.png>
>
>
>
> AVROSchemaRegistry:
>
> <image002.png>
>
>
>
> <image006.png>
>
>
>
>
>
> *From:* Koji Kawamura [mailto:ijokarumawak@gmail.com
> <ij...@gmail.com>]
> *Sent:* Thursday, October 12, 2017 8:28 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: Transformations using Nifi
>
>
>
> Hi Aruna,
>
>
>
> If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
>
> ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
>
> As Russ mentioned, EL might be helpful when you implement the scripted
> components.
>
>
>
> #4 is a bit harder since it requires a database connection, but doable, I
> don't know it works efficiently though..
>
>
>
> Alternative approach would be create a temporary table, insert rows as
> they are, then perform a insert/update query using the temporary table and
> postgresql lookup table such as:
>
> "insert into X select a, b, c, from T inner join L on X.j = T.j where d =
> ..."
>
> Probably data conversion #1, 2, and 3 can be performed from this query as
> well.
>
>
>
> Thanks,
>
> Koji
>
>
>
> On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>
> wrote:
>
> Aruna,
>
> I don't think there is any generalized NiFi training course yet. I started
> writing a book a year ago, but it's a pretty thankless task and takes a lot
> of time. I mostly write custom processors so I tend to see the world
> differently from most "user" users of NiFi. When I answer questions, which
> I don't do too much, I often tell an impractical story that doesn't answer
> the question asked.
>
> Now, I haven't done much database work at all, so I'm not going to be too
> much help to you. However, looking at your questions, particularly the one
> about obtaining the current date when the flow is run (#2) and also #3, I
> would suggest that you study the NiFi Expression Language. This would allow
> you to insert the "now" date into your flow at some point.
>
> The other suggestions I have are to experiment, which I'm sure you're
> doing, and Google hard for help. For this forum, which is filled with very
> nice and helpful people, you'll tend to get a lot more and better help if
> you come in with a very specific question rather than a list of things or a
> general, "help me" sort of plea.
>
> Cheers,
>
> Russ
>
> P.S. NiFi absolutely rocks, which you'll see as soon as you get over your
> initial hump here. But, you're on the right track.
>
>
>
> On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
>
> Hi,
>
>
>
> Could you someone please help me with these requirements in the email
> below?
>
>
>
> Thanks
>
> Aruna
>
>
>
> *From:* Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com
> <Ar...@Cormac-Corp.com>]
> *Sent:* Wednesday, October 11, 2017 11:26 AM
> *To:* users@nifi.apache.org
> *Subject:* Transformations using Nifi
>
>
>
> I am trying to see what kind of transformations that can be done in nifi
> and how.
>
> Now I have a basic flow that takes CSV from the local dir and puts into s3
> and loads into postgres database.
>
> There are 4 columns in my test file 3 of which are string and one is an
> integer field. I would like to do the following before I load the data into
> postgres. If someone can help me on how to go about these, it will be great.
>
> 1.       Convert one of the string columns to upper case
>
> *For converting to upper case, I was told to use the Update Record
> processor but  my version is 1.2.0 and the update record processor is not
> available.*
>
>
>
> 2.       Postgres has an extra column called “Load_Date” in which I would
> like to load the current date with timestamp when the flow is run
>
> 3.       If the integer column has more than 5 digits, I would like to
> take only the first 5 digits and load to the table
>
> 4.       There is a look up table in postgres. I would like to check if
> the first column value is present in the look up table and if yes, proceed
> ahead and if not ignore the record
>
>
>
> I am trying to learn nifi so I would really appreciate any kind of help
> here. Is there any training available online that I can take in order to
> understand and do all these?
>
>
>
> <image003.png>
>
>
>

RE: UpdateRecord Processor

Posted by Aruna Sankaralingam <Ar...@Cormac-Corp.com>.
Matt,

My issue is that when I am trying to assign current date to sys_creat_ts for every record, I am getting this error that it cannot be null. SYS_CREAT_TS is not coming from the source. It is present in the target table alone and I need to load the current datetime in that field for every record that is coming from source.

Mark,

I am still getting the same error.

Also, actv_ind must display just “Y” in the target table.


[cid:image001.png@01D3468A.1F15F440]

[cid:image002.png@01D3468A.1F15F440]

From: Mark Payne [mailto:markap14@hotmail.com]
Sent: Monday, October 16, 2017 12:03 PM
To: users@nifi.apache.org
Subject: Re: UpdateRecord Processor

Hi Aruna,

I think the issue is likely that you are setting /sys_create_ts but your schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
Would recommend you change the property names to /ACTV_IND and /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
set to "{now()}" but I think what you really are wanting is "${now():toNumber()}".

Also, of note, /actv_ind is set to '/Y' so I just want to clarify that the literal value '/Y' (with the single quotes) is what will be placed there. Is that
the intention? Or did you actually want just /Y to be there?

Thanks
-Mark



On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <Ar...@Cormac-Corp.com>> wrote:

Hi,

I updated to version 1.4 now. I am using Updaterecord processor. I have assigned values to two of the columns that are there in the target table as shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am I missing something?
I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry and my Nifi Flow below.

<image004.png>   <image005.png>

CSVRecordSetWriter:
<image001.png>

AVROSchemaRegistry:
<image002.png>

<image006.png>


From: Koji Kawamura [mailto:ijokarumawak@gmail.com]
Sent: Thursday, October 12, 2017 8:28 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: Transformations using Nifi

Hi Aruna,

If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
As Russ mentioned, EL might be helpful when you implement the scripted components.

#4 is a bit harder since it requires a database connection, but doable, I don't know it works efficiently though..

Alternative approach would be create a temporary table, insert rows as they are, then perform a insert/update query using the temporary table and postgresql lookup table such as:
"insert into X select a, b, c, from T inner join L on X.j = T.j where d = ..."
Probably data conversion #1, 2, and 3 can be performed from this query as well.

Thanks,
Koji

On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>> wrote:
Aruna,
I don't think there is any generalized NiFi training course yet. I started writing a book a year ago, but it's a pretty thankless task and takes a lot of time. I mostly write custom processors so I tend to see the world differently from most "user" users of NiFi. When I answer questions, which I don't do too much, I often tell an impractical story that doesn't answer the question asked.
Now, I haven't done much database work at all, so I'm not going to be too much help to you. However, looking at your questions, particularly the one about obtaining the current date when the flow is run (#2) and also #3, I would suggest that you study the NiFi Expression Language. This would allow you to insert the "now" date into your flow at some point.
The other suggestions I have are to experiment, which I'm sure you're doing, and Google hard for help. For this forum, which is filled with very nice and helpful people, you'll tend to get a lot more and better help if you come in with a very specific question rather than a list of things or a general, "help me" sort of plea.
Cheers,
Russ
P.S. NiFi absolutely rocks, which you'll see as soon as you get over your initial hump here. But, you're on the right track.

On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
Hi,

Could you someone please help me with these requirements in the email below?

Thanks
Aruna

From: Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com]
Sent: Wednesday, October 11, 2017 11:26 AM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Transformations using Nifi

I am trying to see what kind of transformations that can be done in nifi and how.
Now I have a basic flow that takes CSV from the local dir and puts into s3 and loads into postgres database.
There are 4 columns in my test file 3 of which are string and one is an integer field. I would like to do the following before I load the data into postgres. If someone can help me on how to go about these, it will be great.

1.       Convert one of the string columns to upper case

For converting to upper case, I was told to use the Update Record processor but  my version is 1.2.0 and the update record processor is not available.



2.       Postgres has an extra column called “Load_Date” in which I would like to load the current date with timestamp when the flow is run

3.       If the integer column has more than 5 digits, I would like to take only the first 5 digits and load to the table

4.       There is a look up table in postgres. I would like to check if the first column value is present in the look up table and if yes, proceed ahead and if not ignore the record

I am trying to learn nifi so I would really appreciate any kind of help here. Is there any training available online that I can take in order to understand and do all these?

<image003.png>


Re: UpdateRecord Processor

Posted by Mark Payne <ma...@hotmail.com>.
Hi Aruna,

I think the issue is likely that you are setting /sys_create_ts but your schema has that field as SYS_CREAT_TS (i.e., it is in upper case).
Would recommend you change the property names to /ACTV_IND and /SYS_CREAT_TS. Also, the value that you have for /sys_creat_ts is
set to "{now()}" but I think what you really are wanting is "${now():toNumber()}".

Also, of note, /actv_ind is set to '/Y' so I just want to clarify that the literal value '/Y' (with the single quotes) is what will be placed there. Is that
the intention? Or did you actually want just /Y to be there?

Thanks
-Mark



On Oct 16, 2017, at 11:41 AM, Aruna Sankaralingam <Ar...@Cormac-Corp.com>> wrote:

Hi,

I updated to version 1.4 now. I am using Updaterecord processor. I have assigned values to two of the columns that are there in the target table as shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am I missing something?
I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry and my Nifi Flow below.

<image004.png>   <image005.png>

CSVRecordSetWriter:
<image001.png>

AVROSchemaRegistry:
<image002.png>

<image006.png>


From: Koji Kawamura [mailto:ijokarumawak@gmail.com]
Sent: Thursday, October 12, 2017 8:28 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: Transformations using Nifi

Hi Aruna,

If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
As Russ mentioned, EL might be helpful when you implement the scripted components.

#4 is a bit harder since it requires a database connection, but doable, I don't know it works efficiently though..

Alternative approach would be create a temporary table, insert rows as they are, then perform a insert/update query using the temporary table and postgresql lookup table such as:
"insert into X select a, b, c, from T inner join L on X.j = T.j where d = ..."
Probably data conversion #1, 2, and 3 can be performed from this query as well.

Thanks,
Koji

On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>> wrote:

Aruna,

I don't think there is any generalized NiFi training course yet. I started writing a book a year ago, but it's a pretty thankless task and takes a lot of time. I mostly write custom processors so I tend to see the world differently from most "user" users of NiFi. When I answer questions, which I don't do too much, I often tell an impractical story that doesn't answer the question asked.

Now, I haven't done much database work at all, so I'm not going to be too much help to you. However, looking at your questions, particularly the one about obtaining the current date when the flow is run (#2) and also #3, I would suggest that you study the NiFi Expression Language. This would allow you to insert the "now" date into your flow at some point.

The other suggestions I have are to experiment, which I'm sure you're doing, and Google hard for help. For this forum, which is filled with very nice and helpful people, you'll tend to get a lot more and better help if you come in with a very specific question rather than a list of things or a general, "help me" sort of plea.

Cheers,

Russ

P.S. NiFi absolutely rocks, which you'll see as soon as you get over your initial hump here. But, you're on the right track.


On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
Hi,

Could you someone please help me with these requirements in the email below?

Thanks
Aruna

From: Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com]
Sent: Wednesday, October 11, 2017 11:26 AM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Transformations using Nifi

I am trying to see what kind of transformations that can be done in nifi and how.
Now I have a basic flow that takes CSV from the local dir and puts into s3 and loads into postgres database.
There are 4 columns in my test file 3 of which are string and one is an integer field. I would like to do the following before I load the data into postgres. If someone can help me on how to go about these, it will be great.

1.       Convert one of the string columns to upper case

For converting to upper case, I was told to use the Update Record processor but  my version is 1.2.0 and the update record processor is not available.



2.       Postgres has an extra column called “Load_Date” in which I would like to load the current date with timestamp when the flow is run

3.       If the integer column has more than 5 digits, I would like to take only the first 5 digits and load to the table

4.       There is a look up table in postgres. I would like to check if the first column value is present in the look up table and if yes, proceed ahead and if not ignore the record


I am trying to learn nifi so I would really appreciate any kind of help here. Is there any training available online that I can take in order to understand and do all these?

<image003.png>


Re: UpdateRecord Processor

Posted by Matt Burgess <ma...@apache.org>.
Aruna,

In an Avro schema you have to explicitly specify that a field can be null.
This is done with a union, so for ENRLMT_ID the "type" entry would look
like this:

{ "name": "ENRLMT_ID", "type": ["null", "string"] }

And for SYS CREATE_TS I believe it would be:

{ "name": "SYS_CREATE_TS", "type": ["null", { "type": "long",
"logicalType": "timestamp-millis" } ] }


Having said that, NiFi is using Avro version 1.8.1, so you may run into the
AVRO-1891 [1] bug. If you do, please let us know and we can write a Jira to
upgrade the version to at least 1.8.3.  A workaround might be to use
QueryRecord to filter out the records that have a null value for that
field, and handle them separately, perhaps using UpdateRecord to supply a
default value.


Regards,
Matt

[1] https://issues.apache.org/jira/browse/AVRO-1891

On Mon, Oct 16, 2017 at 11:41 AM, Aruna Sankaralingam <
Aruna.Sankaralingam@cormac-corp.com> wrote:

> Hi,
>
>
>
> I updated to version 1.4 now. I am using Updaterecord processor. I have
> assigned values to two of the columns that are there in the target table as
> shown below but I am getting the error that SYS_CREAT_TS cannot be NULL. Am
> I missing something?
>
> I have provided the screenshots of the CSVReordWriter, AVROSchemaRegistry
> and my Nifi Flow below.
>
>
>
>
>
>
>
> CSVRecordSetWriter:
>
>
>
> AVROSchemaRegistry:
>
>
>
>
>
>
>
> *From:* Koji Kawamura [mailto:ijokarumawak@gmail.com]
> *Sent:* Thursday, October 12, 2017 8:28 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: Transformations using Nifi
>
>
>
> Hi Aruna,
>
>
>
> If you can not upgrade from NiFi 1.2.0, then I think the best bed is using:
>
> ScriptedReader or ScriptedRecordSetWriter for data conversion #1, 2 and 3.
>
> As Russ mentioned, EL might be helpful when you implement the scripted
> components.
>
>
>
> #4 is a bit harder since it requires a database connection, but doable, I
> don't know it works efficiently though..
>
>
>
> Alternative approach would be create a temporary table, insert rows as
> they are, then perform a insert/update query using the temporary table and
> postgresql lookup table such as:
>
> "insert into X select a, b, c, from T inner join L on X.j = T.j where d =
> ..."
>
> Probably data conversion #1, 2, and 3 can be performed from this query as
> well.
>
>
>
> Thanks,
>
> Koji
>
>
>
> On Thu, Oct 12, 2017 at 11:50 PM, Russell Bateman <ru...@windofkeltia.com>
> wrote:
>
> Aruna,
>
> I don't think there is any generalized NiFi training course yet. I started
> writing a book a year ago, but it's a pretty thankless task and takes a lot
> of time. I mostly write custom processors so I tend to see the world
> differently from most "user" users of NiFi. When I answer questions, which
> I don't do too much, I often tell an impractical story that doesn't answer
> the question asked.
>
> Now, I haven't done much database work at all, so I'm not going to be too
> much help to you. However, looking at your questions, particularly the one
> about obtaining the current date when the flow is run (#2) and also #3, I
> would suggest that you study the NiFi Expression Language. This would allow
> you to insert the "now" date into your flow at some point.
>
> The other suggestions I have are to experiment, which I'm sure you're
> doing, and Google hard for help. For this forum, which is filled with very
> nice and helpful people, you'll tend to get a lot more and better help if
> you come in with a very specific question rather than a list of things or a
> general, "help me" sort of plea.
>
> Cheers,
>
> Russ
>
> P.S. NiFi absolutely rocks, which you'll see as soon as you get over your
> initial hump here. But, you're on the right track.
>
>
>
> On 10/12/2017 08:16 AM, Aruna Sankaralingam wrote:
>
> Hi,
>
>
>
> Could you someone please help me with these requirements in the email
> below?
>
>
>
> Thanks
>
> Aruna
>
>
>
> *From:* Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com
> <Ar...@Cormac-Corp.com>]
> *Sent:* Wednesday, October 11, 2017 11:26 AM
> *To:* users@nifi.apache.org
> *Subject:* Transformations using Nifi
>
>
>
> I am trying to see what kind of transformations that can be done in nifi
> and how.
>
> Now I have a basic flow that takes CSV from the local dir and puts into s3
> and loads into postgres database.
>
> There are 4 columns in my test file 3 of which are string and one is an
> integer field. I would like to do the following before I load the data into
> postgres. If someone can help me on how to go about these, it will be great.
>
> 1.       Convert one of the string columns to upper case
>
> *For converting to upper case, I was told to use the Update Record
> processor but  my version is 1.2.0 and the update record processor is not
> available. *
>
>
>
> 2.       Postgres has an extra column called “Load_Date” in which I would
> like to load the current date with timestamp when the flow is run
>
> 3.       If the integer column has more than 5 digits, I would like to
> take only the first 5 digits and load to the table
>
> 4.       There is a look up table in postgres. I would like to check if
> the first column value is present in the look up table and if yes, proceed
> ahead and if not ignore the record
>
>
>
> I am trying to learn nifi so I would really appreciate any kind of help
> here. Is there any training available online that I can take in order to
> understand and do all these?
>
>
>
>
>
>
>