You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Jean-Sebastien Vachon <js...@brizodata.com> on 2018/07/18 21:41:15 UTC

Attributes vs JOLTTransformJSON

Hi all,

I'm using a JOLT transformation at the very end of my flow to filter out some attributes that I don't want to send to ElasticSearch for Indexing. So far, it is working great but I'd like to include the value of an attribute (docId) into the transformation as well.

My JOLT specs are:
[{
                "operation": "shift",
                "spec": {
                                "companyId": "&",
                                "companyName": "&",
                                "s3Key": "&",
                                "runId": "&",
                                "urls": "&",
                                "urlId": "&",
                                "urlLevel": "&",
                                "urlAddress": "&",
                             "docId": "${docId}"
                }
}]

When I run my flow through this processor, the result is (check the last field):

{
  "companyId" : 1,
  "companyName" : "some company",
  "s3Key" : "1.9fe1cf4d384cd0a4cec3d97f54ae5a8d.json",
  "runId" : 1,
  "urls" : [ {
    "url" : "http://www.somecompany.com",
    "id" : 0,
    "filter_status" : "ok"
  }, {
    "url" : "http://www. somecompany.com/contact",
    "id" : 0,
    "filter_status" : "ok"
  }, {
    "url" : "http://www. somecompany.com/#nav",
    "id" : 0,
    "filter_status" : "ok"
  }, {
    "url" : "http://www. somecompany.com#top",
    "id" : 0,
    "filter_status" : "ok"
  } ],
  "urlId" : 1,
  "urlLevel" : 0,
  "urlAddress" : "http://www. somecompany.com",
  "1001" : "1001"
}

I was expecting the last field to read like "docId": "1001"...
Now, I'm pretty sure this is obvious to someone experienced with JOLT but I googled a bit and could not find good documentation about JOLT's syntax.

Thanks
--
Jean-Sébastien Vachon

vachonjs@gmail.com<ma...@brizodata.com>
jsvachon@brizodata.com<ma...@brizodata.com>
www.brizodata.com<http://www.brizodata.com/>


RE: Attributes vs JOLTTransformJSON

Posted by Jean-Sebastien Vachon <js...@brizodata.com>.
Hi,

Thanks for the advice. I don’t really like Javadoc but it did the job 😉

Regards

From: Juan Pablo Gardella <ga...@gmail.com>
Sent: July 18, 2018 5:46 PM
To: users@nifi.apache.org
Subject: Re: Attributes vs JOLTTransformJSON

The best docs are javadoc for Jolt. I suggest to checkout the code and read from there. It also has examples.

On Wed, 18 Jul 2018 at 18:41 Jean-Sebastien Vachon <js...@brizodata.com>> wrote:
Hi all,

I’m using a JOLT transformation at the very end of my flow to filter out some attributes that I don’t want to send to ElasticSearch for Indexing. So far, it is working great but I’d like to include the value of an attribute (docId) into the transformation as well.

My JOLT specs are:
[{
                "operation": "shift",
                "spec": {
                                "companyId": "&",
                                "companyName": "&",
                                "s3Key": "&",
                                "runId": "&",
                                "urls": "&",
                                "urlId": "&",
                                "urlLevel": "&",
                                "urlAddress": "&",
                             "docId": "${docId}"
                }
}]

When I run my flow through this processor, the result is (check the last field):

{
  "companyId" : 1,
  "companyName" : "some company",
  "s3Key" : "1.9fe1cf4d384cd0a4cec3d97f54ae5a8d.json",
  "runId" : 1,
  "urls" : [ {
    "url" : "http://www.somecompany.com",
    "id" : 0,
    "filter_status" : "ok"
  }, {
    "url" : "http://www. somecompany.com/contact<http://somecompany.com/contact>",
    "id" : 0,
    "filter_status" : "ok"
  }, {
    "url" : "http://www. somecompany.com/#nav<http://somecompany.com/#nav>",
    "id" : 0,
    "filter_status" : "ok"
  }, {
    "url" : "http://www. somecompany.com#top<http://somecompany.com#top>",
    "id" : 0,
    "filter_status" : "ok"
  } ],
  "urlId" : 1,
  "urlLevel" : 0,
  "urlAddress" : "http://www. somecompany.com<http://somecompany.com>",
  "1001" : "1001"
}

I was expecting the last field to read like “docId”: “1001”…
Now, I’m pretty sure this is obvious to someone experienced with JOLT but I googled a bit and could not find good documentation about JOLT’s syntax.

Thanks
--
Jean-Sébastien Vachon
vachonjs@gmail.com<ma...@brizodata.com>
jsvachon@brizodata.com<ma...@brizodata.com>
www.brizodata.com<http://www.brizodata.com/>


Re: Attributes vs JOLTTransformJSON

Posted by Juan Pablo Gardella <ga...@gmail.com>.
The best docs are javadoc for Jolt. I suggest to checkout the code and read
from there. It also has examples.

On Wed, 18 Jul 2018 at 18:41 Jean-Sebastien Vachon <js...@brizodata.com>
wrote:

> Hi all,
>
>
>
> I’m using a JOLT transformation at the very end of my flow to filter out
> some attributes that I don’t want to send to ElasticSearch for Indexing. So
> far, it is working great but I’d like to include the value of an attribute
> (docId) into the transformation as well.
>
>
>
> My JOLT specs are:
>
> [{
>
>                 "operation": "shift",
>
>                 "spec": {
>
>                                 "companyId": "&",
>
>                                 "companyName": "&",
>
>                                 "s3Key": "&",
>
>                                 "runId": "&",
>
>                                 "urls": "&",
>
>                                 "urlId": "&",
>
>                                 "urlLevel": "&",
>
>                                 "urlAddress": "&",
>
>                              "docId": "${docId}"
>
>                 }
>
> }]
>
>
>
> When I run my flow through this processor, the result is (check the last
> field):
>
>
>
> {
>
>   "companyId" : 1,
>
>   "companyName" : "some company",
>
>   "s3Key" : "1.9fe1cf4d384cd0a4cec3d97f54ae5a8d.json",
>
>   "runId" : 1,
>
>   "urls" : [ {
>
>     "url" : "http://www.somecompany.com",
>
>     "id" : 0,
>
>     "filter_status" : "ok"
>
>   }, {
>
>     "url" : "http://www. somecompany.com/contact",
>
>     "id" : 0,
>
>     "filter_status" : "ok"
>
>   }, {
>
>     "url" : "http://www. somecompany.com/#nav",
>
>     "id" : 0,
>
>     "filter_status" : "ok"
>
>   }, {
>
>     "url" : "http://www. somecompany.com#top",
>
>     "id" : 0,
>
>     "filter_status" : "ok"
>
>   } ],
>
>   "urlId" : 1,
>
>   "urlLevel" : 0,
>
>   "urlAddress" : "http://www. somecompany.com",
>
>   "1001" : "1001"
>
> }
>
>
>
> I was expecting the last field to read like “docId”: “1001”…
>
> Now, I’m pretty sure this is obvious to someone experienced with JOLT but
> I googled a bit and could not find good documentation about JOLT’s syntax.
>
>
>
> Thanks
>
> --
>
> Jean-Sébastien Vachon
>
> vachonjs@gmail.com <js...@brizodata.com>
>
> jsvachon@brizodata.com
>
> www.brizodata.com
>
>
>