You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Mike Thomsen <mi...@gmail.com> on 2017/10/24 18:18:37 UTC

Is it possible to use $ characters in field names?

I am trying to build an avro schema for a NiFi flow that is going to insert
data into Mongo, and Mongo extended JSON requires the use of $ characters
in cases like this (to represent a date):

{
    "timestamp": {
        "$date": TIMESTAMP_LONG_HERE
    }
}

I tried building a schema with that, and it failed saying there was an
invalid character in the schema.  just wanted to check and see if there was
a work around for this or if I'll have to choose another option.

Thanks,

Mike

Re: Is it possible to use $ characters in field names?

Posted by Mike Thomsen <mi...@gmail.com>.
It looks like I might have been over thinking this, as there is a NiFi
Record API capability for handling Date objects that looks like it might be
able to sidestep it entirely by converting a string into its
representation. I'll explore that route, and if it works will try to follow
up with findings in the off chance someone else goes down this path.

On Wed, Oct 25, 2017 at 2:21 PM, Mike Thomsen <mi...@gmail.com>
wrote:

> The problem is actually with that processor. When I wrote it, I used a
> naive approach to reading the records and turning them into Mongo Document
> objects.
>
> Now what COULD work is if I could use the "date" logical type to create an
> Avro date that could return a java.util.Date object. Mongo's client API
> will not have a problem with that.
>
> I'll take this over to nifi-dev to see what others think.
>
> Thanks.
>
> On Wed, Oct 25, 2017 at 12:08 PM, Sean Busbey <bu...@cloudera.com> wrote:
>
>> Shoot. my copying in the NiFi user list failed. Mike, if using the
>> PutMongoRecord processor might work, the folks on that list are more likely
>> to be able to help with edge cases.
>>
>> If you need the intermediate JSON for some reason, I think there's a JSON
>> transforming processor that you could maybe use to rewrite the JSON records
>> with the right field name?
>>
>> On Wed, Oct 25, 2017 at 11:05 AM, Sean Busbey <bu...@cloudera.com>
>> wrote:
>>
>>> +users@nifi.apache.org[1]
>>>
>>> Could you can keep the data in Avro and then use Nifi's PutMongoRecord
>>> processor[2] with an AvroReader to insert?
>>>
>>>
>>> [1]: https://lists.apache.org/list.html?users@nifi.apache.org
>>> [2]: https://s.apache.org/MmPG
>>>
>>> On Wed, Oct 25, 2017 at 7:51 AM, Mike Thomsen <mi...@gmail.com>
>>> wrote:
>>>
>>>> No, it doesn't look like it's going to work. It accepts $date into the
>>>> record using the alias, but it doesn't generate $date as the field name
>>>> when writing the object back to JSON.
>>>>
>>>> On Wed, Oct 25, 2017 at 8:19 AM, Nandor Kollar <nk...@cloudera.com>
>>>> wrote:
>>>>
>>>>> Oh yes, you're right, you face with the limitation of field names
>>>>> <https://avro.apache.org/docs/1.8.0/spec.html#names>. Apart from
>>>>> solving this via a map, you might consider using Avro aliases
>>>>> <https://avro.apache.org/docs/1.8.2/spec.html#Aliases>, since looks
>>>>> like aliases don't have this limitation, can you use them?
>>>>>
>>>>> Nandor
>>>>>
>>>>> On Wed, Oct 25, 2017 at 1:40 PM, Mike Thomsen <mi...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Nandor,
>>>>>>
>>>>>> It's not the numeric portion that is the problem for me, but the
>>>>>> $date field name. Mongo apparently requires the structure I provided in the
>>>>>> example, and whenever I use $date as the field name the Java Avro API
>>>>>> throws an exception about an invalid character in the field definition.
>>>>>>
>>>>>> The logical type thing is good to know for future reference.
>>>>>>
>>>>>> I admit that this is likely a really uncommon edge case for Avro. The
>>>>>> work around I found for defining a schema that is at least compatible with
>>>>>> the Mongo Extended JSON requirements was to do this (one field example):
>>>>>>
>>>>>> {
>>>>>>     "namespace": "test",
>>>>>>     "name": "PutTestRecord",
>>>>>>     "type": "record",
>>>>>>     "fields": [{
>>>>>>         "name": "timestampField",
>>>>>>         "type": {
>>>>>>             "type": "map",
>>>>>>             "values": "long"
>>>>>>         }
>>>>>>     }]
>>>>>> }
>>>>>>
>>>>>> It doesn't give you the full validation that would be ideal if we
>>>>>> could define a field with the name "$date," but it's an 80% solution that
>>>>>> works with NiFi and other tools that have to generate Extended JSON for
>>>>>> Mongo.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Mike
>>>>>>
>>>>>> On Wed, Oct 25, 2017 at 4:48 AM, Nandor Kollar <nk...@cloudera.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Mike,
>>>>>>>
>>>>>>> This JSON doesn't seems like a valid Avro schema
>>>>>>> <https://avro.apache.org/docs/1.8.1/spec.html#schemas>. If you'd
>>>>>>> like to use timestamps in your schema, you should use Timestamp
>>>>>>> logical types,
>>>>>>> <https://avro.apache.org/docs/1.8.1/spec.html#Timestamp+%28millisecond+precision%29>
>>>>>>> which annotate Avro longs. In this case the schema of this field should
>>>>>>> look like this:
>>>>>>>
>>>>>>> {
>>>>>>>    "name":"timestamp",
>>>>>>>    "type":"long",
>>>>>>>    "logicalType":"timestamp-millis"
>>>>>>> }
>>>>>>>
>>>>>>> If you'd like to create Avro files with this schema, there's on Avro
>>>>>>> wiki you can find a brief tutorial
>>>>>>> <https://avro.apache.org/docs/1.8.1/gettingstartedjava.html#Compiling+the+schema>
>>>>>>> how to create and write Avro files with this schema in Java.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Nandor
>>>>>>>
>>>>>>> On Tue, Oct 24, 2017 at 8:18 PM, Mike Thomsen <
>>>>>>> mikerthomsen@gmail.com> wrote:
>>>>>>>
>>>>>>>> I am trying to build an avro schema for a NiFi flow that is going
>>>>>>>> to insert data into Mongo, and Mongo extended JSON requires the use of $
>>>>>>>> characters in cases like this (to represent a date):
>>>>>>>>
>>>>>>>> {
>>>>>>>>     "timestamp": {
>>>>>>>>         "$date": TIMESTAMP_LONG_HERE
>>>>>>>>     }
>>>>>>>> }
>>>>>>>>
>>>>>>>> I tried building a schema with that, and it failed saying there was
>>>>>>>> an invalid character in the schema.  just wanted to check and see if there
>>>>>>>> was a work around for this or if I'll have to choose another option.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Mike
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> busbey
>>>
>>
>>
>>
>> --
>> busbey
>>
>
>

Re: Is it possible to use $ characters in field names?

Posted by Mike Thomsen <mi...@gmail.com>.
The problem is actually with that processor. When I wrote it, I used a
naive approach to reading the records and turning them into Mongo Document
objects.

Now what COULD work is if I could use the "date" logical type to create an
Avro date that could return a java.util.Date object. Mongo's client API
will not have a problem with that.

I'll take this over to nifi-dev to see what others think.

Thanks.

On Wed, Oct 25, 2017 at 12:08 PM, Sean Busbey <bu...@cloudera.com> wrote:

> Shoot. my copying in the NiFi user list failed. Mike, if using the
> PutMongoRecord processor might work, the folks on that list are more likely
> to be able to help with edge cases.
>
> If you need the intermediate JSON for some reason, I think there's a JSON
> transforming processor that you could maybe use to rewrite the JSON records
> with the right field name?
>
> On Wed, Oct 25, 2017 at 11:05 AM, Sean Busbey <bu...@cloudera.com> wrote:
>
>> +users@nifi.apache.org[1]
>>
>> Could you can keep the data in Avro and then use Nifi's PutMongoRecord
>> processor[2] with an AvroReader to insert?
>>
>>
>> [1]: https://lists.apache.org/list.html?users@nifi.apache.org
>> [2]: https://s.apache.org/MmPG
>>
>> On Wed, Oct 25, 2017 at 7:51 AM, Mike Thomsen <mi...@gmail.com>
>> wrote:
>>
>>> No, it doesn't look like it's going to work. It accepts $date into the
>>> record using the alias, but it doesn't generate $date as the field name
>>> when writing the object back to JSON.
>>>
>>> On Wed, Oct 25, 2017 at 8:19 AM, Nandor Kollar <nk...@cloudera.com>
>>> wrote:
>>>
>>>> Oh yes, you're right, you face with the limitation of field names
>>>> <https://avro.apache.org/docs/1.8.0/spec.html#names>. Apart from
>>>> solving this via a map, you might consider using Avro aliases
>>>> <https://avro.apache.org/docs/1.8.2/spec.html#Aliases>, since looks
>>>> like aliases don't have this limitation, can you use them?
>>>>
>>>> Nandor
>>>>
>>>> On Wed, Oct 25, 2017 at 1:40 PM, Mike Thomsen <mi...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Nandor,
>>>>>
>>>>> It's not the numeric portion that is the problem for me, but the $date
>>>>> field name. Mongo apparently requires the structure I provided in the
>>>>> example, and whenever I use $date as the field name the Java Avro API
>>>>> throws an exception about an invalid character in the field definition.
>>>>>
>>>>> The logical type thing is good to know for future reference.
>>>>>
>>>>> I admit that this is likely a really uncommon edge case for Avro. The
>>>>> work around I found for defining a schema that is at least compatible with
>>>>> the Mongo Extended JSON requirements was to do this (one field example):
>>>>>
>>>>> {
>>>>>     "namespace": "test",
>>>>>     "name": "PutTestRecord",
>>>>>     "type": "record",
>>>>>     "fields": [{
>>>>>         "name": "timestampField",
>>>>>         "type": {
>>>>>             "type": "map",
>>>>>             "values": "long"
>>>>>         }
>>>>>     }]
>>>>> }
>>>>>
>>>>> It doesn't give you the full validation that would be ideal if we
>>>>> could define a field with the name "$date," but it's an 80% solution that
>>>>> works with NiFi and other tools that have to generate Extended JSON for
>>>>> Mongo.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Mike
>>>>>
>>>>> On Wed, Oct 25, 2017 at 4:48 AM, Nandor Kollar <nk...@cloudera.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Mike,
>>>>>>
>>>>>> This JSON doesn't seems like a valid Avro schema
>>>>>> <https://avro.apache.org/docs/1.8.1/spec.html#schemas>. If you'd
>>>>>> like to use timestamps in your schema, you should use Timestamp
>>>>>> logical types,
>>>>>> <https://avro.apache.org/docs/1.8.1/spec.html#Timestamp+%28millisecond+precision%29>
>>>>>> which annotate Avro longs. In this case the schema of this field should
>>>>>> look like this:
>>>>>>
>>>>>> {
>>>>>>    "name":"timestamp",
>>>>>>    "type":"long",
>>>>>>    "logicalType":"timestamp-millis"
>>>>>> }
>>>>>>
>>>>>> If you'd like to create Avro files with this schema, there's on Avro
>>>>>> wiki you can find a brief tutorial
>>>>>> <https://avro.apache.org/docs/1.8.1/gettingstartedjava.html#Compiling+the+schema>
>>>>>> how to create and write Avro files with this schema in Java.
>>>>>>
>>>>>> Regards,
>>>>>> Nandor
>>>>>>
>>>>>> On Tue, Oct 24, 2017 at 8:18 PM, Mike Thomsen <mikerthomsen@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> I am trying to build an avro schema for a NiFi flow that is going to
>>>>>>> insert data into Mongo, and Mongo extended JSON requires the use of $
>>>>>>> characters in cases like this (to represent a date):
>>>>>>>
>>>>>>> {
>>>>>>>     "timestamp": {
>>>>>>>         "$date": TIMESTAMP_LONG_HERE
>>>>>>>     }
>>>>>>> }
>>>>>>>
>>>>>>> I tried building a schema with that, and it failed saying there was
>>>>>>> an invalid character in the schema.  just wanted to check and see if there
>>>>>>> was a work around for this or if I'll have to choose another option.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Mike
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> busbey
>>
>
>
>
> --
> busbey
>

Re: Is it possible to use $ characters in field names?

Posted by Sean Busbey <bu...@cloudera.com>.
Shoot. my copying in the NiFi user list failed. Mike, if using the
PutMongoRecord processor might work, the folks on that list are more likely
to be able to help with edge cases.

If you need the intermediate JSON for some reason, I think there's a JSON
transforming processor that you could maybe use to rewrite the JSON records
with the right field name?

On Wed, Oct 25, 2017 at 11:05 AM, Sean Busbey <bu...@cloudera.com> wrote:

> +users@nifi.apache.org[1]
>
> Could you can keep the data in Avro and then use Nifi's PutMongoRecord
> processor[2] with an AvroReader to insert?
>
>
> [1]: https://lists.apache.org/list.html?users@nifi.apache.org
> [2]: https://s.apache.org/MmPG
>
> On Wed, Oct 25, 2017 at 7:51 AM, Mike Thomsen <mi...@gmail.com>
> wrote:
>
>> No, it doesn't look like it's going to work. It accepts $date into the
>> record using the alias, but it doesn't generate $date as the field name
>> when writing the object back to JSON.
>>
>> On Wed, Oct 25, 2017 at 8:19 AM, Nandor Kollar <nk...@cloudera.com>
>> wrote:
>>
>>> Oh yes, you're right, you face with the limitation of field names
>>> <https://avro.apache.org/docs/1.8.0/spec.html#names>. Apart from
>>> solving this via a map, you might consider using Avro aliases
>>> <https://avro.apache.org/docs/1.8.2/spec.html#Aliases>, since looks
>>> like aliases don't have this limitation, can you use them?
>>>
>>> Nandor
>>>
>>> On Wed, Oct 25, 2017 at 1:40 PM, Mike Thomsen <mi...@gmail.com>
>>> wrote:
>>>
>>>> Hi Nandor,
>>>>
>>>> It's not the numeric portion that is the problem for me, but the $date
>>>> field name. Mongo apparently requires the structure I provided in the
>>>> example, and whenever I use $date as the field name the Java Avro API
>>>> throws an exception about an invalid character in the field definition.
>>>>
>>>> The logical type thing is good to know for future reference.
>>>>
>>>> I admit that this is likely a really uncommon edge case for Avro. The
>>>> work around I found for defining a schema that is at least compatible with
>>>> the Mongo Extended JSON requirements was to do this (one field example):
>>>>
>>>> {
>>>>     "namespace": "test",
>>>>     "name": "PutTestRecord",
>>>>     "type": "record",
>>>>     "fields": [{
>>>>         "name": "timestampField",
>>>>         "type": {
>>>>             "type": "map",
>>>>             "values": "long"
>>>>         }
>>>>     }]
>>>> }
>>>>
>>>> It doesn't give you the full validation that would be ideal if we could
>>>> define a field with the name "$date," but it's an 80% solution that works
>>>> with NiFi and other tools that have to generate Extended JSON for Mongo.
>>>>
>>>> Thanks,
>>>>
>>>> Mike
>>>>
>>>> On Wed, Oct 25, 2017 at 4:48 AM, Nandor Kollar <nk...@cloudera.com>
>>>> wrote:
>>>>
>>>>> Hi Mike,
>>>>>
>>>>> This JSON doesn't seems like a valid Avro schema
>>>>> <https://avro.apache.org/docs/1.8.1/spec.html#schemas>. If you'd like
>>>>> to use timestamps in your schema, you should use Timestamp logical
>>>>> types,
>>>>> <https://avro.apache.org/docs/1.8.1/spec.html#Timestamp+%28millisecond+precision%29>
>>>>> which annotate Avro longs. In this case the schema of this field should
>>>>> look like this:
>>>>>
>>>>> {
>>>>>    "name":"timestamp",
>>>>>    "type":"long",
>>>>>    "logicalType":"timestamp-millis"
>>>>> }
>>>>>
>>>>> If you'd like to create Avro files with this schema, there's on Avro
>>>>> wiki you can find a brief tutorial
>>>>> <https://avro.apache.org/docs/1.8.1/gettingstartedjava.html#Compiling+the+schema>
>>>>> how to create and write Avro files with this schema in Java.
>>>>>
>>>>> Regards,
>>>>> Nandor
>>>>>
>>>>> On Tue, Oct 24, 2017 at 8:18 PM, Mike Thomsen <mi...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I am trying to build an avro schema for a NiFi flow that is going to
>>>>>> insert data into Mongo, and Mongo extended JSON requires the use of $
>>>>>> characters in cases like this (to represent a date):
>>>>>>
>>>>>> {
>>>>>>     "timestamp": {
>>>>>>         "$date": TIMESTAMP_LONG_HERE
>>>>>>     }
>>>>>> }
>>>>>>
>>>>>> I tried building a schema with that, and it failed saying there was
>>>>>> an invalid character in the schema.  just wanted to check and see if there
>>>>>> was a work around for this or if I'll have to choose another option.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Mike
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
>
> --
> busbey
>



-- 
busbey

Re: Is it possible to use $ characters in field names?

Posted by Sean Busbey <bu...@cloudera.com>.
+users@nifi.apache.org[1]

Could you can keep the data in Avro and then use Nifi's PutMongoRecord
processor[2] with an AvroReader to insert?


[1]: https://lists.apache.org/list.html?users@nifi.apache.org
[2]: https://s.apache.org/MmPG

On Wed, Oct 25, 2017 at 7:51 AM, Mike Thomsen <mi...@gmail.com>
wrote:

> No, it doesn't look like it's going to work. It accepts $date into the
> record using the alias, but it doesn't generate $date as the field name
> when writing the object back to JSON.
>
> On Wed, Oct 25, 2017 at 8:19 AM, Nandor Kollar <nk...@cloudera.com>
> wrote:
>
>> Oh yes, you're right, you face with the limitation of field names
>> <https://avro.apache.org/docs/1.8.0/spec.html#names>. Apart from solving
>> this via a map, you might consider using Avro aliases
>> <https://avro.apache.org/docs/1.8.2/spec.html#Aliases>, since looks like
>> aliases don't have this limitation, can you use them?
>>
>> Nandor
>>
>> On Wed, Oct 25, 2017 at 1:40 PM, Mike Thomsen <mi...@gmail.com>
>> wrote:
>>
>>> Hi Nandor,
>>>
>>> It's not the numeric portion that is the problem for me, but the $date
>>> field name. Mongo apparently requires the structure I provided in the
>>> example, and whenever I use $date as the field name the Java Avro API
>>> throws an exception about an invalid character in the field definition.
>>>
>>> The logical type thing is good to know for future reference.
>>>
>>> I admit that this is likely a really uncommon edge case for Avro. The
>>> work around I found for defining a schema that is at least compatible with
>>> the Mongo Extended JSON requirements was to do this (one field example):
>>>
>>> {
>>>     "namespace": "test",
>>>     "name": "PutTestRecord",
>>>     "type": "record",
>>>     "fields": [{
>>>         "name": "timestampField",
>>>         "type": {
>>>             "type": "map",
>>>             "values": "long"
>>>         }
>>>     }]
>>> }
>>>
>>> It doesn't give you the full validation that would be ideal if we could
>>> define a field with the name "$date," but it's an 80% solution that works
>>> with NiFi and other tools that have to generate Extended JSON for Mongo.
>>>
>>> Thanks,
>>>
>>> Mike
>>>
>>> On Wed, Oct 25, 2017 at 4:48 AM, Nandor Kollar <nk...@cloudera.com>
>>> wrote:
>>>
>>>> Hi Mike,
>>>>
>>>> This JSON doesn't seems like a valid Avro schema
>>>> <https://avro.apache.org/docs/1.8.1/spec.html#schemas>. If you'd like
>>>> to use timestamps in your schema, you should use Timestamp logical
>>>> types,
>>>> <https://avro.apache.org/docs/1.8.1/spec.html#Timestamp+%28millisecond+precision%29>
>>>> which annotate Avro longs. In this case the schema of this field should
>>>> look like this:
>>>>
>>>> {
>>>>    "name":"timestamp",
>>>>    "type":"long",
>>>>    "logicalType":"timestamp-millis"
>>>> }
>>>>
>>>> If you'd like to create Avro files with this schema, there's on Avro
>>>> wiki you can find a brief tutorial
>>>> <https://avro.apache.org/docs/1.8.1/gettingstartedjava.html#Compiling+the+schema>
>>>> how to create and write Avro files with this schema in Java.
>>>>
>>>> Regards,
>>>> Nandor
>>>>
>>>> On Tue, Oct 24, 2017 at 8:18 PM, Mike Thomsen <mi...@gmail.com>
>>>> wrote:
>>>>
>>>>> I am trying to build an avro schema for a NiFi flow that is going to
>>>>> insert data into Mongo, and Mongo extended JSON requires the use of $
>>>>> characters in cases like this (to represent a date):
>>>>>
>>>>> {
>>>>>     "timestamp": {
>>>>>         "$date": TIMESTAMP_LONG_HERE
>>>>>     }
>>>>> }
>>>>>
>>>>> I tried building a schema with that, and it failed saying there was an
>>>>> invalid character in the schema.  just wanted to check and see if there was
>>>>> a work around for this or if I'll have to choose another option.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Mike
>>>>>
>>>>
>>>>
>>>
>>
>


-- 
busbey

Re: Is it possible to use $ characters in field names?

Posted by Mike Thomsen <mi...@gmail.com>.
No, it doesn't look like it's going to work. It accepts $date into the
record using the alias, but it doesn't generate $date as the field name
when writing the object back to JSON.

On Wed, Oct 25, 2017 at 8:19 AM, Nandor Kollar <nk...@cloudera.com> wrote:

> Oh yes, you're right, you face with the limitation of field names
> <https://avro.apache.org/docs/1.8.0/spec.html#names>. Apart from solving
> this via a map, you might consider using Avro aliases
> <https://avro.apache.org/docs/1.8.2/spec.html#Aliases>, since looks like
> aliases don't have this limitation, can you use them?
>
> Nandor
>
> On Wed, Oct 25, 2017 at 1:40 PM, Mike Thomsen <mi...@gmail.com>
> wrote:
>
>> Hi Nandor,
>>
>> It's not the numeric portion that is the problem for me, but the $date
>> field name. Mongo apparently requires the structure I provided in the
>> example, and whenever I use $date as the field name the Java Avro API
>> throws an exception about an invalid character in the field definition.
>>
>> The logical type thing is good to know for future reference.
>>
>> I admit that this is likely a really uncommon edge case for Avro. The
>> work around I found for defining a schema that is at least compatible with
>> the Mongo Extended JSON requirements was to do this (one field example):
>>
>> {
>>     "namespace": "test",
>>     "name": "PutTestRecord",
>>     "type": "record",
>>     "fields": [{
>>         "name": "timestampField",
>>         "type": {
>>             "type": "map",
>>             "values": "long"
>>         }
>>     }]
>> }
>>
>> It doesn't give you the full validation that would be ideal if we could
>> define a field with the name "$date," but it's an 80% solution that works
>> with NiFi and other tools that have to generate Extended JSON for Mongo.
>>
>> Thanks,
>>
>> Mike
>>
>> On Wed, Oct 25, 2017 at 4:48 AM, Nandor Kollar <nk...@cloudera.com>
>> wrote:
>>
>>> Hi Mike,
>>>
>>> This JSON doesn't seems like a valid Avro schema
>>> <https://avro.apache.org/docs/1.8.1/spec.html#schemas>. If you'd like
>>> to use timestamps in your schema, you should use Timestamp logical
>>> types,
>>> <https://avro.apache.org/docs/1.8.1/spec.html#Timestamp+%28millisecond+precision%29>
>>> which annotate Avro longs. In this case the schema of this field should
>>> look like this:
>>>
>>> {
>>>    "name":"timestamp",
>>>    "type":"long",
>>>    "logicalType":"timestamp-millis"
>>> }
>>>
>>> If you'd like to create Avro files with this schema, there's on Avro
>>> wiki you can find a brief tutorial
>>> <https://avro.apache.org/docs/1.8.1/gettingstartedjava.html#Compiling+the+schema>
>>> how to create and write Avro files with this schema in Java.
>>>
>>> Regards,
>>> Nandor
>>>
>>> On Tue, Oct 24, 2017 at 8:18 PM, Mike Thomsen <mi...@gmail.com>
>>> wrote:
>>>
>>>> I am trying to build an avro schema for a NiFi flow that is going to
>>>> insert data into Mongo, and Mongo extended JSON requires the use of $
>>>> characters in cases like this (to represent a date):
>>>>
>>>> {
>>>>     "timestamp": {
>>>>         "$date": TIMESTAMP_LONG_HERE
>>>>     }
>>>> }
>>>>
>>>> I tried building a schema with that, and it failed saying there was an
>>>> invalid character in the schema.  just wanted to check and see if there was
>>>> a work around for this or if I'll have to choose another option.
>>>>
>>>> Thanks,
>>>>
>>>> Mike
>>>>
>>>
>>>
>>
>

Re: Is it possible to use $ characters in field names?

Posted by Nandor Kollar <nk...@cloudera.com>.
Oh yes, you're right, you face with the limitation of field names
<https://avro.apache.org/docs/1.8.0/spec.html#names>. Apart from solving
this via a map, you might consider using Avro aliases
<https://avro.apache.org/docs/1.8.2/spec.html#Aliases>, since looks like
aliases don't have this limitation, can you use them?

Nandor

On Wed, Oct 25, 2017 at 1:40 PM, Mike Thomsen <mi...@gmail.com>
wrote:

> Hi Nandor,
>
> It's not the numeric portion that is the problem for me, but the $date
> field name. Mongo apparently requires the structure I provided in the
> example, and whenever I use $date as the field name the Java Avro API
> throws an exception about an invalid character in the field definition.
>
> The logical type thing is good to know for future reference.
>
> I admit that this is likely a really uncommon edge case for Avro. The work
> around I found for defining a schema that is at least compatible with the
> Mongo Extended JSON requirements was to do this (one field example):
>
> {
>     "namespace": "test",
>     "name": "PutTestRecord",
>     "type": "record",
>     "fields": [{
>         "name": "timestampField",
>         "type": {
>             "type": "map",
>             "values": "long"
>         }
>     }]
> }
>
> It doesn't give you the full validation that would be ideal if we could
> define a field with the name "$date," but it's an 80% solution that works
> with NiFi and other tools that have to generate Extended JSON for Mongo.
>
> Thanks,
>
> Mike
>
> On Wed, Oct 25, 2017 at 4:48 AM, Nandor Kollar <nk...@cloudera.com>
> wrote:
>
>> Hi Mike,
>>
>> This JSON doesn't seems like a valid Avro schema
>> <https://avro.apache.org/docs/1.8.1/spec.html#schemas>. If you'd like to
>> use timestamps in your schema, you should use Timestamp logical types,
>> <https://avro.apache.org/docs/1.8.1/spec.html#Timestamp+%28millisecond+precision%29>
>> which annotate Avro longs. In this case the schema of this field should
>> look like this:
>>
>> {
>>    "name":"timestamp",
>>    "type":"long",
>>    "logicalType":"timestamp-millis"
>> }
>>
>> If you'd like to create Avro files with this schema, there's on Avro wiki
>> you can find a brief tutorial
>> <https://avro.apache.org/docs/1.8.1/gettingstartedjava.html#Compiling+the+schema>
>> how to create and write Avro files with this schema in Java.
>>
>> Regards,
>> Nandor
>>
>> On Tue, Oct 24, 2017 at 8:18 PM, Mike Thomsen <mi...@gmail.com>
>> wrote:
>>
>>> I am trying to build an avro schema for a NiFi flow that is going to
>>> insert data into Mongo, and Mongo extended JSON requires the use of $
>>> characters in cases like this (to represent a date):
>>>
>>> {
>>>     "timestamp": {
>>>         "$date": TIMESTAMP_LONG_HERE
>>>     }
>>> }
>>>
>>> I tried building a schema with that, and it failed saying there was an
>>> invalid character in the schema.  just wanted to check and see if there was
>>> a work around for this or if I'll have to choose another option.
>>>
>>> Thanks,
>>>
>>> Mike
>>>
>>
>>
>

Re: Is it possible to use $ characters in field names?

Posted by Mike Thomsen <mi...@gmail.com>.
Hi Nandor,

It's not the numeric portion that is the problem for me, but the $date
field name. Mongo apparently requires the structure I provided in the
example, and whenever I use $date as the field name the Java Avro API
throws an exception about an invalid character in the field definition.

The logical type thing is good to know for future reference.

I admit that this is likely a really uncommon edge case for Avro. The work
around I found for defining a schema that is at least compatible with the
Mongo Extended JSON requirements was to do this (one field example):

{
    "namespace": "test",
    "name": "PutTestRecord",
    "type": "record",
    "fields": [{
        "name": "timestampField",
        "type": {
            "type": "map",
            "values": "long"
        }
    }]
}

It doesn't give you the full validation that would be ideal if we could
define a field with the name "$date," but it's an 80% solution that works
with NiFi and other tools that have to generate Extended JSON for Mongo.

Thanks,

Mike

On Wed, Oct 25, 2017 at 4:48 AM, Nandor Kollar <nk...@cloudera.com> wrote:

> Hi Mike,
>
> This JSON doesn't seems like a valid Avro schema
> <https://avro.apache.org/docs/1.8.1/spec.html#schemas>. If you'd like to
> use timestamps in your schema, you should use Timestamp logical types,
> <https://avro.apache.org/docs/1.8.1/spec.html#Timestamp+%28millisecond+precision%29>
> which annotate Avro longs. In this case the schema of this field should
> look like this:
>
> {
>    "name":"timestamp",
>    "type":"long",
>    "logicalType":"timestamp-millis"
> }
>
> If you'd like to create Avro files with this schema, there's on Avro wiki
> you can find a brief tutorial
> <https://avro.apache.org/docs/1.8.1/gettingstartedjava.html#Compiling+the+schema>
> how to create and write Avro files with this schema in Java.
>
> Regards,
> Nandor
>
> On Tue, Oct 24, 2017 at 8:18 PM, Mike Thomsen <mi...@gmail.com>
> wrote:
>
>> I am trying to build an avro schema for a NiFi flow that is going to
>> insert data into Mongo, and Mongo extended JSON requires the use of $
>> characters in cases like this (to represent a date):
>>
>> {
>>     "timestamp": {
>>         "$date": TIMESTAMP_LONG_HERE
>>     }
>> }
>>
>> I tried building a schema with that, and it failed saying there was an
>> invalid character in the schema.  just wanted to check and see if there was
>> a work around for this or if I'll have to choose another option.
>>
>> Thanks,
>>
>> Mike
>>
>
>

Re: Is it possible to use $ characters in field names?

Posted by Nandor Kollar <nk...@cloudera.com>.
Hi Mike,

This JSON doesn't seems like a valid Avro schema
<https://avro.apache.org/docs/1.8.1/spec.html#schemas>. If you'd like to
use timestamps in your schema, you should use Timestamp logical types,
<https://avro.apache.org/docs/1.8.1/spec.html#Timestamp+%28millisecond+precision%29>
which annotate Avro longs. In this case the schema of this field should
look like this:

{
   "name":"timestamp",
   "type":"long",
   "logicalType":"timestamp-millis"
}

If you'd like to create Avro files with this schema, there's on Avro wiki
you can find a brief tutorial
<https://avro.apache.org/docs/1.8.1/gettingstartedjava.html#Compiling+the+schema>
how to create and write Avro files with this schema in Java.

Regards,
Nandor

On Tue, Oct 24, 2017 at 8:18 PM, Mike Thomsen <mi...@gmail.com>
wrote:

> I am trying to build an avro schema for a NiFi flow that is going to
> insert data into Mongo, and Mongo extended JSON requires the use of $
> characters in cases like this (to represent a date):
>
> {
>     "timestamp": {
>         "$date": TIMESTAMP_LONG_HERE
>     }
> }
>
> I tried building a schema with that, and it failed saying there was an
> invalid character in the schema.  just wanted to check and see if there was
> a work around for this or if I'll have to choose another option.
>
> Thanks,
>
> Mike
>