You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gora.apache.org by Lewis John Mcgibbney <le...@gmail.com> on 2013/08/25 03:20:15 UTC
GoraCompiler old and new
Hi,
There are some issues when attempting to use the new GoraCompiler within
the GORA_94 codebase.
I refer specifically to the employee.json schema [0], where we have the
following field in particular
{"name": "boss", "type":["null","Employee","string"]},
It seems that the new compiler is allergic to the definitions within this
field. Specifically that a boss can have >=1 Employee's however it seems
that this results in the following
@CEE279Law3-Linux:~/Downloads/asf/GORA_94$ ./bin/gora goracompiler
gora-core/src/examples/avro/employee.json .
Exception in thread "main" java.lang.StackOverflowError
at java.util.Arrays.copyOfRange(Arrays.java:2695)
at java.lang.String.<init>(String.java:203)
at java.lang.StringBuilder.toString(StringBuilder.java:405)
at org.apache.avro.Schema$Name.<init>(Schema.java:436)
at org.apache.avro.Schema.createRecord(Schema.java:144)
at
org.apache.gora.compiler.GoraCompiler.getRecordSchemaWithDirtySupport(GoraCompiler.java:170)
at
org.apache.gora.compiler.GoraCompiler.getSchemaWithDirtySupport(GoraCompiler.java:128)
If we remove the offending field... all is well.
Any ideas here?
If we can get this fixed, we can compile the schema and stabilize this
branch.
Thanks
Lewis
[0]
http://svn.apache.org/repos/asf/gora/branches/GORA_94/gora-core/src/examples/avro/employee.json
--
*Lewis*
Re: GoraCompiler old and new
Posted by Renato MarroquĂn Mogrovejo <re...@gmail.com>.
Hi Tejas,
This is great! But could you please provide a path we can apply and share
with the community?
The link you provided is for Avro SpecificCompiler mate ;) was that
intended?
Thanks Tejas!
Renato M.
2013/8/25 Tejas Patil <te...@gmail.com>
> Hi Lewis,
>
> I think the problem here was having recursive reference to same schema
> (ie. for defining Employee, we need Employee schema). After a closer look
> at how it is handled in Avro [0], I feel that using a queue to tackle this
> recursive-ness would help. Adding the same to GoraCompiler seems to solve
> the same:
> mvn test
> .........
> .........
> [INFO]
> ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Apache Gora ....................................... SUCCESS [0.575s]
> [INFO] Apache Gora :: Compiler ........................... SUCCESS [1.004s]
> [INFO] Apache Gora :: Compiler-CLI ....................... SUCCESS [0.271s]
> [INFO] Apache Gora :: Core ............................... SUCCESS
> [11.520s]
> [INFO] Apache Gora :: Tutorial ........................... SUCCESS [0.509s]
> [INFO] Apache Gora :: Sources-Dist ....................... SUCCESS [0.052s]
> [INFO]
> ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO]
> ------------------------------------------------------------------------
>
> Attached the modified GoraCompiler.java.
>
> [0]
> https://github.com/apache/avro/blob/trunk/lang/java/compiler/src/main/java/org/apache/avro/compiler/specific/SpecificCompiler.java
>
>
>
> On Sat, Aug 24, 2013 at 6:20 PM, Lewis John Mcgibbney <
> lewis.mcgibbney@gmail.com> wrote:
>
>> Hi,
>> There are some issues when attempting to use the new GoraCompiler within
>> the GORA_94 codebase.
>> I refer specifically to the employee.json schema [0], where we have the
>> following field in particular
>>
>> {"name": "boss", "type":["null","Employee","string"]},
>>
>> It seems that the new compiler is allergic to the definitions within this
>> field. Specifically that a boss can have >=1 Employee's however it seems
>> that this results in the following
>>
>> @CEE279Law3-Linux:~/Downloads/asf/GORA_94$ ./bin/gora goracompiler
>> gora-core/src/examples/avro/employee.json .
>> Exception in thread "main" java.lang.StackOverflowError
>> at java.util.Arrays.copyOfRange(Arrays.java:2695)
>> at java.lang.String.<init>(String.java:203)
>> at java.lang.StringBuilder.toString(StringBuilder.java:405)
>> at org.apache.avro.Schema$Name.<init>(Schema.java:436)
>> at org.apache.avro.Schema.createRecord(Schema.java:144)
>> at
>>
>> org.apache.gora.compiler.GoraCompiler.getRecordSchemaWithDirtySupport(GoraCompiler.java:170)
>> at
>>
>> org.apache.gora.compiler.GoraCompiler.getSchemaWithDirtySupport(GoraCompiler.java:128)
>>
>> If we remove the offending field... all is well.
>>
>> Any ideas here?
>> If we can get this fixed, we can compile the schema and stabilize this
>> branch.
>> Thanks
>> Lewis
>>
>> [0]
>>
>> http://svn.apache.org/repos/asf/gora/branches/GORA_94/gora-core/src/examples/avro/employee.json
>>
>>
>>
>> --
>> *Lewis*
>>
>
>
Re: GoraCompiler old and new
Posted by Tejas Patil <te...@gmail.com>.
Hi Lewis,
I think the problem here was having recursive reference to same schema (ie.
for defining Employee, we need Employee schema). After a closer look at how
it is handled in Avro [0], I feel that using a queue to tackle this
recursive-ness would help. Adding the same to GoraCompiler seems to solve
the same:
mvn test
.........
.........
[INFO]
------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Gora ....................................... SUCCESS [0.575s]
[INFO] Apache Gora :: Compiler ........................... SUCCESS [1.004s]
[INFO] Apache Gora :: Compiler-CLI ....................... SUCCESS [0.271s]
[INFO] Apache Gora :: Core ............................... SUCCESS [11.520s]
[INFO] Apache Gora :: Tutorial ........................... SUCCESS [0.509s]
[INFO] Apache Gora :: Sources-Dist ....................... SUCCESS [0.052s]
[INFO]
------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO]
------------------------------------------------------------------------
Attached the modified GoraCompiler.java.
[0]
https://github.com/apache/avro/blob/trunk/lang/java/compiler/src/main/java/org/apache/avro/compiler/specific/SpecificCompiler.java
On Sat, Aug 24, 2013 at 6:20 PM, Lewis John Mcgibbney <
lewis.mcgibbney@gmail.com> wrote:
> Hi,
> There are some issues when attempting to use the new GoraCompiler within
> the GORA_94 codebase.
> I refer specifically to the employee.json schema [0], where we have the
> following field in particular
>
> {"name": "boss", "type":["null","Employee","string"]},
>
> It seems that the new compiler is allergic to the definitions within this
> field. Specifically that a boss can have >=1 Employee's however it seems
> that this results in the following
>
> @CEE279Law3-Linux:~/Downloads/asf/GORA_94$ ./bin/gora goracompiler
> gora-core/src/examples/avro/employee.json .
> Exception in thread "main" java.lang.StackOverflowError
> at java.util.Arrays.copyOfRange(Arrays.java:2695)
> at java.lang.String.<init>(String.java:203)
> at java.lang.StringBuilder.toString(StringBuilder.java:405)
> at org.apache.avro.Schema$Name.<init>(Schema.java:436)
> at org.apache.avro.Schema.createRecord(Schema.java:144)
> at
>
> org.apache.gora.compiler.GoraCompiler.getRecordSchemaWithDirtySupport(GoraCompiler.java:170)
> at
>
> org.apache.gora.compiler.GoraCompiler.getSchemaWithDirtySupport(GoraCompiler.java:128)
>
> If we remove the offending field... all is well.
>
> Any ideas here?
> If we can get this fixed, we can compile the schema and stabilize this
> branch.
> Thanks
> Lewis
>
> [0]
>
> http://svn.apache.org/repos/asf/gora/branches/GORA_94/gora-core/src/examples/avro/employee.json
>
>
>
> --
> *Lewis*
>
Re: GoraCompiler old and new
Posted by Scott Stults <ss...@opensourceconnections.com>.
Oh! Now I get it. Thanks for being patient!
-Scott
On Aug 26, 2013, at 11:43 PM, Tejas Patil <te...@gmail.com> wrote:
> Hey Scott,
> Let me put it this way:
>
> Employee {
> name:..
> ......
> boss: [null / Employee / string]
> }
>
> Each record represents a single employee along with all his/her details.
> Field "boss" points to the boss of the current employee. There is at max
> one boss for an employee.
>
> Thanks,
> Tejas
>
>
> On Mon, Aug 26, 2013 at 6:53 PM, Scott Stults <
> sstults@opensourceconnections.com> wrote:
>
>> Right, but aren't we looking for multiple Employees there? The way I
>> interpret the original schema, a boss may only have a single Employee.
>> Maybe I'm just not understanding the data model.
>>
>> -Scott
>>
>>
>> On Aug 26, 2013, at 7:45 PM, Henry Saputra <he...@gmail.com>
>> wrote:
>>
>>> I think the schema is a valid Union, so it allows null, or Employee, or
>>> string for boss.
>>>
>>>
>>> On Sun, Aug 25, 2013 at 8:34 AM, Scott Stults <
>>> sstults@opensourceconnections.com> wrote:
>>>
>>>> Lewis,
>>>>
>>>> In your schema, should the type for boss be a union of null and an array
>>>> of Employee?
>>>>
>>>>>
>>>>> {"name": "boss", "type":["null","Employee","string"]},
>>>>
>>>>
>>>> -Scott
>>
>>
Re: GoraCompiler old and new
Posted by Tejas Patil <te...@gmail.com>.
Hey Scott,
Let me put it this way:
Employee {
name:..
......
boss: [null / Employee / string]
}
Each record represents a single employee along with all his/her details.
Field "boss" points to the boss of the current employee. There is at max
one boss for an employee.
Thanks,
Tejas
On Mon, Aug 26, 2013 at 6:53 PM, Scott Stults <
sstults@opensourceconnections.com> wrote:
> Right, but aren't we looking for multiple Employees there? The way I
> interpret the original schema, a boss may only have a single Employee.
> Maybe I'm just not understanding the data model.
>
> -Scott
>
>
> On Aug 26, 2013, at 7:45 PM, Henry Saputra <he...@gmail.com>
> wrote:
>
> > I think the schema is a valid Union, so it allows null, or Employee, or
> > string for boss.
> >
> >
> > On Sun, Aug 25, 2013 at 8:34 AM, Scott Stults <
> > sstults@opensourceconnections.com> wrote:
> >
> >> Lewis,
> >>
> >> In your schema, should the type for boss be a union of null and an array
> >> of Employee?
> >>
> >>>
> >>> {"name": "boss", "type":["null","Employee","string"]},
> >>
> >>
> >> -Scott
>
>
Re: GoraCompiler old and new
Posted by Scott Stults <ss...@opensourceconnections.com>.
Right, but aren't we looking for multiple Employees there? The way I interpret the original schema, a boss may only have a single Employee. Maybe I'm just not understanding the data model.
-Scott
On Aug 26, 2013, at 7:45 PM, Henry Saputra <he...@gmail.com> wrote:
> I think the schema is a valid Union, so it allows null, or Employee, or
> string for boss.
>
>
> On Sun, Aug 25, 2013 at 8:34 AM, Scott Stults <
> sstults@opensourceconnections.com> wrote:
>
>> Lewis,
>>
>> In your schema, should the type for boss be a union of null and an array
>> of Employee?
>>
>>>
>>> {"name": "boss", "type":["null","Employee","string"]},
>>
>>
>> -Scott
Re: GoraCompiler old and new
Posted by Henry Saputra <he...@gmail.com>.
I think the schema is a valid Union, so it allows null, or Employee, or
string for boss.
On Sun, Aug 25, 2013 at 8:34 AM, Scott Stults <
sstults@opensourceconnections.com> wrote:
> Lewis,
>
> In your schema, should the type for boss be a union of null and an array
> of Employee?
>
> >
> > {"name": "boss", "type":["null","Employee","string"]},
>
>
> -Scott
Re: GoraCompiler old and new
Posted by Tejas Patil <te...@gmail.com>.
*@Lewis:*
Now that I have been thinking about it more, looks like my way would end up
causing an issue: I add the dirty field to each original schema obj after
queuing the original schemas. This would end up causing one problem: the
nested appearances would not refer to the modified schema with "dirty" but
with the original schema. Certainly something that we don't want.
*@Renato:*
> The link you provided is for Avro SpecificCompiler mate ;) was that
intended?
Yup. GoraCompiler extends this Avros' compiler and it was worth looking at
that code would have had the same problem !!!
*@Scott*
The schema seems correct to me. Below is an example from Avro website [0]:
For example, a linked-list of 64-bit values may be defined with:
{
"type": "record",
"name": "LongList",
"aliases": ["LinkedLongs"], // old name for this
"fields" : [
{"name": "value", "type": "long"}, // each element has a
long
{"name": "next", "type": ["LongList", "null"]} // optional next element
]
}
Here Longlist refers to itself... pretty similar to what Lewis' example [1]
does :
"type": "record",
"name": "Employee",
"fields" : [
...........
...........
{"name": "boss", "type":["null","Employee","string"]},
[0] : http://avro.apache.org/docs/current/spec.html#schema_record
[1] :
http://svn.apache.org/repos/asf/gora/branches/GORA_94/gora-core/src/examples/avro/employee.json
Thanks,
Tejas
On Sun, Aug 25, 2013 at 8:34 AM, Scott Stults <
sstults@opensourceconnections.com> wrote:
> Lewis,
>
> In your schema, should the type for boss be a union of null and an array
> of Employee?
>
> >
> > {"name": "boss", "type":["null","Employee","string"]},
>
>
> -Scott
Re: GoraCompiler old and new
Posted by Scott Stults <ss...@opensourceconnections.com>.
Lewis,
In your schema, should the type for boss be a union of null and an array of Employee?
>
> {"name": "boss", "type":["null","Employee","string"]},
-Scott