You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gora.apache.org by Lewis John Mcgibbney <le...@gmail.com> on 2013/08/25 03:20:15 UTC

GoraCompiler old and new

Hi,
There are some issues when attempting to use the new GoraCompiler within
the GORA_94 codebase.
I refer specifically to the employee.json schema [0], where we have the
following field in particular

      {"name": "boss", "type":["null","Employee","string"]},

It seems that the new compiler is allergic to the definitions within this
field. Specifically that a boss can have >=1 Employee's however it seems
that this results in the following

@CEE279Law3-Linux:~/Downloads/asf/GORA_94$ ./bin/gora goracompiler
gora-core/src/examples/avro/employee.json .
Exception in thread "main" java.lang.StackOverflowError
    at java.util.Arrays.copyOfRange(Arrays.java:2695)
    at java.lang.String.<init>(String.java:203)
    at java.lang.StringBuilder.toString(StringBuilder.java:405)
    at org.apache.avro.Schema$Name.<init>(Schema.java:436)
    at org.apache.avro.Schema.createRecord(Schema.java:144)
    at
org.apache.gora.compiler.GoraCompiler.getRecordSchemaWithDirtySupport(GoraCompiler.java:170)
    at
org.apache.gora.compiler.GoraCompiler.getSchemaWithDirtySupport(GoraCompiler.java:128)

If we remove the offending field... all is well.

Any ideas here?
If we can get this fixed, we can compile the schema and stabilize this
branch.
Thanks
Lewis

[0]
http://svn.apache.org/repos/asf/gora/branches/GORA_94/gora-core/src/examples/avro/employee.json



-- 
*Lewis*

Re: GoraCompiler old and new

Posted by Renato MarroquĂ­n Mogrovejo <re...@gmail.com>.
Hi Tejas,

This is great! But could you please provide a path we can apply and share
with the community?
The link you provided is for Avro SpecificCompiler mate ;) was that
intended?
Thanks Tejas!


Renato M.


2013/8/25 Tejas Patil <te...@gmail.com>

> Hi Lewis,
>
> I think the problem here was having recursive reference to same schema
> (ie. for defining Employee, we need Employee schema). After a closer look
> at how it is handled in Avro [0], I feel that using a queue to tackle this
> recursive-ness would help. Adding the same to GoraCompiler seems to solve
> the same:
> mvn test
> .........
> .........
> [INFO]
> ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Apache Gora ....................................... SUCCESS [0.575s]
> [INFO] Apache Gora :: Compiler ........................... SUCCESS [1.004s]
> [INFO] Apache Gora :: Compiler-CLI ....................... SUCCESS [0.271s]
> [INFO] Apache Gora :: Core ............................... SUCCESS
> [11.520s]
> [INFO] Apache Gora :: Tutorial ........................... SUCCESS [0.509s]
> [INFO] Apache Gora :: Sources-Dist ....................... SUCCESS [0.052s]
> [INFO]
> ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO]
> ------------------------------------------------------------------------
>
> Attached the modified GoraCompiler.java.
>
> [0]
> https://github.com/apache/avro/blob/trunk/lang/java/compiler/src/main/java/org/apache/avro/compiler/specific/SpecificCompiler.java
>
>
>
> On Sat, Aug 24, 2013 at 6:20 PM, Lewis John Mcgibbney <
> lewis.mcgibbney@gmail.com> wrote:
>
>> Hi,
>> There are some issues when attempting to use the new GoraCompiler within
>> the GORA_94 codebase.
>> I refer specifically to the employee.json schema [0], where we have the
>> following field in particular
>>
>>       {"name": "boss", "type":["null","Employee","string"]},
>>
>> It seems that the new compiler is allergic to the definitions within this
>> field. Specifically that a boss can have >=1 Employee's however it seems
>> that this results in the following
>>
>> @CEE279Law3-Linux:~/Downloads/asf/GORA_94$ ./bin/gora goracompiler
>> gora-core/src/examples/avro/employee.json .
>> Exception in thread "main" java.lang.StackOverflowError
>>     at java.util.Arrays.copyOfRange(Arrays.java:2695)
>>     at java.lang.String.<init>(String.java:203)
>>     at java.lang.StringBuilder.toString(StringBuilder.java:405)
>>     at org.apache.avro.Schema$Name.<init>(Schema.java:436)
>>     at org.apache.avro.Schema.createRecord(Schema.java:144)
>>     at
>>
>> org.apache.gora.compiler.GoraCompiler.getRecordSchemaWithDirtySupport(GoraCompiler.java:170)
>>     at
>>
>> org.apache.gora.compiler.GoraCompiler.getSchemaWithDirtySupport(GoraCompiler.java:128)
>>
>> If we remove the offending field... all is well.
>>
>> Any ideas here?
>> If we can get this fixed, we can compile the schema and stabilize this
>> branch.
>> Thanks
>> Lewis
>>
>> [0]
>>
>> http://svn.apache.org/repos/asf/gora/branches/GORA_94/gora-core/src/examples/avro/employee.json
>>
>>
>>
>> --
>> *Lewis*
>>
>
>

Re: GoraCompiler old and new

Posted by Tejas Patil <te...@gmail.com>.
Hi Lewis,

I think the problem here was having recursive reference to same schema (ie.
for defining Employee, we need Employee schema). After a closer look at how
it is handled in Avro [0], I feel that using a queue to tackle this
recursive-ness would help. Adding the same to GoraCompiler seems to solve
the same:
mvn test
.........
.........
[INFO]
------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Gora ....................................... SUCCESS [0.575s]
[INFO] Apache Gora :: Compiler ........................... SUCCESS [1.004s]
[INFO] Apache Gora :: Compiler-CLI ....................... SUCCESS [0.271s]
[INFO] Apache Gora :: Core ............................... SUCCESS [11.520s]
[INFO] Apache Gora :: Tutorial ........................... SUCCESS [0.509s]
[INFO] Apache Gora :: Sources-Dist ....................... SUCCESS [0.052s]
[INFO]
------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO]
------------------------------------------------------------------------

Attached the modified GoraCompiler.java.

[0]
https://github.com/apache/avro/blob/trunk/lang/java/compiler/src/main/java/org/apache/avro/compiler/specific/SpecificCompiler.java



On Sat, Aug 24, 2013 at 6:20 PM, Lewis John Mcgibbney <
lewis.mcgibbney@gmail.com> wrote:

> Hi,
> There are some issues when attempting to use the new GoraCompiler within
> the GORA_94 codebase.
> I refer specifically to the employee.json schema [0], where we have the
> following field in particular
>
>       {"name": "boss", "type":["null","Employee","string"]},
>
> It seems that the new compiler is allergic to the definitions within this
> field. Specifically that a boss can have >=1 Employee's however it seems
> that this results in the following
>
> @CEE279Law3-Linux:~/Downloads/asf/GORA_94$ ./bin/gora goracompiler
> gora-core/src/examples/avro/employee.json .
> Exception in thread "main" java.lang.StackOverflowError
>     at java.util.Arrays.copyOfRange(Arrays.java:2695)
>     at java.lang.String.<init>(String.java:203)
>     at java.lang.StringBuilder.toString(StringBuilder.java:405)
>     at org.apache.avro.Schema$Name.<init>(Schema.java:436)
>     at org.apache.avro.Schema.createRecord(Schema.java:144)
>     at
>
> org.apache.gora.compiler.GoraCompiler.getRecordSchemaWithDirtySupport(GoraCompiler.java:170)
>     at
>
> org.apache.gora.compiler.GoraCompiler.getSchemaWithDirtySupport(GoraCompiler.java:128)
>
> If we remove the offending field... all is well.
>
> Any ideas here?
> If we can get this fixed, we can compile the schema and stabilize this
> branch.
> Thanks
> Lewis
>
> [0]
>
> http://svn.apache.org/repos/asf/gora/branches/GORA_94/gora-core/src/examples/avro/employee.json
>
>
>
> --
> *Lewis*
>

Re: GoraCompiler old and new

Posted by Scott Stults <ss...@opensourceconnections.com>.
Oh! Now I get it. Thanks for being patient!

-Scott

On Aug 26, 2013, at 11:43 PM, Tejas Patil <te...@gmail.com> wrote:

> Hey Scott,
> Let me put it this way:
> 
> Employee {
>  name:..
>  ......
>  boss: [null / Employee / string]
> }
> 
> Each record represents a single employee along with all his/her details.
> Field "boss" points to the boss of the current employee. There is at max
> one boss for an employee.
> 
> Thanks,
> Tejas
> 
> 
> On Mon, Aug 26, 2013 at 6:53 PM, Scott Stults <
> sstults@opensourceconnections.com> wrote:
> 
>> Right, but aren't we looking for multiple Employees there? The way I
>> interpret the original schema, a boss may only have a single Employee.
>> Maybe I'm just not understanding the data model.
>> 
>> -Scott
>> 
>> 
>> On Aug 26, 2013, at 7:45 PM, Henry Saputra <he...@gmail.com>
>> wrote:
>> 
>>> I think the schema is a valid Union, so it allows null, or Employee, or
>>> string for boss.
>>> 
>>> 
>>> On Sun, Aug 25, 2013 at 8:34 AM, Scott Stults <
>>> sstults@opensourceconnections.com> wrote:
>>> 
>>>> Lewis,
>>>> 
>>>> In your schema, should the type for boss be a union of null and an array
>>>> of Employee?
>>>> 
>>>>> 
>>>>>    {"name": "boss", "type":["null","Employee","string"]},
>>>> 
>>>> 
>>>> -Scott
>> 
>> 


Re: GoraCompiler old and new

Posted by Tejas Patil <te...@gmail.com>.
Hey Scott,
Let me put it this way:

Employee {
  name:..
  ......
  boss: [null / Employee / string]
}

Each record represents a single employee along with all his/her details.
Field "boss" points to the boss of the current employee. There is at max
one boss for an employee.

Thanks,
Tejas


On Mon, Aug 26, 2013 at 6:53 PM, Scott Stults <
sstults@opensourceconnections.com> wrote:

> Right, but aren't we looking for multiple Employees there? The way I
> interpret the original schema, a boss may only have a single Employee.
> Maybe I'm just not understanding the data model.
>
> -Scott
>
>
> On Aug 26, 2013, at 7:45 PM, Henry Saputra <he...@gmail.com>
> wrote:
>
> > I think the schema is a valid Union, so it allows null, or Employee, or
> > string for boss.
> >
> >
> > On Sun, Aug 25, 2013 at 8:34 AM, Scott Stults <
> > sstults@opensourceconnections.com> wrote:
> >
> >> Lewis,
> >>
> >> In your schema, should the type for boss be a union of null and an array
> >> of Employee?
> >>
> >>>
> >>>     {"name": "boss", "type":["null","Employee","string"]},
> >>
> >>
> >> -Scott
>
>

Re: GoraCompiler old and new

Posted by Scott Stults <ss...@opensourceconnections.com>.
Right, but aren't we looking for multiple Employees there? The way I interpret the original schema, a boss may only have a single Employee. Maybe I'm just not understanding the data model.

-Scott


On Aug 26, 2013, at 7:45 PM, Henry Saputra <he...@gmail.com> wrote:

> I think the schema is a valid Union, so it allows null, or Employee, or
> string for boss.
> 
> 
> On Sun, Aug 25, 2013 at 8:34 AM, Scott Stults <
> sstults@opensourceconnections.com> wrote:
> 
>> Lewis,
>> 
>> In your schema, should the type for boss be a union of null and an array
>> of Employee?
>> 
>>> 
>>>     {"name": "boss", "type":["null","Employee","string"]},
>> 
>> 
>> -Scott


Re: GoraCompiler old and new

Posted by Henry Saputra <he...@gmail.com>.
I think the schema is a valid Union, so it allows null, or Employee, or
string for boss.


On Sun, Aug 25, 2013 at 8:34 AM, Scott Stults <
sstults@opensourceconnections.com> wrote:

> Lewis,
>
> In your schema, should the type for boss be a union of null and an array
> of Employee?
>
> >
> >      {"name": "boss", "type":["null","Employee","string"]},
>
>
> -Scott

Re: GoraCompiler old and new

Posted by Tejas Patil <te...@gmail.com>.
*@Lewis:*
Now that I have been thinking about it more, looks like my way would end up
causing an issue: I add the dirty field to each original schema obj after
queuing the original schemas. This would end up causing one problem: the
nested appearances would not refer to the modified schema with "dirty" but
with the original schema. Certainly something that we don't want.

*@Renato:*
> The link you provided is for Avro SpecificCompiler mate ;) was that
intended?
Yup. GoraCompiler extends this Avros' compiler and it was worth looking at
that code would have had the same problem !!!

*@Scott*
The schema seems correct to me. Below is an example from Avro website [0]:
For example, a linked-list of 64-bit values may be defined with:

{
  "type": "record",
  "name": "LongList",
  "aliases": ["LinkedLongs"],                      // old name for this
  "fields" : [
    {"name": "value", "type": "long"},             // each element has a
long
    {"name": "next", "type": ["LongList", "null"]} // optional next element
  ]
}

Here Longlist refers to itself... pretty similar to what Lewis' example [1]
does  :

    "type": "record",
    "name": "Employee",
    "fields" : [
      ...........
      ...........
      {"name": "boss", "type":["null","Employee","string"]},


[0] : http://avro.apache.org/docs/current/spec.html#schema_record
[1] :
http://svn.apache.org/repos/asf/gora/branches/GORA_94/gora-core/src/examples/avro/employee.json

Thanks,
Tejas

On Sun, Aug 25, 2013 at 8:34 AM, Scott Stults <
sstults@opensourceconnections.com> wrote:

> Lewis,
>
> In your schema, should the type for boss be a union of null and an array
> of Employee?
>
> >
> >      {"name": "boss", "type":["null","Employee","string"]},
>
>
> -Scott

Re: GoraCompiler old and new

Posted by Scott Stults <ss...@opensourceconnections.com>.
Lewis,

In your schema, should the type for boss be a union of null and an array of Employee?

> 
>      {"name": "boss", "type":["null","Employee","string"]},


-Scott