You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Richard Ahrens (JIRA)" <ji...@apache.org> on 2010/10/27 21:29:19 UTC

[jira] Created: (AVRO-685) Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning

Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning
-------------------------------------------------------------------------------------------------

                 Key: AVRO-685
                 URL: https://issues.apache.org/jira/browse/AVRO-685
             Project: Avro
          Issue Type: Bug
          Components: java
    Affects Versions: 1.4.1
         Environment: all
            Reporter: Richard Ahrens
            Priority: Critical


I am creating a protocol in memory by building up Schema objects, then writing the avpr file to disk and running SpecificCompiler against it to generate Java sources.  My protocol file causes SpecificCompiler to hang.  Running in the debugger, I can see a long stack trace emanating from SpecificCompiler.enqueue() (see debugger stack trace at end of this text).  What appears to be happening is that Schema.RecordSchema.hashCode() is removing itself from the SEEN_HASHCODE map prematurely; schemas with circular references in multiple fields are added and removed from SEEN_HASHCODE causing the code to bounce around between fields without ever unwinding to the root object.

I'm unable to include attachments to this Jira issue, but here's a patch that fixes the problem.  If this patch is accepted, I'd like to request an incremental release as this is a showstopper for us.

Index: src/java/org/apache/avro/Schema.java
===================================================================
--- src/java/org/apache/avro/Schema.java	(revision 1028064)
+++ src/java/org/apache/avro/Schema.java	(working copy)
@@ -587,21 +587,27 @@
       Set seen = SEEN_EQUALS.get();
       SeenPair here = new SeenPair(this, o);
       if (seen.contains(here)) return true;       // prevent stack overflow
+      boolean first = seen.isEmpty();
       try {
         seen.add(here);
         return fields.equals(((RecordSchema)o).fields);
       } finally {
-        seen.remove(here);
+          if(first) {
+              seen.clear();
+          }
       }
     }
     public int hashCode() {
       Map seen = SEEN_HASHCODE.get();
       if (seen.containsKey(this)) return 0;       // prevent stack overflow
+      boolean first = seen.isEmpty();
       try {
         seen.put(this, this);
         return super.hashCode() + fields.hashCode();
       } finally {
-        seen.remove(this);
+          if(first) {
+              seen.clear();
+          }
       }
     }
     void toJson(Names names, JsonGenerator gen) throws IOException {



I can also provide a sample avpr file to reproduce the issue-- please contact me directly.

Debugger stack trace referenced above:
org.apache.avro.specific.SpecificCompiler at localhost:3273	
	Thread [main] (Suspended)	
		System.identityHashCode(Object) line: not available [native method]	
		IdentityHashMap<K,V>.hash(Object, int) line: 284	
		IdentityHashMap<K,V>.put(K, V) line: 412	
		Schema$RecordSchema.hashCode() line: 601	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$ArraySchema.hashCode() line: 703	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$ArraySchema.hashCode() line: 703	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		HashMap<K,V>.getEntry(Object) line: 344	
		HashMap<K,V>.containsKey(Object) line: 335	
		HashSet<E>.contains(Object) line: 184	
		SpecificCompiler.enqueue(Schema) line: 134	
		SpecificCompiler.<init>(Protocol) line: 70	
		SpecificCompiler.compileProtocol(File, File) line: 114	
		SpecificCompiler.main(String[]) line: 399	


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-685) Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925596#action_12925596 ] 

Doug Cutting commented on AVRO-685:
-----------------------------------

My current theory is that this is not actually a loop but an exponential explosion.

> Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning
> -------------------------------------------------------------------------------------------------
>
>                 Key: AVRO-685
>                 URL: https://issues.apache.org/jira/browse/AVRO-685
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.4.1
>         Environment: all
>            Reporter: Richard Ahrens
>            Priority: Critical
>         Attachments: Schema.patch, test.avpr
>
>
> I am creating a protocol in memory by building up Schema objects, then writing the avpr file to disk and running SpecificCompiler against it to generate Java sources.  My protocol file causes SpecificCompiler to hang.  Running in the debugger, I can see a long stack trace emanating from SpecificCompiler.enqueue() (see debugger stack trace at end of this text).  What appears to be happening is that Schema.RecordSchema.hashCode() is removing itself from the SEEN_HASHCODE map prematurely; schemas with circular references in multiple fields are added and removed from SEEN_HASHCODE causing the code to bounce around between fields without ever unwinding to the root object.
> Attached is a patch that fixes the problem.  If this patch is accepted, I'd like to request an incremental release as this is a showstopper for us.  I've also attached a sample avpr file that reproduces the issue.
> Debugger stack trace referenced above:
> org.apache.avro.specific.SpecificCompiler at localhost:3273	
> 	Thread [main] (Suspended)	
> 		System.identityHashCode(Object) line: not available [native method]	
> 		IdentityHashMap<K,V>.hash(Object, int) line: 284	
> 		IdentityHashMap<K,V>.put(K, V) line: 412	
> 		Schema$RecordSchema.hashCode() line: 601	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		HashMap<K,V>.getEntry(Object) line: 344	
> 		HashMap<K,V>.containsKey(Object) line: 335	
> 		HashSet<E>.contains(Object) line: 184	
> 		SpecificCompiler.enqueue(Schema) line: 134	
> 		SpecificCompiler.<init>(Protocol) line: 70	
> 		SpecificCompiler.compileProtocol(File, File) line: 114	
> 		SpecificCompiler.main(String[]) line: 399	

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (AVRO-685) Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting resolved AVRO-685.
-------------------------------

       Resolution: Fixed
    Fix Version/s: 1.5.0
         Assignee: Richard Ahrens
     Hadoop Flags: [Reviewed]

I just committed this.  Thanks, Richard!

> Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning
> -------------------------------------------------------------------------------------------------
>
>                 Key: AVRO-685
>                 URL: https://issues.apache.org/jira/browse/AVRO-685
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.4.1
>         Environment: all
>            Reporter: Richard Ahrens
>            Assignee: Richard Ahrens
>            Priority: Critical
>             Fix For: 1.5.0
>
>         Attachments: AVRO-685.patch, Schema.patch, test.avpr
>
>
> I am creating a protocol in memory by building up Schema objects, then writing the avpr file to disk and running SpecificCompiler against it to generate Java sources.  My protocol file causes SpecificCompiler to hang.  Running in the debugger, I can see a long stack trace emanating from SpecificCompiler.enqueue() (see debugger stack trace at end of this text).  What appears to be happening is that Schema.RecordSchema.hashCode() is removing itself from the SEEN_HASHCODE map prematurely; schemas with circular references in multiple fields are added and removed from SEEN_HASHCODE causing the code to bounce around between fields without ever unwinding to the root object.
> Attached is a patch that fixes the problem.  If this patch is accepted, I'd like to request an incremental release as this is a showstopper for us.  I've also attached a sample avpr file that reproduces the issue.
> Debugger stack trace referenced above:
> org.apache.avro.specific.SpecificCompiler at localhost:3273	
> 	Thread [main] (Suspended)	
> 		System.identityHashCode(Object) line: not available [native method]	
> 		IdentityHashMap<K,V>.hash(Object, int) line: 284	
> 		IdentityHashMap<K,V>.put(K, V) line: 412	
> 		Schema$RecordSchema.hashCode() line: 601	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		HashMap<K,V>.getEntry(Object) line: 344	
> 		HashMap<K,V>.containsKey(Object) line: 335	
> 		HashSet<E>.contains(Object) line: 184	
> 		SpecificCompiler.enqueue(Schema) line: 134	
> 		SpecificCompiler.<init>(Protocol) line: 70	
> 		SpecificCompiler.compileProtocol(File, File) line: 114	
> 		SpecificCompiler.main(String[]) line: 399	

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-685) Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning

Posted by "Richard Ahrens (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Richard Ahrens updated AVRO-685:
--------------------------------

    Description: 
I am creating a protocol in memory by building up Schema objects, then writing the avpr file to disk and running SpecificCompiler against it to generate Java sources.  My protocol file causes SpecificCompiler to hang.  Running in the debugger, I can see a long stack trace emanating from SpecificCompiler.enqueue() (see debugger stack trace at end of this text).  What appears to be happening is that Schema.RecordSchema.hashCode() is removing itself from the SEEN_HASHCODE map prematurely; schemas with circular references in multiple fields are added and removed from SEEN_HASHCODE causing the code to bounce around between fields without ever unwinding to the root object.

Attached is a patch that fixes the problem.  If this patch is accepted, I'd like to request an incremental release as this is a showstopper for us.  I've also attached a sample avpr file that reproduces the issue.

Debugger stack trace referenced above:
org.apache.avro.specific.SpecificCompiler at localhost:3273	
	Thread [main] (Suspended)	
		System.identityHashCode(Object) line: not available [native method]	
		IdentityHashMap<K,V>.hash(Object, int) line: 284	
		IdentityHashMap<K,V>.put(K, V) line: 412	
		Schema$RecordSchema.hashCode() line: 601	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$ArraySchema.hashCode() line: 703	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$ArraySchema.hashCode() line: 703	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		HashMap<K,V>.getEntry(Object) line: 344	
		HashMap<K,V>.containsKey(Object) line: 335	
		HashSet<E>.contains(Object) line: 184	
		SpecificCompiler.enqueue(Schema) line: 134	
		SpecificCompiler.<init>(Protocol) line: 70	
		SpecificCompiler.compileProtocol(File, File) line: 114	
		SpecificCompiler.main(String[]) line: 399	


  was:
I am creating a protocol in memory by building up Schema objects, then writing the avpr file to disk and running SpecificCompiler against it to generate Java sources.  My protocol file causes SpecificCompiler to hang.  Running in the debugger, I can see a long stack trace emanating from SpecificCompiler.enqueue() (see debugger stack trace at end of this text).  What appears to be happening is that Schema.RecordSchema.hashCode() is removing itself from the SEEN_HASHCODE map prematurely; schemas with circular references in multiple fields are added and removed from SEEN_HASHCODE causing the code to bounce around between fields without ever unwinding to the root object.

I'm unable to include attachments to this Jira issue, but here's a patch that fixes the problem.  If this patch is accepted, I'd like to request an incremental release as this is a showstopper for us.

Index: src/java/org/apache/avro/Schema.java
===================================================================
--- src/java/org/apache/avro/Schema.java	(revision 1028064)
+++ src/java/org/apache/avro/Schema.java	(working copy)
@@ -587,21 +587,27 @@
       Set seen = SEEN_EQUALS.get();
       SeenPair here = new SeenPair(this, o);
       if (seen.contains(here)) return true;       // prevent stack overflow
+      boolean first = seen.isEmpty();
       try {
         seen.add(here);
         return fields.equals(((RecordSchema)o).fields);
       } finally {
-        seen.remove(here);
+          if(first) {
+              seen.clear();
+          }
       }
     }
     public int hashCode() {
       Map seen = SEEN_HASHCODE.get();
       if (seen.containsKey(this)) return 0;       // prevent stack overflow
+      boolean first = seen.isEmpty();
       try {
         seen.put(this, this);
         return super.hashCode() + fields.hashCode();
       } finally {
-        seen.remove(this);
+          if(first) {
+              seen.clear();
+          }
       }
     }
     void toJson(Names names, JsonGenerator gen) throws IOException {



I can also provide a sample avpr file to reproduce the issue-- please contact me directly.

Debugger stack trace referenced above:
org.apache.avro.specific.SpecificCompiler at localhost:3273	
	Thread [main] (Suspended)	
		System.identityHashCode(Object) line: not available [native method]	
		IdentityHashMap<K,V>.hash(Object, int) line: 284	
		IdentityHashMap<K,V>.put(K, V) line: 412	
		Schema$RecordSchema.hashCode() line: 601	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$ArraySchema.hashCode() line: 703	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$UnionSchema.hashCode() line: 781	
		Schema$ArraySchema.hashCode() line: 703	
		Schema$Field.hashCode() line: 421	
		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
		Schema$RecordSchema.hashCode() line: 602	
		HashMap<K,V>.getEntry(Object) line: 344	
		HashMap<K,V>.containsKey(Object) line: 335	
		HashSet<E>.contains(Object) line: 184	
		SpecificCompiler.enqueue(Schema) line: 134	
		SpecificCompiler.<init>(Protocol) line: 70	
		SpecificCompiler.compileProtocol(File, File) line: 114	
		SpecificCompiler.main(String[]) line: 399	



> Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning
> -------------------------------------------------------------------------------------------------
>
>                 Key: AVRO-685
>                 URL: https://issues.apache.org/jira/browse/AVRO-685
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.4.1
>         Environment: all
>            Reporter: Richard Ahrens
>            Priority: Critical
>
> I am creating a protocol in memory by building up Schema objects, then writing the avpr file to disk and running SpecificCompiler against it to generate Java sources.  My protocol file causes SpecificCompiler to hang.  Running in the debugger, I can see a long stack trace emanating from SpecificCompiler.enqueue() (see debugger stack trace at end of this text).  What appears to be happening is that Schema.RecordSchema.hashCode() is removing itself from the SEEN_HASHCODE map prematurely; schemas with circular references in multiple fields are added and removed from SEEN_HASHCODE causing the code to bounce around between fields without ever unwinding to the root object.
> Attached is a patch that fixes the problem.  If this patch is accepted, I'd like to request an incremental release as this is a showstopper for us.  I've also attached a sample avpr file that reproduces the issue.
> Debugger stack trace referenced above:
> org.apache.avro.specific.SpecificCompiler at localhost:3273	
> 	Thread [main] (Suspended)	
> 		System.identityHashCode(Object) line: not available [native method]	
> 		IdentityHashMap<K,V>.hash(Object, int) line: 284	
> 		IdentityHashMap<K,V>.put(K, V) line: 412	
> 		Schema$RecordSchema.hashCode() line: 601	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		HashMap<K,V>.getEntry(Object) line: 344	
> 		HashMap<K,V>.containsKey(Object) line: 335	
> 		HashSet<E>.contains(Object) line: 184	
> 		SpecificCompiler.enqueue(Schema) line: 134	
> 		SpecificCompiler.<init>(Protocol) line: 70	
> 		SpecificCompiler.compileProtocol(File, File) line: 114	
> 		SpecificCompiler.main(String[]) line: 399	

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-685) Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning

Posted by "Richard Ahrens (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Richard Ahrens updated AVRO-685:
--------------------------------

    Attachment: Schema.patch

> Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning
> -------------------------------------------------------------------------------------------------
>
>                 Key: AVRO-685
>                 URL: https://issues.apache.org/jira/browse/AVRO-685
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.4.1
>         Environment: all
>            Reporter: Richard Ahrens
>            Priority: Critical
>         Attachments: Schema.patch, test.avpr
>
>
> I am creating a protocol in memory by building up Schema objects, then writing the avpr file to disk and running SpecificCompiler against it to generate Java sources.  My protocol file causes SpecificCompiler to hang.  Running in the debugger, I can see a long stack trace emanating from SpecificCompiler.enqueue() (see debugger stack trace at end of this text).  What appears to be happening is that Schema.RecordSchema.hashCode() is removing itself from the SEEN_HASHCODE map prematurely; schemas with circular references in multiple fields are added and removed from SEEN_HASHCODE causing the code to bounce around between fields without ever unwinding to the root object.
> Attached is a patch that fixes the problem.  If this patch is accepted, I'd like to request an incremental release as this is a showstopper for us.  I've also attached a sample avpr file that reproduces the issue.
> Debugger stack trace referenced above:
> org.apache.avro.specific.SpecificCompiler at localhost:3273	
> 	Thread [main] (Suspended)	
> 		System.identityHashCode(Object) line: not available [native method]	
> 		IdentityHashMap<K,V>.hash(Object, int) line: 284	
> 		IdentityHashMap<K,V>.put(K, V) line: 412	
> 		Schema$RecordSchema.hashCode() line: 601	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		HashMap<K,V>.getEntry(Object) line: 344	
> 		HashMap<K,V>.containsKey(Object) line: 335	
> 		HashSet<E>.contains(Object) line: 184	
> 		SpecificCompiler.enqueue(Schema) line: 134	
> 		SpecificCompiler.<init>(Protocol) line: 70	
> 		SpecificCompiler.compileProtocol(File, File) line: 114	
> 		SpecificCompiler.main(String[]) line: 399	

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-685) Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning

Posted by "Scott Carey (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925891#action_12925891 ] 

Scott Carey commented on AVRO-685:
----------------------------------

Patch looks good and passes tests.    I haven't completely thought through this problem yet but the change looks safe.


> Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning
> -------------------------------------------------------------------------------------------------
>
>                 Key: AVRO-685
>                 URL: https://issues.apache.org/jira/browse/AVRO-685
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.4.1
>         Environment: all
>            Reporter: Richard Ahrens
>            Priority: Critical
>         Attachments: AVRO-685.patch, Schema.patch, test.avpr
>
>
> I am creating a protocol in memory by building up Schema objects, then writing the avpr file to disk and running SpecificCompiler against it to generate Java sources.  My protocol file causes SpecificCompiler to hang.  Running in the debugger, I can see a long stack trace emanating from SpecificCompiler.enqueue() (see debugger stack trace at end of this text).  What appears to be happening is that Schema.RecordSchema.hashCode() is removing itself from the SEEN_HASHCODE map prematurely; schemas with circular references in multiple fields are added and removed from SEEN_HASHCODE causing the code to bounce around between fields without ever unwinding to the root object.
> Attached is a patch that fixes the problem.  If this patch is accepted, I'd like to request an incremental release as this is a showstopper for us.  I've also attached a sample avpr file that reproduces the issue.
> Debugger stack trace referenced above:
> org.apache.avro.specific.SpecificCompiler at localhost:3273	
> 	Thread [main] (Suspended)	
> 		System.identityHashCode(Object) line: not available [native method]	
> 		IdentityHashMap<K,V>.hash(Object, int) line: 284	
> 		IdentityHashMap<K,V>.put(K, V) line: 412	
> 		Schema$RecordSchema.hashCode() line: 601	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		HashMap<K,V>.getEntry(Object) line: 344	
> 		HashMap<K,V>.containsKey(Object) line: 335	
> 		HashSet<E>.contains(Object) line: 184	
> 		SpecificCompiler.enqueue(Schema) line: 134	
> 		SpecificCompiler.<init>(Protocol) line: 70	
> 		SpecificCompiler.compileProtocol(File, File) line: 114	
> 		SpecificCompiler.main(String[]) line: 399	

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-685) Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning

Posted by "Richard Ahrens (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926033#action_12926033 ] 

Richard Ahrens commented on AVRO-685:
-------------------------------------

Terrific!  Much appreciated, Doug.

> Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning
> -------------------------------------------------------------------------------------------------
>
>                 Key: AVRO-685
>                 URL: https://issues.apache.org/jira/browse/AVRO-685
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.4.1
>         Environment: all
>            Reporter: Richard Ahrens
>            Assignee: Richard Ahrens
>            Priority: Critical
>             Fix For: 1.5.0
>
>         Attachments: AVRO-685.patch, Schema.patch, test.avpr
>
>
> I am creating a protocol in memory by building up Schema objects, then writing the avpr file to disk and running SpecificCompiler against it to generate Java sources.  My protocol file causes SpecificCompiler to hang.  Running in the debugger, I can see a long stack trace emanating from SpecificCompiler.enqueue() (see debugger stack trace at end of this text).  What appears to be happening is that Schema.RecordSchema.hashCode() is removing itself from the SEEN_HASHCODE map prematurely; schemas with circular references in multiple fields are added and removed from SEEN_HASHCODE causing the code to bounce around between fields without ever unwinding to the root object.
> Attached is a patch that fixes the problem.  If this patch is accepted, I'd like to request an incremental release as this is a showstopper for us.  I've also attached a sample avpr file that reproduces the issue.
> Debugger stack trace referenced above:
> org.apache.avro.specific.SpecificCompiler at localhost:3273	
> 	Thread [main] (Suspended)	
> 		System.identityHashCode(Object) line: not available [native method]	
> 		IdentityHashMap<K,V>.hash(Object, int) line: 284	
> 		IdentityHashMap<K,V>.put(K, V) line: 412	
> 		Schema$RecordSchema.hashCode() line: 601	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		HashMap<K,V>.getEntry(Object) line: 344	
> 		HashMap<K,V>.containsKey(Object) line: 335	
> 		HashSet<E>.contains(Object) line: 184	
> 		SpecificCompiler.enqueue(Schema) line: 134	
> 		SpecificCompiler.<init>(Protocol) line: 70	
> 		SpecificCompiler.compileProtocol(File, File) line: 114	
> 		SpecificCompiler.main(String[]) line: 399	

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-685) Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925565#action_12925565 ] 

Doug Cutting commented on AVRO-685:
-----------------------------------

Thanks for finding this!

I can't yet see how this can happen, yet it does with the protocol you provide.

I'm trying to create a minimal example.  Does anyone have an intuition?


> Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning
> -------------------------------------------------------------------------------------------------
>
>                 Key: AVRO-685
>                 URL: https://issues.apache.org/jira/browse/AVRO-685
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.4.1
>         Environment: all
>            Reporter: Richard Ahrens
>            Priority: Critical
>         Attachments: Schema.patch, test.avpr
>
>
> I am creating a protocol in memory by building up Schema objects, then writing the avpr file to disk and running SpecificCompiler against it to generate Java sources.  My protocol file causes SpecificCompiler to hang.  Running in the debugger, I can see a long stack trace emanating from SpecificCompiler.enqueue() (see debugger stack trace at end of this text).  What appears to be happening is that Schema.RecordSchema.hashCode() is removing itself from the SEEN_HASHCODE map prematurely; schemas with circular references in multiple fields are added and removed from SEEN_HASHCODE causing the code to bounce around between fields without ever unwinding to the root object.
> Attached is a patch that fixes the problem.  If this patch is accepted, I'd like to request an incremental release as this is a showstopper for us.  I've also attached a sample avpr file that reproduces the issue.
> Debugger stack trace referenced above:
> org.apache.avro.specific.SpecificCompiler at localhost:3273	
> 	Thread [main] (Suspended)	
> 		System.identityHashCode(Object) line: not available [native method]	
> 		IdentityHashMap<K,V>.hash(Object, int) line: 284	
> 		IdentityHashMap<K,V>.put(K, V) line: 412	
> 		Schema$RecordSchema.hashCode() line: 601	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		HashMap<K,V>.getEntry(Object) line: 344	
> 		HashMap<K,V>.containsKey(Object) line: 335	
> 		HashSet<E>.contains(Object) line: 184	
> 		SpecificCompiler.enqueue(Schema) line: 134	
> 		SpecificCompiler.<init>(Protocol) line: 70	
> 		SpecificCompiler.compileProtocol(File, File) line: 114	
> 		SpecificCompiler.main(String[]) line: 399	

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-685) Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning

Posted by "Richard Ahrens (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Richard Ahrens updated AVRO-685:
--------------------------------

    Attachment: test.avpr

> Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning
> -------------------------------------------------------------------------------------------------
>
>                 Key: AVRO-685
>                 URL: https://issues.apache.org/jira/browse/AVRO-685
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.4.1
>         Environment: all
>            Reporter: Richard Ahrens
>            Priority: Critical
>         Attachments: Schema.patch, test.avpr
>
>
> I am creating a protocol in memory by building up Schema objects, then writing the avpr file to disk and running SpecificCompiler against it to generate Java sources.  My protocol file causes SpecificCompiler to hang.  Running in the debugger, I can see a long stack trace emanating from SpecificCompiler.enqueue() (see debugger stack trace at end of this text).  What appears to be happening is that Schema.RecordSchema.hashCode() is removing itself from the SEEN_HASHCODE map prematurely; schemas with circular references in multiple fields are added and removed from SEEN_HASHCODE causing the code to bounce around between fields without ever unwinding to the root object.
> Attached is a patch that fixes the problem.  If this patch is accepted, I'd like to request an incremental release as this is a showstopper for us.  I've also attached a sample avpr file that reproduces the issue.
> Debugger stack trace referenced above:
> org.apache.avro.specific.SpecificCompiler at localhost:3273	
> 	Thread [main] (Suspended)	
> 		System.identityHashCode(Object) line: not available [native method]	
> 		IdentityHashMap<K,V>.hash(Object, int) line: 284	
> 		IdentityHashMap<K,V>.put(K, V) line: 412	
> 		Schema$RecordSchema.hashCode() line: 601	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		HashMap<K,V>.getEntry(Object) line: 344	
> 		HashMap<K,V>.containsKey(Object) line: 335	
> 		HashSet<E>.contains(Object) line: 184	
> 		SpecificCompiler.enqueue(Schema) line: 134	
> 		SpecificCompiler.<init>(Protocol) line: 70	
> 		SpecificCompiler.compileProtocol(File, File) line: 114	
> 		SpecificCompiler.main(String[]) line: 399	

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-685) Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated AVRO-685:
------------------------------

    Attachment: AVRO-685.patch

Yes, it was an exponential blowup.  I've added a test that illustrates this.

If there are no objections, I'll commit this tomorrow.

> Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning
> -------------------------------------------------------------------------------------------------
>
>                 Key: AVRO-685
>                 URL: https://issues.apache.org/jira/browse/AVRO-685
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.4.1
>         Environment: all
>            Reporter: Richard Ahrens
>            Priority: Critical
>         Attachments: AVRO-685.patch, Schema.patch, test.avpr
>
>
> I am creating a protocol in memory by building up Schema objects, then writing the avpr file to disk and running SpecificCompiler against it to generate Java sources.  My protocol file causes SpecificCompiler to hang.  Running in the debugger, I can see a long stack trace emanating from SpecificCompiler.enqueue() (see debugger stack trace at end of this text).  What appears to be happening is that Schema.RecordSchema.hashCode() is removing itself from the SEEN_HASHCODE map prematurely; schemas with circular references in multiple fields are added and removed from SEEN_HASHCODE causing the code to bounce around between fields without ever unwinding to the root object.
> Attached is a patch that fixes the problem.  If this patch is accepted, I'd like to request an incremental release as this is a showstopper for us.  I've also attached a sample avpr file that reproduces the issue.
> Debugger stack trace referenced above:
> org.apache.avro.specific.SpecificCompiler at localhost:3273	
> 	Thread [main] (Suspended)	
> 		System.identityHashCode(Object) line: not available [native method]	
> 		IdentityHashMap<K,V>.hash(Object, int) line: 284	
> 		IdentityHashMap<K,V>.put(K, V) line: 412	
> 		Schema$RecordSchema.hashCode() line: 601	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$UnionSchema.hashCode() line: 781	
> 		Schema$ArraySchema.hashCode() line: 703	
> 		Schema$Field.hashCode() line: 421	
> 		Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527	
> 		Schema$RecordSchema.hashCode() line: 602	
> 		HashMap<K,V>.getEntry(Object) line: 344	
> 		HashMap<K,V>.containsKey(Object) line: 335	
> 		HashSet<E>.contains(Object) line: 184	
> 		SpecificCompiler.enqueue(Schema) line: 134	
> 		SpecificCompiler.<init>(Protocol) line: 70	
> 		SpecificCompiler.compileProtocol(File, File) line: 114	
> 		SpecificCompiler.main(String[]) line: 399	

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.