You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gora.apache.org by "Renato Javier Marroquín Mogrovejo (JIRA)" <ji...@apache.org> on 2013/02/26 15:58:14 UTC

[jira] [Commented] (GORA-210) thread safety

    [ https://issues.apache.org/jira/browse/GORA-210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587179#comment-13587179 ] 

Renato Javier Marroquín Mogrovejo commented on GORA-210:
--------------------------------------------------------

Hi Roland!

This is great! Especially if you have nailed this down and found the main problem. On Nutch-1534 you are talking about a NullPointerException, does this fix that as well? And for Nutch-1534 the problem seems like a different one. I had seen this behaviour as well, I mean getting errors if column names were not defined.
Anyways regarding to the patch, I think this makes sense but we will have to make the necessary changes everywhere we iterate over this key set [1]. What do you think Roland?

[1] http://docs.oracle.com/javase/6/docs/api/java/util/Collections.html#synchronizedMap%28java.util.Map%29
                
> thread safety
> -------------
>
>                 Key: GORA-210
>                 URL: https://issues.apache.org/jira/browse/GORA-210
>             Project: Apache Gora
>          Issue Type: Bug
>          Components: storage-cassandra
>    Affects Versions: 0.2
>         Environment: nutch 2.1 / cassandra 1.2.1 / gora-cassandra 0.2 / gora-core 0.2.1
> running fetch with parse=true
> fetcher.threads.per.queue>1
>            Reporter: Roland
>            Priority: Critical
>              Labels: patch
>         Attachments: GORA-210.patch
>
>
> This is the result of debugging one of my issues described in NUTCH-1534.
> I think there is a wrong assumpation about thread safety of LinkedHashMap, it is not enough to not iterate over the buffer (which is a LinkedHashMap).
> My patch fixes this error for me:
> java.util.ConcurrentModificationException
>         at java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:394)
>         at java.util.LinkedHashMap$KeyIterator.next(LinkedHashMap.java:405)
>         at java.util.AbstractCollection.toArray(AbstractCollection.java:141)
>         at org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:200)
>         at org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65)
>         at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:587)
>         at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>         at org.apache.nutch.fetcher.FetcherReducer$FetcherThread.output(FetcherReducer.java:664)
>         at org.apache.nutch.fetcher.FetcherReducer$FetcherThread.run(FetcherReducer.java:534)
> It may not be perfect from a performance point of view...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira