You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/14 01:25:42 UTC

[jira] [Created] (NUTCH-1390) readdb -url $url throws NPE with gora-cassandra

Lewis John McGibbney created NUTCH-1390:
-------------------------------------------

             Summary: readdb -url $url throws NPE with gora-cassandra
                 Key: NUTCH-1390
                 URL: https://issues.apache.org/jira/browse/NUTCH-1390
             Project: Nutch
          Issue Type: Bug
          Components: crawldb
    Affects Versions: nutchgora
            Reporter: Lewis John McGibbney
             Fix For: 2.1


After successfully injecting, generating, fetching (without parsing enabled), parsing, updatingdb, then executinga readdb passing a particular -url argument I get a lovely NPE

{code}
lewis@lewis:~/ASF/nutchgora/runtime/local$ ./bin/nutch readdb -url http://www.trancearoundtheworld.com
WebTableReader: java.lang.NullPointerException
	at org.apache.gora.cassandra.store.CassandraClient.getFamilyMap(CassandraClient.java:220)
	at org.apache.gora.cassandra.store.CassandraStore.execute(CassandraStore.java:108)
	at org.apache.nutch.crawl.WebTableReader.read(WebTableReader.java:234)
	at org.apache.nutch.crawl.WebTableReader.run(WebTableReader.java:476)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.nutch.crawl.WebTableReader.main(WebTableReader.java:412)
{code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] [Commented] (NUTCH-1390) readdb -url $url throws NPE with gora-cassandra

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Thanks

This is most probably a bug in Gora's CassandraClient code then.

Thanks for reporting.

On Wed, Aug 22, 2012 at 3:09 PM, lin weijian <li...@gmail.com> wrote:
>
> I test this situation with Hbase 0.92.1, but  it works just right, no matter trancearoundtheworld.com or other domain.
>
>
>
> 在 2012-8-22,下午9:59, Lewis John McGibbney (JIRA) 写道:
>
>>
>>    [ https://issues.apache.org/jira/browse/NUTCH-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439539#comment-13439539 ]
>>
>> Lewis John McGibbney commented on NUTCH-1390:
>> ---------------------------------------------
>>
>> Can anyone confirm is this is the case with any other backend other than Cassandra? If needs to I'll do test runs on all backends if I get no feedback. Ta Lewis
>>
>>> readdb -url $url throws NPE with gora-cassandra
>>> -----------------------------------------------
>>>
>>>                Key: NUTCH-1390
>>>                URL: https://issues.apache.org/jira/browse/NUTCH-1390
>>>            Project: Nutch
>>>         Issue Type: Bug
>>>         Components: crawldb
>>>   Affects Versions: nutchgora
>>>           Reporter: Lewis John McGibbney
>>>            Fix For: 2.1
>>>
>>>
>>> After successfully injecting, generating, fetching (without parsing enabled), parsing, updatingdb, then executinga readdb passing a particular -url argument I get a lovely NPE
>>> {code}
>>> lewis@lewis:~/ASF/nutchgora/runtime/local$ ./bin/nutch readdb -url http://www.trancearoundtheworld.com
>>> WebTableReader: java.lang.NullPointerException
>>>      at org.apache.gora.cassandra.store.CassandraClient.getFamilyMap(CassandraClient.java:220)
>>>      at org.apache.gora.cassandra.store.CassandraStore.execute(CassandraStore.java:108)
>>>      at org.apache.nutch.crawl.WebTableReader.read(WebTableReader.java:234)
>>>      at org.apache.nutch.crawl.WebTableReader.run(WebTableReader.java:476)
>>>      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>      at org.apache.nutch.crawl.WebTableReader.main(WebTableReader.java:412)
>>> {code}
>>
>> --
>> This message is automatically generated by JIRA.
>> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
>> For more information on JIRA, see: http://www.atlassian.com/software/jira
>>
>>
>



-- 
Lewis

Re: [jira] [Commented] (NUTCH-1390) readdb -url $url throws NPE with gora-cassandra

Posted by lin weijian <li...@gmail.com>.
I test this situation with Hbase 0.92.1, but  it works just right, no matter trancearoundtheworld.com or other domain.



在 2012-8-22,下午9:59, Lewis John McGibbney (JIRA) 写道:

> 
>    [ https://issues.apache.org/jira/browse/NUTCH-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439539#comment-13439539 ] 
> 
> Lewis John McGibbney commented on NUTCH-1390:
> ---------------------------------------------
> 
> Can anyone confirm is this is the case with any other backend other than Cassandra? If needs to I'll do test runs on all backends if I get no feedback. Ta Lewis
> 
>> readdb -url $url throws NPE with gora-cassandra
>> -----------------------------------------------
>> 
>>                Key: NUTCH-1390
>>                URL: https://issues.apache.org/jira/browse/NUTCH-1390
>>            Project: Nutch
>>         Issue Type: Bug
>>         Components: crawldb
>>   Affects Versions: nutchgora
>>           Reporter: Lewis John McGibbney
>>            Fix For: 2.1
>> 
>> 
>> After successfully injecting, generating, fetching (without parsing enabled), parsing, updatingdb, then executinga readdb passing a particular -url argument I get a lovely NPE
>> {code}
>> lewis@lewis:~/ASF/nutchgora/runtime/local$ ./bin/nutch readdb -url http://www.trancearoundtheworld.com
>> WebTableReader: java.lang.NullPointerException
>> 	at org.apache.gora.cassandra.store.CassandraClient.getFamilyMap(CassandraClient.java:220)
>> 	at org.apache.gora.cassandra.store.CassandraStore.execute(CassandraStore.java:108)
>> 	at org.apache.nutch.crawl.WebTableReader.read(WebTableReader.java:234)
>> 	at org.apache.nutch.crawl.WebTableReader.run(WebTableReader.java:476)
>> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> 	at org.apache.nutch.crawl.WebTableReader.main(WebTableReader.java:412)
>> {code} 
> 
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
> 
> 


[jira] [Commented] (NUTCH-1390) readdb -url $url throws NPE with gora-cassandra

Posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439539#comment-13439539 ] 

Lewis John McGibbney commented on NUTCH-1390:
---------------------------------------------

Can anyone confirm is this is the case with any other backend other than Cassandra? If needs to I'll do test runs on all backends if I get no feedback. Ta Lewis
                
> readdb -url $url throws NPE with gora-cassandra
> -----------------------------------------------
>
>                 Key: NUTCH-1390
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1390
>             Project: Nutch
>          Issue Type: Bug
>          Components: crawldb
>    Affects Versions: nutchgora
>            Reporter: Lewis John McGibbney
>             Fix For: 2.1
>
>
> After successfully injecting, generating, fetching (without parsing enabled), parsing, updatingdb, then executinga readdb passing a particular -url argument I get a lovely NPE
> {code}
> lewis@lewis:~/ASF/nutchgora/runtime/local$ ./bin/nutch readdb -url http://www.trancearoundtheworld.com
> WebTableReader: java.lang.NullPointerException
> 	at org.apache.gora.cassandra.store.CassandraClient.getFamilyMap(CassandraClient.java:220)
> 	at org.apache.gora.cassandra.store.CassandraStore.execute(CassandraStore.java:108)
> 	at org.apache.nutch.crawl.WebTableReader.read(WebTableReader.java:234)
> 	at org.apache.nutch.crawl.WebTableReader.run(WebTableReader.java:476)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at org.apache.nutch.crawl.WebTableReader.main(WebTableReader.java:412)
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (NUTCH-1390) readdb -url $url throws NPE with gora-cassandra

Posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lewis John McGibbney updated NUTCH-1390:
----------------------------------------

    Fix Version/s:     (was: 2.1)
                   2.2
    
> readdb -url $url throws NPE with gora-cassandra
> -----------------------------------------------
>
>                 Key: NUTCH-1390
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1390
>             Project: Nutch
>          Issue Type: Bug
>          Components: crawldb
>    Affects Versions: nutchgora
>            Reporter: Lewis John McGibbney
>             Fix For: 2.2
>
>
> After successfully injecting, generating, fetching (without parsing enabled), parsing, updatingdb, then executinga readdb passing a particular -url argument I get a lovely NPE
> {code}
> lewis@lewis:~/ASF/nutchgora/runtime/local$ ./bin/nutch readdb -url http://www.trancearoundtheworld.com
> WebTableReader: java.lang.NullPointerException
> 	at org.apache.gora.cassandra.store.CassandraClient.getFamilyMap(CassandraClient.java:220)
> 	at org.apache.gora.cassandra.store.CassandraStore.execute(CassandraStore.java:108)
> 	at org.apache.nutch.crawl.WebTableReader.read(WebTableReader.java:234)
> 	at org.apache.nutch.crawl.WebTableReader.run(WebTableReader.java:476)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at org.apache.nutch.crawl.WebTableReader.main(WebTableReader.java:412)
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira