You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by feng lu <am...@gmail.com> on 2013/01/22 06:59:15 UTC
CrawlDbFilter urlNormalizers NULL pointer
Hi all
In map method of CrawlDbFilter class, if url == null and urlNormalizers is
true, may be it will throw NullPointerExceptions .
if (urlNormalizers) {
try {
url = normalizers.normalize(url, scope); // normalize the url
} catch (Exception e) {
LOG.warn("Skipping " + url + ":" + e);
url = null;
}
}
if (url != null && urlFiltering) {
try {
url = filters.filter(url); // filter the url
} catch (Exception e) {
LOG.warn("Skipping " + url + ":" + e);
url = null;
}
}
May be we can check the url null value before urlNormalizers.
if ( url != null && urlNormalizers) {
....
}
--
Don't Grow Old, Grow Up... :-)
Re: CrawlDbFilter urlNormalizers NULL pointer
Posted by feng lu <am...@gmail.com>.
ok. i will add a issue and a trivial test case later.
thanks Lewis
On Tue, Jan 22, 2013 at 2:09 PM, Lewis John Mcgibbney <
lewis.mcgibbney@gmail.com> wrote:
> This looks like a good catch.
> Please open a ticket for it if you can. A trivial test case would also be
> great if you are able.
> Lewis
>
>
> On Monday, January 21, 2013, feng lu <am...@gmail.com> wrote:
> > Hi all
> > In map method of CrawlDbFilter class, if url == null and urlNormalizers
> is true, may be it will throw NullPointerExceptions .
> > if (urlNormalizers) {
> > try {
> > url = normalizers.normalize(url, scope); // normalize the url
> > } catch (Exception e) {
> > LOG.warn("Skipping " + url + ":" + e);
> > url = null;
> > }
> > }
> > if (url != null && urlFiltering) {
> > try {
> > url = filters.filter(url); // filter the url
> > } catch (Exception e) {
> > LOG.warn("Skipping " + url + ":" + e);
> > url = null;
> > }
> > }
> > May be we can check the url null value before urlNormalizers.
> > if ( url != null && urlNormalizers) {
> > ....
> > }
> > --
> > Don't Grow Old, Grow Up... :-)
>
> --
> *Lewis*
>
>
--
Don't Grow Old, Grow Up... :-)
Re: CrawlDbFilter urlNormalizers NULL pointer
Posted by Lewis John Mcgibbney <le...@gmail.com>.
This looks like a good catch.
Please open a ticket for it if you can. A trivial test case would also be
great if you are able.
Lewis
On Monday, January 21, 2013, feng lu <am...@gmail.com> wrote:
> Hi all
> In map method of CrawlDbFilter class, if url == null and urlNormalizers
is true, may be it will throw NullPointerExceptions .
> if (urlNormalizers) {
> try {
> url = normalizers.normalize(url, scope); // normalize the url
> } catch (Exception e) {
> LOG.warn("Skipping " + url + ":" + e);
> url = null;
> }
> }
> if (url != null && urlFiltering) {
> try {
> url = filters.filter(url); // filter the url
> } catch (Exception e) {
> LOG.warn("Skipping " + url + ":" + e);
> url = null;
> }
> }
> May be we can check the url null value before urlNormalizers.
> if ( url != null && urlNormalizers) {
> ....
> }
> --
> Don't Grow Old, Grow Up... :-)
--
*Lewis*