You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by sn...@apache.org on 2019/01/29 10:16:57 UTC

[nutch] branch master updated: NUTCH-2691: Improve logging from scoring-depth plugin

This is an automated email from the ASF dual-hosted git repository.

snagel pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/nutch.git


The following commit(s) were added to refs/heads/master by this push:
     new 010c2fc  NUTCH-2691: Improve logging from scoring-depth plugin
     new cb580f0  Merge pull request #434 from YossiTamari/patch-3
010c2fc is described below

commit 010c2fc8035525545812ae8acfbeeda1a8bbb96b
Author: YossiTamari <33...@users.noreply.github.com>
AuthorDate: Tue Jan 22 18:03:45 2019 +0200

    NUTCH-2691: Improve logging from scoring-depth plugin
    
    Exit distributeScoreToOutlinks immediately if there are no outlinks. This is a very small performance improvement, but more importantly it prevents the plugin from emitting a "Missing depth, removing all outlinks from url" warn message for every page that failed parsing.
---
 .../src/java/org/apache/nutch/scoring/depth/DepthScoringFilter.java    | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/plugin/scoring-depth/src/java/org/apache/nutch/scoring/depth/DepthScoringFilter.java b/src/plugin/scoring-depth/src/java/org/apache/nutch/scoring/depth/DepthScoringFilter.java
index 07e0e3f..c016030 100644
--- a/src/plugin/scoring-depth/src/java/org/apache/nutch/scoring/depth/DepthScoringFilter.java
+++ b/src/plugin/scoring-depth/src/java/org/apache/nutch/scoring/depth/DepthScoringFilter.java
@@ -73,6 +73,9 @@ public class DepthScoringFilter extends Configured implements ScoringFilter {
   public CrawlDatum distributeScoreToOutlinks(Text fromUrl,
       ParseData parseData, Collection<Entry<Text, CrawlDatum>> targets,
       CrawlDatum adjust, int allCount) throws ScoringFilterException {
+    if (targets.isEmpty()) {
+      return adjust;
+    }
     String depthString = parseData.getMeta(DEPTH_KEY);
     if (depthString == null) {
       LOG.warn("Missing depth, removing all outlinks from url " + fromUrl);