You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by su...@apache.org on 2016/01/27 17:40:40 UTC

svn commit: r1727126 - /nutch/trunk/conf/stopwords.txt.template

Author: sujen
Date: Wed Jan 27 16:40:40 2016
New Revision: 1727126

URL: http://svn.apache.org/viewvc?rev=1727126&view=rev
Log:
Added missing stopword file for NUTCH-2206

Added:
    nutch/trunk/conf/stopwords.txt.template

Added: nutch/trunk/conf/stopwords.txt.template
URL: http://svn.apache.org/viewvc/nutch/trunk/conf/stopwords.txt.template?rev=1727126&view=auto
==============================================================================
--- nutch/trunk/conf/stopwords.txt.template (added)
+++ nutch/trunk/conf/stopwords.txt.template Wed Jan 27 16:40:40 2016
@@ -0,0 +1,50 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+# Standard english stop words taken from Lucene's StopAnalyzer
+a
+an
+and
+are
+as
+at
+be
+but
+by
+for
+if
+in
+into
+is
+it
+no
+not
+of
+on
+or
+such
+that
+the
+their
+then
+there
+these
+they
+this
+to
+was
+will
+with