You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by ar...@apache.org on 2017/04/01 17:54:24 UTC
[5/6] lucene-solr:master: SOLR-7383: Replace DIH 'rss' example with
'atom' rss example was broken for multiple reasons. atom example showcases
the same - and more - features and uses the smallest config file needed to
make it work.
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_es.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_es.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_es.txt
deleted file mode 100644
index 487d78c..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_es.txt
+++ /dev/null
@@ -1,356 +0,0 @@
- | From svn.tartarus.org/snowball/trunk/website/algorithms/spanish/stop.txt
- | This file is distributed under the BSD License.
- | See http://snowball.tartarus.org/license.php
- | Also see http://www.opensource.org/licenses/bsd-license.html
- | - Encoding was converted to UTF-8.
- | - This notice was added.
- |
- | NOTE: To use this file with StopFilterFactory, you must specify format="snowball"
-
- | A Spanish stop word list. Comments begin with vertical bar. Each stop
- | word is at the start of a line.
-
-
- | The following is a ranked list (commonest to rarest) of stopwords
- | deriving from a large sample of text.
-
- | Extra words have been added at the end.
-
-de | from, of
-la | the, her
-que | who, that
-el | the
-en | in
-y | and
-a | to
-los | the, them
-del | de + el
-se | himself, from him etc
-las | the, them
-por | for, by, etc
-un | a
-para | for
-con | with
-no | no
-una | a
-su | his, her
-al | a + el
- | es from SER
-lo | him
-como | how
-m�s | more
-pero | pero
-sus | su plural
-le | to him, her
-ya | already
-o | or
- | fue from SER
-este | this
- | ha from HABER
-s� | himself etc
-porque | because
-esta | this
- | son from SER
-entre | between
- | est� from ESTAR
-cuando | when
-muy | very
-sin | without
-sobre | on
- | ser from SER
- | tiene from TENER
-tambi�n | also
-me | me
-hasta | until
-hay | there is/are
-donde | where
- | han from HABER
-quien | whom, that
- | est�n from ESTAR
- | estado from ESTAR
-desde | from
-todo | all
-nos | us
-durante | during
- | estados from ESTAR
-todos | all
-uno | a
-les | to them
-ni | nor
-contra | against
-otros | other
- | fueron from SER
-ese | that
-eso | that
- | hab�a from HABER
-ante | before
-ellos | they
-e | and (variant of y)
-esto | this
-m� | me
-antes | before
-algunos | some
-qu� | what?
-unos | a
-yo | I
-otro | other
-otras | other
-otra | other
-�l | he
-tanto | so much, many
-esa | that
-estos | these
-mucho | much, many
-quienes | who
-nada | nothing
-muchos | many
-cual | who
- | sea from SER
-poco | few
-ella | she
-estar | to be
- | haber from HABER
-estas | these
- | estaba from ESTAR
- | estamos from ESTAR
-algunas | some
-algo | something
-nosotros | we
-
- | other forms
-
-mi | me
-mis | mi plural
-t� | thou
-te | thee
-ti | thee
-tu | thy
-tus | tu plural
-ellas | they
-nosotras | we
-vosotros | you
-vosotras | you
-os | you
-m�o | mine
-m�a |
-m�os |
-m�as |
-tuyo | thine
-tuya |
-tuyos |
-tuyas |
-suyo | his, hers, theirs
-suya |
-suyos |
-suyas |
-nuestro | ours
-nuestra |
-nuestros |
-nuestras |
-vuestro | yours
-vuestra |
-vuestros |
-vuestras |
-esos | those
-esas | those
-
- | forms of estar, to be (not including the infinitive):
-estoy
-est�s
-est�
-estamos
-est�is
-est�n
-est�
-est�s
-estemos
-est�is
-est�n
-estar�
-estar�s
-estar�
-estaremos
-estar�is
-estar�n
-estar�a
-estar�as
-estar�amos
-estar�ais
-estar�an
-estaba
-estabas
-est�bamos
-estabais
-estaban
-estuve
-estuviste
-estuvo
-estuvimos
-estuvisteis
-estuvieron
-estuviera
-estuvieras
-estuvi�ramos
-estuvierais
-estuvieran
-estuviese
-estuvieses
-estuvi�semos
-estuvieseis
-estuviesen
-estando
-estado
-estada
-estados
-estadas
-estad
-
- | forms of haber, to have (not including the infinitive):
-he
-has
-ha
-hemos
-hab�is
-han
-haya
-hayas
-hayamos
-hay�is
-hayan
-habr�
-habr�s
-habr�
-habremos
-habr�is
-habr�n
-habr�a
-habr�as
-habr�amos
-habr�ais
-habr�an
-hab�a
-hab�as
-hab�amos
-hab�ais
-hab�an
-hube
-hubiste
-hubo
-hubimos
-hubisteis
-hubieron
-hubiera
-hubieras
-hubi�ramos
-hubierais
-hubieran
-hubiese
-hubieses
-hubi�semos
-hubieseis
-hubiesen
-habiendo
-habido
-habida
-habidos
-habidas
-
- | forms of ser, to be (not including the infinitive):
-soy
-eres
-es
-somos
-sois
-son
-sea
-seas
-seamos
-se�is
-sean
-ser�
-ser�s
-ser�
-seremos
-ser�is
-ser�n
-ser�a
-ser�as
-ser�amos
-ser�ais
-ser�an
-era
-eras
-�ramos
-erais
-eran
-fui
-fuiste
-fue
-fuimos
-fuisteis
-fueron
-fuera
-fueras
-fu�ramos
-fuerais
-fueran
-fuese
-fueses
-fu�semos
-fueseis
-fuesen
-siendo
-sido
- | sed also means 'thirst'
-
- | forms of tener, to have (not including the infinitive):
-tengo
-tienes
-tiene
-tenemos
-ten�is
-tienen
-tenga
-tengas
-tengamos
-teng�is
-tengan
-tendr�
-tendr�s
-tendr�
-tendremos
-tendr�is
-tendr�n
-tendr�a
-tendr�as
-tendr�amos
-tendr�ais
-tendr�an
-ten�a
-ten�as
-ten�amos
-ten�ais
-ten�an
-tuve
-tuviste
-tuvo
-tuvimos
-tuvisteis
-tuvieron
-tuviera
-tuvieras
-tuvi�ramos
-tuvierais
-tuvieran
-tuviese
-tuvieses
-tuvi�semos
-tuvieseis
-tuviesen
-teniendo
-tenido
-tenida
-tenidos
-tenidas
-tened
-
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_eu.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_eu.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_eu.txt
deleted file mode 100644
index 25f1db9..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_eu.txt
+++ /dev/null
@@ -1,99 +0,0 @@
-# example set of basque stopwords
-al
-anitz
-arabera
-asko
-baina
-bat
-batean
-batek
-bati
-batzuei
-batzuek
-batzuetan
-batzuk
-bera
-beraiek
-berau
-berauek
-bere
-berori
-beroriek
-beste
-bezala
-da
-dago
-dira
-ditu
-du
-dute
-edo
-egin
-ere
-eta
-eurak
-ez
-gainera
-gu
-gutxi
-guzti
-haiei
-haiek
-haietan
-hainbeste
-hala
-han
-handik
-hango
-hara
-hari
-hark
-hartan
-hau
-hauei
-hauek
-hauetan
-hemen
-hemendik
-hemengo
-hi
-hona
-honek
-honela
-honetan
-honi
-hor
-hori
-horiei
-horiek
-horietan
-horko
-horra
-horrek
-horrela
-horretan
-horri
-hortik
-hura
-izan
-ni
-noiz
-nola
-non
-nondik
-nongo
-nor
-nora
-ze
-zein
-zen
-zenbait
-zenbat
-zer
-zergatik
-ziren
-zituen
-zu
-zuek
-zuen
-zuten
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_fa.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_fa.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_fa.txt
deleted file mode 100644
index 723641c..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_fa.txt
+++ /dev/null
@@ -1,313 +0,0 @@
-# This file was created by Jacques Savoy and is distributed under the BSD license.
-# See http://members.unine.ch/jacques.savoy/clef/index.html.
-# Also see http://www.opensource.org/licenses/bsd-license.html
-# Note: by default this file is used after normalization, so when adding entries
-# to this file, use the arabic '\u064a' instead of '\u06cc'
-\u0627\u0646\u0627\u0646
-\u0646\u062f\u0627\u0634\u062a\u0647
-\u0633\u0631\u0627\u0633\u0631
-\u062e\u064a\u0627\u0647
-\u0627\u064a\u0634\u0627\u0646
-\u0648\u064a
-\u062a\u0627\u0643\u0646\u0648\u0646
-\u0628\u064a\u0634\u062a\u0631\u064a
-\u062f\u0648\u0645
-\u067e\u0633
-\u0646\u0627\u0634\u064a
-\u0648\u06af\u0648
-\u064a\u0627
-\u062f\u0627\u0634\u062a\u0646\u062f
-\u0633\u067e\u0633
-\u0647\u0646\u06af\u0627\u0645
-\u0647\u0631\u06af\u0632
-\u067e\u0646\u062c
-\u0646\u0634\u0627\u0646
-\u0627\u0645\u0633\u0627\u0644
-\u062f\u064a\u06af\u0631
-\u06af\u0631\u0648\u0647\u064a
-\u0634\u062f\u0646\u062f
-\u0686\u0637\u0648\u0631
-\u062f\u0647
-\u0648
-\u062f\u0648
-\u0646\u062e\u0633\u062a\u064a\u0646
-\u0648\u0644\u064a
-\u0686\u0631\u0627
-\u0686\u0647
-\u0648\u0633\u0637
-\u0647
-\u0643\u062f\u0627\u0645
-\u0642\u0627\u0628\u0644
-\u064a\u0643
-\u0631\u0641\u062a
-\u0647\u0641\u062a
-\u0647\u0645\u0686\u0646\u064a\u0646
-\u062f\u0631
-\u0647\u0632\u0627\u0631
-\u0628\u0644\u0647
-\u0628\u0644\u064a
-\u0634\u0627\u064a\u062f
-\u0627\u0645\u0627
-\u0634\u0646\u0627\u0633\u064a
-\u06af\u0631\u0641\u062a\u0647
-\u062f\u0647\u062f
-\u062f\u0627\u0634\u062a\u0647
-\u062f\u0627\u0646\u0633\u062a
-\u062f\u0627\u0634\u062a\u0646
-\u062e\u0648\u0627\u0647\u064a\u0645
-\u0645\u064a\u0644\u064a\u0627\u0631\u062f
-\u0648\u0642\u062a\u064a\u0643\u0647
-\u0627\u0645\u062f
-\u062e\u0648\u0627\u0647\u062f
-\u062c\u0632
-\u0627\u0648\u0631\u062f\u0647
-\u0634\u062f\u0647
-\u0628\u0644\u0643\u0647
-\u062e\u062f\u0645\u0627\u062a
-\u0634\u062f\u0646
-\u0628\u0631\u062e\u064a
-\u0646\u0628\u0648\u062f
-\u0628\u0633\u064a\u0627\u0631\u064a
-\u062c\u0644\u0648\u06af\u064a\u0631\u064a
-\u062d\u0642
-\u0643\u0631\u062f\u0646\u062f
-\u0646\u0648\u0639\u064a
-\u0628\u0639\u0631\u064a
-\u0646\u0643\u0631\u062f\u0647
-\u0646\u0638\u064a\u0631
-\u0646\u0628\u0627\u064a\u062f
-\u0628\u0648\u062f\u0647
-\u0628\u0648\u062f\u0646
-\u062f\u0627\u062f
-\u0627\u0648\u0631\u062f
-\u0647\u0633\u062a
-\u062c\u0627\u064a\u064a
-\u0634\u0648\u062f
-\u062f\u0646\u0628\u0627\u0644
-\u062f\u0627\u062f\u0647
-\u0628\u0627\u064a\u062f
-\u0633\u0627\u0628\u0642
-\u0647\u064a\u0686
-\u0647\u0645\u0627\u0646
-\u0627\u0646\u062c\u0627
-\u0643\u0645\u062a\u0631
-\u0643\u062c\u0627\u0633\u062a
-\u06af\u0631\u062f\u062f
-\u0643\u0633\u064a
-\u062a\u0631
-\u0645\u0631\u062f\u0645
-\u062a\u0627\u0646
-\u062f\u0627\u062f\u0646
-\u0628\u0648\u062f\u0646\u062f
-\u0633\u0631\u064a
-\u062c\u062f\u0627
-\u0646\u062f\u0627\u0631\u0646\u062f
-\u0645\u06af\u0631
-\u064a\u0643\u062f\u064a\u06af\u0631
-\u062f\u0627\u0631\u062f
-\u062f\u0647\u0646\u062f
-\u0628\u0646\u0627\u0628\u0631\u0627\u064a\u0646
-\u0647\u0646\u06af\u0627\u0645\u064a
-\u0633\u0645\u062a
-\u062c\u0627
-\u0627\u0646\u0686\u0647
-\u062e\u0648\u062f
-\u062f\u0627\u062f\u0646\u062f
-\u0632\u064a\u0627\u062f
-\u062f\u0627\u0631\u0646\u062f
-\u0627\u062b\u0631
-\u0628\u062f\u0648\u0646
-\u0628\u0647\u062a\u0631\u064a\u0646
-\u0628\u064a\u0634\u062a\u0631
-\u0627\u0644\u0628\u062a\u0647
-\u0628\u0647
-\u0628\u0631\u0627\u0633\u0627\u0633
-\u0628\u064a\u0631\u0648\u0646
-\u0643\u0631\u062f
-\u0628\u0639\u0636\u064a
-\u06af\u0631\u0641\u062a
-\u062a\u0648\u064a
-\u0627\u064a
-\u0645\u064a\u0644\u064a\u0648\u0646
-\u0627\u0648
-\u062c\u0631\u064a\u0627\u0646
-\u062a\u0648\u0644
-\u0628\u0631
-\u0645\u0627\u0646\u0646\u062f
-\u0628\u0631\u0627\u0628\u0631
-\u0628\u0627\u0634\u064a\u0645
-\u0645\u062f\u062a\u064a
-\u06af\u0648\u064a\u0646\u062f
-\u0627\u0643\u0646\u0648\u0646
-\u062a\u0627
-\u062a\u0646\u0647\u0627
-\u062c\u062f\u064a\u062f
-\u0686\u0646\u062f
-\u0628\u064a
-\u0646\u0634\u062f\u0647
-\u0643\u0631\u062f\u0646
-\u0643\u0631\u062f\u0645
-\u06af\u0648\u064a\u062f
-\u0643\u0631\u062f\u0647
-\u0643\u0646\u064a\u0645
-\u0646\u0645\u064a
-\u0646\u0632\u062f
-\u0631\u0648\u064a
-\u0642\u0635\u062f
-\u0641\u0642\u0637
-\u0628\u0627\u0644\u0627\u064a
-\u062f\u064a\u06af\u0631\u0627\u0646
-\u0627\u064a\u0646
-\u062f\u064a\u0631\u0648\u0632
-\u062a\u0648\u0633\u0637
-\u0633\u0648\u0645
-\u0627\u064a\u0645
-\u062f\u0627\u0646\u0646\u062f
-\u0633\u0648\u064a
-\u0627\u0633\u062a\u0641\u0627\u062f\u0647
-\u0634\u0645\u0627
-\u0643\u0646\u0627\u0631
-\u062f\u0627\u0631\u064a\u0645
-\u0633\u0627\u062e\u062a\u0647
-\u0637\u0648\u0631
-\u0627\u0645\u062f\u0647
-\u0631\u0641\u062a\u0647
-\u0646\u062e\u0633\u062a
-\u0628\u064a\u0633\u062a
-\u0646\u0632\u062f\u064a\u0643
-\u0637\u064a
-\u0643\u0646\u064a\u062f
-\u0627\u0632
-\u0627\u0646\u0647\u0627
-\u062a\u0645\u0627\u0645\u064a
-\u062f\u0627\u0634\u062a
-\u064a\u0643\u064a
-\u0637\u0631\u064a\u0642
-\u0627\u0634
-\u0686\u064a\u0633\u062a
-\u0631\u0648\u0628
-\u0646\u0645\u0627\u064a\u062f
-\u06af\u0641\u062a
-\u0686\u0646\u062f\u064a\u0646
-\u0686\u064a\u0632\u064a
-\u062a\u0648\u0627\u0646\u062f
-\u0627\u0645
-\u0627\u064a\u0627
-\u0628\u0627
-\u0627\u0646
-\u0627\u064a\u062f
-\u062a\u0631\u064a\u0646
-\u0627\u064a\u0646\u0643\u0647
-\u062f\u064a\u06af\u0631\u064a
-\u0631\u0627\u0647
-\u0647\u0627\u064a\u064a
-\u0628\u0631\u0648\u0632
-\u0647\u0645\u0686\u0646\u0627\u0646
-\u067e\u0627\u0639\u064a\u0646
-\u0643\u0633
-\u062d\u062f\u0648\u062f
-\u0645\u062e\u062a\u0644\u0641
-\u0645\u0642\u0627\u0628\u0644
-\u0686\u064a\u0632
-\u06af\u064a\u0631\u062f
-\u0646\u062f\u0627\u0631\u062f
-\u0636\u062f
-\u0647\u0645\u0686\u0648\u0646
-\u0633\u0627\u0632\u064a
-\u0634\u0627\u0646
-\u0645\u0648\u0631\u062f
-\u0628\u0627\u0631\u0647
-\u0645\u0631\u0633\u064a
-\u062e\u0648\u064a\u0634
-\u0628\u0631\u062e\u0648\u0631\u062f\u0627\u0631
-\u0686\u0648\u0646
-\u062e\u0627\u0631\u062c
-\u0634\u0634
-\u0647\u0646\u0648\u0632
-\u062a\u062d\u062a
-\u0636\u0645\u0646
-\u0647\u0633\u062a\u064a\u0645
-\u06af\u0641\u062a\u0647
-\u0641\u0643\u0631
-\u0628\u0633\u064a\u0627\u0631
-\u067e\u064a\u0634
-\u0628\u0631\u0627\u064a
-\u0631\u0648\u0632\u0647\u0627\u064a
-\u0627\u0646\u0643\u0647
-\u0646\u062e\u0648\u0627\u0647\u062f
-\u0628\u0627\u0644\u0627
-\u0643\u0644
-\u0648\u0642\u062a\u064a
-\u0643\u064a
-\u0686\u0646\u064a\u0646
-\u0643\u0647
-\u06af\u064a\u0631\u064a
-\u0646\u064a\u0633\u062a
-\u0627\u0633\u062a
-\u0643\u062c\u0627
-\u0643\u0646\u062f
-\u0646\u064a\u0632
-\u064a\u0627\u0628\u062f
-\u0628\u0646\u062f\u064a
-\u062d\u062a\u064a
-\u062a\u0648\u0627\u0646\u0646\u062f
-\u0639\u0642\u0628
-\u062e\u0648\u0627\u0633\u062a
-\u0643\u0646\u0646\u062f
-\u0628\u064a\u0646
-\u062a\u0645\u0627\u0645
-\u0647\u0645\u0647
-\u0645\u0627
-\u0628\u0627\u0634\u0646\u062f
-\u0645\u062b\u0644
-\u0634\u062f
-\u0627\u0631\u064a
-\u0628\u0627\u0634\u062f
-\u0627\u0631\u0647
-\u0637\u0628\u0642
-\u0628\u0639\u062f
-\u0627\u06af\u0631
-\u0635\u0648\u0631\u062a
-\u063a\u064a\u0631
-\u062c\u0627\u064a
-\u0628\u064a\u0634
-\u0631\u064a\u0632\u064a
-\u0627\u0646\u062f
-\u0632\u064a\u0631\u0627
-\u0686\u06af\u0648\u0646\u0647
-\u0628\u0627\u0631
-\u0644\u0637\u0641\u0627
-\u0645\u064a
-\u062f\u0631\u0628\u0627\u0631\u0647
-\u0645\u0646
-\u062f\u064a\u062f\u0647
-\u0647\u0645\u064a\u0646
-\u06af\u0630\u0627\u0631\u064a
-\u0628\u0631\u062f\u0627\u0631\u064a
-\u0639\u0644\u062a
-\u06af\u0630\u0627\u0634\u062a\u0647
-\u0647\u0645
-\u0641\u0648\u0642
-\u0646\u0647
-\u0647\u0627
-\u0634\u0648\u0646\u062f
-\u0627\u0628\u0627\u062f
-\u0647\u0645\u0648\u0627\u0631\u0647
-\u0647\u0631
-\u0627\u0648\u0644
-\u062e\u0648\u0627\u0647\u0646\u062f
-\u0686\u0647\u0627\u0631
-\u0646\u0627\u0645
-\u0627\u0645\u0631\u0648\u0632
-\u0645\u0627\u0646
-\u0647\u0627\u064a
-\u0642\u0628\u0644
-\u0643\u0646\u0645
-\u0633\u0639\u064a
-\u062a\u0627\u0632\u0647
-\u0631\u0627
-\u0647\u0633\u062a\u0646\u062f
-\u0632\u064a\u0631
-\u062c\u0644\u0648\u064a
-\u0639\u0646\u0648\u0627\u0646
-\u0628\u0648\u062f
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_fi.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_fi.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_fi.txt
deleted file mode 100644
index 4372c9a..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_fi.txt
+++ /dev/null
@@ -1,97 +0,0 @@
- | From svn.tartarus.org/snowball/trunk/website/algorithms/finnish/stop.txt
- | This file is distributed under the BSD License.
- | See http://snowball.tartarus.org/license.php
- | Also see http://www.opensource.org/licenses/bsd-license.html
- | - Encoding was converted to UTF-8.
- | - This notice was added.
- |
- | NOTE: To use this file with StopFilterFactory, you must specify format="snowball"
-
-| forms of BE
-
-olla
-olen
-olet
-on
-olemme
-olette
-ovat
-ole | negative form
-
-oli
-olisi
-olisit
-olisin
-olisimme
-olisitte
-olisivat
-olit
-olin
-olimme
-olitte
-olivat
-ollut
-olleet
-
-en | negation
-et
-ei
-emme
-ette
-eiv�t
-
-|Nom Gen Acc Part Iness Elat Illat Adess Ablat Allat Ess Trans
-min� minun minut minua minussa minusta minuun minulla minulta minulle | I
-sin� sinun sinut sinua sinussa sinusta sinuun sinulla sinulta sinulle | you
-h�n h�nen h�net h�nt� h�ness� h�nest� h�neen h�nell� h�nelt� h�nelle | he she
-me meid�n meid�t meit� meiss� meist� meihin meill� meilt� meille | we
-te teid�n teid�t teit� teiss� teist� teihin teill� teilt� teille | you
-he heid�n heid�t heit� heiss� heist� heihin heill� heilt� heille | they
-
-t�m� t�m�n t�t� t�ss� t�st� t�h�n tall� t�lt� t�lle t�n� t�ksi | this
-tuo tuon tuot� tuossa tuosta tuohon tuolla tuolta tuolle tuona tuoksi | that
-se sen sit� siin� siit� siihen sill� silt� sille sin� siksi | it
-n�m� n�iden n�it� n�iss� n�ist� n�ihin n�ill� n�ilt� n�ille n�in� n�iksi | these
-nuo noiden noita noissa noista noihin noilla noilta noille noina noiksi | those
-ne niiden niit� niiss� niist� niihin niill� niilt� niille niin� niiksi | they
-
-kuka kenen kenet ket� keness� kenest� keneen kenell� kenelt� kenelle kenen� keneksi| who
-ketk� keiden ketk� keit� keiss� keist� keihin keill� keilt� keille kein� keiksi | (pl)
-mik� mink� mink� mit� miss� mist� mihin mill� milt� mille min� miksi | which what
-mitk� | (pl)
-
-joka jonka jota jossa josta johon jolla jolta jolle jona joksi | who which
-jotka joiden joita joissa joista joihin joilla joilta joille joina joiksi | (pl)
-
-| conjunctions
-
-ett� | that
-ja | and
-jos | if
-koska | because
-kuin | than
-mutta | but
-niin | so
-sek� | and
-sill� | for
-tai | or
-vaan | but
-vai | or
-vaikka | although
-
-
-| prepositions
-
-kanssa | with
-mukaan | according to
-noin | about
-poikki | across
-yli | over, across
-
-| other
-
-kun | when
-niin | so
-nyt | now
-itse | self
-
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_fr.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_fr.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_fr.txt
deleted file mode 100644
index 749abae..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_fr.txt
+++ /dev/null
@@ -1,186 +0,0 @@
- | From svn.tartarus.org/snowball/trunk/website/algorithms/french/stop.txt
- | This file is distributed under the BSD License.
- | See http://snowball.tartarus.org/license.php
- | Also see http://www.opensource.org/licenses/bsd-license.html
- | - Encoding was converted to UTF-8.
- | - This notice was added.
- |
- | NOTE: To use this file with StopFilterFactory, you must specify format="snowball"
-
- | A French stop word list. Comments begin with vertical bar. Each stop
- | word is at the start of a line.
-
-au | a + le
-aux | a + les
-avec | with
-ce | this
-ces | these
-dans | with
-de | of
-des | de + les
-du | de + le
-elle | she
-en | `of them' etc
-et | and
-eux | them
-il | he
-je | I
-la | the
-le | the
-leur | their
-lui | him
-ma | my (fem)
-mais | but
-me | me
-m�me | same; as in moi-m�me (myself) etc
-mes | me (pl)
-moi | me
-mon | my (masc)
-ne | not
-nos | our (pl)
-notre | our
-nous | we
-on | one
-ou | where
-par | by
-pas | not
-pour | for
-qu | que before vowel
-que | that
-qui | who
-sa | his, her (fem)
-se | oneself
-ses | his (pl)
-son | his, her (masc)
-sur | on
-ta | thy (fem)
-te | thee
-tes | thy (pl)
-toi | thee
-ton | thy (masc)
-tu | thou
-un | a
-une | a
-vos | your (pl)
-votre | your
-vous | you
-
- | single letter forms
-
-c | c'
-d | d'
-j | j'
-l | l'
-� | to, at
-m | m'
-n | n'
-s | s'
-t | t'
-y | there
-
- | forms of �tre (not including the infinitive):
-�t�
-�t�e
-�t�es
-�t�s
-�tant
-suis
-es
-est
-sommes
-�tes
-sont
-serai
-seras
-sera
-serons
-serez
-seront
-serais
-serait
-serions
-seriez
-seraient
-�tais
-�tait
-�tions
-�tiez
-�taient
-fus
-fut
-f�mes
-f�tes
-furent
-sois
-soit
-soyons
-soyez
-soient
-fusse
-fusses
-f�t
-fussions
-fussiez
-fussent
-
- | forms of avoir (not including the infinitive):
-ayant
-eu
-eue
-eues
-eus
-ai
-as
-avons
-avez
-ont
-aurai
-auras
-aura
-aurons
-aurez
-auront
-aurais
-aurait
-aurions
-auriez
-auraient
-avais
-avait
-avions
-aviez
-avaient
-eut
-e�mes
-e�tes
-eurent
-aie
-aies
-ait
-ayons
-ayez
-aient
-eusse
-eusses
-e�t
-eussions
-eussiez
-eussent
-
- | Later additions (from Jean-Christophe Deschamps)
-ceci | this
-cela | that
-cel� | that
-cet | this
-cette | this
-ici | here
-ils | they
-les | the (pl)
-leurs | their (pl)
-quel | which
-quels | which
-quelle | which
-quelles | which
-sans | without
-soi | oneself
-
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_ga.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_ga.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_ga.txt
deleted file mode 100644
index 9ff88d7..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_ga.txt
+++ /dev/null
@@ -1,110 +0,0 @@
-
-a
-ach
-ag
-agus
-an
-aon
-ar
-arna
-as
-b'
-ba
-beirt
-bh�r
-caoga
-ceathair
-ceathrar
-chomh
-cht�
-chuig
-chun
-cois
-c�ad
-c�ig
-c�igear
-d'
-daichead
-dar
-de
-deich
-deichni�r
-den
-dh�
-do
-don
-dt�
-d�
-d�r
-d�
-faoi
-faoin
-faoina
-faoin�r
-fara
-fiche
-gach
-gan
-go
-gur
-haon
-hocht
-i
-iad
-idir
-in
-ina
-ins
-in�r
-is
-le
-leis
-lena
-len�r
-m'
-mar
-mo
-m�
-na
-nach
-naoi
-naon�r
-n�
-n�
-n�or
-n�
-n�cha
-ocht
-ochtar
-os
-roimh
-sa
-seacht
-seachtar
-seacht�
-seasca
-seisear
-siad
-sibh
-sinn
-sna
-s�
-s�
-tar
-thar
-th�
-tri�r
-tr�
-tr�na
-tr�n�r
-tr�ocha
-t�
-um
-�r
-�
-�is
-�
-�
-�n
-�na
-�n�r
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_gl.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_gl.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_gl.txt
deleted file mode 100644
index d8760b1..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_gl.txt
+++ /dev/null
@@ -1,161 +0,0 @@
-# galican stopwords
-a
-a�nda
-al�
-aquel
-aquela
-aquelas
-aqueles
-aquilo
-aqu�
-ao
-aos
-as
-as�
-�
-ben
-cando
-che
-co
-coa
-comigo
-con
-connosco
-contigo
-convosco
-coas
-cos
-cun
-cuns
-cunha
-cunhas
-da
-dalgunha
-dalgunhas
-dalg�n
-dalg�ns
-das
-de
-del
-dela
-delas
-deles
-desde
-deste
-do
-dos
-dun
-duns
-dunha
-dunhas
-e
-el
-ela
-elas
-eles
-en
-era
-eran
-esa
-esas
-ese
-eses
-esta
-estar
-estaba
-est�
-est�n
-este
-estes
-estiven
-estou
-eu
-�
-facer
-foi
-foron
-fun
-hab�a
-hai
-iso
-isto
-la
-las
-lle
-lles
-lo
-los
-mais
-me
-meu
-meus
-min
-mi�a
-mi�as
-moi
-na
-nas
-neste
-nin
-no
-non
-nos
-nosa
-nosas
-noso
-nosos
-n�s
-nun
-nunha
-nuns
-nunhas
-o
-os
-ou
-�
-�s
-para
-pero
-pode
-pois
-pola
-polas
-polo
-polos
-por
-que
-se
-sen�n
-ser
-seu
-seus
-sexa
-sido
-sobre
-s�a
-s�as
-tam�n
-tan
-te
-ten
-te�en
-te�o
-ter
-teu
-teus
-ti
-tido
-ti�a
-tiven
-t�a
-t�as
-un
-unha
-unhas
-uns
-vos
-vosa
-vosas
-voso
-vosos
-v�s
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_hi.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_hi.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_hi.txt
deleted file mode 100644
index 86286bb..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_hi.txt
+++ /dev/null
@@ -1,235 +0,0 @@
-# Also see http://www.opensource.org/licenses/bsd-license.html
-# See http://members.unine.ch/jacques.savoy/clef/index.html.
-# This file was created by Jacques Savoy and is distributed under the BSD license.
-# Note: by default this file also contains forms normalized by HindiNormalizer
-# for spelling variation (see section below), such that it can be used whether or
-# not you enable that feature. When adding additional entries to this list,
-# please add the normalized form as well.
-\u0905\u0902\u0926\u0930
-\u0905\u0924
-\u0905\u092a\u0928\u093e
-\u0905\u092a\u0928\u0940
-\u0905\u092a\u0928\u0947
-\u0905\u092d\u0940
-\u0906\u0926\u093f
-\u0906\u092a
-\u0907\u0924\u094d\u092f\u093e\u0926\u093f
-\u0907\u0928
-\u0907\u0928\u0915\u093e
-\u0907\u0928\u094d\u0939\u0940\u0902
-\u0907\u0928\u094d\u0939\u0947\u0902
-\u0907\u0928\u094d\u0939\u094b\u0902
-\u0907\u0938
-\u0907\u0938\u0915\u093e
-\u0907\u0938\u0915\u0940
-\u0907\u0938\u0915\u0947
-\u0907\u0938\u092e\u0947\u0902
-\u0907\u0938\u0940
-\u0907\u0938\u0947
-\u0909\u0928
-\u0909\u0928\u0915\u093e
-\u0909\u0928\u0915\u0940
-\u0909\u0928\u0915\u0947
-\u0909\u0928\u0915\u094b
-\u0909\u0928\u094d\u0939\u0940\u0902
-\u0909\u0928\u094d\u0939\u0947\u0902
-\u0909\u0928\u094d\u0939\u094b\u0902
-\u0909\u0938
-\u0909\u0938\u0915\u0947
-\u0909\u0938\u0940
-\u0909\u0938\u0947
-\u090f\u0915
-\u090f\u0935\u0902
-\u090f\u0938
-\u0910\u0938\u0947
-\u0914\u0930
-\u0915\u0908
-\u0915\u0930
-\u0915\u0930\u0924\u093e
-\u0915\u0930\u0924\u0947
-\u0915\u0930\u0928\u093e
-\u0915\u0930\u0928\u0947
-\u0915\u0930\u0947\u0902
-\u0915\u0939\u0924\u0947
-\u0915\u0939\u093e
-\u0915\u093e
-\u0915\u093e\u095e\u0940
-\u0915\u093f
-\u0915\u093f\u0924\u0928\u093e
-\u0915\u093f\u0928\u094d\u0939\u0947\u0902
-\u0915\u093f\u0928\u094d\u0939\u094b\u0902
-\u0915\u093f\u092f\u093e
-\u0915\u093f\u0930
-\u0915\u093f\u0938
-\u0915\u093f\u0938\u0940
-\u0915\u093f\u0938\u0947
-\u0915\u0940
-\u0915\u0941\u091b
-\u0915\u0941\u0932
-\u0915\u0947
-\u0915\u094b
-\u0915\u094b\u0908
-\u0915\u094c\u0928
-\u0915\u094c\u0928\u0938\u093e
-\u0917\u092f\u093e
-\u0918\u0930
-\u091c\u092c
-\u091c\u0939\u093e\u0901
-\u091c\u093e
-\u091c\u093f\u0924\u0928\u093e
-\u091c\u093f\u0928
-\u091c\u093f\u0928\u094d\u0939\u0947\u0902
-\u091c\u093f\u0928\u094d\u0939\u094b\u0902
-\u091c\u093f\u0938
-\u091c\u093f\u0938\u0947
-\u091c\u0940\u0927\u0930
-\u091c\u0948\u0938\u093e
-\u091c\u0948\u0938\u0947
-\u091c\u094b
-\u0924\u0915
-\u0924\u092c
-\u0924\u0930\u0939
-\u0924\u093f\u0928
-\u0924\u093f\u0928\u094d\u0939\u0947\u0902
-\u0924\u093f\u0928\u094d\u0939\u094b\u0902
-\u0924\u093f\u0938
-\u0924\u093f\u0938\u0947
-\u0924\u094b
-\u0925\u093e
-\u0925\u0940
-\u0925\u0947
-\u0926\u092c\u093e\u0930\u093e
-\u0926\u093f\u092f\u093e
-\u0926\u0941\u0938\u0930\u093e
-\u0926\u0942\u0938\u0930\u0947
-\u0926\u094b
-\u0926\u094d\u0935\u093e\u0930\u093e
-\u0928
-\u0928\u0939\u0940\u0902
-\u0928\u093e
-\u0928\u093f\u0939\u093e\u092f\u0924
-\u0928\u0940\u091a\u0947
-\u0928\u0947
-\u092a\u0930
-\u092a\u0930
-\u092a\u0939\u0932\u0947
-\u092a\u0942\u0930\u093e
-\u092a\u0947
-\u092b\u093f\u0930
-\u092c\u0928\u0940
-\u092c\u0939\u0940
-\u092c\u0939\u0941\u0924
-\u092c\u093e\u0926
-\u092c\u093e\u0932\u093e
-\u092c\u093f\u0932\u0915\u0941\u0932
-\u092d\u0940
-\u092d\u0940\u0924\u0930
-\u092e\u0917\u0930
-\u092e\u093e\u0928\u094b
-\u092e\u0947
-\u092e\u0947\u0902
-\u092f\u0926\u093f
-\u092f\u0939
-\u092f\u0939\u093e\u0901
-\u092f\u0939\u0940
-\u092f\u093e
-\u092f\u093f\u0939
-\u092f\u0947
-\u0930\u0916\u0947\u0902
-\u0930\u0939\u093e
-\u0930\u0939\u0947
-\u0931\u094d\u0935\u093e\u0938\u093e
-\u0932\u093f\u090f
-\u0932\u093f\u092f\u0947
-\u0932\u0947\u0915\u093f\u0928
-\u0935
-\u0935\u0930\u094d\u0917
-\u0935\u0939
-\u0935\u0939
-\u0935\u0939\u093e\u0901
-\u0935\u0939\u0940\u0902
-\u0935\u093e\u0932\u0947
-\u0935\u0941\u0939
-\u0935\u0947
-\u0935\u095a\u0948\u0930\u0939
-\u0938\u0902\u0917
-\u0938\u0915\u0924\u093e
-\u0938\u0915\u0924\u0947
-\u0938\u092c\u0938\u0947
-\u0938\u092d\u0940
-\u0938\u093e\u0925
-\u0938\u093e\u092c\u0941\u0924
-\u0938\u093e\u092d
-\u0938\u093e\u0930\u093e
-\u0938\u0947
-\u0938\u094b
-\u0939\u0940
-\u0939\u0941\u0906
-\u0939\u0941\u0908
-\u0939\u0941\u090f
-\u0939\u0948
-\u0939\u0948\u0902
-\u0939\u094b
-\u0939\u094b\u0924\u093e
-\u0939\u094b\u0924\u0940
-\u0939\u094b\u0924\u0947
-\u0939\u094b\u0928\u093e
-\u0939\u094b\u0928\u0947
-# additional normalized forms of the above
-\u0905\u092a\u0928\u093f
-\u091c\u0947\u0938\u0947
-\u0939\u094b\u0924\u093f
-\u0938\u092d\u093f
-\u0924\u093f\u0902\u0939\u094b\u0902
-\u0907\u0902\u0939\u094b\u0902
-\u0926\u0935\u093e\u0930\u093e
-\u0907\u0938\u093f
-\u0915\u093f\u0902\u0939\u0947\u0902
-\u0925\u093f
-\u0909\u0902\u0939\u094b\u0902
-\u0913\u0930
-\u091c\u093f\u0902\u0939\u0947\u0902
-\u0935\u0939\u093f\u0902
-\u0905\u092d\u093f
-\u092c\u0928\u093f
-\u0939\u093f
-\u0909\u0902\u0939\u093f\u0902
-\u0909\u0902\u0939\u0947\u0902
-\u0939\u0947\u0902
-\u0935\u0917\u0947\u0930\u0939
-\u090f\u0938\u0947
-\u0930\u0935\u093e\u0938\u093e
-\u0915\u094b\u0928
-\u0928\u093f\u091a\u0947
-\u0915\u093e\u092b\u093f
-\u0909\u0938\u093f
-\u092a\u0941\u0930\u093e
-\u092d\u093f\u0924\u0930
-\u0939\u0947
-\u092c\u0939\u093f
-\u0935\u0939\u093e\u0902
-\u0915\u094b\u0907
-\u092f\u0939\u093e\u0902
-\u091c\u093f\u0902\u0939\u094b\u0902
-\u0924\u093f\u0902\u0939\u0947\u0902
-\u0915\u093f\u0938\u093f
-\u0915\u0907
-\u092f\u0939\u093f
-\u0907\u0902\u0939\u093f\u0902
-\u091c\u093f\u0927\u0930
-\u0907\u0902\u0939\u0947\u0902
-\u0905\u0926\u093f
-\u0907\u0924\u092f\u093e\u0926\u093f
-\u0939\u0941\u0907
-\u0915\u094b\u0928\u0938\u093e
-\u0907\u0938\u0915\u093f
-\u0926\u0941\u0938\u0930\u0947
-\u091c\u0939\u093e\u0902
-\u0905\u092a
-\u0915\u093f\u0902\u0939\u094b\u0902
-\u0909\u0928\u0915\u093f
-\u092d\u093f
-\u0935\u0930\u0917
-\u0939\u0941\u0905
-\u091c\u0947\u0938\u093e
-\u0928\u0939\u093f\u0902
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_hu.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_hu.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_hu.txt
deleted file mode 100644
index 37526da..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_hu.txt
+++ /dev/null
@@ -1,211 +0,0 @@
- | From svn.tartarus.org/snowball/trunk/website/algorithms/hungarian/stop.txt
- | This file is distributed under the BSD License.
- | See http://snowball.tartarus.org/license.php
- | Also see http://www.opensource.org/licenses/bsd-license.html
- | - Encoding was converted to UTF-8.
- | - This notice was added.
- |
- | NOTE: To use this file with StopFilterFactory, you must specify format="snowball"
-
-| Hungarian stop word list
-| prepared by Anna Tordai
-
-a
-ahogy
-ahol
-aki
-akik
-akkor
-alatt
-�ltal
-�ltal�ban
-amely
-amelyek
-amelyekben
-amelyeket
-amelyet
-amelynek
-ami
-amit
-amolyan
-am�g
-amikor
-�t
-abban
-ahhoz
-annak
-arra
-arr�l
-az
-azok
-azon
-azt
-azzal
-az�rt
-azt�n
-azut�n
-azonban
-b�r
-be
-bel�l
-benne
-cikk
-cikkek
-cikkeket
-csak
-de
-e
-eddig
-eg�sz
-egy
-egyes
-egyetlen
-egy�b
-egyik
-egyre
-ekkor
-el
-el�g
-ellen
-el\u0151
-el\u0151sz�r
-el\u0151tt
-els\u0151
-�n
-�ppen
-ebben
-ehhez
-emilyen
-ennek
-erre
-ez
-ezt
-ezek
-ezen
-ezzel
-ez�rt
-�s
-fel
-fel�
-hanem
-hiszen
-hogy
-hogyan
-igen
-�gy
-illetve
-ill.
-ill
-ilyen
-ilyenkor
-ison
-ism�t
-itt
-j�
-j�l
-jobban
-kell
-kellett
-kereszt�l
-keress�nk
-ki
-k�v�l
-k�z�tt
-k�z�l
-legal�bb
-lehet
-lehetett
-legyen
-lenne
-lenni
-lesz
-lett
-maga
-mag�t
-majd
-majd
-m�r
-m�s
-m�sik
-meg
-m�g
-mellett
-mert
-mely
-melyek
-mi
-mit
-m�g
-mi�rt
-milyen
-mikor
-minden
-mindent
-mindenki
-mindig
-mint
-mintha
-mivel
-most
-nagy
-nagyobb
-nagyon
-ne
-n�ha
-nekem
-neki
-nem
-n�h�ny
-n�lk�l
-nincs
-olyan
-ott
-�ssze
-\u0151
-\u0151k
-\u0151ket
-pedig
-persze
-r�
-s
-saj�t
-sem
-semmi
-sok
-sokat
-sokkal
-sz�m�ra
-szemben
-szerint
-szinte
-tal�n
-teh�t
-teljes
-tov�bb
-tov�bb�
-t�bb
-�gy
-ugyanis
-�j
-�jabb
-�jra
-ut�n
-ut�na
-utols�
-vagy
-vagyis
-valaki
-valami
-valamint
-val�
-vagyok
-van
-vannak
-volt
-voltam
-voltak
-voltunk
-vissza
-vele
-viszont
-volna
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_hy.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_hy.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_hy.txt
deleted file mode 100644
index 60c1c50..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_hy.txt
+++ /dev/null
@@ -1,46 +0,0 @@
-# example set of Armenian stopwords.
-\u0561\u0575\u0564
-\u0561\u0575\u056c
-\u0561\u0575\u0576
-\u0561\u0575\u057d
-\u0564\u0578\u0582
-\u0564\u0578\u0582\u0584
-\u0565\u0574
-\u0565\u0576
-\u0565\u0576\u0584
-\u0565\u057d
-\u0565\u0584
-\u0567
-\u0567\u056b
-\u0567\u056b\u0576
-\u0567\u056b\u0576\u0584
-\u0567\u056b\u0580
-\u0567\u056b\u0584
-\u0567\u0580
-\u0568\u057d\u057f
-\u0569
-\u056b
-\u056b\u0576
-\u056b\u057d\u056f
-\u056b\u0580
-\u056f\u0561\u0574
-\u0570\u0561\u0574\u0561\u0580
-\u0570\u0565\u057f
-\u0570\u0565\u057f\u0578
-\u0574\u0565\u0576\u0584
-\u0574\u0565\u057b
-\u0574\u056b
-\u0576
-\u0576\u0561
-\u0576\u0561\u0587
-\u0576\u0580\u0561
-\u0576\u0580\u0561\u0576\u0584
-\u0578\u0580
-\u0578\u0580\u0568
-\u0578\u0580\u0578\u0576\u0584
-\u0578\u0580\u057a\u0565\u057d
-\u0578\u0582
-\u0578\u0582\u0574
-\u057a\u056b\u057f\u056b
-\u057e\u0580\u0561
-\u0587
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_id.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_id.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_id.txt
deleted file mode 100644
index 4617f83..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_id.txt
+++ /dev/null
@@ -1,359 +0,0 @@
-# from appendix D of: A Study of Stemming Effects on Information
-# Retrieval in Bahasa Indonesia
-ada
-adanya
-adalah
-adapun
-agak
-agaknya
-agar
-akan
-akankah
-akhirnya
-aku
-akulah
-amat
-amatlah
-anda
-andalah
-antar
-diantaranya
-antara
-antaranya
-diantara
-apa
-apaan
-mengapa
-apabila
-apakah
-apalagi
-apatah
-atau
-ataukah
-ataupun
-bagai
-bagaikan
-sebagai
-sebagainya
-bagaimana
-bagaimanapun
-sebagaimana
-bagaimanakah
-bagi
-bahkan
-bahwa
-bahwasanya
-sebaliknya
-banyak
-sebanyak
-beberapa
-seberapa
-begini
-beginian
-beginikah
-beginilah
-sebegini
-begitu
-begitukah
-begitulah
-begitupun
-sebegitu
-belum
-belumlah
-sebelum
-sebelumnya
-sebenarnya
-berapa
-berapakah
-berapalah
-berapapun
-betulkah
-sebetulnya
-biasa
-biasanya
-bila
-bilakah
-bisa
-bisakah
-sebisanya
-boleh
-bolehkah
-bolehlah
-buat
-bukan
-bukankah
-bukanlah
-bukannya
-cuma
-percuma
-dahulu
-dalam
-dan
-dapat
-dari
-daripada
-dekat
-demi
-demikian
-demikianlah
-sedemikian
-dengan
-depan
-di
-dia
-dialah
-dini
-diri
-dirinya
-terdiri
-dong
-dulu
-enggak
-enggaknya
-entah
-entahlah
-terhadap
-terhadapnya
-hal
-hampir
-hanya
-hanyalah
-harus
-haruslah
-harusnya
-seharusnya
-hendak
-hendaklah
-hendaknya
-hingga
-sehingga
-ia
-ialah
-ibarat
-ingin
-inginkah
-inginkan
-ini
-inikah
-inilah
-itu
-itukah
-itulah
-jangan
-jangankan
-janganlah
-jika
-jikalau
-juga
-justru
-kala
-kalau
-kalaulah
-kalaupun
-kalian
-kami
-kamilah
-kamu
-kamulah
-kan
-kapan
-kapankah
-kapanpun
-dikarenakan
-karena
-karenanya
-ke
-kecil
-kemudian
-kenapa
-kepada
-kepadanya
-ketika
-seketika
-khususnya
-kini
-kinilah
-kiranya
-sekiranya
-kita
-kitalah
-kok
-lagi
-lagian
-selagi
-lah
-lain
-lainnya
-melainkan
-selaku
-lalu
-melalui
-terlalu
-lama
-lamanya
-selama
-selama
-selamanya
-lebih
-terlebih
-bermacam
-macam
-semacam
-maka
-makanya
-makin
-malah
-malahan
-mampu
-mampukah
-mana
-manakala
-manalagi
-masih
-masihkah
-semasih
-masing
-mau
-maupun
-semaunya
-memang
-mereka
-merekalah
-meski
-meskipun
-semula
-mungkin
-mungkinkah
-nah
-namun
-nanti
-nantinya
-nyaris
-oleh
-olehnya
-seorang
-seseorang
-pada
-padanya
-padahal
-paling
-sepanjang
-pantas
-sepantasnya
-sepantasnyalah
-para
-pasti
-pastilah
-per
-pernah
-pula
-pun
-merupakan
-rupanya
-serupa
-saat
-saatnya
-sesaat
-saja
-sajalah
-saling
-bersama
-sama
-sesama
-sambil
-sampai
-sana
-sangat
-sangatlah
-saya
-sayalah
-se
-sebab
-sebabnya
-sebuah
-tersebut
-tersebutlah
-sedang
-sedangkan
-sedikit
-sedikitnya
-segala
-segalanya
-segera
-sesegera
-sejak
-sejenak
-sekali
-sekalian
-sekalipun
-sesekali
-sekaligus
-sekarang
-sekarang
-sekitar
-sekitarnya
-sela
-selain
-selalu
-seluruh
-seluruhnya
-semakin
-sementara
-sempat
-semua
-semuanya
-sendiri
-sendirinya
-seolah
-seperti
-sepertinya
-sering
-seringnya
-serta
-siapa
-siapakah
-siapapun
-disini
-disinilah
-sini
-sinilah
-sesuatu
-sesuatunya
-suatu
-sesudah
-sesudahnya
-sudah
-sudahkah
-sudahlah
-supaya
-tadi
-tadinya
-tak
-tanpa
-setelah
-telah
-tentang
-tentu
-tentulah
-tentunya
-tertentu
-seterusnya
-tapi
-tetapi
-setiap
-tiap
-setidaknya
-tidak
-tidakkah
-tidaklah
-toh
-waduh
-wah
-wahai
-sewaktu
-walau
-walaupun
-wong
-yaitu
-yakni
-yang
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_it.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_it.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_it.txt
deleted file mode 100644
index 1219cc7..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_it.txt
+++ /dev/null
@@ -1,303 +0,0 @@
- | From svn.tartarus.org/snowball/trunk/website/algorithms/italian/stop.txt
- | This file is distributed under the BSD License.
- | See http://snowball.tartarus.org/license.php
- | Also see http://www.opensource.org/licenses/bsd-license.html
- | - Encoding was converted to UTF-8.
- | - This notice was added.
- |
- | NOTE: To use this file with StopFilterFactory, you must specify format="snowball"
-
- | An Italian stop word list. Comments begin with vertical bar. Each stop
- | word is at the start of a line.
-
-ad | a (to) before vowel
-al | a + il
-allo | a + lo
-ai | a + i
-agli | a + gli
-all | a + l'
-agl | a + gl'
-alla | a + la
-alle | a + le
-con | with
-col | con + il
-coi | con + i (forms collo, cogli etc are now very rare)
-da | from
-dal | da + il
-dallo | da + lo
-dai | da + i
-dagli | da + gli
-dall | da + l'
-dagl | da + gll'
-dalla | da + la
-dalle | da + le
-di | of
-del | di + il
-dello | di + lo
-dei | di + i
-degli | di + gli
-dell | di + l'
-degl | di + gl'
-della | di + la
-delle | di + le
-in | in
-nel | in + el
-nello | in + lo
-nei | in + i
-negli | in + gli
-nell | in + l'
-negl | in + gl'
-nella | in + la
-nelle | in + le
-su | on
-sul | su + il
-sullo | su + lo
-sui | su + i
-sugli | su + gli
-sull | su + l'
-sugl | su + gl'
-sulla | su + la
-sulle | su + le
-per | through, by
-tra | among
-contro | against
-io | I
-tu | thou
-lui | he
-lei | she
-noi | we
-voi | you
-loro | they
-mio | my
-mia |
-miei |
-mie |
-tuo |
-tua |
-tuoi | thy
-tue |
-suo |
-sua |
-suoi | his, her
-sue |
-nostro | our
-nostra |
-nostri |
-nostre |
-vostro | your
-vostra |
-vostri |
-vostre |
-mi | me
-ti | thee
-ci | us, there
-vi | you, there
-lo | him, the
-la | her, the
-li | them
-le | them, the
-gli | to him, the
-ne | from there etc
-il | the
-un | a
-uno | a
-una | a
-ma | but
-ed | and
-se | if
-perch� | why, because
-anche | also
-come | how
-dov | where (as dov')
-dove | where
-che | who, that
-chi | who
-cui | whom
-non | not
-pi� | more
-quale | who, that
-quanto | how much
-quanti |
-quanta |
-quante |
-quello | that
-quelli |
-quella |
-quelle |
-questo | this
-questi |
-questa |
-queste |
-si | yes
-tutto | all
-tutti | all
-
- | single letter forms:
-
-a | at
-c | as c' for ce or ci
-e | and
-i | the
-l | as l'
-o | or
-
- | forms of avere, to have (not including the infinitive):
-
-ho
-hai
-ha
-abbiamo
-avete
-hanno
-abbia
-abbiate
-abbiano
-avr�
-avrai
-avr�
-avremo
-avrete
-avranno
-avrei
-avresti
-avrebbe
-avremmo
-avreste
-avrebbero
-avevo
-avevi
-aveva
-avevamo
-avevate
-avevano
-ebbi
-avesti
-ebbe
-avemmo
-aveste
-ebbero
-avessi
-avesse
-avessimo
-avessero
-avendo
-avuto
-avuta
-avuti
-avute
-
- | forms of essere, to be (not including the infinitive):
-sono
-sei
-�
-siamo
-siete
-sia
-siate
-siano
-sar�
-sarai
-sar�
-saremo
-sarete
-saranno
-sarei
-saresti
-sarebbe
-saremmo
-sareste
-sarebbero
-ero
-eri
-era
-eravamo
-eravate
-erano
-fui
-fosti
-fu
-fummo
-foste
-furono
-fossi
-fosse
-fossimo
-fossero
-essendo
-
- | forms of fare, to do (not including the infinitive, fa, fat-):
-faccio
-fai
-facciamo
-fanno
-faccia
-facciate
-facciano
-far�
-farai
-far�
-faremo
-farete
-faranno
-farei
-faresti
-farebbe
-faremmo
-fareste
-farebbero
-facevo
-facevi
-faceva
-facevamo
-facevate
-facevano
-feci
-facesti
-fece
-facemmo
-faceste
-fecero
-facessi
-facesse
-facessimo
-facessero
-facendo
-
- | forms of stare, to be (not including the infinitive):
-sto
-stai
-sta
-stiamo
-stanno
-stia
-stiate
-stiano
-star�
-starai
-star�
-staremo
-starete
-staranno
-starei
-staresti
-starebbe
-staremmo
-stareste
-starebbero
-stavo
-stavi
-stava
-stavamo
-stavate
-stavano
-stetti
-stesti
-stette
-stemmo
-steste
-stettero
-stessi
-stesse
-stessimo
-stessero
-stando
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_ja.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_ja.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_ja.txt
deleted file mode 100644
index d4321be..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_ja.txt
+++ /dev/null
@@ -1,127 +0,0 @@
-#
-# This file defines a stopword set for Japanese.
-#
-# This set is made up of hand-picked frequent terms from segmented Japanese Wikipedia.
-# Punctuation characters and frequent kanji have mostly been left out. See LUCENE-3745
-# for frequency lists, etc. that can be useful for making your own set (if desired)
-#
-# Note that there is an overlap between these stopwords and the terms stopped when used
-# in combination with the JapanesePartOfSpeechStopFilter. When editing this file, note
-# that comments are not allowed on the same line as stopwords.
-#
-# Also note that stopping is done in a case-insensitive manner. Change your StopFilter
-# configuration if you need case-sensitive stopping. Lastly, note that stopping is done
-# using the same character width as the entries in this file. Since this StopFilter is
-# normally done after a CJKWidthFilter in your chain, you would usually want your romaji
-# entries to be in half-width and your kana entries to be in full-width.
-#
-\u306e
-\u306b
-\u306f
-\u3092
-\u305f
-\u304c
-\u3067
-\u3066
-\u3068
-\u3057
-\u308c
-\u3055
-\u3042\u308b
-\u3044\u308b
-\u3082
-\u3059\u308b
-\u304b\u3089
-\u306a
-\u3053\u3068
-\u3068\u3057\u3066
-\u3044
-\u3084
-\u308c\u308b
-\u306a\u3069
-\u306a\u3063
-\u306a\u3044
-\u3053\u306e
-\u305f\u3081
-\u305d\u306e
-\u3042\u3063
-\u3088\u3046
-\u307e\u305f
-\u3082\u306e
-\u3068\u3044\u3046
-\u3042\u308a
-\u307e\u3067
-\u3089\u308c
-\u306a\u308b
-\u3078
-\u304b
-\u3060
-\u3053\u308c
-\u306b\u3088\u3063\u3066
-\u306b\u3088\u308a
-\u304a\u308a
-\u3088\u308a
-\u306b\u3088\u308b
-\u305a
-\u306a\u308a
-\u3089\u308c\u308b
-\u306b\u304a\u3044\u3066
-\u3070
-\u306a\u304b\u3063
-\u306a\u304f
-\u3057\u304b\u3057
-\u306b\u3064\u3044\u3066
-\u305b
-\u3060\u3063
-\u305d\u306e\u5f8c
-\u3067\u304d\u308b
-\u305d\u308c
-\u3046
-\u306e\u3067
-\u306a\u304a
-\u306e\u307f
-\u3067\u304d
-\u304d
-\u3064
-\u306b\u304a\u3051\u308b
-\u304a\u3088\u3073
-\u3044\u3046
-\u3055\u3089\u306b
-\u3067\u3082
-\u3089
-\u305f\u308a
-\u305d\u306e\u4ed6
-\u306b\u95a2\u3059\u308b
-\u305f\u3061
-\u307e\u3059
-\u3093
-\u306a\u3089
-\u306b\u5bfe\u3057\u3066
-\u7279\u306b
-\u305b\u308b
-\u53ca\u3073
-\u3053\u308c\u3089
-\u3068\u304d
-\u3067\u306f
-\u306b\u3066
-\u307b\u304b
-\u306a\u304c\u3089
-\u3046\u3061
-\u305d\u3057\u3066
-\u3068\u3068\u3082\u306b
-\u305f\u3060\u3057
-\u304b\u3064\u3066
-\u305d\u308c\u305e\u308c
-\u307e\u305f\u306f
-\u304a
-\u307b\u3069
-\u3082\u306e\u306e
-\u306b\u5bfe\u3059\u308b
-\u307b\u3068\u3093\u3069
-\u3068\u5171\u306b
-\u3068\u3044\u3063\u305f
-\u3067\u3059
-\u3068\u3082
-\u3068\u3053\u308d
-\u3053\u3053
-##### End of file
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_lv.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_lv.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_lv.txt
deleted file mode 100644
index e21a23c..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_lv.txt
+++ /dev/null
@@ -1,172 +0,0 @@
-# Set of Latvian stopwords from A Stemming Algorithm for Latvian, Karlis Kreslins
-# the original list of over 800 forms was refined:
-# pronouns, adverbs, interjections were removed
-#
-# prepositions
-aiz
-ap
-ar
-apak\u0161
-\u0101rpus
-aug\u0161pus
-bez
-caur
-d\u0113\u013c
-gar
-iek\u0161
-iz
-kop\u0161
-labad
-lejpus
-l\u012bdz
-no
-otrpus
-pa
-par
-p\u0101r
-p\u0113c
-pie
-pirms
-pret
-priek\u0161
-starp
-\u0161aipus
-uz
-vi\u0146pus
-virs
-virspus
-zem
-apak\u0161pus
-# Conjunctions
-un
-bet
-jo
-ja
-ka
-lai
-tom\u0113r
-tikko
-turpret\u012b
-ar\u012b
-kaut
-gan
-t\u0101d\u0113\u013c
-t\u0101
-ne
-tikvien
-vien
-k\u0101
-ir
-te
-vai
-kam\u0113r
-# Particles
-ar
-diezin
-dro\u0161i
-diem\u017e\u0113l
-neb\u016bt
-ik
-it
-ta\u010du
-nu
-pat
-tiklab
-iek\u0161pus
-nedz
-tik
-nevis
-turpretim
-jeb
-iekam
-iek\u0101m
-iek\u0101ms
-kol\u012bdz
-l\u012bdzko
-tikl\u012bdz
-jeb\u0161u
-t\u0101lab
-t\u0101p\u0113c
-nek\u0101
-itin
-j\u0101
-jau
-jel
-n\u0113
-nezin
-tad
-tikai
-vis
-tak
-iekams
-vien
-# modal verbs
-b\u016bt
-biju
-biji
-bija
-bij\u0101m
-bij\u0101t
-esmu
-esi
-esam
-esat
-b\u016b\u0161u
-b\u016bsi
-b\u016bs
-b\u016bsim
-b\u016bsiet
-tikt
-tiku
-tiki
-tika
-tik\u0101m
-tik\u0101t
-tieku
-tiec
-tiek
-tiekam
-tiekat
-tik\u0161u
-tiks
-tiksim
-tiksiet
-tapt
-tapi
-tap\u0101t
-topat
-tap\u0161u
-tapsi
-taps
-tapsim
-tapsiet
-k\u013c\u016bt
-k\u013cuvu
-k\u013cuvi
-k\u013cuva
-k\u013cuv\u0101m
-k\u013cuv\u0101t
-k\u013c\u016bstu
-k\u013c\u016bsti
-k\u013c\u016bst
-k\u013c\u016bstam
-k\u013c\u016bstat
-k\u013c\u016b\u0161u
-k\u013c\u016bsi
-k\u013c\u016bs
-k\u013c\u016bsim
-k\u013c\u016bsiet
-# verbs
-var\u0113t
-var\u0113ju
-var\u0113j\u0101m
-var\u0113\u0161u
-var\u0113sim
-var
-var\u0113ji
-var\u0113j\u0101t
-var\u0113si
-var\u0113siet
-varat
-var\u0113ja
-var\u0113s
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_nl.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_nl.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_nl.txt
deleted file mode 100644
index 47a2aea..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_nl.txt
+++ /dev/null
@@ -1,119 +0,0 @@
- | From svn.tartarus.org/snowball/trunk/website/algorithms/dutch/stop.txt
- | This file is distributed under the BSD License.
- | See http://snowball.tartarus.org/license.php
- | Also see http://www.opensource.org/licenses/bsd-license.html
- | - Encoding was converted to UTF-8.
- | - This notice was added.
- |
- | NOTE: To use this file with StopFilterFactory, you must specify format="snowball"
-
- | A Dutch stop word list. Comments begin with vertical bar. Each stop
- | word is at the start of a line.
-
- | This is a ranked list (commonest to rarest) of stopwords derived from
- | a large sample of Dutch text.
-
- | Dutch stop words frequently exhibit homonym clashes. These are indicated
- | clearly below.
-
-de | the
-en | and
-van | of, from
-ik | I, the ego
-te | (1) chez, at etc, (2) to, (3) too
-dat | that, which
-die | that, those, who, which
-in | in, inside
-een | a, an, one
-hij | he
-het | the, it
-niet | not, nothing, naught
-zijn | (1) to be, being, (2) his, one's, its
-is | is
-was | (1) was, past tense of all persons sing. of 'zijn' (to be) (2) wax, (3) the washing, (4) rise of river
-op | on, upon, at, in, up, used up
-aan | on, upon, to (as dative)
-met | with, by
-als | like, such as, when
-voor | (1) before, in front of, (2) furrow
-had | had, past tense all persons sing. of 'hebben' (have)
-er | there
-maar | but, only
-om | round, about, for etc
-hem | him
-dan | then
-zou | should/would, past tense all persons sing. of 'zullen'
-of | or, whether, if
-wat | what, something, anything
-mijn | possessive and noun 'mine'
-men | people, 'one'
-dit | this
-zo | so, thus, in this way
-door | through by
-over | over, across
-ze | she, her, they, them
-zich | oneself
-bij | (1) a bee, (2) by, near, at
-ook | also, too
-tot | till, until
-je | you
-mij | me
-uit | out of, from
-der | Old Dutch form of 'van der' still found in surnames
-daar | (1) there, (2) because
-haar | (1) her, their, them, (2) hair
-naar | (1) unpleasant, unwell etc, (2) towards, (3) as
-heb | present first person sing. of 'to have'
-hoe | how, why
-heeft | present third person sing. of 'to have'
-hebben | 'to have' and various parts thereof
-deze | this
-u | you
-want | (1) for, (2) mitten, (3) rigging
-nog | yet, still
-zal | 'shall', first and third person sing. of verb 'zullen' (will)
-me | me
-zij | she, they
-nu | now
-ge | 'thou', still used in Belgium and south Netherlands
-geen | none
-omdat | because
-iets | something, somewhat
-worden | to become, grow, get
-toch | yet, still
-al | all, every, each
-waren | (1) 'were' (2) to wander, (3) wares, (3)
-veel | much, many
-meer | (1) more, (2) lake
-doen | to do, to make
-toen | then, when
-moet | noun 'spot/mote' and present form of 'to must'
-ben | (1) am, (2) 'are' in interrogative second person singular of 'to be'
-zonder | without
-kan | noun 'can' and present form of 'to be able'
-hun | their, them
-dus | so, consequently
-alles | all, everything, anything
-onder | under, beneath
-ja | yes, of course
-eens | once, one day
-hier | here
-wie | who
-werd | imperfect third person sing. of 'become'
-altijd | always
-doch | yet, but etc
-wordt | present third person sing. of 'become'
-wezen | (1) to be, (2) 'been' as in 'been fishing', (3) orphans
-kunnen | to be able
-ons | us/our
-zelf | self
-tegen | against, towards, at
-na | after, near
-reeds | already
-wil | (1) present tense of 'want', (2) 'will', noun, (3) fender
-kon | could; past tense of 'to be able'
-niets | nothing
-uw | your
-iemand | somebody
-geweest | been; past participle of 'be'
-andere | other
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_no.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_no.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_no.txt
deleted file mode 100644
index a7a2c28..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_no.txt
+++ /dev/null
@@ -1,194 +0,0 @@
- | From svn.tartarus.org/snowball/trunk/website/algorithms/norwegian/stop.txt
- | This file is distributed under the BSD License.
- | See http://snowball.tartarus.org/license.php
- | Also see http://www.opensource.org/licenses/bsd-license.html
- | - Encoding was converted to UTF-8.
- | - This notice was added.
- |
- | NOTE: To use this file with StopFilterFactory, you must specify format="snowball"
-
- | A Norwegian stop word list. Comments begin with vertical bar. Each stop
- | word is at the start of a line.
-
- | This stop word list is for the dominant bokm�l dialect. Words unique
- | to nynorsk are marked *.
-
- | Revised by Jan Bruusgaard <Ja...@ssb.no>, Jan 2005
-
-og | and
-i | in
-jeg | I
-det | it/this/that
-at | to (w. inf.)
-en | a/an
-et | a/an
-den | it/this/that
-til | to
-er | is/am/are
-som | who/that
-p� | on
-de | they / you(formal)
-med | with
-han | he
-av | of
-ikke | not
-ikkje | not *
-der | there
-s� | so
-var | was/were
-meg | me
-seg | you
-men | but
-ett | one
-har | have
-om | about
-vi | we
-min | my
-mitt | my
-ha | have
-hadde | had
-hun | she
-n� | now
-over | over
-da | when/as
-ved | by/know
-fra | from
-du | you
-ut | out
-sin | your
-dem | them
-oss | us
-opp | up
-man | you/one
-kan | can
-hans | his
-hvor | where
-eller | or
-hva | what
-skal | shall/must
-selv | self (reflective)
-sj�l | self (reflective)
-her | here
-alle | all
-vil | will
-bli | become
-ble | became
-blei | became *
-blitt | have become
-kunne | could
-inn | in
-n�r | when
-v�re | be
-kom | come
-noen | some
-noe | some
-ville | would
-dere | you
-som | who/which/that
-deres | their/theirs
-kun | only/just
-ja | yes
-etter | after
-ned | down
-skulle | should
-denne | this
-for | for/because
-deg | you
-si | hers/his
-sine | hers/his
-sitt | hers/his
-mot | against
-� | to
-meget | much
-hvorfor | why
-dette | this
-disse | these/those
-uten | without
-hvordan | how
-ingen | none
-din | your
-ditt | your
-blir | become
-samme | same
-hvilken | which
-hvilke | which (plural)
-s�nn | such a
-inni | inside/within
-mellom | between
-v�r | our
-hver | each
-hvem | who
-vors | us/ours
-hvis | whose
-b�de | both
-bare | only/just
-enn | than
-fordi | as/because
-f�r | before
-mange | many
-ogs� | also
-slik | just
-v�rt | been
-v�re | to be
-b�e | both *
-begge | both
-siden | since
-dykk | your *
-dykkar | yours *
-dei | they *
-deira | them *
-deires | theirs *
-deim | them *
-di | your (fem.) *
-d� | as/when *
-eg | I *
-ein | a/an *
-eit | a/an *
-eitt | a/an *
-elles | or *
-honom | he *
-hj� | at *
-ho | she *
-hoe | she *
-henne | her
-hennar | her/hers
-hennes | hers
-hoss | how *
-hossen | how *
-ikkje | not *
-ingi | noone *
-inkje | noone *
-korleis | how *
-korso | how *
-kva | what/which *
-kvar | where *
-kvarhelst | where *
-kven | who/whom *
-kvi | why *
-kvifor | why *
-me | we *
-medan | while *
-mi | my *
-mine | my *
-mykje | much *
-no | now *
-nokon | some (masc./neut.) *
-noka | some (fem.) *
-nokor | some *
-noko | some *
-nokre | some *
-si | his/hers *
-sia | since *
-sidan | since *
-so | so *
-somt | some *
-somme | some *
-um | about*
-upp | up *
-vere | be *
-vore | was *
-verte | become *
-vort | become *
-varte | became *
-vart | became *
-
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_pt.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_pt.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_pt.txt
deleted file mode 100644
index acfeb01..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_pt.txt
+++ /dev/null
@@ -1,253 +0,0 @@
- | From svn.tartarus.org/snowball/trunk/website/algorithms/portuguese/stop.txt
- | This file is distributed under the BSD License.
- | See http://snowball.tartarus.org/license.php
- | Also see http://www.opensource.org/licenses/bsd-license.html
- | - Encoding was converted to UTF-8.
- | - This notice was added.
- |
- | NOTE: To use this file with StopFilterFactory, you must specify format="snowball"
-
- | A Portuguese stop word list. Comments begin with vertical bar. Each stop
- | word is at the start of a line.
-
-
- | The following is a ranked list (commonest to rarest) of stopwords
- | deriving from a large sample of text.
-
- | Extra words have been added at the end.
-
-de | of, from
-a | the; to, at; her
-o | the; him
-que | who, that
-e | and
-do | de + o
-da | de + a
-em | in
-um | a
-para | for
- | � from SER
-com | with
-n�o | not, no
-uma | a
-os | the; them
-no | em + o
-se | himself etc
-na | em + a
-por | for
-mais | more
-as | the; them
-dos | de + os
-como | as, like
-mas | but
- | foi from SER
-ao | a + o
-ele | he
-das | de + as
- | tem from TER
-� | a + a
-seu | his
-sua | her
-ou | or
- | ser from SER
-quando | when
-muito | much
- | h� from HAV
-nos | em + os; us
-j� | already, now
- | est� from EST
-eu | I
-tamb�m | also
-s� | only, just
-pelo | per + o
-pela | per + a
-at� | up to
-isso | that
-ela | he
-entre | between
- | era from SER
-depois | after
-sem | without
-mesmo | same
-aos | a + os
- | ter from TER
-seus | his
-quem | whom
-nas | em + as
-me | me
-esse | that
-eles | they
- | est�o from EST
-voc� | you
- | tinha from TER
- | foram from SER
-essa | that
-num | em + um
-nem | nor
-suas | her
-meu | my
-�s | a + as
-minha | my
- | t�m from TER
-numa | em + uma
-pelos | per + os
-elas | they
- | havia from HAV
- | seja from SER
-qual | which
- | ser� from SER
-n�s | we
- | tenho from TER
-lhe | to him, her
-deles | of them
-essas | those
-esses | those
-pelas | per + as
-este | this
- | fosse from SER
-dele | of him
-
- | other words. There are many contractions such as naquele = em+aquele,
- | mo = me+o, but they are rare.
- | Indefinite article plural forms are also rare.
-
-tu | thou
-te | thee
-voc�s | you (plural)
-vos | you
-lhes | to them
-meus | my
-minhas
-teu | thy
-tua
-teus
-tuas
-nosso | our
-nossa
-nossos
-nossas
-
-dela | of her
-delas | of them
-
-esta | this
-estes | these
-estas | these
-aquele | that
-aquela | that
-aqueles | those
-aquelas | those
-isto | this
-aquilo | that
-
- | forms of estar, to be (not including the infinitive):
-estou
-est�
-estamos
-est�o
-estive
-esteve
-estivemos
-estiveram
-estava
-est�vamos
-estavam
-estivera
-estiv�ramos
-esteja
-estejamos
-estejam
-estivesse
-estiv�ssemos
-estivessem
-estiver
-estivermos
-estiverem
-
- | forms of haver, to have (not including the infinitive):
-hei
-h�
-havemos
-h�o
-houve
-houvemos
-houveram
-houvera
-houv�ramos
-haja
-hajamos
-hajam
-houvesse
-houv�ssemos
-houvessem
-houver
-houvermos
-houverem
-houverei
-houver�
-houveremos
-houver�o
-houveria
-houver�amos
-houveriam
-
- | forms of ser, to be (not including the infinitive):
-sou
-somos
-s�o
-era
-�ramos
-eram
-fui
-foi
-fomos
-foram
-fora
-f�ramos
-seja
-sejamos
-sejam
-fosse
-f�ssemos
-fossem
-for
-formos
-forem
-serei
-ser�
-seremos
-ser�o
-seria
-ser�amos
-seriam
-
- | forms of ter, to have (not including the infinitive):
-tenho
-tem
-temos
-t�m
-tinha
-t�nhamos
-tinham
-tive
-teve
-tivemos
-tiveram
-tivera
-tiv�ramos
-tenha
-tenhamos
-tenham
-tivesse
-tiv�ssemos
-tivessem
-tiver
-tivermos
-tiverem
-terei
-ter�
-teremos
-ter�o
-teria
-ter�amos
-teriam
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_ro.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_ro.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_ro.txt
deleted file mode 100644
index 4fdee90..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_ro.txt
+++ /dev/null
@@ -1,233 +0,0 @@
-# This file was created by Jacques Savoy and is distributed under the BSD license.
-# See http://members.unine.ch/jacques.savoy/clef/index.html.
-# Also see http://www.opensource.org/licenses/bsd-license.html
-acea
-aceasta
-aceast\u0103
-aceea
-acei
-aceia
-acel
-acela
-acele
-acelea
-acest
-acesta
-aceste
-acestea
-ace\u015fti
-ace\u015ftia
-acolo
-acum
-ai
-aia
-aib\u0103
-aici
-al
-\u0103la
-ale
-alea
-\u0103lea
-altceva
-altcineva
-am
-ar
-are
-a\u015f
-a\u015fadar
-asemenea
-asta
-\u0103sta
-ast\u0103zi
-astea
-\u0103stea
-\u0103\u015ftia
-asupra
-a\u0163i
-au
-avea
-avem
-ave\u0163i
-azi
-bine
-bucur
-bun\u0103
-ca
-c\u0103
-c\u0103ci
-c�nd
-care
-c\u0103rei
-c\u0103ror
-c\u0103rui
-c�t
-c�te
-c�\u0163i
-c\u0103tre
-c�tva
-ce
-cel
-ceva
-chiar
-c�nd
-cine
-cineva
-c�t
-c�te
-c�\u0163i
-c�tva
-contra
-cu
-cum
-cumva
-cur�nd
-cur�nd
-da
-d\u0103
-dac\u0103
-dar
-datorit\u0103
-de
-deci
-deja
-deoarece
-departe
-de\u015fi
-din
-dinaintea
-dintr
-dintre
-drept
-dup\u0103
-ea
-ei
-el
-ele
-eram
-este
-e\u015fti
-eu
-face
-f\u0103r\u0103
-fi
-fie
-fiecare
-fii
-fim
-fi\u0163i
-iar
-ieri
-�i
-�l
-�mi
-�mpotriva
-�n
-�nainte
-�naintea
-�nc�t
-�nc�t
-�ncotro
-�ntre
-�ntruc�t
-�ntruc�t
-�\u0163i
-la
-l�ng\u0103
-le
-li
-l�ng\u0103
-lor
-lui
-m\u0103
-m�ine
-mea
-mei
-mele
-mereu
-meu
-mi
-mine
-mult
-mult\u0103
-mul\u0163i
-ne
-nic\u0103ieri
-nici
-nimeni
-ni\u015fte
-noastr\u0103
-noastre
-noi
-no\u015ftri
-nostru
-nu
-ori
-oric�nd
-oricare
-oric�t
-orice
-oric�nd
-oricine
-oric�t
-oricum
-oriunde
-p�n\u0103
-pe
-pentru
-peste
-p�n\u0103
-poate
-pot
-prea
-prima
-primul
-prin
-printr
-sa
-s\u0103
-s\u0103i
-sale
-sau
-s\u0103u
-se
-\u015fi
-s�nt
-s�ntem
-s�nte\u0163i
-spre
-sub
-sunt
-suntem
-sunte\u0163i
-ta
-t\u0103i
-tale
-t\u0103u
-te
-\u0163i
-\u0163ie
-tine
-toat\u0103
-toate
-tot
-to\u0163i
-totu\u015fi
-tu
-un
-una
-unde
-undeva
-unei
-unele
-uneori
-unor
-v\u0103
-vi
-voastr\u0103
-voastre
-voi
-vo\u015ftri
-vostru
-vou\u0103
-vreo
-vreun
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_ru.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_ru.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_ru.txt
deleted file mode 100644
index 5527140..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_ru.txt
+++ /dev/null
@@ -1,243 +0,0 @@
- | From svn.tartarus.org/snowball/trunk/website/algorithms/russian/stop.txt
- | This file is distributed under the BSD License.
- | See http://snowball.tartarus.org/license.php
- | Also see http://www.opensource.org/licenses/bsd-license.html
- | - Encoding was converted to UTF-8.
- | - This notice was added.
- |
- | NOTE: To use this file with StopFilterFactory, you must specify format="snowball"
-
- | a russian stop word list. comments begin with vertical bar. each stop
- | word is at the start of a line.
-
- | this is a ranked list (commonest to rarest) of stopwords derived from
- | a large text sample.
-
- | letter `\u0451' is translated to `\u0435'.
-
-\u0438 | and
-\u0432 | in/into
-\u0432\u043e | alternative form
-\u043d\u0435 | not
-\u0447\u0442\u043e | what/that
-\u043e\u043d | he
-\u043d\u0430 | on/onto
-\u044f | i
-\u0441 | from
-\u0441\u043e | alternative form
-\u043a\u0430\u043a | how
-\u0430 | milder form of `no' (but)
-\u0442\u043e | conjunction and form of `that'
-\u0432\u0441\u0435 | all
-\u043e\u043d\u0430 | she
-\u0442\u0430\u043a | so, thus
-\u0435\u0433\u043e | him
-\u043d\u043e | but
-\u0434\u0430 | yes/and
-\u0442\u044b | thou
-\u043a | towards, by
-\u0443 | around, chez
-\u0436\u0435 | intensifier particle
-\u0432\u044b | you
-\u0437\u0430 | beyond, behind
-\u0431\u044b | conditional/subj. particle
-\u043f\u043e | up to, along
-\u0442\u043e\u043b\u044c\u043a\u043e | only
-\u0435\u0435 | her
-\u043c\u043d\u0435 | to me
-\u0431\u044b\u043b\u043e | it was
-\u0432\u043e\u0442 | here is/are, particle
-\u043e\u0442 | away from
-\u043c\u0435\u043d\u044f | me
-\u0435\u0449\u0435 | still, yet, more
-\u043d\u0435\u0442 | no, there isnt/arent
-\u043e | about
-\u0438\u0437 | out of
-\u0435\u043c\u0443 | to him
-\u0442\u0435\u043f\u0435\u0440\u044c | now
-\u043a\u043e\u0433\u0434\u0430 | when
-\u0434\u0430\u0436\u0435 | even
-\u043d\u0443 | so, well
-\u0432\u0434\u0440\u0443\u0433 | suddenly
-\u043b\u0438 | interrogative particle
-\u0435\u0441\u043b\u0438 | if
-\u0443\u0436\u0435 | already, but homonym of `narrower'
-\u0438\u043b\u0438 | or
-\u043d\u0438 | neither
-\u0431\u044b\u0442\u044c | to be
-\u0431\u044b\u043b | he was
-\u043d\u0435\u0433\u043e | prepositional form of \u0435\u0433\u043e
-\u0434\u043e | up to
-\u0432\u0430\u0441 | you accusative
-\u043d\u0438\u0431\u0443\u0434\u044c | indef. suffix preceded by hyphen
-\u043e\u043f\u044f\u0442\u044c | again
-\u0443\u0436 | already, but homonym of `adder'
-\u0432\u0430\u043c | to you
-\u0441\u043a\u0430\u0437\u0430\u043b | he said
-\u0432\u0435\u0434\u044c | particle `after all'
-\u0442\u0430\u043c | there
-\u043f\u043e\u0442\u043e\u043c | then
-\u0441\u0435\u0431\u044f | oneself
-\u043d\u0438\u0447\u0435\u0433\u043e | nothing
-\u0435\u0439 | to her
-\u043c\u043e\u0436\u0435\u0442 | usually with `\u0431\u044b\u0442\u044c' as `maybe'
-\u043e\u043d\u0438 | they
-\u0442\u0443\u0442 | here
-\u0433\u0434\u0435 | where
-\u0435\u0441\u0442\u044c | there is/are
-\u043d\u0430\u0434\u043e | got to, must
-\u043d\u0435\u0439 | prepositional form of \u0435\u0439
-\u0434\u043b\u044f | for
-\u043c\u044b | we
-\u0442\u0435\u0431\u044f | thee
-\u0438\u0445 | them, their
-\u0447\u0435\u043c | than
-\u0431\u044b\u043b\u0430 | she was
-\u0441\u0430\u043c | self
-\u0447\u0442\u043e\u0431 | in order to
-\u0431\u0435\u0437 | without
-\u0431\u0443\u0434\u0442\u043e | as if
-\u0447\u0435\u043b\u043e\u0432\u0435\u043a | man, person, one
-\u0447\u0435\u0433\u043e | genitive form of `what'
-\u0440\u0430\u0437 | once
-\u0442\u043e\u0436\u0435 | also
-\u0441\u0435\u0431\u0435 | to oneself
-\u043f\u043e\u0434 | beneath
-\u0436\u0438\u0437\u043d\u044c | life
-\u0431\u0443\u0434\u0435\u0442 | will be
-\u0436 | short form of intensifer particle `\u0436\u0435'
-\u0442\u043e\u0433\u0434\u0430 | then
-\u043a\u0442\u043e | who
-\u044d\u0442\u043e\u0442 | this
-\u0433\u043e\u0432\u043e\u0440\u0438\u043b | was saying
-\u0442\u043e\u0433\u043e | genitive form of `that'
-\u043f\u043e\u0442\u043e\u043c\u0443 | for that reason
-\u044d\u0442\u043e\u0433\u043e | genitive form of `this'
-\u043a\u0430\u043a\u043e\u0439 | which
-\u0441\u043e\u0432\u0441\u0435\u043c | altogether
-\u043d\u0438\u043c | prepositional form of `\u0435\u0433\u043e', `\u043e\u043d\u0438'
-\u0437\u0434\u0435\u0441\u044c | here
-\u044d\u0442\u043e\u043c | prepositional form of `\u044d\u0442\u043e\u0442'
-\u043e\u0434\u0438\u043d | one
-\u043f\u043e\u0447\u0442\u0438 | almost
-\u043c\u043e\u0439 | my
-\u0442\u0435\u043c | instrumental/dative plural of `\u0442\u043e\u0442', `\u0442\u043e'
-\u0447\u0442\u043e\u0431\u044b | full form of `in order that'
-\u043d\u0435\u0435 | her (acc.)
-\u043a\u0430\u0436\u0435\u0442\u0441\u044f | it seems
-\u0441\u0435\u0439\u0447\u0430\u0441 | now
-\u0431\u044b\u043b\u0438 | they were
-\u043a\u0443\u0434\u0430 | where to
-\u0437\u0430\u0447\u0435\u043c | why
-\u0441\u043a\u0430\u0437\u0430\u0442\u044c | to say
-\u0432\u0441\u0435\u0445 | all (acc., gen. preposn. plural)
-\u043d\u0438\u043a\u043e\u0433\u0434\u0430 | never
-\u0441\u0435\u0433\u043e\u0434\u043d\u044f | today
-\u043c\u043e\u0436\u043d\u043e | possible, one can
-\u043f\u0440\u0438 | by
-\u043d\u0430\u043a\u043e\u043d\u0435\u0446 | finally
-\u0434\u0432\u0430 | two
-\u043e\u0431 | alternative form of `\u043e', about
-\u0434\u0440\u0443\u0433\u043e\u0439 | another
-\u0445\u043e\u0442\u044c | even
-\u043f\u043e\u0441\u043b\u0435 | after
-\u043d\u0430\u0434 | above
-\u0431\u043e\u043b\u044c\u0448\u0435 | more
-\u0442\u043e\u0442 | that one (masc.)
-\u0447\u0435\u0440\u0435\u0437 | across, in
-\u044d\u0442\u0438 | these
-\u043d\u0430\u0441 | us
-\u043f\u0440\u043e | about
-\u0432\u0441\u0435\u0433\u043e | in all, only, of all
-\u043d\u0438\u0445 | prepositional form of `\u043e\u043d\u0438' (they)
-\u043a\u0430\u043a\u0430\u044f | which, feminine
-\u043c\u043d\u043e\u0433\u043e | lots
-\u0440\u0430\u0437\u0432\u0435 | interrogative particle
-\u0441\u043a\u0430\u0437\u0430\u043b\u0430 | she said
-\u0442\u0440\u0438 | three
-\u044d\u0442\u0443 | this, acc. fem. sing.
-\u043c\u043e\u044f | my, feminine
-\u0432\u043f\u0440\u043e\u0447\u0435\u043c | moreover, besides
-\u0445\u043e\u0440\u043e\u0448\u043e | good
-\u0441\u0432\u043e\u044e | ones own, acc. fem. sing.
-\u044d\u0442\u043e\u0439 | oblique form of `\u044d\u0442\u0430', fem. `this'
-\u043f\u0435\u0440\u0435\u0434 | in front of
-\u0438\u043d\u043e\u0433\u0434\u0430 | sometimes
-\u043b\u0443\u0447\u0448\u0435 | better
-\u0447\u0443\u0442\u044c | a little
-\u0442\u043e\u043c | preposn. form of `that one'
-\u043d\u0435\u043b\u044c\u0437\u044f | one must not
-\u0442\u0430\u043a\u043e\u0439 | such a one
-\u0438\u043c | to them
-\u0431\u043e\u043b\u0435\u0435 | more
-\u0432\u0441\u0435\u0433\u0434\u0430 | always
-\u043a\u043e\u043d\u0435\u0447\u043d\u043e | of course
-\u0432\u0441\u044e | acc. fem. sing of `all'
-\u043c\u0435\u0436\u0434\u0443 | between
-
-
- | b: some paradigms
- |
- | personal pronouns
- |
- | \u044f \u043c\u0435\u043d\u044f \u043c\u043d\u0435 \u043c\u043d\u043e\u0439 [\u043c\u043d\u043e\u044e]
- | \u0442\u044b \u0442\u0435\u0431\u044f \u0442\u0435\u0431\u0435 \u0442\u043e\u0431\u043e\u0439 [\u0442\u043e\u0431\u043e\u044e]
- | \u043e\u043d \u0435\u0433\u043e \u0435\u043c\u0443 \u0438\u043c [\u043d\u0435\u0433\u043e, \u043d\u0435\u043c\u0443, \u043d\u0438\u043c]
- | \u043e\u043d\u0430 \u0435\u0435 \u044d\u0438 \u0435\u044e [\u043d\u0435\u0435, \u043d\u044d\u0438, \u043d\u0435\u044e]
- | \u043e\u043d\u043e \u0435\u0433\u043e \u0435\u043c\u0443 \u0438\u043c [\u043d\u0435\u0433\u043e, \u043d\u0435\u043c\u0443, \u043d\u0438\u043c]
- |
- | \u043c\u044b \u043d\u0430\u0441 \u043d\u0430\u043c \u043d\u0430\u043c\u0438
- | \u0432\u044b \u0432\u0430\u0441 \u0432\u0430\u043c \u0432\u0430\u043c\u0438
- | \u043e\u043d\u0438 \u0438\u0445 \u0438\u043c \u0438\u043c\u0438 [\u043d\u0438\u0445, \u043d\u0438\u043c, \u043d\u0438\u043c\u0438]
- |
- | \u0441\u0435\u0431\u044f \u0441\u0435\u0431\u0435 \u0441\u043e\u0431\u043e\u0439 [\u0441\u043e\u0431\u043e\u044e]
- |
- | demonstrative pronouns: \u044d\u0442\u043e\u0442 (this), \u0442\u043e\u0442 (that)
- |
- | \u044d\u0442\u043e\u0442 \u044d\u0442\u0430 \u044d\u0442\u043e \u044d\u0442\u0438
- | \u044d\u0442\u043e\u0433\u043e \u044d\u0442\u044b \u044d\u0442\u043e \u044d\u0442\u0438
- | \u044d\u0442\u043e\u0433\u043e \u044d\u0442\u043e\u0439 \u044d\u0442\u043e\u0433\u043e \u044d\u0442\u0438\u0445
- | \u044d\u0442\u043e\u043c\u0443 \u044d\u0442\u043e\u0439 \u044d\u0442\u043e\u043c\u0443 \u044d\u0442\u0438\u043c
- | \u044d\u0442\u0438\u043c \u044d\u0442\u043e\u0439 \u044d\u0442\u0438\u043c [\u044d\u0442\u043e\u044e] \u044d\u0442\u0438\u043c\u0438
- | \u044d\u0442\u043e\u043c \u044d\u0442\u043e\u0439 \u044d\u0442\u043e\u043c \u044d\u0442\u0438\u0445
- |
- | \u0442\u043e\u0442 \u0442\u0430 \u0442\u043e \u0442\u0435
- | \u0442\u043e\u0433\u043e \u0442\u0443 \u0442\u043e \u0442\u0435
- | \u0442\u043e\u0433\u043e \u0442\u043e\u0439 \u0442\u043e\u0433\u043e \u0442\u0435\u0445
- | \u0442\u043e\u043c\u0443 \u0442\u043e\u0439 \u0442\u043e\u043c\u0443 \u0442\u0435\u043c
- | \u0442\u0435\u043c \u0442\u043e\u0439 \u0442\u0435\u043c [\u0442\u043e\u044e] \u0442\u0435\u043c\u0438
- | \u0442\u043e\u043c \u0442\u043e\u0439 \u0442\u043e\u043c \u0442\u0435\u0445
- |
- | determinative pronouns
- |
- | (a) \u0432\u0435\u0441\u044c (all)
- |
- | \u0432\u0435\u0441\u044c \u0432\u0441\u044f \u0432\u0441\u0435 \u0432\u0441\u0435
- | \u0432\u0441\u0435\u0433\u043e \u0432\u0441\u044e \u0432\u0441\u0435 \u0432\u0441\u0435
- | \u0432\u0441\u0435\u0433\u043e \u0432\u0441\u0435\u0439 \u0432\u0441\u0435\u0433\u043e \u0432\u0441\u0435\u0445
- | \u0432\u0441\u0435\u043c\u0443 \u0432\u0441\u0435\u0439 \u0432\u0441\u0435\u043c\u0443 \u0432\u0441\u0435\u043c
- | \u0432\u0441\u0435\u043c \u0432\u0441\u0435\u0439 \u0432\u0441\u0435\u043c [\u0432\u0441\u0435\u044e] \u0432\u0441\u0435\u043c\u0438
- | \u0432\u0441\u0435\u043c \u0432\u0441\u0435\u0439 \u0432\u0441\u0435\u043c \u0432\u0441\u0435\u0445
- |
- | (b) \u0441\u0430\u043c (himself etc)
- |
- | \u0441\u0430\u043c \u0441\u0430\u043c\u0430 \u0441\u0430\u043c\u043e \u0441\u0430\u043c\u0438
- | \u0441\u0430\u043c\u043e\u0433\u043e \u0441\u0430\u043c\u0443 \u0441\u0430\u043c\u043e \u0441\u0430\u043c\u0438\u0445
- | \u0441\u0430\u043c\u043e\u0433\u043e \u0441\u0430\u043c\u043e\u0439 \u0441\u0430\u043c\u043e\u0433\u043e \u0441\u0430\u043c\u0438\u0445
- | \u0441\u0430\u043c\u043e\u043c\u0443 \u0441\u0430\u043c\u043e\u0439 \u0441\u0430\u043c\u043e\u043c\u0443 \u0441\u0430\u043c\u0438\u043c
- | \u0441\u0430\u043c\u0438\u043c \u0441\u0430\u043c\u043e\u0439 \u0441\u0430\u043c\u0438\u043c [\u0441\u0430\u043c\u043e\u044e] \u0441\u0430\u043c\u0438\u043c\u0438
- | \u0441\u0430\u043c\u043e\u043c \u0441\u0430\u043c\u043e\u0439 \u0441\u0430\u043c\u043e\u043c \u0441\u0430\u043c\u0438\u0445
- |
- | stems of verbs `to be', `to have', `to do' and modal
- |
- | \u0431\u044b\u0442\u044c \u0431\u044b \u0431\u0443\u0434 \u0431\u044b\u0432 \u0435\u0441\u0442\u044c \u0441\u0443\u0442\u044c
- | \u0438\u043c\u0435
- | \u0434\u0435\u043b
- | \u043c\u043e\u0433 \u043c\u043e\u0436 \u043c\u043e\u0447\u044c
- | \u0443\u043c\u0435
- | \u0445\u043e\u0447 \u0445\u043e\u0442
- | \u0434\u043e\u043b\u0436
- | \u043c\u043e\u0436\u043d
- | \u043d\u0443\u0436\u043d
- | \u043d\u0435\u043b\u044c\u0437\u044f
-
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_sv.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_sv.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_sv.txt
deleted file mode 100644
index 096f87f..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_sv.txt
+++ /dev/null
@@ -1,133 +0,0 @@
- | From svn.tartarus.org/snowball/trunk/website/algorithms/swedish/stop.txt
- | This file is distributed under the BSD License.
- | See http://snowball.tartarus.org/license.php
- | Also see http://www.opensource.org/licenses/bsd-license.html
- | - Encoding was converted to UTF-8.
- | - This notice was added.
- |
- | NOTE: To use this file with StopFilterFactory, you must specify format="snowball"
-
- | A Swedish stop word list. Comments begin with vertical bar. Each stop
- | word is at the start of a line.
-
- | This is a ranked list (commonest to rarest) of stopwords derived from
- | a large text sample.
-
- | Swedish stop words occasionally exhibit homonym clashes. For example
- | s� = so, but also seed. These are indicated clearly below.
-
-och | and
-det | it, this/that
-att | to (with infinitive)
-i | in, at
-en | a
-jag | I
-hon | she
-som | who, that
-han | he
-p� | on
-den | it, this/that
-med | with
-var | where, each
-sig | him(self) etc
-f�r | for
-s� | so (also: seed)
-till | to
-�r | is
-men | but
-ett | a
-om | if; around, about
-hade | had
-de | they, these/those
-av | of
-icke | not, no
-mig | me
-du | you
-henne | her
-d� | then, when
-sin | his
-nu | now
-har | have
-inte | inte n�gon = no one
-hans | his
-honom | him
-skulle | 'sake'
-hennes | her
-d�r | there
-min | my
-man | one (pronoun)
-ej | nor
-vid | at, by, on (also: vast)
-kunde | could
-n�got | some etc
-fr�n | from, off
-ut | out
-n�r | when
-efter | after, behind
-upp | up
-vi | we
-dem | them
-vara | be
-vad | what
-�ver | over
-�n | than
-dig | you
-kan | can
-sina | his
-h�r | here
-ha | have
-mot | towards
-alla | all
-under | under (also: wonder)
-n�gon | some etc
-eller | or (else)
-allt | all
-mycket | much
-sedan | since
-ju | why
-denna | this/that
-sj�lv | myself, yourself etc
-detta | this/that
-�t | to
-utan | without
-varit | was
-hur | how
-ingen | no
-mitt | my
-ni | you
-bli | to be, become
-blev | from bli
-oss | us
-din | thy
-dessa | these/those
-n�gra | some etc
-deras | their
-blir | from bli
-mina | my
-samma | (the) same
-vilken | who, that
-er | you, your
-s�dan | such a
-v�r | our
-blivit | from bli
-dess | its
-inom | within
-mellan | between
-s�dant | such a
-varf�r | why
-varje | each
-vilka | who, that
-ditt | thy
-vem | who
-vilket | who, that
-sitta | his
-s�dana | such a
-vart | each
-dina | thy
-vars | whose
-v�rt | our
-v�ra | our
-ert | your
-era | your
-vilkas | whose
-
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_th.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_th.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_th.txt
deleted file mode 100644
index 07f0fab..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_th.txt
+++ /dev/null
@@ -1,119 +0,0 @@
-# Thai stopwords from:
-# "Opinion Detection in Thai Political News Columns
-# Based on Subjectivity Analysis"
-# Khampol Sukhum, Supot Nitsuwat, and Choochart Haruechaiyasak
-\u0e44\u0e27\u0e49
-\u0e44\u0e21\u0e48
-\u0e44\u0e1b
-\u0e44\u0e14\u0e49
-\u0e43\u0e2b\u0e49
-\u0e43\u0e19
-\u0e42\u0e14\u0e22
-\u0e41\u0e2b\u0e48\u0e07
-\u0e41\u0e25\u0e49\u0e27
-\u0e41\u0e25\u0e30
-\u0e41\u0e23\u0e01
-\u0e41\u0e1a\u0e1a
-\u0e41\u0e15\u0e48
-\u0e40\u0e2d\u0e07
-\u0e40\u0e2b\u0e47\u0e19
-\u0e40\u0e25\u0e22
-\u0e40\u0e23\u0e34\u0e48\u0e21
-\u0e40\u0e23\u0e32
-\u0e40\u0e21\u0e37\u0e48\u0e2d
-\u0e40\u0e1e\u0e37\u0e48\u0e2d
-\u0e40\u0e1e\u0e23\u0e32\u0e30
-\u0e40\u0e1b\u0e47\u0e19\u0e01\u0e32\u0e23
-\u0e40\u0e1b\u0e47\u0e19
-\u0e40\u0e1b\u0e34\u0e14\u0e40\u0e1c\u0e22
-\u0e40\u0e1b\u0e34\u0e14
-\u0e40\u0e19\u0e37\u0e48\u0e2d\u0e07\u0e08\u0e32\u0e01
-\u0e40\u0e14\u0e35\u0e22\u0e27\u0e01\u0e31\u0e19
-\u0e40\u0e14\u0e35\u0e22\u0e27
-\u0e40\u0e0a\u0e48\u0e19
-\u0e40\u0e09\u0e1e\u0e32\u0e30
-\u0e40\u0e04\u0e22
-\u0e40\u0e02\u0e49\u0e32
-\u0e40\u0e02\u0e32
-\u0e2d\u0e35\u0e01
-\u0e2d\u0e32\u0e08
-\u0e2d\u0e30\u0e44\u0e23
-\u0e2d\u0e2d\u0e01
-\u0e2d\u0e22\u0e48\u0e32\u0e07
-\u0e2d\u0e22\u0e39\u0e48
-\u0e2d\u0e22\u0e32\u0e01
-\u0e2b\u0e32\u0e01
-\u0e2b\u0e25\u0e32\u0e22
-\u0e2b\u0e25\u0e31\u0e07\u0e08\u0e32\u0e01
-\u0e2b\u0e25\u0e31\u0e07
-\u0e2b\u0e23\u0e37\u0e2d
-\u0e2b\u0e19\u0e36\u0e48\u0e07
-\u0e2a\u0e48\u0e27\u0e19
-\u0e2a\u0e48\u0e07
-\u0e2a\u0e38\u0e14
-\u0e2a\u0e4d\u0e32\u0e2b\u0e23\u0e31\u0e1a
-\u0e27\u0e48\u0e32
-\u0e27\u0e31\u0e19
-\u0e25\u0e07
-\u0e23\u0e48\u0e27\u0e21
-\u0e23\u0e32\u0e22
-\u0e23\u0e31\u0e1a
-\u0e23\u0e30\u0e2b\u0e27\u0e48\u0e32\u0e07
-\u0e23\u0e27\u0e21
-\u0e22\u0e31\u0e07
-\u0e21\u0e35
-\u0e21\u0e32\u0e01
-\u0e21\u0e32
-\u0e1e\u0e23\u0e49\u0e2d\u0e21
-\u0e1e\u0e1a
-\u0e1c\u0e48\u0e32\u0e19
-\u0e1c\u0e25
-\u0e1a\u0e32\u0e07
-\u0e19\u0e48\u0e32
-\u0e19\u0e35\u0e49
-\u0e19\u0e4d\u0e32
-\u0e19\u0e31\u0e49\u0e19
-\u0e19\u0e31\u0e01
-\u0e19\u0e2d\u0e01\u0e08\u0e32\u0e01
-\u0e17\u0e38\u0e01
-\u0e17\u0e35\u0e48\u0e2a\u0e38\u0e14
-\u0e17\u0e35\u0e48
-\u0e17\u0e4d\u0e32\u0e43\u0e2b\u0e49
-\u0e17\u0e4d\u0e32
-\u0e17\u0e32\u0e07
-\u0e17\u0e31\u0e49\u0e07\u0e19\u0e35\u0e49
-\u0e17\u0e31\u0e49\u0e07
-\u0e16\u0e49\u0e32
-\u0e16\u0e39\u0e01
-\u0e16\u0e36\u0e07
-\u0e15\u0e49\u0e2d\u0e07
-\u0e15\u0e48\u0e32\u0e07\u0e46
-\u0e15\u0e48\u0e32\u0e07
-\u0e15\u0e48\u0e2d
-\u0e15\u0e32\u0e21
-\u0e15\u0e31\u0e49\u0e07\u0e41\u0e15\u0e48
-\u0e15\u0e31\u0e49\u0e07
-\u0e14\u0e49\u0e32\u0e19
-\u0e14\u0e49\u0e27\u0e22
-\u0e14\u0e31\u0e07
-\u0e0b\u0e36\u0e48\u0e07
-\u0e0a\u0e48\u0e27\u0e07
-\u0e08\u0e36\u0e07
-\u0e08\u0e32\u0e01
-\u0e08\u0e31\u0e14
-\u0e08\u0e30
-\u0e04\u0e37\u0e2d
-\u0e04\u0e27\u0e32\u0e21
-\u0e04\u0e23\u0e31\u0e49\u0e07
-\u0e04\u0e07
-\u0e02\u0e36\u0e49\u0e19
-\u0e02\u0e2d\u0e07
-\u0e02\u0e2d
-\u0e02\u0e13\u0e30
-\u0e01\u0e48\u0e2d\u0e19
-\u0e01\u0e47
-\u0e01\u0e32\u0e23
-\u0e01\u0e31\u0e1a
-\u0e01\u0e31\u0e19
-\u0e01\u0e27\u0e48\u0e32
-\u0e01\u0e25\u0e48\u0e32\u0e27
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/stopwords_tr.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_tr.txt b/solr/example/example-DIH/solr/rss/conf/lang/stopwords_tr.txt
deleted file mode 100644
index 84d9408..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/stopwords_tr.txt
+++ /dev/null
@@ -1,212 +0,0 @@
-# Turkish stopwords from LUCENE-559
-# merged with the list from "Information Retrieval on Turkish Texts"
-# (http://www.users.muohio.edu/canf/papers/JASIST2008offPrint.pdf)
-acaba
-altm\u0131\u015f
-alt\u0131
-ama
-ancak
-arada
-asl\u0131nda
-ayr\u0131ca
-bana
-baz\u0131
-belki
-ben
-benden
-beni
-benim
-beri
-be\u015f
-bile
-bin
-bir
-bir�ok
-biri
-birka�
-birkez
-bir\u015fey
-bir\u015feyi
-biz
-bize
-bizden
-bizi
-bizim
-b�yle
-b�ylece
-bu
-buna
-bunda
-bundan
-bunlar
-bunlar\u0131
-bunlar\u0131n
-bunu
-bunun
-burada
-�ok
-��nk�
-da
-daha
-dahi
-de
-defa
-de\u011fil
-di\u011fer
-diye
-doksan
-dokuz
-dolay\u0131
-dolay\u0131s\u0131yla
-d�rt
-edecek
-eden
-ederek
-edilecek
-ediliyor
-edilmesi
-ediyor
-e\u011fer
-elli
-en
-etmesi
-etti
-etti\u011fi
-etti\u011fini
-gibi
-g�re
-halen
-hangi
-hatta
-hem
-hen�z
-hep
-hepsi
-her
-herhangi
-herkesin
-hi�
-hi�bir
-i�in
-iki
-ile
-ilgili
-ise
-i\u015fte
-itibaren
-itibariyle
-kadar
-kar\u015f\u0131n
-katrilyon
-kendi
-kendilerine
-kendini
-kendisi
-kendisine
-kendisini
-kez
-ki
-kim
-kimden
-kime
-kimi
-kimse
-k\u0131rk
-milyar
-milyon
-mu
-m�
-m\u0131
-nas\u0131l
-ne
-neden
-nedenle
-nerde
-nerede
-nereye
-niye
-ni�in
-o
-olan
-olarak
-oldu
-oldu\u011fu
-oldu\u011funu
-olduklar\u0131n\u0131
-olmad\u0131
-olmad\u0131\u011f\u0131
-olmak
-olmas\u0131
-olmayan
-olmaz
-olsa
-olsun
-olup
-olur
-olursa
-oluyor
-on
-ona
-ondan
-onlar
-onlardan
-onlar\u0131
-onlar\u0131n
-onu
-onun
-otuz
-oysa
-�yle
-pek
-ra\u011fmen
-sadece
-sanki
-sekiz
-seksen
-sen
-senden
-seni
-senin
-siz
-sizden
-sizi
-sizin
-\u015fey
-\u015feyden
-\u015feyi
-\u015feyler
-\u015f�yle
-\u015fu
-\u015funa
-\u015funda
-\u015fundan
-\u015funlar\u0131
-\u015funu
-taraf\u0131ndan
-trilyon
-t�m
-��
-�zere
-var
-vard\u0131
-ve
-veya
-ya
-yani
-yapacak
-yap\u0131lan
-yap\u0131lmas\u0131
-yap\u0131yor
-yapmak
-yapt\u0131
-yapt\u0131\u011f\u0131
-yapt\u0131\u011f\u0131n\u0131
-yapt\u0131klar\u0131
-yedi
-yerine
-yetmi\u015f
-yine
-yirmi
-yoksa
-y�z
-zaten
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/580f6e98/solr/example/example-DIH/solr/rss/conf/lang/userdict_ja.txt
----------------------------------------------------------------------
diff --git a/solr/example/example-DIH/solr/rss/conf/lang/userdict_ja.txt b/solr/example/example-DIH/solr/rss/conf/lang/userdict_ja.txt
deleted file mode 100644
index 6f0368e..0000000
--- a/solr/example/example-DIH/solr/rss/conf/lang/userdict_ja.txt
+++ /dev/null
@@ -1,29 +0,0 @@
-#
-# This is a sample user dictionary for Kuromoji (JapaneseTokenizer)
-#
-# Add entries to this file in order to override the statistical model in terms
-# of segmentation, readings and part-of-speech tags. Notice that entries do
-# not have weights since they are always used when found. This is by-design
-# in order to maximize ease-of-use.
-#
-# Entries are defined using the following CSV format:
-# <text>,<token 1> ... <token n>,<reading 1> ... <reading n>,<part-of-speech tag>
-#
-# Notice that a single half-width space separates tokens and readings, and
-# that the number tokens and readings must match exactly.
-#
-# Also notice that multiple entries with the same <text> is undefined.
-#
-# Whitespace only lines are ignored. Comments are not allowed on entry lines.
-#
-
-# Custom segmentation for kanji compounds
-\u65e5\u672c\u7d4c\u6e08\u65b0\u805e,\u65e5\u672c \u7d4c\u6e08 \u65b0\u805e,\u30cb\u30db\u30f3 \u30b1\u30a4\u30b6\u30a4 \u30b7\u30f3\u30d6\u30f3,\u30ab\u30b9\u30bf\u30e0\u540d\u8a5e
-\u95a2\u897f\u56fd\u969b\u7a7a\u6e2f,\u95a2\u897f \u56fd\u969b \u7a7a\u6e2f,\u30ab\u30f3\u30b5\u30a4 \u30b3\u30af\u30b5\u30a4 \u30af\u30a6\u30b3\u30a6,\u30ab\u30b9\u30bf\u30e0\u540d\u8a5e
-
-# Custom segmentation for compound katakana
-\u30c8\u30fc\u30c8\u30d0\u30c3\u30b0,\u30c8\u30fc\u30c8 \u30d0\u30c3\u30b0,\u30c8\u30fc\u30c8 \u30d0\u30c3\u30b0,\u304b\u305a\u30ab\u30ca\u540d\u8a5e
-\u30b7\u30e7\u30eb\u30c0\u30fc\u30d0\u30c3\u30b0,\u30b7\u30e7\u30eb\u30c0\u30fc \u30d0\u30c3\u30b0,\u30b7\u30e7\u30eb\u30c0\u30fc \u30d0\u30c3\u30b0,\u304b\u305a\u30ab\u30ca\u540d\u8a5e
-
-# Custom reading for former sumo wrestler
-\u671d\u9752\u9f8d,\u671d\u9752\u9f8d,\u30a2\u30b5\u30b7\u30e7\u30a6\u30ea\u30e5\u30a6,\u30ab\u30b9\u30bf\u30e0\u4eba\u540d