You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Furkan KAMACI <fu...@gmail.com> on 2013/05/27 15:37:54 UTC
Solr/Lucene Analayzer That Writes To File
Hi;
I want to use Solr for an academical research. One step of my purpose is I
want to store tokens in a file (I will store it at a database later) and I
don't want to index them. For such kind of purposes should I use core
Lucene or Solr? Is there an example for writing a custom analyzer and just
storing tokens in a file?
Re: Solr/Lucene Analayzer That Writes To File
Posted by Roman Chyla <ro...@gmail.com>.
You can store them and then use different analyzer chains on it (stored,
doesn't need to be indexed)
I'd probably use the collector pattern
se.search(new MatchAllDocsQuery(), new Collector() {
private AtomicReader reader;
private int i = 0;
@Override
public boolean acceptsDocsOutOfOrder() {
return true;
}
@Override
public void collect(int i) {
Document d;
try {
d = reader.document(i, fieldsToLoad);
for (String f: fieldsToLoad) {
String[] vals = d.getValues(f);
for (String s: vals) {
TokenStream ts = analyzer.tokenStream(targetAnalyzer,
new StringReader(s));
ts.reset();
while (ts.incrementToken()) {
//do something with the analyzed tokens
}
}
}
} catch (IOException e) {
// pass
}
}
@Override
public void setNextReader(AtomicReaderContext context) {
this.reader = context.reader();
}
@Override
public void setScorer(org.apache.lucene.search.Scorer scorer) {
// Do Nothing
}
});
// or persist the data here if one of your components knows to
write to disk, but there is no api...
TokenStream ts = analyzer.tokenStream(data.targetField, new
StringReader("xxx"));
ts.reset();
ts.reset();
ts.reset();
}
On Mon, May 27, 2013 at 9:37 AM, Furkan KAMACI <fu...@gmail.com>wrote:
> Hi;
>
> I want to use Solr for an academical research. One step of my purpose is I
> want to store tokens in a file (I will store it at a database later) and I
> don't want to index them. For such kind of purposes should I use core
> Lucene or Solr? Is there an example for writing a custom analyzer and just
> storing tokens in a file?
>
Re: Solr/Lucene Analayzer That Writes To File
Posted by Rafał Kuć <r....@solr.pl>.
Hello!
Take a look at custom posting formats. For example
here is a nice post showing what you can do with Lucene SimpleText
codec:
http://blog.mikemccandless.com/2010/10/lucenes-simpletext-codec.html
However please remember that it is not advised to use that codec in
production environment.
--
Regards,
Rafał Kuć
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch
> Hi;
> I want to use Solr for an academical research. One step of my purpose is I
> want to store tokens in a file (I will store it at a database later) and I
> don't want to index them. For such kind of purposes should I use core
> Lucene or Solr? Is there an example for writing a custom analyzer and just
> storing tokens in a file?
Re: Solr/Lucene Analayzer That Writes To File
Posted by Chris Hostetter <ho...@fucit.org>.
: I want to use Solr for an academical research. One step of my purpose is I
: want to store tokens in a file (I will store it at a database later) and I
you could absolutely write a java program which access the analyzers
directly nad does whatever you want with the results of analysing a piece
of text that you feed in.
Alternatively, you could use something like the
FieldAnalysisRequestHandler in solr, so that you could have an arbitrary
client send data to solr asking it to analyze it for you and break it down
into tokens, per your schema.xml...
http://localhost:8983/solr/collection1/analysis/field?analysis.fieldvalue=The%20quick%20brown%20fox%20jumped%20over%20the%20lazy%20dog&analysis.fieldtype=text_en&wt=json&indent=true
(this is exactly how the Analysis page in the admin UI works, the
javascript powering htat page hits this same URL)
https://lucene.apache.org/solr/4_3_0/solr-core/org/apache/solr/handler/FieldAnalysisRequestHandler.html
-Hoss