You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Lee Goddard <le...@gmail.com> on 2011/01/19 09:22:51 UTC
Solr with Unknown Lucene Index?
I have to use some Lucene indexes, and Solr looks like the perfect
solution.
However, all I know about the Lucene indexes are what Luke tells me, and
simply setting the schema to represent all fields as text does not seem
to be working -- though as this is my first Solr, I am not sure if that
is due to some other issue.
Is there some way to ascertain how the Solr schema should describe the
Lucene fields?
Many thanks in anticipation
Lee
Re: Solr with Unknown Lucene Index?
Posted by Chris Hostetter <ho...@fucit.org>.
: Having found some code that searches a Lucene index, the only analyzers
: referenced are Lucene.Net.Analysis.Standard.StandardAnalyzer.
:
: How can I map this is Solr? The example schema doesn't seem to mention this,
: and specifying 'text' or 'string' for every field doesn't seem to help.
1) that analyzer seems to be a Lucene.Net analyzer, so the java equivilent
would be org.apache.lucene.analsys.standard.StandardAnalyzer
2) the example schema.xml demonstrates how to use an existing Analyzer
implementation...
<!-- One can also specify an existing Analyzer class that has a
default constructor via the class attribute on the analyzer element
<fieldType name="text_greek" class="solr.TextField">
<analyzer class="org.apache.lucene.analysis.el.GreekAnalyzer"/>
</fieldType>
-->
3) i'm getting the sense from your comments that you aren't very familiar
with lucene/solr in general. An important thing to understand is that
just because the code that created the index only ever uses
"StandardAnalyzer" doens't mean it will make sense to use that analyzer on
every field when attempting to search that field from solr -- some fields
may have been indexed w/o using any analysis, some may be numeric fields
with special encoding, some may be compressed, etc...
trying to reverse engineer what the schema should look like to open any
arbitrary index requires a lot of understanding about how that index was
built -- it's easy to just "dump the terms" found in an index w/o knowing
anything about where those terms came fom (that's what Luke does) but that
doens't help your recognize things like "this list of X words were treated
as stop words, and don't appera in the index, so my query analyzer needs
to be configured with those same X words"
In short: you can eaisly make solr *read* the index (just like luke) but
that won't neccessarily help you *use* the index in a meaninigful way.
-Hoss
Re: Solr with Unknown Lucene Index?
Posted by Lee Goddard <le...@gmail.com>.
Having found some code that searches a Lucene index, the only analyzers
referenced are Lucene.Net.Analysis.Standard.StandardAnalyzer.
How can I map this is Solr? The example schema doesn't seem to mention
this, and specifying 'text' or 'string' for every field doesn't seem to
help.
Thanks
Lee
On 22/01/2011 21:50, Erick Erickson wrote:
> Sorry, I was out of town for a while. Luke just reads stuff, it
> doesn't try to interpret any schema.
> Solr makes certain assumptions about what *should* be in the index
> based on the schema.
> So getting Solr to just use a Lucene index really involves knowing
> that Lucene used, say,
> a StandardAnalyzer followed by a LowerCaseFilter followed by for some
> field.... And there's
> no way I know of to find that information out from a raw Lucene index.
>
> If you don't get things to match, your results will...er...vary. But
> perhaps you can guess
> well enough to make it work, although upgrading will be a problem.
>
> I really think your effort would be best spent finding the original
> indexing or querying
> code if at all possible and seeing the way that code defined the
> analysis chain (in the
> code) for each fields and using that as a basis for creating a "close
> enough" schema.
>
>
> Best
> Erick
>
> On Thu, Jan 20, 2011 at 3:59 AM, Lee Goddard <leegee@gmail.com
> <ma...@gmail.com>> wrote:
>
> Thanks, Erick. I think my question comes down to, 'how does Luke
> know how to read the indexes?' I will try the Luke mailing list.
>
> Cheers
> Lee
>
>
> On 19/01/2011 17:49, Erick Erickson wrote:
>> I don't really think this is possible/reasonable. There's nothing
>> fixed about
>> a Lucene index, you could index a field in different documents
>> with any
>> number of analysis chains. The tricky part here will, as you've
>> discovered,
>> find a way to match the Solr schema "closely enough" to get your
>> desired
>> results.
>>
>> Are you sure there's no way to re-index the data? Or find the
>> original code
>> that indexed it?
>>
>> Best
>> Erick
>>
>> On Wed, Jan 19, 2011 at 3:22 AM, Lee Goddard <leegee@gmail.com
>> <ma...@gmail.com>> wrote:
>>
>> I have to use some Lucene indexes, and Solr looks like the
>> perfect solution.
>>
>> However, all I know about the Lucene indexes are what Luke
>> tells me, and simply setting the schema to represent all
>> fields as text does not seem to be working -- though as this
>> is my first Solr, I am not sure if that is due to some other
>> issue.
>>
>> Is there some way to ascertain how the Solr schema should
>> describe the Lucene fields?
>>
>> Many thanks in anticipation
>> Lee
>>
>>
>
Re: Solr with Unknown Lucene Index?
Posted by Erick Erickson <er...@gmail.com>.
I don't really think this is possible/reasonable. There's nothing fixed
about
a Lucene index, you could index a field in different documents with any
number of analysis chains. The tricky part here will, as you've discovered,
find a way to match the Solr schema "closely enough" to get your desired
results.
Are you sure there's no way to re-index the data? Or find the original code
that indexed it?
Best
Erick
On Wed, Jan 19, 2011 at 3:22 AM, Lee Goddard <le...@gmail.com> wrote:
> I have to use some Lucene indexes, and Solr looks like the perfect
> solution.
>
> However, all I know about the Lucene indexes are what Luke tells me, and
> simply setting the schema to represent all fields as text does not seem to
> be working -- though as this is my first Solr, I am not sure if that is due
> to some other issue.
>
> Is there some way to ascertain how the Solr schema should describe the
> Lucene fields?
>
> Many thanks in anticipation
> Lee
>