You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jayni <ja...@web.de> on 2013/12/13 18:02:31 UTC

Similarity search with Solr

Hi,

I want to do a similarity search on millions of sentences. They are written
in natural language and I want to find sentences, which have a "similar" set
of words.
A search based on trigrams or a kind of Full Text search, which finds
similar sentences is my aim.
Before I used PostgreSQL, but it was far to slow.

Do you think it's possible to realize a performed similarity search like
described with Solr and do you think it's the right search engine to do
that?
Thanks for your Answers!

Janek



--
View this message in context: http://lucene.472066.n3.nabble.com/Similarity-search-with-Solr-tp4106623.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Similarity search with Solr

Posted by Jayni <ja...@web.de>.
okay, thanks for your help

Janek



--
View this message in context: http://lucene.472066.n3.nabble.com/Similarity-search-with-Solr-tp4106623p4106648.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Similarity search with Solr

Posted by Jack Krupansky <ja...@basetechnology.com>.
Do a proof of concept implementation and see for yourself if you find the 
performance acceptable.

I mean, performance should be reasonably decent.

-- Jack Krupansky

-----Original Message----- 
From: Jayni
Sent: Friday, December 13, 2013 12:22 PM
To: solr-user@lucene.apache.org
Subject: Re: Similarity search with Solr

@kamaci
The sentences are stored in txt files, but I can also import them. The file
includes a lot of RTF-stuff like a font table, but I'm only interested in
the sentences, which are enclosed by tags.

@Jack Krupansky-2
Do you think it will be fast enough. I got millions of sentences and I have
to search for hundreds of sentenes many times?
What I want to ask. Will it be significantly faster than a RDBMS solution,
which is rather to slow.

Janek



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Similarity-search-with-Solr-tp4106623p4106634.html
Sent from the Solr - User mailing list archive at Nabble.com. 


Re: Similarity search with Solr

Posted by Jayni <ja...@web.de>.
@kamaci
The sentences are stored in txt files, but I can also import them. The file
includes a lot of RTF-stuff like a font table, but I'm only interested in
the sentences, which are enclosed by tags.

@Jack Krupansky-2
Do you think it will be fast enough. I got millions of sentences and I have
to search for hundreds of sentenes many times?
What I want to ask. Will it be significantly faster than a RDBMS solution,
which is rather to slow.

Janek



--
View this message in context: http://lucene.472066.n3.nabble.com/Similarity-search-with-Solr-tp4106623p4106634.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Similarity search with Solr

Posted by Jack Krupansky <ja...@basetechnology.com>.
Just use the edismax query parser with bigrams and trigrams enabled and the 
default operator set to OR. That will select all sentences even vaguely 
similar and will more highly score sentences that have a greater number of 
words and phrases that match.

-- Jack Krupansky

-----Original Message----- 
From: Jayni
Sent: Friday, December 13, 2013 12:02 PM
To: solr-user@lucene.apache.org
Subject: Similarity search with Solr

Hi,

I want to do a similarity search on millions of sentences. They are written
in natural language and I want to find sentences, which have a "similar" set
of words.
A search based on trigrams or a kind of Full Text search, which finds
similar sentences is my aim.
Before I used PostgreSQL, but it was far to slow.

Do you think it's possible to realize a performed similarity search like
described with Solr and do you think it's the right search engine to do
that?
Thanks for your Answers!

Janek



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Similarity-search-with-Solr-tp4106623.html
Sent from the Solr - User mailing list archive at Nabble.com. 


Re: Similarity search with Solr

Posted by Furkan KAMACI <fu...@gmail.com>.
Hi;

Could you explain your infrastructure?

Thanks;
Furkan KAMACI


2013/12/13 Jayni <ja...@web.de>

> Hi,
>
> I want to do a similarity search on millions of sentences. They are written
> in natural language and I want to find sentences, which have a "similar"
> set
> of words.
> A search based on trigrams or a kind of Full Text search, which finds
> similar sentences is my aim.
> Before I used PostgreSQL, but it was far to slow.
>
> Do you think it's possible to realize a performed similarity search like
> described with Solr and do you think it's the right search engine to do
> that?
> Thanks for your Answers!
>
> Janek
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Similarity-search-with-Solr-tp4106623.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>