You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucenenet.apache.org by Bogdan Litescu <bo...@gmail.com> on 2011/11/13 16:36:31 UTC
[Lucene.Net] ngram filter
Hello,
I have a requirement where I need to also be able to search parts of words.
Doing some research I found that there are some NGrams filters capable of
extracting parts of the words and index those as well. But I couldn't find
any reference to this in Lucene.Net source code. Is this part of a separate
package o has it not been ported to .NET yet?
Thanks,
Bogdan
Re: [Lucene.Net] inheritance in lucene .net
Posted by Anders Lybecker <an...@lybecker.com>.
Hi Christian,
This is NHibernate specific, so I guess you have a better chance in there
forum.
:-)
Anders Lybecker
On Mon, Nov 14, 2011 at 5:24 PM, Christian Setzkorn
<ch...@setzkorn.eu>wrote:
> I have a class Publication which has a child PublicationX. I am trying to
> index all instances of Publication like this:
>
> public void BuildSearchIndex()
> {
> FSDirectory entityDirectory = null;
> IndexWriter writer = null;
>
> var entityType = typeof(Publication);
>
> var indexDirectory = new DirectoryInfo(GetIndexDirectory());
>
> if (indexDirectory.Exists)
> {
> indexDirectory.Delete(true);
> }
>
> try
> {
> var dir = new DirectoryInfo(Path.Combine(indexDirectory.FullName,
> entityType.Name));
> entityDirectory = FSDirectory.Open(dir);
> writer = new IndexWriter(entityDirectory, new
> StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29), true,
> IndexWriter.MaxFieldLength.UNLIMITED);
> }
> finally
> {
> if (entityDirectory != null)
> {
> entityDirectory.Close();
> }
>
> if (writer != null)
> {
> writer.Close();
> }
> }
>
> var fullTextSession =
> Search.CreateFullTextSession(NHibernateSession.Current);
>
> int totalCount =
>
> NHibernateSession.Current.CreateCriteria(typeof(Publication)).SetProjection(
> Projections.RowCount()).FutureValue<Int32>().Value;
>
> int pageSize = 500;
> int totalPages = totalCount / pageSize + 1;
> int currentPage = 0;
>
> do
> {
> IList<Publication> list =
>
> NHibernateSession.Current.CreateCriteria(typeof(Publication)).SetFirstResult
> (pageSize * (currentPage - 1)).SetMaxResults(pageSize *
> currentPage).Future<Publication>().ToList();
>
> foreach (Publication p in list)
> {
> fullTextSession.Index(p);
> }
> currentPage++;
> }
> while (currentPage < totalPages);
> }
>
> The idea is to do the indexing in batches (improvement suggestions welcome)
> as there are quite a lot of publications already in the relational
> database.
> Unfortunately, I am getting this exception:
>
> NHibernate.Search.Impl.SearchException was unhandled by user code
> Message=Unable to open IndexReader for EID2.Domain.PublicationX
> Source=NHibernate.Search
>
> I have marked both classes with [Indexed].
>
> Thanks.
>
> Christian
>
>
[Lucene.Net] inheritance in lucene .net
Posted by Christian Setzkorn <ch...@setzkorn.eu>.
I have a class Publication which has a child PublicationX. I am trying to
index all instances of Publication like this:
public void BuildSearchIndex()
{
FSDirectory entityDirectory = null;
IndexWriter writer = null;
var entityType = typeof(Publication);
var indexDirectory = new DirectoryInfo(GetIndexDirectory());
if (indexDirectory.Exists)
{
indexDirectory.Delete(true);
}
try
{
var dir = new DirectoryInfo(Path.Combine(indexDirectory.FullName,
entityType.Name));
entityDirectory = FSDirectory.Open(dir);
writer = new IndexWriter(entityDirectory, new
StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29), true,
IndexWriter.MaxFieldLength.UNLIMITED);
}
finally
{
if (entityDirectory != null)
{
entityDirectory.Close();
}
if (writer != null)
{
writer.Close();
}
}
var fullTextSession =
Search.CreateFullTextSession(NHibernateSession.Current);
int totalCount =
NHibernateSession.Current.CreateCriteria(typeof(Publication)).SetProjection(
Projections.RowCount()).FutureValue<Int32>().Value;
int pageSize = 500;
int totalPages = totalCount / pageSize + 1;
int currentPage = 0;
do
{
IList<Publication> list =
NHibernateSession.Current.CreateCriteria(typeof(Publication)).SetFirstResult
(pageSize * (currentPage - 1)).SetMaxResults(pageSize *
currentPage).Future<Publication>().ToList();
foreach (Publication p in list)
{
fullTextSession.Index(p);
}
currentPage++;
}
while (currentPage < totalPages);
}
The idea is to do the indexing in batches (improvement suggestions welcome)
as there are quite a lot of publications already in the relational database.
Unfortunately, I am getting this exception:
NHibernate.Search.Impl.SearchException was unhandled by user code
Message=Unable to open IndexReader for EID2.Domain.PublicationX
Source=NHibernate.Search
I have marked both classes with [Indexed].
Thanks.
Christian
RE: [Lucene.Net] ngram filter
Posted by Digy <di...@gmail.com>.
I don't know how others (github, nuget etc.) are maintained.
The only official location for Lucene.Net source is
https://svn.apache.org/repos/asf/incubator/lucene.net/
DIGY
-----Original Message-----
From: Bogdan Litescu [mailto:bogdan.litescu@avatar-soft.ro]
Sent: Sunday, November 13, 2011 8:47 PM
To: lucene-net-user@lucene.apache.org
Subject: Re: [Lucene.Net] ngram filter
Thanks, I found it, I guess I had an older version of the source code.
By the way, I did a pull request some months ago on github, is this the
correct way to do it?
https://github.com/apache/lucene.net/pulls
Bogdan
On Sun, Nov 13, 2011 at 8:14 PM, Digy <di...@gmail.com> wrote:
> It is in contrib, not in the core. You can download the source (using a
svn
> client) from
>
>
>
http://svn.apache.org/viewvc/incubator/lucene.net/trunk/src/contrib/Analyzer
> s/
> or
>
>
http://svn.apache.org/viewvc/incubator/lucene.net/branches/Lucene.Net_2_9_4g
> /src/contrib/Analyzers/
>
>
> DIGY
>
> -----Original Message-----
> From: Bogdan Litescu [mailto:bogdan.litescu@gmail.com]
> Sent: Sunday, November 13, 2011 5:37 PM
> To: lucene-net-user@lucene.apache.org
> Subject: [Lucene.Net] ngram filter
>
> Hello,
>
> I have a requirement where I need to also be able to search parts of
words.
> Doing some research I found that there are some NGrams filters capable of
> extracting parts of the words and index those as well. But I couldn't find
> any reference to this in Lucene.Net source code. Is this part of a
separate
> package o has it not been ported to .NET yet?
>
> Thanks,
> Bogdan
>
> -----
>
> Checked by AVG - www.avg.com
> Version: 2012.0.1869 / Virus Database: 2092/4614 - Release Date: 11/13/11
>
>
--
Bogdan Litescu
Avatar Software <http://www.avatar-soft.ro>
twitter.com/AvtSoft
-----
Checked by AVG - www.avg.com
Version: 2012.0.1869 / Virus Database: 2092/4614 - Release Date: 11/13/11
Re: [Lucene.Net] ngram filter
Posted by Bogdan Litescu <bo...@avatar-soft.ro>.
Thanks, I found it, I guess I had an older version of the source code.
By the way, I did a pull request some months ago on github, is this the
correct way to do it?
https://github.com/apache/lucene.net/pulls
Bogdan
On Sun, Nov 13, 2011 at 8:14 PM, Digy <di...@gmail.com> wrote:
> It is in contrib, not in the core. You can download the source (using a svn
> client) from
>
>
> http://svn.apache.org/viewvc/incubator/lucene.net/trunk/src/contrib/Analyzer
> s/
> or
>
> http://svn.apache.org/viewvc/incubator/lucene.net/branches/Lucene.Net_2_9_4g
> /src/contrib/Analyzers/
>
>
> DIGY
>
> -----Original Message-----
> From: Bogdan Litescu [mailto:bogdan.litescu@gmail.com]
> Sent: Sunday, November 13, 2011 5:37 PM
> To: lucene-net-user@lucene.apache.org
> Subject: [Lucene.Net] ngram filter
>
> Hello,
>
> I have a requirement where I need to also be able to search parts of words.
> Doing some research I found that there are some NGrams filters capable of
> extracting parts of the words and index those as well. But I couldn't find
> any reference to this in Lucene.Net source code. Is this part of a separate
> package o has it not been ported to .NET yet?
>
> Thanks,
> Bogdan
>
> -----
>
> Checked by AVG - www.avg.com
> Version: 2012.0.1869 / Virus Database: 2092/4614 - Release Date: 11/13/11
>
>
--
Bogdan Litescu
Avatar Software <http://www.avatar-soft.ro>
twitter.com/AvtSoft
RE: [Lucene.Net] ngram filter
Posted by Digy <di...@gmail.com>.
It is in contrib, not in the core. You can download the source (using a svn
client) from
http://svn.apache.org/viewvc/incubator/lucene.net/trunk/src/contrib/Analyzer
s/
or
http://svn.apache.org/viewvc/incubator/lucene.net/branches/Lucene.Net_2_9_4g
/src/contrib/Analyzers/
DIGY
-----Original Message-----
From: Bogdan Litescu [mailto:bogdan.litescu@gmail.com]
Sent: Sunday, November 13, 2011 5:37 PM
To: lucene-net-user@lucene.apache.org
Subject: [Lucene.Net] ngram filter
Hello,
I have a requirement where I need to also be able to search parts of words.
Doing some research I found that there are some NGrams filters capable of
extracting parts of the words and index those as well. But I couldn't find
any reference to this in Lucene.Net source code. Is this part of a separate
package o has it not been ported to .NET yet?
Thanks,
Bogdan
-----
Checked by AVG - www.avg.com
Version: 2012.0.1869 / Virus Database: 2092/4614 - Release Date: 11/13/11