You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by bing <JS...@hotmail.com> on 2012/02/24 04:44:58 UTC

TikaLanguageIdentifierUpdateProcessorFactory(since Solr3.5.0) to be used in Solr3.3.0?

Hi, all, 

I am using
org.apache.solr.update.processor.TikaLanguageIdentifierUpdateProcessorFactory
(since Solr3.5.0) to do language detection, and it's cool.
 
An issue: if I deploy Solr3.3.0, is it possible to import that factory in
Solr3.5.0 to be used in Solr3.3.0? 

Why I stick on Solr3.3.0 is because I am working on Dspace (discovery) to
call solr, and for now the highest version that Solr can be upgraded to is
3.3.0.

I would hope to do this while keep Dspace + Solr at the most. Say, import
that factory into Solr3.3.0, is it possible? Does any one happen to know
certain way to solve this?

Best Regards, 
Bing

--
View this message in context: http://lucene.472066.n3.nabble.com/TikaLanguageIdentifierUpdateProcessorFactory-since-Solr3-5-0-to-be-used-in-Solr3-3-0-tp3771620p3771620.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: TikaLanguageIdentifierUpdateProcessorFactory(since Solr3.5.0) to be used in Solr3.3.0?

Posted by bing <JS...@hotmail.com>.
Hi, Erick, 

I get your point. Thank you so much. 

Best Regards, 
Bing

--
View this message in context: http://lucene.472066.n3.nabble.com/TikaLanguageIdentifierUpdateProcessorFactory-since-Solr3-5-0-to-be-used-in-Solr3-3-0-tp3771620p3782938.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: TikaLanguageIdentifierUpdateProcessorFactory(since Solr3.5.0) to be used in Solr3.3.0?

Posted by Erick Erickson <er...@gmail.com>.
It runs any place that has access to the raw files and an HTTP connection
to the Solr server, which is another way of saying "sounds good to me".

Erick

On Mon, Feb 27, 2012 at 9:18 PM, bing <JS...@hotmail.com> wrote:
> HI, Erick,
>
> I can write SolrJ client to call Tika, but I am not certain where to invoke
> the client. In my case, I work on Dspace to call Solr, and I suppose the
> client should be invoked in-between Dspace and Solr. That is, Dspace invokes
> SolrJ client when doing index/query,  which call Tika and Solr. Do you think
> it is reasonable?
>
> Best Regards,
> Bing
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/TikaLanguageIdentifierUpdateProcessorFactory-since-Solr3-5-0-to-be-used-in-Solr3-3-0-tp3771620p3782793.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: TikaLanguageIdentifierUpdateProcessorFactory(since Solr3.5.0) to be used in Solr3.3.0?

Posted by bing <JS...@hotmail.com>.
HI, Erick, 

I can write SolrJ client to call Tika, but I am not certain where to invoke
the client. In my case, I work on Dspace to call Solr, and I suppose the
client should be invoked in-between Dspace and Solr. That is, Dspace invokes
SolrJ client when doing index/query,  which call Tika and Solr. Do you think
it is reasonable? 

Best Regards, 
Bing 

--
View this message in context: http://lucene.472066.n3.nabble.com/TikaLanguageIdentifierUpdateProcessorFactory-since-Solr3-5-0-to-be-used-in-Solr3-3-0-tp3771620p3782793.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: TikaLanguageIdentifierUpdateProcessorFactory(since Solr3.5.0) to be used in Solr3.3.0?

Posted by Erick Erickson <er...@gmail.com>.
My *real* suggestion would be to not do it. Write a SolrJ
program that uses whatever version of Tika you want
to download and use *that* to index rather than try to
sort through the various jar dependencies in Solr. It'd be
safer.

Otherwise, you're on your own here.

Here's some example code:

http://www.lucidimagination.com/blog/2012/02/14/indexing-with-solrj/

Best
Erick

On Sun, Feb 26, 2012 at 9:01 PM, bing <JS...@hotmail.com> wrote:
> Hi, Erick,
>
> My idea is to use Tika0.10 in Dspace1.7.2, which is based on two steps:
>
> 1. Upgrade Solr1.4.1 to Solr3.3.0 in Dspace1.7.2
> In the following link, upgraded Solr & Lucene 3.3.0 has been resolved.
> https://jira.duraspace.org/browse/DS-980
>
> 2. Upgrade to Tika0.10 in Solr3.3.0
> In the following link, people has tried to upgrade Tika0.8 to Tika0.9.
> http://lucene.472066.n3.nabble.com/upgrading-to-Tika-0-9-on-Solr-1-4-1-td2570526.html
>
> I was thinking, if both the above two steps can be achieved, then maybe I
> can get it done. What is your suggestion?
>
> Thank you.
>
> Best Regards,
> Bing
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/TikaLanguageIdentifierUpdateProcessorFactory-since-Solr3-5-0-to-be-used-in-Solr3-3-0-tp3771620p3779437.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: TikaLanguageIdentifierUpdateProcessorFactory(since Solr3.5.0) to be used in Solr3.3.0?

Posted by bing <JS...@hotmail.com>.
Hi, Erick, 

My idea is to use Tika0.10 in Dspace1.7.2, which is based on two steps:

1. Upgrade Solr1.4.1 to Solr3.3.0 in Dspace1.7.2 
In the following link, upgraded Solr & Lucene 3.3.0 has been resolved. 
https://jira.duraspace.org/browse/DS-980

2. Upgrade to Tika0.10 in Solr3.3.0 
In the following link, people has tried to upgrade Tika0.8 to Tika0.9.  
http://lucene.472066.n3.nabble.com/upgrading-to-Tika-0-9-on-Solr-1-4-1-td2570526.html

I was thinking, if both the above two steps can be achieved, then maybe I
can get it done. What is your suggestion? 

Thank you. 

Best Regards, 
Bing 

--
View this message in context: http://lucene.472066.n3.nabble.com/TikaLanguageIdentifierUpdateProcessorFactory-since-Solr3-5-0-to-be-used-in-Solr3-3-0-tp3771620p3779437.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: TikaLanguageIdentifierUpdateProcessorFactory(since Solr3.5.0) to be used in Solr3.3.0?

Posted by Erick Erickson <er...@gmail.com>.
Well, you can give it a try, I don't know if anyone's done that
before. And you're on your own, I haven't a clue what
the results would be...

Sorry I can't be more help here...
Erick

On Thu, Feb 23, 2012 at 10:44 PM, bing <JS...@hotmail.com> wrote:
> Hi, all,
>
> I am using
> org.apache.solr.update.processor.TikaLanguageIdentifierUpdateProcessorFactory
> (since Solr3.5.0) to do language detection, and it's cool.
>
> An issue: if I deploy Solr3.3.0, is it possible to import that factory in
> Solr3.5.0 to be used in Solr3.3.0?
>
> Why I stick on Solr3.3.0 is because I am working on Dspace (discovery) to
> call solr, and for now the highest version that Solr can be upgraded to is
> 3.3.0.
>
> I would hope to do this while keep Dspace + Solr at the most. Say, import
> that factory into Solr3.3.0, is it possible? Does any one happen to know
> certain way to solve this?
>
> Best Regards,
> Bing
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/TikaLanguageIdentifierUpdateProcessorFactory-since-Solr3-5-0-to-be-used-in-Solr3-3-0-tp3771620p3771620.html
> Sent from the Solr - User mailing list archive at Nabble.com.