You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Claudio Martella <cl...@tis.bz.it> on 2010/02/10 21:15:40 UTC

dismax and multi-language corpus

Hello list,

I have a corpus with 3 languages, so i setup a text content field (with
no stemming) and 3 text-[en|it|de] fields with specific snowball stemmers.
i copyField the text to my language-away fields. So, I setup this dismax
searchHandler:

<requestHandler name="content" class="solr.SearchHandler" default="true">
<lst name="defaults">
   <str name="defType">dismax</str>
   <str name="pf">title^1.2 content-en^0.8 content-it^0.8
content-de^0.8</str>
   <str name="bf">title^1.2 content-en^0.8 content-it^0.8
content-de^0.8</str>
   <str name="qf">title^1.2 content-en^0.8 content-it^0.8
content-de^0.8</str>
   <float name="tie">0.1</float>
</lst>
</requestHandler>


but i get this error:

HTTP Status 400 - org.apache.lucene.queryParser.ParseException: Expected
',' at position 7 in 'content-en'

type Status report

message org.apache.lucene.queryParser.ParseException: Expected ',' at
position 7 in 'content-en'

description The request sent by the client was syntactically incorrect
(org.apache.lucene.queryParser.ParseException: Expected ',' at position
7 in 'content-en').

Any idea?

TIA

Claudio

-- 
Claudio Martella
Digital Technologies
Unit Research & Development - Analyst

TIS innovation park
Via Siemens 19 | Siemensstr. 19
39100 Bolzano | 39100 Bozen
Tel. +39 0471 068 123
Fax  +39 0471 068 129
claudio.martella@tis.bz.it http://www.tis.bz.it

Short information regarding use of personal data. According to Section 13 of Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we process your personal data in order to fulfil contractual and fiscal obligations and also to send you information regarding our services and events. Your personal data are processed with and without electronic means and by respecting data subjects' rights, fundamental freedoms and dignity, particularly with regard to confidentiality, personal identity and the right to personal data protection. At any time and without formalities you can write an e-mail to privacy@tis.bz.it in order to object the processing of your personal data for the purpose of sending advertising materials and also to exercise the right to access personal data and other rights referred to in Section 7 of Decree 196/2003. The data controller is TIS Techno Innovation Alto Adige, Siemens Street n. 19, Bolzano. You can find the complete information on the web site www.tis.bz.it.



Re: dismax and multi-language corpus

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Claudio,

Ah, through multilingual indexing/search work (with http://www.sematext.com/products/multilingual-indexer/index.html ) I learned that cross-language search often doesn't really make sense, unless the search involves "universal terms" (e.g. Fiat, BMW, Mercedes, Olivetti, Tomi de Paola, Alberto Tomba...).  If the search involved natural language-specific terms, then searching in the "foreign" language doesn't work so well and doesn't make a ton.  Imagine a search for "ciao ragazzi".  I have no idea what the Italian stemmer does with that, but say it turns it into "cia raga" (it doesn't, but just imagine).  If this was done with Italian docs at index time, you will find the matching docs.  But what happens if "ciao ragazzi" was analyzed by some German analyzer?  Different tokens will be created and indexed, so a "ciao ragazzi" search won't work.  And this Analyzer would you use to analyze that query anyway?  Italian or German?

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/



----- Original Message ----
> From: Claudio Martella <cl...@tis.bz.it>
> To: solr-user@lucene.apache.org
> Sent: Thu, February 11, 2010 3:21:32 AM
> Subject: Re: dismax and multi-language corpus
> 
> I'll try removing the '-'. I do need now to search it. the other option
> would be to request the user what language to query. but in my region we
> use italian and german in the same quantity, so it would turn out in
> querying both the languages all the time. or you meant a more performant
> solution of query both the languages all the time? :)
> 
> 
> Otis Gospodnetic wrote:
> > Claudio - fields with '-' in them can be problematic.
> >
> > Side comment: do you really want to search across all languages at once?  If 
> not, maybe 3 different dismax configs would make your searches better.
> >
> >  Otis
> > ----
> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> > Hadoop ecosystem search :: http://search-hadoop.com/
> >
> >
> >
> > ----- Original Message ----
> >  
> >> From: Claudio Martella 
> >> To: solr-user@lucene.apache.org
> >> Sent: Wed, February 10, 2010 3:15:40 PM
> >> Subject: dismax and multi-language corpus
> >>
> >> Hello list,
> >>
> >> I have a corpus with 3 languages, so i setup a text content field (with
> >> no stemming) and 3 text-[en|it|de] fields with specific snowball stemmers.
> >> i copyField the text to my language-away fields. So, I setup this dismax
> >> searchHandler:
> >>
> >>
> >>
> >>   dismax
> >>   title^1.2 content-en^0.8 content-it^0.8
> >> content-de^0.8
> >>   title^1.2 content-en^0.8 content-it^0.8
> >> content-de^0.8
> >>   title^1.2 content-en^0.8 content-it^0.8
> >> content-de^0.8
> >>   0.1
> >>
> >>
> >>
> >>
> >> but i get this error:
> >>
> >> HTTP Status 400 - org.apache.lucene.queryParser.ParseException: Expected
> >> ',' at position 7 in 'content-en'
> >>
> >> type Status report
> >>
> >> message org.apache.lucene.queryParser.ParseException: Expected ',' at
> >> position 7 in 'content-en'
> >>
> >> description The request sent by the client was syntactically incorrect
> >> (org.apache.lucene.queryParser.ParseException: Expected ',' at position
> >> 7 in 'content-en').
> >>
> >> Any idea?
> >>
> >> TIA
> >>
> >> Claudio
> >>
> >> -- 
> >> Claudio Martella
> >> Digital Technologies
> >> Unit Research & Development - Analyst
> >>
> >> TIS innovation park
> >> Via Siemens 19 | Siemensstr. 19
> >> 39100 Bolzano | 39100 Bozen
> >> Tel. +39 0471 068 123
> >> Fax  +39 0471 068 129
> >> claudio.martella@tis.bz.it http://www.tis.bz.it
> >>
> >> Short information regarding use of personal data. According to Section 13 of 
> >> Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we 
> >> process your personal data in order to fulfil contractual and fiscal 
> obligations 
> >> and also to send you information regarding our services and events. Your 
> >> personal data are processed with and without electronic means and by 
> respecting 
> >> data subjects' rights, fundamental freedoms and dignity, particularly with 
> >> regard to confidentiality, personal identity and the right to personal data 
> >> protection. At any time and without formalities you can write an e-mail to 
> >> privacy@tis.bz.it in order to object the processing of your personal data for 
> 
> >> the purpose of sending advertising materials and also to exercise the right 
> to 
> >> access personal data and other rights referred to in Section 7 of Decree 
> >> 196/2003. The data controller is TIS Techno Innovation Alto Adige, Siemens 
> >> Street n. 19, Bolzano. You can find the complete information on the web site 
> >> www.tis.bz.it.
> >>    
> >
> >
> >  
> 
> 
> -- 
> Claudio Martella
> Digital Technologies
> Unit Research & Development - Analyst
> 
> TIS innovation park
> Via Siemens 19 | Siemensstr. 19
> 39100 Bolzano | 39100 Bozen
> Tel. +39 0471 068 123
> Fax  +39 0471 068 129
> claudio.martella@tis.bz.it http://www.tis.bz.it
> 
> Short information regarding use of personal data. According to Section 13 of 
> Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we 
> process your personal data in order to fulfil contractual and fiscal obligations 
> and also to send you information regarding our services and events. Your 
> personal data are processed with and without electronic means and by respecting 
> data subjects' rights, fundamental freedoms and dignity, particularly with 
> regard to confidentiality, personal identity and the right to personal data 
> protection. At any time and without formalities you can write an e-mail to 
> privacy@tis.bz.it in order to object the processing of your personal data for 
> the purpose of sending advertising materials and also to exercise the right to 
> access personal data and other rights referred to in Section 7 of Decree 
> 196/2003. The data controller is TIS Techno Innovation Alto Adige, Siemens 
> Street n. 19, Bolzano. You can find the complete information on the web site 
> www.tis.bz.it.


Re: dismax and multi-language corpus

Posted by Claudio Martella <cl...@tis.bz.it>.
I'll try removing the '-'. I do need now to search it. the other option
would be to request the user what language to query. but in my region we
use italian and german in the same quantity, so it would turn out in
querying both the languages all the time. or you meant a more performant
solution of query both the languages all the time? :)


Otis Gospodnetic wrote:
> Claudio - fields with '-' in them can be problematic.
>
> Side comment: do you really want to search across all languages at once?  If not, maybe 3 different dismax configs would make your searches better.
>
>  Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Hadoop ecosystem search :: http://search-hadoop.com/
>
>
>
> ----- Original Message ----
>   
>> From: Claudio Martella <cl...@tis.bz.it>
>> To: solr-user@lucene.apache.org
>> Sent: Wed, February 10, 2010 3:15:40 PM
>> Subject: dismax and multi-language corpus
>>
>> Hello list,
>>
>> I have a corpus with 3 languages, so i setup a text content field (with
>> no stemming) and 3 text-[en|it|de] fields with specific snowball stemmers.
>> i copyField the text to my language-away fields. So, I setup this dismax
>> searchHandler:
>>
>>
>>
>>   dismax
>>   title^1.2 content-en^0.8 content-it^0.8
>> content-de^0.8
>>   title^1.2 content-en^0.8 content-it^0.8
>> content-de^0.8
>>   title^1.2 content-en^0.8 content-it^0.8
>> content-de^0.8
>>   0.1
>>
>>
>>
>>
>> but i get this error:
>>
>> HTTP Status 400 - org.apache.lucene.queryParser.ParseException: Expected
>> ',' at position 7 in 'content-en'
>>
>> type Status report
>>
>> message org.apache.lucene.queryParser.ParseException: Expected ',' at
>> position 7 in 'content-en'
>>
>> description The request sent by the client was syntactically incorrect
>> (org.apache.lucene.queryParser.ParseException: Expected ',' at position
>> 7 in 'content-en').
>>
>> Any idea?
>>
>> TIA
>>
>> Claudio
>>
>> -- 
>> Claudio Martella
>> Digital Technologies
>> Unit Research & Development - Analyst
>>
>> TIS innovation park
>> Via Siemens 19 | Siemensstr. 19
>> 39100 Bolzano | 39100 Bozen
>> Tel. +39 0471 068 123
>> Fax  +39 0471 068 129
>> claudio.martella@tis.bz.it http://www.tis.bz.it
>>
>> Short information regarding use of personal data. According to Section 13 of 
>> Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we 
>> process your personal data in order to fulfil contractual and fiscal obligations 
>> and also to send you information regarding our services and events. Your 
>> personal data are processed with and without electronic means and by respecting 
>> data subjects' rights, fundamental freedoms and dignity, particularly with 
>> regard to confidentiality, personal identity and the right to personal data 
>> protection. At any time and without formalities you can write an e-mail to 
>> privacy@tis.bz.it in order to object the processing of your personal data for 
>> the purpose of sending advertising materials and also to exercise the right to 
>> access personal data and other rights referred to in Section 7 of Decree 
>> 196/2003. The data controller is TIS Techno Innovation Alto Adige, Siemens 
>> Street n. 19, Bolzano. You can find the complete information on the web site 
>> www.tis.bz.it.
>>     
>
>
>   


-- 
Claudio Martella
Digital Technologies
Unit Research & Development - Analyst

TIS innovation park
Via Siemens 19 | Siemensstr. 19
39100 Bolzano | 39100 Bozen
Tel. +39 0471 068 123
Fax  +39 0471 068 129
claudio.martella@tis.bz.it http://www.tis.bz.it

Short information regarding use of personal data. According to Section 13 of Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we process your personal data in order to fulfil contractual and fiscal obligations and also to send you information regarding our services and events. Your personal data are processed with and without electronic means and by respecting data subjects' rights, fundamental freedoms and dignity, particularly with regard to confidentiality, personal identity and the right to personal data protection. At any time and without formalities you can write an e-mail to privacy@tis.bz.it in order to object the processing of your personal data for the purpose of sending advertising materials and also to exercise the right to access personal data and other rights referred to in Section 7 of Decree 196/2003. The data controller is TIS Techno Innovation Alto Adige, Siemens Street n. 19, Bolzano. You can find the complete information on the web site www.tis.bz.it.



Re: dismax and multi-language corpus

Posted by Otis Gospodnetic <ot...@yahoo.com>.
I agree.  I just didn't have the chance to look at it closely to get enough details for filing in JIRA.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/



----- Original Message ----
> From: Jason Rutherglen <ja...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Thu, February 11, 2010 10:47:03 PM
> Subject: Re: dismax and multi-language corpus
> 
> That's a bug, IMO...
> 
> On Thu, Feb 11, 2010 at 1:30 PM, Otis Gospodnetic
> wrote:
> > I don't know, but the other day I did see a NPE related to fields with '-'. 
>  In Distributed Search context at least, fields with '-' were causing a NPE.
> >
> >
> > Otis
> > ----
> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> > Hadoop ecosystem search :: http://search-hadoop.com/
> >
> >
> >
> > ----- Original Message ----
> >> From: Jason Rutherglen 
> >> To: solr-user@lucene.apache.org
> >> Sent: Thu, February 11, 2010 12:36:00 AM
> >> Subject: Re: dismax and multi-language corpus
> >>
> >> > Claudio - fields with '-' in them can be problematic.
> >>
> >> Why's that?
> >>
> >> On Wed, Feb 10, 2010 at 2:38 PM, Otis Gospodnetic
> >> wrote:
> >> > Claudio - fields with '-' in them can be problematic.
> >> >
> >> > Side comment: do you really want to search across all languages at once? 
>  If
> >> not, maybe 3 different dismax configs would make your searches better.
> >> >
> >> >  Otis
> >> > ----
> >> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> >> > Hadoop ecosystem search :: http://search-hadoop.com/
> >> >
> >> >
> >> >
> >> > ----- Original Message ----
> >> >> From: Claudio Martella
> >> >> To: solr-user@lucene.apache.org
> >> >> Sent: Wed, February 10, 2010 3:15:40 PM
> >> >> Subject: dismax and multi-language corpus
> >> >>
> >> >> Hello list,
> >> >>
> >> >> I have a corpus with 3 languages, so i setup a text content field (with
> >> >> no stemming) and 3 text-[en|it|de] fields with specific snowball stemmers.
> >> >> i copyField the text to my language-away fields. So, I setup this dismax
> >> >> searchHandler:
> >> >>
> >> >>
> >> >>
> >> >>   dismax
> >> >>   title^1.2 content-en^0.8 content-it^0.8
> >> >> content-de^0.8
> >> >>   title^1.2 content-en^0.8 content-it^0.8
> >> >> content-de^0.8
> >> >>   title^1.2 content-en^0.8 content-it^0.8
> >> >> content-de^0.8
> >> >>   0.1
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> but i get this error:
> >> >>
> >> >> HTTP Status 400 - org.apache.lucene.queryParser.ParseException: Expected
> >> >> ',' at position 7 in 'content-en'
> >> >>
> >> >> type Status report
> >> >>
> >> >> message org.apache.lucene.queryParser.ParseException: Expected ',' at
> >> >> position 7 in 'content-en'
> >> >>
> >> >> description The request sent by the client was syntactically incorrect
> >> >> (org.apache.lucene.queryParser.ParseException: Expected ',' at position
> >> >> 7 in 'content-en').
> >> >>
> >> >> Any idea?
> >> >>
> >> >> TIA
> >> >>
> >> >> Claudio
> >> >>
> >> >> --
> >> >> Claudio Martella
> >> >> Digital Technologies
> >> >> Unit Research & Development - Analyst
> >> >>
> >> >> TIS innovation park
> >> >> Via Siemens 19 | Siemensstr. 19
> >> >> 39100 Bolzano | 39100 Bozen
> >> >> Tel. +39 0471 068 123
> >> >> Fax  +39 0471 068 129
> >> >> claudio.martella@tis.bz.it http://www.tis.bz.it
> >> >>
> >> >> Short information regarding use of personal data. According to Section 13 
> of
> >> >> Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we
> >> >> process your personal data in order to fulfil contractual and fiscal
> >> obligations
> >> >> and also to send you information regarding our services and events. Your
> >> >> personal data are processed with and without electronic means and by
> >> respecting
> >> >> data subjects' rights, fundamental freedoms and dignity, particularly with
> >> >> regard to confidentiality, personal identity and the right to personal 
> data
> >> >> protection. At any time and without formalities you can write an e-mail to
> >> >> privacy@tis.bz.it in order to object the processing of your personal data 
> for
> >> >> the purpose of sending advertising materials and also to exercise the 
> right
> >> to
> >> >> access personal data and other rights referred to in Section 7 of Decree
> >> >> 196/2003. The data controller is TIS Techno Innovation Alto Adige, Siemens
> >> >> Street n. 19, Bolzano. You can find the complete information on the web 
> site
> >> >> www.tis.bz.it.
> >> >
> >> >
> >
> >


Re: dismax and multi-language corpus

Posted by Jason Rutherglen <ja...@gmail.com>.
That's a bug, IMO...

On Thu, Feb 11, 2010 at 1:30 PM, Otis Gospodnetic
<ot...@yahoo.com> wrote:
> I don't know, but the other day I did see a NPE related to fields with '-'.  In Distributed Search context at least, fields with '-' were causing a NPE.
>
>
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Hadoop ecosystem search :: http://search-hadoop.com/
>
>
>
> ----- Original Message ----
>> From: Jason Rutherglen <ja...@gmail.com>
>> To: solr-user@lucene.apache.org
>> Sent: Thu, February 11, 2010 12:36:00 AM
>> Subject: Re: dismax and multi-language corpus
>>
>> > Claudio - fields with '-' in them can be problematic.
>>
>> Why's that?
>>
>> On Wed, Feb 10, 2010 at 2:38 PM, Otis Gospodnetic
>> wrote:
>> > Claudio - fields with '-' in them can be problematic.
>> >
>> > Side comment: do you really want to search across all languages at once?  If
>> not, maybe 3 different dismax configs would make your searches better.
>> >
>> >  Otis
>> > ----
>> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>> > Hadoop ecosystem search :: http://search-hadoop.com/
>> >
>> >
>> >
>> > ----- Original Message ----
>> >> From: Claudio Martella
>> >> To: solr-user@lucene.apache.org
>> >> Sent: Wed, February 10, 2010 3:15:40 PM
>> >> Subject: dismax and multi-language corpus
>> >>
>> >> Hello list,
>> >>
>> >> I have a corpus with 3 languages, so i setup a text content field (with
>> >> no stemming) and 3 text-[en|it|de] fields with specific snowball stemmers.
>> >> i copyField the text to my language-away fields. So, I setup this dismax
>> >> searchHandler:
>> >>
>> >>
>> >>
>> >>   dismax
>> >>   title^1.2 content-en^0.8 content-it^0.8
>> >> content-de^0.8
>> >>   title^1.2 content-en^0.8 content-it^0.8
>> >> content-de^0.8
>> >>   title^1.2 content-en^0.8 content-it^0.8
>> >> content-de^0.8
>> >>   0.1
>> >>
>> >>
>> >>
>> >>
>> >> but i get this error:
>> >>
>> >> HTTP Status 400 - org.apache.lucene.queryParser.ParseException: Expected
>> >> ',' at position 7 in 'content-en'
>> >>
>> >> type Status report
>> >>
>> >> message org.apache.lucene.queryParser.ParseException: Expected ',' at
>> >> position 7 in 'content-en'
>> >>
>> >> description The request sent by the client was syntactically incorrect
>> >> (org.apache.lucene.queryParser.ParseException: Expected ',' at position
>> >> 7 in 'content-en').
>> >>
>> >> Any idea?
>> >>
>> >> TIA
>> >>
>> >> Claudio
>> >>
>> >> --
>> >> Claudio Martella
>> >> Digital Technologies
>> >> Unit Research & Development - Analyst
>> >>
>> >> TIS innovation park
>> >> Via Siemens 19 | Siemensstr. 19
>> >> 39100 Bolzano | 39100 Bozen
>> >> Tel. +39 0471 068 123
>> >> Fax  +39 0471 068 129
>> >> claudio.martella@tis.bz.it http://www.tis.bz.it
>> >>
>> >> Short information regarding use of personal data. According to Section 13 of
>> >> Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we
>> >> process your personal data in order to fulfil contractual and fiscal
>> obligations
>> >> and also to send you information regarding our services and events. Your
>> >> personal data are processed with and without electronic means and by
>> respecting
>> >> data subjects' rights, fundamental freedoms and dignity, particularly with
>> >> regard to confidentiality, personal identity and the right to personal data
>> >> protection. At any time and without formalities you can write an e-mail to
>> >> privacy@tis.bz.it in order to object the processing of your personal data for
>> >> the purpose of sending advertising materials and also to exercise the right
>> to
>> >> access personal data and other rights referred to in Section 7 of Decree
>> >> 196/2003. The data controller is TIS Techno Innovation Alto Adige, Siemens
>> >> Street n. 19, Bolzano. You can find the complete information on the web site
>> >> www.tis.bz.it.
>> >
>> >
>
>

Re: dismax and multi-language corpus

Posted by Otis Gospodnetic <ot...@yahoo.com>.
I don't know, but the other day I did see a NPE related to fields with '-'.  In Distributed Search context at least, fields with '-' were causing a NPE.


Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/



----- Original Message ----
> From: Jason Rutherglen <ja...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Thu, February 11, 2010 12:36:00 AM
> Subject: Re: dismax and multi-language corpus
> 
> > Claudio - fields with '-' in them can be problematic.
> 
> Why's that?
> 
> On Wed, Feb 10, 2010 at 2:38 PM, Otis Gospodnetic
> wrote:
> > Claudio - fields with '-' in them can be problematic.
> >
> > Side comment: do you really want to search across all languages at once?  If 
> not, maybe 3 different dismax configs would make your searches better.
> >
> >  Otis
> > ----
> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> > Hadoop ecosystem search :: http://search-hadoop.com/
> >
> >
> >
> > ----- Original Message ----
> >> From: Claudio Martella 
> >> To: solr-user@lucene.apache.org
> >> Sent: Wed, February 10, 2010 3:15:40 PM
> >> Subject: dismax and multi-language corpus
> >>
> >> Hello list,
> >>
> >> I have a corpus with 3 languages, so i setup a text content field (with
> >> no stemming) and 3 text-[en|it|de] fields with specific snowball stemmers.
> >> i copyField the text to my language-away fields. So, I setup this dismax
> >> searchHandler:
> >>
> >>
> >>
> >>   dismax
> >>   title^1.2 content-en^0.8 content-it^0.8
> >> content-de^0.8
> >>   title^1.2 content-en^0.8 content-it^0.8
> >> content-de^0.8
> >>   title^1.2 content-en^0.8 content-it^0.8
> >> content-de^0.8
> >>   0.1
> >>
> >>
> >>
> >>
> >> but i get this error:
> >>
> >> HTTP Status 400 - org.apache.lucene.queryParser.ParseException: Expected
> >> ',' at position 7 in 'content-en'
> >>
> >> type Status report
> >>
> >> message org.apache.lucene.queryParser.ParseException: Expected ',' at
> >> position 7 in 'content-en'
> >>
> >> description The request sent by the client was syntactically incorrect
> >> (org.apache.lucene.queryParser.ParseException: Expected ',' at position
> >> 7 in 'content-en').
> >>
> >> Any idea?
> >>
> >> TIA
> >>
> >> Claudio
> >>
> >> --
> >> Claudio Martella
> >> Digital Technologies
> >> Unit Research & Development - Analyst
> >>
> >> TIS innovation park
> >> Via Siemens 19 | Siemensstr. 19
> >> 39100 Bolzano | 39100 Bozen
> >> Tel. +39 0471 068 123
> >> Fax  +39 0471 068 129
> >> claudio.martella@tis.bz.it http://www.tis.bz.it
> >>
> >> Short information regarding use of personal data. According to Section 13 of
> >> Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we
> >> process your personal data in order to fulfil contractual and fiscal 
> obligations
> >> and also to send you information regarding our services and events. Your
> >> personal data are processed with and without electronic means and by 
> respecting
> >> data subjects' rights, fundamental freedoms and dignity, particularly with
> >> regard to confidentiality, personal identity and the right to personal data
> >> protection. At any time and without formalities you can write an e-mail to
> >> privacy@tis.bz.it in order to object the processing of your personal data for
> >> the purpose of sending advertising materials and also to exercise the right 
> to
> >> access personal data and other rights referred to in Section 7 of Decree
> >> 196/2003. The data controller is TIS Techno Innovation Alto Adige, Siemens
> >> Street n. 19, Bolzano. You can find the complete information on the web site
> >> www.tis.bz.it.
> >
> >


Re: dismax and multi-language corpus

Posted by Jason Rutherglen <ja...@gmail.com>.
> Claudio - fields with '-' in them can be problematic.

Why's that?

On Wed, Feb 10, 2010 at 2:38 PM, Otis Gospodnetic
<ot...@yahoo.com> wrote:
> Claudio - fields with '-' in them can be problematic.
>
> Side comment: do you really want to search across all languages at once?  If not, maybe 3 different dismax configs would make your searches better.
>
>  Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Hadoop ecosystem search :: http://search-hadoop.com/
>
>
>
> ----- Original Message ----
>> From: Claudio Martella <cl...@tis.bz.it>
>> To: solr-user@lucene.apache.org
>> Sent: Wed, February 10, 2010 3:15:40 PM
>> Subject: dismax and multi-language corpus
>>
>> Hello list,
>>
>> I have a corpus with 3 languages, so i setup a text content field (with
>> no stemming) and 3 text-[en|it|de] fields with specific snowball stemmers.
>> i copyField the text to my language-away fields. So, I setup this dismax
>> searchHandler:
>>
>>
>>
>>   dismax
>>   title^1.2 content-en^0.8 content-it^0.8
>> content-de^0.8
>>   title^1.2 content-en^0.8 content-it^0.8
>> content-de^0.8
>>   title^1.2 content-en^0.8 content-it^0.8
>> content-de^0.8
>>   0.1
>>
>>
>>
>>
>> but i get this error:
>>
>> HTTP Status 400 - org.apache.lucene.queryParser.ParseException: Expected
>> ',' at position 7 in 'content-en'
>>
>> type Status report
>>
>> message org.apache.lucene.queryParser.ParseException: Expected ',' at
>> position 7 in 'content-en'
>>
>> description The request sent by the client was syntactically incorrect
>> (org.apache.lucene.queryParser.ParseException: Expected ',' at position
>> 7 in 'content-en').
>>
>> Any idea?
>>
>> TIA
>>
>> Claudio
>>
>> --
>> Claudio Martella
>> Digital Technologies
>> Unit Research & Development - Analyst
>>
>> TIS innovation park
>> Via Siemens 19 | Siemensstr. 19
>> 39100 Bolzano | 39100 Bozen
>> Tel. +39 0471 068 123
>> Fax  +39 0471 068 129
>> claudio.martella@tis.bz.it http://www.tis.bz.it
>>
>> Short information regarding use of personal data. According to Section 13 of
>> Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we
>> process your personal data in order to fulfil contractual and fiscal obligations
>> and also to send you information regarding our services and events. Your
>> personal data are processed with and without electronic means and by respecting
>> data subjects' rights, fundamental freedoms and dignity, particularly with
>> regard to confidentiality, personal identity and the right to personal data
>> protection. At any time and without formalities you can write an e-mail to
>> privacy@tis.bz.it in order to object the processing of your personal data for
>> the purpose of sending advertising materials and also to exercise the right to
>> access personal data and other rights referred to in Section 7 of Decree
>> 196/2003. The data controller is TIS Techno Innovation Alto Adige, Siemens
>> Street n. 19, Bolzano. You can find the complete information on the web site
>> www.tis.bz.it.
>
>

Re: dismax and multi-language corpus

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Claudio - fields with '-' in them can be problematic.

Side comment: do you really want to search across all languages at once?  If not, maybe 3 different dismax configs would make your searches better.

 Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/



----- Original Message ----
> From: Claudio Martella <cl...@tis.bz.it>
> To: solr-user@lucene.apache.org
> Sent: Wed, February 10, 2010 3:15:40 PM
> Subject: dismax and multi-language corpus
> 
> Hello list,
> 
> I have a corpus with 3 languages, so i setup a text content field (with
> no stemming) and 3 text-[en|it|de] fields with specific snowball stemmers.
> i copyField the text to my language-away fields. So, I setup this dismax
> searchHandler:
> 
> 
> 
>   dismax
>   title^1.2 content-en^0.8 content-it^0.8
> content-de^0.8
>   title^1.2 content-en^0.8 content-it^0.8
> content-de^0.8
>   title^1.2 content-en^0.8 content-it^0.8
> content-de^0.8
>   0.1
> 
> 
> 
> 
> but i get this error:
> 
> HTTP Status 400 - org.apache.lucene.queryParser.ParseException: Expected
> ',' at position 7 in 'content-en'
> 
> type Status report
> 
> message org.apache.lucene.queryParser.ParseException: Expected ',' at
> position 7 in 'content-en'
> 
> description The request sent by the client was syntactically incorrect
> (org.apache.lucene.queryParser.ParseException: Expected ',' at position
> 7 in 'content-en').
> 
> Any idea?
> 
> TIA
> 
> Claudio
> 
> -- 
> Claudio Martella
> Digital Technologies
> Unit Research & Development - Analyst
> 
> TIS innovation park
> Via Siemens 19 | Siemensstr. 19
> 39100 Bolzano | 39100 Bozen
> Tel. +39 0471 068 123
> Fax  +39 0471 068 129
> claudio.martella@tis.bz.it http://www.tis.bz.it
> 
> Short information regarding use of personal data. According to Section 13 of 
> Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we 
> process your personal data in order to fulfil contractual and fiscal obligations 
> and also to send you information regarding our services and events. Your 
> personal data are processed with and without electronic means and by respecting 
> data subjects' rights, fundamental freedoms and dignity, particularly with 
> regard to confidentiality, personal identity and the right to personal data 
> protection. At any time and without formalities you can write an e-mail to 
> privacy@tis.bz.it in order to object the processing of your personal data for 
> the purpose of sending advertising materials and also to exercise the right to 
> access personal data and other rights referred to in Section 7 of Decree 
> 196/2003. The data controller is TIS Techno Innovation Alto Adige, Siemens 
> Street n. 19, Bolzano. You can find the complete information on the web site 
> www.tis.bz.it.