You are viewing a plain text version of this content. The canonical link for it is here.
Posted to announce@apache.org by Mark Miller <ma...@apache.org> on 2010/06/25 15:23:13 UTC

[ANN] Solr 1.4.1 Released

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Apache Solr 1.4.1 has been released and is now available for public
download!
http://www.apache.org/dyn/closer.cgi/lucene/solr/

Solr is the popular, blazing fast open source enterprise search
platform from the Apache Lucene project.  Its major features include
powerful full-text search, hit highlighting, faceted search, dynamic
clustering, database integration, and rich document (e.g., Word, PDF)
handling.  Solr is highly scalable, providing distributed search and
index replication, and it powers the search and navigation features of
many of the world's largest internet sites.

Solr is written in Java and runs as a standalone full-text search server
within a servlet container such as Tomcat.  Solr uses the Lucene Java
search library at its core for full-text indexing and search, and has
REST-like HTTP/XML and JSON APIs that make it easy to use from virtually
any programming language.  Solr's powerful external configuration allows
it to be tailored to almost any type of application without Java coding,
and it has an extensive plugin architecture when more advanced
customization is required.

Solr 1.4.1 is a bug fix release for Solr 1.4 that includes many Solr bug
fixes as well as Lucene bug fixes from Lucene 2.9.3.

See all of the CHANGES here:
http://svn.apache.org/repos/asf/lucene/solr/tags/release-1.4.1/CHANGES.txt


- - Mark Miller on behalf of the Solr team
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.14 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJMJK3AAAoJED+/0YJ4eWrIrfAP/RLD7QvreOBFebICN/eiRzCH
1dHOt9Scn7qGQU4RvXZ8GQq37AuoRMgmgckntttFLCCD5w5A29/GxzyZbAoQDQ0B
OkaHsYIcUuhbLq8QtlTjt+rK3gc6oxMoCRMJBS7DfUFUyROl6om4gpYAVem50qDy
FfBdgRxp4VZ07E7VwmMvma03nSrKuvX0bwE8NXksaCAVsvkmi8Sh7aLMPPVHgsuD
pbY8kB0hXCULJgs9ZAc2t6+T38+eV9wxJSeAktVlGAvNlYTavW2bxzF5wQk+kXCd
DwGjdlU9/ebHdx3MHJyE0zXSl4rGFsy8zfh/ntk7UV7qklQ2jn5Ur18zLqv4vkb1
Ea78GpoqCZWlMGcRUSErtH33cGs4blo/kuJZj/VLrk6jxO4x4beUsAfRcM/YliJW
Z6OuFtpcdVDjVl4aB2xbAMwDl2DXqgyNmlxs8vvqdRoDhN8wZ91raO0kkbrkzj1f
5gPD//Efx6RcrYtXAV3HKAwI7FLP8MhzFu1Y2FK2FY7DyFNmirad03+pB6bFs1xq
ARU6pdeTYvv+PsWH3Keaw/L/nb0BYbU8R1sVhkvjm+S9gJ6cCcKJkeAkNgL+6QNm
JPJ5VeXVFGVmwzQ5mE3j6qX1uDrEmLA2T5Dd7bssWtwveLoyfo0s7qezIfbRamnc
T3iyCE6cuSU9CvCEqN+o
=nBB9
-----END PGP SIGNATURE-----

Re: [ANN] Solr 1.4.1 Released

Posted by Thibaut <th...@free.fr>.
So what the right xml code to call solr 1.4.1 through maven ?

Regards,
Thibaut

On 06/27/2010 02:18 AM, Jason Chaffee wrote:
> Was this change intentional or a mistake?  If it was a mistake, can someone please fix it in maven's central repository?

Re: [ANN] Solr 1.4.1 Released

Posted by Stevo Slavić <ss...@gmail.com>.
Created issue <https://issues.apache.org/jira/browse/SOLR-1977> for this.

Regards,
Stevo.

On Sun, Jun 27, 2010 at 2:54 AM, Ken Krugler <kk...@transpac.com>wrote:

>
> On Jun 26, 2010, at 5:18pm, Jason Chaffee wrote:
>
>  It appears the 1.4.1 version was deployed with a new maven groupId
>>
>> For eample, if you are trying to download solr-core, here are the
>> differences between 1.4.0 and 1.4.1.
>>
>> 1.4.0
>> groupId: org.apache.solr
>> artifactId: solr-core
>>
>> 1.4.1
>> groupId: org.apache.solr.solr
>> artifactId:solr-core
>>
>> Was this change intentional or a mistake?  If it was a mistake, can
>> someone please fix it in maven's central repository?
>>
>
> I believe it was a mistake. From a recent email thread on this list, Mark
> Miller said:
>
>  Can a solr/maven dude look at this? I simply used the copy command on
>> the release to-do wiki (sounds like it should be updated).
>>
>> If no one steps up, I'll try and straighten it out later.
>>
>> On 6/25/10 10:28 AM, Stevo Slavić wrote:
>>
>>> Congrats on the release!
>>>
>>> Something seems to be wrong with solr 1.4.1 maven artifacts, there is in
>>> extra solr in the path. E.g. solr-parent-1.4.1.pom at in
>>>
>>> http://repo1.maven.org/maven2/org/apache/solr/solr/solr-parent/1.4.1/solr-parent-1.4.1.pomwhile
>>> it should be at
>>>
>>> http://repo1.maven.org/maven2/org/apache/solr/solr-parent/1.4.1/solr-parent-1.4.1.pom
>>> .
>>> Pom's seem to contain correct maven artifact coordinates.
>>>
>>> Regards,
>>> Stevo.
>>>
>>
> -- Ken
>
> --------------------------------------------
> Ken Krugler
> +1 530-210-6378
> http://bixolabs.com
> e l a s t i c   w e b   m i n i n g
>
>
>
>
>

Re: [ANN] Solr 1.4.1 Released

Posted by Ken Krugler <kk...@transpac.com>.
On Jun 26, 2010, at 5:18pm, Jason Chaffee wrote:

> It appears the 1.4.1 version was deployed with a new maven groupId
>
> For eample, if you are trying to download solr-core, here are the  
> differences between 1.4.0 and 1.4.1.
>
> 1.4.0
> groupId: org.apache.solr
> artifactId: solr-core
>
> 1.4.1
> groupId: org.apache.solr.solr
> artifactId:solr-core
>
> Was this change intentional or a mistake?  If it was a mistake, can  
> someone please fix it in maven's central repository?

I believe it was a mistake. From a recent email thread on this list,  
Mark Miller said:

> Can a solr/maven dude look at this? I simply used the copy command on
> the release to-do wiki (sounds like it should be updated).
>
> If no one steps up, I'll try and straighten it out later.
>
> On 6/25/10 10:28 AM, Stevo Slavić wrote:
>> Congrats on the release!
>>
>> Something seems to be wrong with solr 1.4.1 maven artifacts, there  
>> is in
>> extra solr in the path. E.g. solr-parent-1.4.1.pom at in
>> http://repo1.maven.org/maven2/org/apache/solr/solr/solr-parent/1.4.1/solr-parent-1.4.1.pomwhile
>> it should be at
>> http://repo1.maven.org/maven2/org/apache/solr/solr-parent/1.4.1/solr-parent-1.4.1.pom 
>> .
>> Pom's seem to contain correct maven artifact coordinates.
>>
>> Regards,
>> Stevo.

-- Ken

--------------------------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g





RE: [ANN] Solr 1.4.1 Released

Posted by Jason Chaffee <jc...@ebates.com>.
It appears the 1.4.1 version was deployed with a new maven groupId

For eample, if you are trying to download solr-core, here are the differences between 1.4.0 and 1.4.1.  

1.4.0
groupId: org.apache.solr
artifactId: solr-core

1.4.1
groupId: org.apache.solr.solr
artifactId:solr-core

Was this change intentional or a mistake?  If it was a mistake, can someone please fix it in maven's central repository?

thanks,

Jason

-----Original Message-----
From: Mark Miller [mailto:markrmiller@apache.org]
Sent: Fri 6/25/2010 6:23 AM
To: solr-user@lucene.apache.org; general@lucene.apache.org; announce@apache.org
Subject: [ANN] Solr 1.4.1 Released
 
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Apache Solr 1.4.1 has been released and is now available for public
download!
http://www.apache.org/dyn/closer.cgi/lucene/solr/

Solr is the popular, blazing fast open source enterprise search
platform from the Apache Lucene project.  Its major features include
powerful full-text search, hit highlighting, faceted search, dynamic
clustering, database integration, and rich document (e.g., Word, PDF)
handling.  Solr is highly scalable, providing distributed search and
index replication, and it powers the search and navigation features of
many of the world's largest internet sites.

Solr is written in Java and runs as a standalone full-text search server
within a servlet container such as Tomcat.  Solr uses the Lucene Java
search library at its core for full-text indexing and search, and has
REST-like HTTP/XML and JSON APIs that make it easy to use from virtually
any programming language.  Solr's powerful external configuration allows
it to be tailored to almost any type of application without Java coding,
and it has an extensive plugin architecture when more advanced
customization is required.

Solr 1.4.1 is a bug fix release for Solr 1.4 that includes many Solr bug
fixes as well as Lucene bug fixes from Lucene 2.9.3.

See all of the CHANGES here:
http://svn.apache.org/repos/asf/lucene/solr/tags/release-1.4.1/CHANGES.txt


- - Mark Miller on behalf of the Solr team
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.14 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJMJK3AAAoJED+/0YJ4eWrIrfAP/RLD7QvreOBFebICN/eiRzCH
1dHOt9Scn7qGQU4RvXZ8GQq37AuoRMgmgckntttFLCCD5w5A29/GxzyZbAoQDQ0B
OkaHsYIcUuhbLq8QtlTjt+rK3gc6oxMoCRMJBS7DfUFUyROl6om4gpYAVem50qDy
FfBdgRxp4VZ07E7VwmMvma03nSrKuvX0bwE8NXksaCAVsvkmi8Sh7aLMPPVHgsuD
pbY8kB0hXCULJgs9ZAc2t6+T38+eV9wxJSeAktVlGAvNlYTavW2bxzF5wQk+kXCd
DwGjdlU9/ebHdx3MHJyE0zXSl4rGFsy8zfh/ntk7UV7qklQ2jn5Ur18zLqv4vkb1
Ea78GpoqCZWlMGcRUSErtH33cGs4blo/kuJZj/VLrk6jxO4x4beUsAfRcM/YliJW
Z6OuFtpcdVDjVl4aB2xbAMwDl2DXqgyNmlxs8vvqdRoDhN8wZ91raO0kkbrkzj1f
5gPD//Efx6RcrYtXAV3HKAwI7FLP8MhzFu1Y2FK2FY7DyFNmirad03+pB6bFs1xq
ARU6pdeTYvv+PsWH3Keaw/L/nb0BYbU8R1sVhkvjm+S9gJ6cCcKJkeAkNgL+6QNm
JPJ5VeXVFGVmwzQ5mE3j6qX1uDrEmLA2T5Dd7bssWtwveLoyfo0s7qezIfbRamnc
T3iyCE6cuSU9CvCEqN+o
=nBB9
-----END PGP SIGNATURE-----


Re: Setting many properties for a multivalued field. Schema.xml ? External file?

Posted by Saïd Radhouani <r....@gmail.com>.
Thanks Geert-Jan, this is indeed very helpful.

The delimiters I gave were just for the need of the example. I will use non frequent delimiter.

Cheers,
-Saïd

On Jun 26, 2010, at 1:53 PM, Geert-Jan Brits wrote:

>> If I understand your suggestion correctly, you said that there's NO need to
> have many Dynamic Fields; instead, we can have one definitive field name,
> which can store a long string (concatenation of >information about tens of
> pictures), e.g., using "-" and "%" delimiters:
> pic_url_value1-pic_caption_value1-pic_description_value1%pic_url_value2-pic_caption_value2-pic_description_value2%...
>> I don't clearly see the reason of doing this. Is there a gain in terms of
> performance? Or does this make programming on the client-side easier? Or
> something else?
> 
> I think you should ask the exact opposite question. If you don't do anything
> with these fields which Solr is particularly good at (searching / filtering
> / faceting/ sorting) why go through the trouble of creating dynamic fields?
> (more fields is more overhead cost/ tracking cost no matter how you look at
> it)
> 
> Moreover, indeed from a client-view it's easier the way I suggested, since
> otherwise you:
> - would have to ask (through SolrJ) to include all dynamic fields to be
> returned in the Fl-field (
> http://wiki.apache.org/solr/CommonQueryParameters#fl). This is difficult,
> because a-priori you don't know how many dynamic-fields to query. So in
> other words you can't just ask SOlr (though SolrJ lik you asked) to just
> return all dynamic fields beginning with pic_*. (afaik)
> - your client iterate code (looping the pics) is a bit more involved.
> 
> HTH, Cheers,
> 
> Geert-Jan
> 
> 2010/6/26 Saïd Radhouani <r....@gmail.com>
> 
>> Thanks Geert-Jan for the detailed answer. Actually, I don't search at all
>> on these fields. I'm only filtering (w/ vs w/ pic) and sorting (based on the
>> number of pictures). Thus, your suggestion of adding an extra field NrOfPics
>> [0,N] would be the best solution.
>> 
>> Regarding the other suggestion:
>> 
>>> If you dont need search at all on these fields, the best thing imo is to
>>> store all pic-related info of all pics together by concatenating them
>> with
>>> some delimiter which you know how to seperate at the client-side.
>>> That or just store it in an external RDB since solr is just sitting on
>> the
>>> data and not doing anything intelligent with it.
>> 
>> If I understand your suggestion correctly, you said that there's NO need to
>> have many Dynamic Fields; instead, we can have one definitive field name,
>> which can store a long string (concatenation of information about tens of
>> pictures), e.g., using "-" and "%" delimiters:
>> pic_url_value1-pic_caption_value1-pic_description_value1%pic_url_value2-pic_caption_value2-pic_description_value2%...
>> 
>> I don't clearly see the reason of doing this. Is there a gain in terms of
>> performance? Or does this make programming on the client-side easier? Or
>> something else?
>> 
>> 
>> My other question was: in case we use Dynamic Fields, is there a
>> documentation about using SolrJ for this purpose?
>> 
>> Thanks
>> -Saïd
>> 
>> On Jun 26, 2010, at 12:29 PM, Geert-Jan Brits wrote:
>> 
>>> You can treat dynamic fields like any other field, so you can facet,
>> sort,
>>> filter, etc on these fields (afaik)
>>> 
>>> I believe the confusion arises that sometimes the usecase for dynamic
>> fields
>>> seems to be ill-understood, i.e: to be able to use them to do some kind
>> of
>>> wildcard search, e.g: search for a value in any of the dynamic fields at
>>> once like pic_url_*. This however is NOT possible.
>>> 
>>> As far as your question goes:
>>> 
>>>> Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc
>> w/o
>>> pic
>>>> To the best of my knowledge, everyone is saying that faceting cannot be
>>> done on dynamic fields (only on definitive field names). Thus, I tried
>> the
>>> following and it's working: I assume that the stored > >pictures have a
>>> sequential number (_1, _2, etc.), i.e., if pic_url_1 exists in the index,
>> it
>>> means that the underlying doc has at least one picture:
>>>> ...&facet=on&facet.field=pic_url_1&facet.mincount=1&fq=pic_url_1:*
>>>> While this is working fine, I'm wondering whether there's a cleaner way
>> to
>>> do the same thing without assuming that pictures have a sequential
>> number.
>>> 
>>> If I understand your question correctly: faceting on docs with and
>> without
>>> pics could ofcourse by done like you mention, however it  would be more
>>> efficient to have an extra field defined:  hasAtLestOnePic with values (0
>> |
>>> 1)
>>> use that to facet / filter on.
>>> 
>>> you can extend this to NrOfPics [0,N)  if you need to filter / facet on
>> docs
>>> with a certain nr of pics.
>>> 
>>> also I wondered what else you wanted to do with this pic-related info. Do
>>> you want to search on pic-description / pic-caption for instance? In that
>>> case the dynamic-fields approach may not be what you want: how would you
>>> know in which dynamic-field to search for a particular term? Would if be
>>> pic_desc_1 , or pic_desc_x?  Of couse you could OR over all dynamic
>> fields,
>>> but you need to know how many pics an upperbound for the nr of pics and
>> it
>>> really doesn't feel right, to me at least.
>>> 
>>> If you need search on pic_description for instance, but don't mind what
>> pic
>>> matches, you could create a single field pic_description and put in the
>>> concat of all pic-descriptions and search on that, or just make it a a
>>> multi-valued field.
>>> 
>>> If you dont need search at all on these fields, the best thing imo is to
>>> store all pic-related info of all pics together by concatenating them
>> with
>>> some delimiter which you know how to seperate at the client-side.
>>> That or just store it in an external RDB since solr is just sitting on
>> the
>>> data and not doing anything intelligent with it.
>>> 
>>> I assume btw that you don't want to sort/ facet on pic-desc /
>> pic_caption/
>>> pic_url either ( I have a hard time thinking of a useful usecase for
>> that)
>>> 
>>> HTH,
>>> 
>>> Geert-Jan
>>> 
>>> 
>>> 
>>> 2010/6/26 Saïd Radhouani <r....@gmail.com>
>>> 
>>>> Thanks so much Otis. This is working great.
>>>> 
>>>> Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc
>> w/o
>>>> pic
>>>> 
>>>> To the best of my knowledge, everyone is saying that faceting cannot be
>>>> done on dynamic fields (only on definitive field names). Thus, I tried
>> the
>>>> following and it's working: I assume that the stored pictures have a
>>>> sequential number (_1, _2, etc.), i.e., if pic_url_1 exists in the
>> index, it
>>>> means that the underlying doc has at least one picture:
>>>> 
>>>> ...&facet=on&facet.field=pic_url_1&facet.mincount=1&fq=pic_url_1:*
>>>> 
>>>> While this is working fine, I'm wondering whether there's a cleaner way
>> to
>>>> do the same thing without assuming that pictures have a sequential
>> number.
>>>> 
>>>> Also, do you have any documentation about handling Dynamic Fields using
>>>> SolrJ. So far, I found only issues about that on JIRA, but no
>> documentation.
>>>> 
>>>> Thanks a lot.
>>>> 
>>>> -Saïd
>>>> 
>>>> On Jun 26, 2010, at 1:18 AM, Otis Gospodnetic wrote:
>>>> 
>>>>> Saïd,
>>>>> 
>>>>> Dynamic fields could help here, for example imagine a doc with:
>>>>> id
>>>>> pic_url_*
>>>>> pic_caption_*
>>>>> pic_description_*
>>>>> 
>>>>> See http://wiki.apache.org/solr/SchemaXml#Dynamic_fields
>>>>> 
>>>>> So, for you:
>>>>> 
>>>>> <dynamicField name="pic_url_*"  type="string"  indexed="true"
>>>> stored="true"/>
>>>>> <dynamicField name="pic_caption_*"  type="text"  indexed="true"
>>>> stored="true"/>
>>>>> <dynamicField name="pic_description_*"  type="text"  indexed="true"
>>>> stored="true"/>
>>>>> 
>>>>> Then you can add docs with unlimited number of
>>>> pic_(url|caption|description)_* fields, e.g.
>>>>> 
>>>>> id
>>>>> pic_url_1
>>>>> pic_caption_1
>>>>> pic_description_1
>>>>> 
>>>>> id
>>>>> pic_url_2
>>>>> pic_caption_2
>>>>> pic_description_2
>>>>> 
>>>>> 
>>>>> Otis
>>>>> ----
>>>>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>>>>> Lucene ecosystem search :: http://search-lucene.com/
>>>>> 
>>>>> 
>>>>> 
>>>>> ----- Original Message ----
>>>>>> From: Saïd Radhouani <r....@gmail.com>
>>>>>> To: solr-user@lucene.apache.org
>>>>>> Sent: Fri, June 25, 2010 6:01:13 PM
>>>>>> Subject: Setting many properties for a multivalued field. Schema.xml ?
>>>> External file?
>>>>>> 
>>>>>> Hi,
>>>>> 
>>>>> I'm trying to index data containing a multivalued field "picture",
>>>>>> that has three properties: url, caption and description:
>>>>> 
>>>>> <picture/>
>>>>>> 
>>>>>  <url/>
>>>>> 
>>>>>> <caption/>
>>>>>  <description/>
>>>>> 
>>>>> Thus, each
>>>>>> indexed document might have many pictures, each of them has a url, a
>>>> caption,
>>>>>> and a description.
>>>>> 
>>>>> I wonder wether it's possible to store this data using
>>>>>> only schema.xml. I couldn't figure it out so far. Instead, I'm
>> thinking
>>>> of using
>>>>>> an external file to sore the properties of each picture, but I haven't
>>>> tried yet
>>>>>> this solution, waiting for your suggestions...
>>>>> 
>>>>> Thanks,
>>>>> -Saïd
>>>> 
>>>> 
>> 
>> 


Re: Setting many properties for a multivalued field. Schema.xml ? External file?

Posted by Geert-Jan Brits <gb...@gmail.com>.
btw, be careful with you delimiters: pic_url may possibly contain a '-',
etc.

2010/6/26 Geert-Jan Brits <gb...@gmail.com>

> >If I understand your suggestion correctly, you said that there's NO need
> to have many Dynamic Fields; instead, we can have one definitive field name,
> which can store a long string (concatenation of >information about tens of
> pictures), e.g., using "-" and "%" delimiters:
> pic_url_value1-pic_caption_value1-pic_description_value1%pic_url_value2-pic_caption_value2-pic_description_value2%...
> >I don't clearly see the reason of doing this. Is there a gain in terms of
> performance? Or does this make programming on the client-side easier? Or
> something else?
>
> I think you should ask the exact opposite question. If you don't do
> anything with these fields which Solr is particularly good at (searching /
> filtering / faceting/ sorting) why go through the trouble of creating
> dynamic fields?  (more fields is more overhead cost/ tracking cost no matter
> how you look at it)
>
> Moreover, indeed from a client-view it's easier the way I suggested, since
> otherwise you:
> - would have to ask (through SolrJ) to include all dynamic fields to be
> returned in the Fl-field (
> http://wiki.apache.org/solr/CommonQueryParameters#fl). This is difficult,
> because a-priori you don't know how many dynamic-fields to query. So in
> other words you can't just ask SOlr (though SolrJ lik you asked) to just
> return all dynamic fields beginning with pic_*. (afaik)
> - your client iterate code (looping the pics) is a bit more involved.
>
> HTH, Cheers,
>
> Geert-Jan
>
> 2010/6/26 Saïd Radhouani <r....@gmail.com>
>
>> Thanks Geert-Jan for the detailed answer. Actually, I don't search at all
>> on these fields. I'm only filtering (w/ vs w/ pic) and sorting (based on the
>> number of pictures). Thus, your suggestion of adding an extra field NrOfPics
>> [0,N] would be the best solution.
>>
>> Regarding the other suggestion:
>>
>> > If you dont need search at all on these fields, the best thing imo is to
>> > store all pic-related info of all pics together by concatenating them
>> with
>> > some delimiter which you know how to seperate at the client-side.
>> > That or just store it in an external RDB since solr is just sitting on
>> the
>> > data and not doing anything intelligent with it.
>>
>> If I understand your suggestion correctly, you said that there's NO need
>> to have many Dynamic Fields; instead, we can have one definitive field name,
>> which can store a long string (concatenation of information about tens of
>> pictures), e.g., using "-" and "%" delimiters:
>> pic_url_value1-pic_caption_value1-pic_description_value1%pic_url_value2-pic_caption_value2-pic_description_value2%...
>>
>> I don't clearly see the reason of doing this. Is there a gain in terms of
>> performance? Or does this make programming on the client-side easier? Or
>> something else?
>>
>>
>> My other question was: in case we use Dynamic Fields, is there a
>> documentation about using SolrJ for this purpose?
>>
>> Thanks
>> -Saïd
>>
>> On Jun 26, 2010, at 12:29 PM, Geert-Jan Brits wrote:
>>
>> > You can treat dynamic fields like any other field, so you can facet,
>> sort,
>> > filter, etc on these fields (afaik)
>> >
>> > I believe the confusion arises that sometimes the usecase for dynamic
>> fields
>> > seems to be ill-understood, i.e: to be able to use them to do some kind
>> of
>> > wildcard search, e.g: search for a value in any of the dynamic fields at
>> > once like pic_url_*. This however is NOT possible.
>> >
>> > As far as your question goes:
>> >
>> >> Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc
>> w/o
>> > pic
>> >> To the best of my knowledge, everyone is saying that faceting cannot be
>> > done on dynamic fields (only on definitive field names). Thus, I tried
>> the
>> > following and it's working: I assume that the stored > >pictures have a
>> > sequential number (_1, _2, etc.), i.e., if pic_url_1 exists in the
>> index, it
>> > means that the underlying doc has at least one picture:
>> >> ...&facet=on&facet.field=pic_url_1&facet.mincount=1&fq=pic_url_1:*
>> >> While this is working fine, I'm wondering whether there's a cleaner way
>> to
>> > do the same thing without assuming that pictures have a sequential
>> number.
>> >
>> > If I understand your question correctly: faceting on docs with and
>> without
>> > pics could ofcourse by done like you mention, however it  would be more
>> > efficient to have an extra field defined:  hasAtLestOnePic with values
>> (0 |
>> > 1)
>> > use that to facet / filter on.
>> >
>> > you can extend this to NrOfPics [0,N)  if you need to filter / facet on
>> docs
>> > with a certain nr of pics.
>> >
>> > also I wondered what else you wanted to do with this pic-related info.
>> Do
>> > you want to search on pic-description / pic-caption for instance? In
>> that
>> > case the dynamic-fields approach may not be what you want: how would you
>> > know in which dynamic-field to search for a particular term? Would if be
>> > pic_desc_1 , or pic_desc_x?  Of couse you could OR over all dynamic
>> fields,
>> > but you need to know how many pics an upperbound for the nr of pics and
>> it
>> > really doesn't feel right, to me at least.
>> >
>> > If you need search on pic_description for instance, but don't mind what
>> pic
>> > matches, you could create a single field pic_description and put in the
>> > concat of all pic-descriptions and search on that, or just make it a a
>> > multi-valued field.
>> >
>> > If you dont need search at all on these fields, the best thing imo is to
>> > store all pic-related info of all pics together by concatenating them
>> with
>> > some delimiter which you know how to seperate at the client-side.
>> > That or just store it in an external RDB since solr is just sitting on
>> the
>> > data and not doing anything intelligent with it.
>> >
>> > I assume btw that you don't want to sort/ facet on pic-desc /
>> pic_caption/
>> > pic_url either ( I have a hard time thinking of a useful usecase for
>> that)
>> >
>> > HTH,
>> >
>> > Geert-Jan
>> >
>> >
>> >
>> > 2010/6/26 Saïd Radhouani <r....@gmail.com>
>> >
>> >> Thanks so much Otis. This is working great.
>> >>
>> >> Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc
>> w/o
>> >> pic
>> >>
>> >> To the best of my knowledge, everyone is saying that faceting cannot be
>> >> done on dynamic fields (only on definitive field names). Thus, I tried
>> the
>> >> following and it's working: I assume that the stored pictures have a
>> >> sequential number (_1, _2, etc.), i.e., if pic_url_1 exists in the
>> index, it
>> >> means that the underlying doc has at least one picture:
>> >>
>> >> ...&facet=on&facet.field=pic_url_1&facet.mincount=1&fq=pic_url_1:*
>> >>
>> >> While this is working fine, I'm wondering whether there's a cleaner way
>> to
>> >> do the same thing without assuming that pictures have a sequential
>> number.
>> >>
>> >> Also, do you have any documentation about handling Dynamic Fields using
>> >> SolrJ. So far, I found only issues about that on JIRA, but no
>> documentation.
>> >>
>> >> Thanks a lot.
>> >>
>> >> -Saïd
>> >>
>> >> On Jun 26, 2010, at 1:18 AM, Otis Gospodnetic wrote:
>> >>
>> >>> Saïd,
>> >>>
>> >>> Dynamic fields could help here, for example imagine a doc with:
>> >>> id
>> >>> pic_url_*
>> >>> pic_caption_*
>> >>> pic_description_*
>> >>>
>> >>> See http://wiki.apache.org/solr/SchemaXml#Dynamic_fields
>> >>>
>> >>> So, for you:
>> >>>
>> >>> <dynamicField name="pic_url_*"  type="string"  indexed="true"
>> >> stored="true"/>
>> >>> <dynamicField name="pic_caption_*"  type="text"  indexed="true"
>> >> stored="true"/>
>> >>> <dynamicField name="pic_description_*"  type="text"  indexed="true"
>> >> stored="true"/>
>> >>>
>> >>> Then you can add docs with unlimited number of
>> >> pic_(url|caption|description)_* fields, e.g.
>> >>>
>> >>> id
>> >>> pic_url_1
>> >>> pic_caption_1
>> >>> pic_description_1
>> >>>
>> >>> id
>> >>> pic_url_2
>> >>> pic_caption_2
>> >>> pic_description_2
>> >>>
>> >>>
>> >>> Otis
>> >>> ----
>> >>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>> >>> Lucene ecosystem search :: http://search-lucene.com/
>> >>>
>> >>>
>> >>>
>> >>> ----- Original Message ----
>> >>>> From: Saïd Radhouani <r....@gmail.com>
>> >>>> To: solr-user@lucene.apache.org
>> >>>> Sent: Fri, June 25, 2010 6:01:13 PM
>> >>>> Subject: Setting many properties for a multivalued field. Schema.xml
>> ?
>> >> External file?
>> >>>>
>> >>>> Hi,
>> >>>
>> >>> I'm trying to index data containing a multivalued field "picture",
>> >>>> that has three properties: url, caption and description:
>> >>>
>> >>> <picture/>
>> >>>>
>> >>>   <url/>
>> >>>
>> >>>> <caption/>
>> >>>   <description/>
>> >>>
>> >>> Thus, each
>> >>>> indexed document might have many pictures, each of them has a url, a
>> >> caption,
>> >>>> and a description.
>> >>>
>> >>> I wonder wether it's possible to store this data using
>> >>>> only schema.xml. I couldn't figure it out so far. Instead, I'm
>> thinking
>> >> of using
>> >>>> an external file to sore the properties of each picture, but I
>> haven't
>> >> tried yet
>> >>>> this solution, waiting for your suggestions...
>> >>>
>> >>> Thanks,
>> >>> -Saïd
>> >>
>> >>
>>
>>
>

Re: Setting many properties for a multivalued field. Schema.xml ? External file?

Posted by Geert-Jan Brits <gb...@gmail.com>.
>If I understand your suggestion correctly, you said that there's NO need to
have many Dynamic Fields; instead, we can have one definitive field name,
which can store a long string (concatenation of >information about tens of
pictures), e.g., using "-" and "%" delimiters:
pic_url_value1-pic_caption_value1-pic_description_value1%pic_url_value2-pic_caption_value2-pic_description_value2%...
>I don't clearly see the reason of doing this. Is there a gain in terms of
performance? Or does this make programming on the client-side easier? Or
something else?

I think you should ask the exact opposite question. If you don't do anything
with these fields which Solr is particularly good at (searching / filtering
/ faceting/ sorting) why go through the trouble of creating dynamic fields?
 (more fields is more overhead cost/ tracking cost no matter how you look at
it)

Moreover, indeed from a client-view it's easier the way I suggested, since
otherwise you:
- would have to ask (through SolrJ) to include all dynamic fields to be
returned in the Fl-field (
http://wiki.apache.org/solr/CommonQueryParameters#fl). This is difficult,
because a-priori you don't know how many dynamic-fields to query. So in
other words you can't just ask SOlr (though SolrJ lik you asked) to just
return all dynamic fields beginning with pic_*. (afaik)
- your client iterate code (looping the pics) is a bit more involved.

HTH, Cheers,

Geert-Jan

2010/6/26 Saïd Radhouani <r....@gmail.com>

> Thanks Geert-Jan for the detailed answer. Actually, I don't search at all
> on these fields. I'm only filtering (w/ vs w/ pic) and sorting (based on the
> number of pictures). Thus, your suggestion of adding an extra field NrOfPics
> [0,N] would be the best solution.
>
> Regarding the other suggestion:
>
> > If you dont need search at all on these fields, the best thing imo is to
> > store all pic-related info of all pics together by concatenating them
> with
> > some delimiter which you know how to seperate at the client-side.
> > That or just store it in an external RDB since solr is just sitting on
> the
> > data and not doing anything intelligent with it.
>
> If I understand your suggestion correctly, you said that there's NO need to
> have many Dynamic Fields; instead, we can have one definitive field name,
> which can store a long string (concatenation of information about tens of
> pictures), e.g., using "-" and "%" delimiters:
> pic_url_value1-pic_caption_value1-pic_description_value1%pic_url_value2-pic_caption_value2-pic_description_value2%...
>
> I don't clearly see the reason of doing this. Is there a gain in terms of
> performance? Or does this make programming on the client-side easier? Or
> something else?
>
>
> My other question was: in case we use Dynamic Fields, is there a
> documentation about using SolrJ for this purpose?
>
> Thanks
> -Saïd
>
> On Jun 26, 2010, at 12:29 PM, Geert-Jan Brits wrote:
>
> > You can treat dynamic fields like any other field, so you can facet,
> sort,
> > filter, etc on these fields (afaik)
> >
> > I believe the confusion arises that sometimes the usecase for dynamic
> fields
> > seems to be ill-understood, i.e: to be able to use them to do some kind
> of
> > wildcard search, e.g: search for a value in any of the dynamic fields at
> > once like pic_url_*. This however is NOT possible.
> >
> > As far as your question goes:
> >
> >> Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc
> w/o
> > pic
> >> To the best of my knowledge, everyone is saying that faceting cannot be
> > done on dynamic fields (only on definitive field names). Thus, I tried
> the
> > following and it's working: I assume that the stored > >pictures have a
> > sequential number (_1, _2, etc.), i.e., if pic_url_1 exists in the index,
> it
> > means that the underlying doc has at least one picture:
> >> ...&facet=on&facet.field=pic_url_1&facet.mincount=1&fq=pic_url_1:*
> >> While this is working fine, I'm wondering whether there's a cleaner way
> to
> > do the same thing without assuming that pictures have a sequential
> number.
> >
> > If I understand your question correctly: faceting on docs with and
> without
> > pics could ofcourse by done like you mention, however it  would be more
> > efficient to have an extra field defined:  hasAtLestOnePic with values (0
> |
> > 1)
> > use that to facet / filter on.
> >
> > you can extend this to NrOfPics [0,N)  if you need to filter / facet on
> docs
> > with a certain nr of pics.
> >
> > also I wondered what else you wanted to do with this pic-related info. Do
> > you want to search on pic-description / pic-caption for instance? In that
> > case the dynamic-fields approach may not be what you want: how would you
> > know in which dynamic-field to search for a particular term? Would if be
> > pic_desc_1 , or pic_desc_x?  Of couse you could OR over all dynamic
> fields,
> > but you need to know how many pics an upperbound for the nr of pics and
> it
> > really doesn't feel right, to me at least.
> >
> > If you need search on pic_description for instance, but don't mind what
> pic
> > matches, you could create a single field pic_description and put in the
> > concat of all pic-descriptions and search on that, or just make it a a
> > multi-valued field.
> >
> > If you dont need search at all on these fields, the best thing imo is to
> > store all pic-related info of all pics together by concatenating them
> with
> > some delimiter which you know how to seperate at the client-side.
> > That or just store it in an external RDB since solr is just sitting on
> the
> > data and not doing anything intelligent with it.
> >
> > I assume btw that you don't want to sort/ facet on pic-desc /
> pic_caption/
> > pic_url either ( I have a hard time thinking of a useful usecase for
> that)
> >
> > HTH,
> >
> > Geert-Jan
> >
> >
> >
> > 2010/6/26 Saïd Radhouani <r....@gmail.com>
> >
> >> Thanks so much Otis. This is working great.
> >>
> >> Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc
> w/o
> >> pic
> >>
> >> To the best of my knowledge, everyone is saying that faceting cannot be
> >> done on dynamic fields (only on definitive field names). Thus, I tried
> the
> >> following and it's working: I assume that the stored pictures have a
> >> sequential number (_1, _2, etc.), i.e., if pic_url_1 exists in the
> index, it
> >> means that the underlying doc has at least one picture:
> >>
> >> ...&facet=on&facet.field=pic_url_1&facet.mincount=1&fq=pic_url_1:*
> >>
> >> While this is working fine, I'm wondering whether there's a cleaner way
> to
> >> do the same thing without assuming that pictures have a sequential
> number.
> >>
> >> Also, do you have any documentation about handling Dynamic Fields using
> >> SolrJ. So far, I found only issues about that on JIRA, but no
> documentation.
> >>
> >> Thanks a lot.
> >>
> >> -Saïd
> >>
> >> On Jun 26, 2010, at 1:18 AM, Otis Gospodnetic wrote:
> >>
> >>> Saïd,
> >>>
> >>> Dynamic fields could help here, for example imagine a doc with:
> >>> id
> >>> pic_url_*
> >>> pic_caption_*
> >>> pic_description_*
> >>>
> >>> See http://wiki.apache.org/solr/SchemaXml#Dynamic_fields
> >>>
> >>> So, for you:
> >>>
> >>> <dynamicField name="pic_url_*"  type="string"  indexed="true"
> >> stored="true"/>
> >>> <dynamicField name="pic_caption_*"  type="text"  indexed="true"
> >> stored="true"/>
> >>> <dynamicField name="pic_description_*"  type="text"  indexed="true"
> >> stored="true"/>
> >>>
> >>> Then you can add docs with unlimited number of
> >> pic_(url|caption|description)_* fields, e.g.
> >>>
> >>> id
> >>> pic_url_1
> >>> pic_caption_1
> >>> pic_description_1
> >>>
> >>> id
> >>> pic_url_2
> >>> pic_caption_2
> >>> pic_description_2
> >>>
> >>>
> >>> Otis
> >>> ----
> >>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> >>> Lucene ecosystem search :: http://search-lucene.com/
> >>>
> >>>
> >>>
> >>> ----- Original Message ----
> >>>> From: Saïd Radhouani <r....@gmail.com>
> >>>> To: solr-user@lucene.apache.org
> >>>> Sent: Fri, June 25, 2010 6:01:13 PM
> >>>> Subject: Setting many properties for a multivalued field. Schema.xml ?
> >> External file?
> >>>>
> >>>> Hi,
> >>>
> >>> I'm trying to index data containing a multivalued field "picture",
> >>>> that has three properties: url, caption and description:
> >>>
> >>> <picture/>
> >>>>
> >>>   <url/>
> >>>
> >>>> <caption/>
> >>>   <description/>
> >>>
> >>> Thus, each
> >>>> indexed document might have many pictures, each of them has a url, a
> >> caption,
> >>>> and a description.
> >>>
> >>> I wonder wether it's possible to store this data using
> >>>> only schema.xml. I couldn't figure it out so far. Instead, I'm
> thinking
> >> of using
> >>>> an external file to sore the properties of each picture, but I haven't
> >> tried yet
> >>>> this solution, waiting for your suggestions...
> >>>
> >>> Thanks,
> >>> -Saïd
> >>
> >>
>
>

Re: Setting many properties for a multivalued field. Schema.xml ? External file?

Posted by Saïd Radhouani <r....@gmail.com>.
Thanks Geert-Jan for the detailed answer. Actually, I don't search at all on these fields. I'm only filtering (w/ vs w/ pic) and sorting (based on the number of pictures). Thus, your suggestion of adding an extra field NrOfPics [0,N] would be the best solution.

Regarding the other suggestion:

> If you dont need search at all on these fields, the best thing imo is to
> store all pic-related info of all pics together by concatenating them with
> some delimiter which you know how to seperate at the client-side.
> That or just store it in an external RDB since solr is just sitting on the
> data and not doing anything intelligent with it.

If I understand your suggestion correctly, you said that there's NO need to have many Dynamic Fields; instead, we can have one definitive field name, which can store a long string (concatenation of information about tens of pictures), e.g., using "-" and "%" delimiters: pic_url_value1-pic_caption_value1-pic_description_value1%pic_url_value2-pic_caption_value2-pic_description_value2%...

I don't clearly see the reason of doing this. Is there a gain in terms of performance? Or does this make programming on the client-side easier? Or something else?


My other question was: in case we use Dynamic Fields, is there a documentation about using SolrJ for this purpose? 

Thanks
-Saïd

On Jun 26, 2010, at 12:29 PM, Geert-Jan Brits wrote:

> You can treat dynamic fields like any other field, so you can facet, sort,
> filter, etc on these fields (afaik)
> 
> I believe the confusion arises that sometimes the usecase for dynamic fields
> seems to be ill-understood, i.e: to be able to use them to do some kind of
> wildcard search, e.g: search for a value in any of the dynamic fields at
> once like pic_url_*. This however is NOT possible.
> 
> As far as your question goes:
> 
>> Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc w/o
> pic
>> To the best of my knowledge, everyone is saying that faceting cannot be
> done on dynamic fields (only on definitive field names). Thus, I tried the
> following and it's working: I assume that the stored > >pictures have a
> sequential number (_1, _2, etc.), i.e., if pic_url_1 exists in the index, it
> means that the underlying doc has at least one picture:
>> ...&facet=on&facet.field=pic_url_1&facet.mincount=1&fq=pic_url_1:*
>> While this is working fine, I'm wondering whether there's a cleaner way to
> do the same thing without assuming that pictures have a sequential number.
> 
> If I understand your question correctly: faceting on docs with and without
> pics could ofcourse by done like you mention, however it  would be more
> efficient to have an extra field defined:  hasAtLestOnePic with values (0 |
> 1)
> use that to facet / filter on.
> 
> you can extend this to NrOfPics [0,N)  if you need to filter / facet on docs
> with a certain nr of pics.
> 
> also I wondered what else you wanted to do with this pic-related info. Do
> you want to search on pic-description / pic-caption for instance? In that
> case the dynamic-fields approach may not be what you want: how would you
> know in which dynamic-field to search for a particular term? Would if be
> pic_desc_1 , or pic_desc_x?  Of couse you could OR over all dynamic fields,
> but you need to know how many pics an upperbound for the nr of pics and it
> really doesn't feel right, to me at least.
> 
> If you need search on pic_description for instance, but don't mind what pic
> matches, you could create a single field pic_description and put in the
> concat of all pic-descriptions and search on that, or just make it a a
> multi-valued field.
> 
> If you dont need search at all on these fields, the best thing imo is to
> store all pic-related info of all pics together by concatenating them with
> some delimiter which you know how to seperate at the client-side.
> That or just store it in an external RDB since solr is just sitting on the
> data and not doing anything intelligent with it.
> 
> I assume btw that you don't want to sort/ facet on pic-desc / pic_caption/
> pic_url either ( I have a hard time thinking of a useful usecase for that)
> 
> HTH,
> 
> Geert-Jan
> 
> 
> 
> 2010/6/26 Saïd Radhouani <r....@gmail.com>
> 
>> Thanks so much Otis. This is working great.
>> 
>> Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc w/o
>> pic
>> 
>> To the best of my knowledge, everyone is saying that faceting cannot be
>> done on dynamic fields (only on definitive field names). Thus, I tried the
>> following and it's working: I assume that the stored pictures have a
>> sequential number (_1, _2, etc.), i.e., if pic_url_1 exists in the index, it
>> means that the underlying doc has at least one picture:
>> 
>> ...&facet=on&facet.field=pic_url_1&facet.mincount=1&fq=pic_url_1:*
>> 
>> While this is working fine, I'm wondering whether there's a cleaner way to
>> do the same thing without assuming that pictures have a sequential number.
>> 
>> Also, do you have any documentation about handling Dynamic Fields using
>> SolrJ. So far, I found only issues about that on JIRA, but no documentation.
>> 
>> Thanks a lot.
>> 
>> -Saïd
>> 
>> On Jun 26, 2010, at 1:18 AM, Otis Gospodnetic wrote:
>> 
>>> Saïd,
>>> 
>>> Dynamic fields could help here, for example imagine a doc with:
>>> id
>>> pic_url_*
>>> pic_caption_*
>>> pic_description_*
>>> 
>>> See http://wiki.apache.org/solr/SchemaXml#Dynamic_fields
>>> 
>>> So, for you:
>>> 
>>> <dynamicField name="pic_url_*"  type="string"  indexed="true"
>> stored="true"/>
>>> <dynamicField name="pic_caption_*"  type="text"  indexed="true"
>> stored="true"/>
>>> <dynamicField name="pic_description_*"  type="text"  indexed="true"
>> stored="true"/>
>>> 
>>> Then you can add docs with unlimited number of
>> pic_(url|caption|description)_* fields, e.g.
>>> 
>>> id
>>> pic_url_1
>>> pic_caption_1
>>> pic_description_1
>>> 
>>> id
>>> pic_url_2
>>> pic_caption_2
>>> pic_description_2
>>> 
>>> 
>>> Otis
>>> ----
>>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>>> Lucene ecosystem search :: http://search-lucene.com/
>>> 
>>> 
>>> 
>>> ----- Original Message ----
>>>> From: Saïd Radhouani <r....@gmail.com>
>>>> To: solr-user@lucene.apache.org
>>>> Sent: Fri, June 25, 2010 6:01:13 PM
>>>> Subject: Setting many properties for a multivalued field. Schema.xml ?
>> External file?
>>>> 
>>>> Hi,
>>> 
>>> I'm trying to index data containing a multivalued field "picture",
>>>> that has three properties: url, caption and description:
>>> 
>>> <picture/>
>>>> 
>>>   <url/>
>>> 
>>>> <caption/>
>>>   <description/>
>>> 
>>> Thus, each
>>>> indexed document might have many pictures, each of them has a url, a
>> caption,
>>>> and a description.
>>> 
>>> I wonder wether it's possible to store this data using
>>>> only schema.xml. I couldn't figure it out so far. Instead, I'm thinking
>> of using
>>>> an external file to sore the properties of each picture, but I haven't
>> tried yet
>>>> this solution, waiting for your suggestions...
>>> 
>>> Thanks,
>>> -Saïd
>> 
>> 


Re: Setting many properties for a multivalued field. Schema.xml ? External file?

Posted by Geert-Jan Brits <gb...@gmail.com>.
You can treat dynamic fields like any other field, so you can facet, sort,
filter, etc on these fields (afaik)

I believe the confusion arises that sometimes the usecase for dynamic fields
seems to be ill-understood, i.e: to be able to use them to do some kind of
wildcard search, e.g: search for a value in any of the dynamic fields at
once like pic_url_*. This however is NOT possible.

As far as your question goes:

>Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc w/o
pic
>To the best of my knowledge, everyone is saying that faceting cannot be
done on dynamic fields (only on definitive field names). Thus, I tried the
following and it's working: I assume that the stored > >pictures have a
sequential number (_1, _2, etc.), i.e., if pic_url_1 exists in the index, it
means that the underlying doc has at least one picture:
> ...&facet=on&facet.field=pic_url_1&facet.mincount=1&fq=pic_url_1:*
> While this is working fine, I'm wondering whether there's a cleaner way to
do the same thing without assuming that pictures have a sequential number.

If I understand your question correctly: faceting on docs with and without
pics could ofcourse by done like you mention, however it  would be more
efficient to have an extra field defined:  hasAtLestOnePic with values (0 |
1)
use that to facet / filter on.

you can extend this to NrOfPics [0,N)  if you need to filter / facet on docs
with a certain nr of pics.

also I wondered what else you wanted to do with this pic-related info. Do
you want to search on pic-description / pic-caption for instance? In that
case the dynamic-fields approach may not be what you want: how would you
know in which dynamic-field to search for a particular term? Would if be
pic_desc_1 , or pic_desc_x?  Of couse you could OR over all dynamic fields,
but you need to know how many pics an upperbound for the nr of pics and it
really doesn't feel right, to me at least.

If you need search on pic_description for instance, but don't mind what pic
matches, you could create a single field pic_description and put in the
concat of all pic-descriptions and search on that, or just make it a a
multi-valued field.

If you dont need search at all on these fields, the best thing imo is to
store all pic-related info of all pics together by concatenating them with
some delimiter which you know how to seperate at the client-side.
That or just store it in an external RDB since solr is just sitting on the
data and not doing anything intelligent with it.

I assume btw that you don't want to sort/ facet on pic-desc / pic_caption/
pic_url either ( I have a hard time thinking of a useful usecase for that)

HTH,

Geert-Jan



2010/6/26 Saïd Radhouani <r....@gmail.com>

> Thanks so much Otis. This is working great.
>
> Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc w/o
> pic
>
> To the best of my knowledge, everyone is saying that faceting cannot be
> done on dynamic fields (only on definitive field names). Thus, I tried the
> following and it's working: I assume that the stored pictures have a
> sequential number (_1, _2, etc.), i.e., if pic_url_1 exists in the index, it
> means that the underlying doc has at least one picture:
>
> ...&facet=on&facet.field=pic_url_1&facet.mincount=1&fq=pic_url_1:*
>
> While this is working fine, I'm wondering whether there's a cleaner way to
> do the same thing without assuming that pictures have a sequential number.
>
> Also, do you have any documentation about handling Dynamic Fields using
> SolrJ. So far, I found only issues about that on JIRA, but no documentation.
>
> Thanks a lot.
>
> -Saïd
>
> On Jun 26, 2010, at 1:18 AM, Otis Gospodnetic wrote:
>
> > Saïd,
> >
> > Dynamic fields could help here, for example imagine a doc with:
> > id
> > pic_url_*
> > pic_caption_*
> > pic_description_*
> >
> > See http://wiki.apache.org/solr/SchemaXml#Dynamic_fields
> >
> > So, for you:
> >
> > <dynamicField name="pic_url_*"  type="string"  indexed="true"
>  stored="true"/>
> > <dynamicField name="pic_caption_*"  type="text"  indexed="true"
>  stored="true"/>
> > <dynamicField name="pic_description_*"  type="text"  indexed="true"
>  stored="true"/>
> >
> > Then you can add docs with unlimited number of
> pic_(url|caption|description)_* fields, e.g.
> >
> > id
> > pic_url_1
> > pic_caption_1
> > pic_description_1
> >
> > id
> > pic_url_2
> > pic_caption_2
> > pic_description_2
> >
> >
> > Otis
> > ----
> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> > Lucene ecosystem search :: http://search-lucene.com/
> >
> >
> >
> > ----- Original Message ----
> >> From: Saïd Radhouani <r....@gmail.com>
> >> To: solr-user@lucene.apache.org
> >> Sent: Fri, June 25, 2010 6:01:13 PM
> >> Subject: Setting many properties for a multivalued field. Schema.xml ?
> External file?
> >>
> >> Hi,
> >
> > I'm trying to index data containing a multivalued field "picture",
> >> that has three properties: url, caption and description:
> >
> > <picture/>
> >>
> >    <url/>
> >
> >> <caption/>
> >    <description/>
> >
> > Thus, each
> >> indexed document might have many pictures, each of them has a url, a
> caption,
> >> and a description.
> >
> > I wonder wether it's possible to store this data using
> >> only schema.xml. I couldn't figure it out so far. Instead, I'm thinking
> of using
> >> an external file to sore the properties of each picture, but I haven't
> tried yet
> >> this solution, waiting for your suggestions...
> >
> > Thanks,
> > -Saïd
>
>

Re: Setting many properties for a multivalued field. Schema.xml ? External file?

Posted by Saïd Radhouani <r....@gmail.com>.
Thanks so much Otis. This is working great.

Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc w/o pic

To the best of my knowledge, everyone is saying that faceting cannot be done on dynamic fields (only on definitive field names). Thus, I tried the following and it's working: I assume that the stored pictures have a sequential number (_1, _2, etc.), i.e., if pic_url_1 exists in the index, it means that the underlying doc has at least one picture: 

...&facet=on&facet.field=pic_url_1&facet.mincount=1&fq=pic_url_1:*

While this is working fine, I'm wondering whether there's a cleaner way to do the same thing without assuming that pictures have a sequential number.

Also, do you have any documentation about handling Dynamic Fields using SolrJ. So far, I found only issues about that on JIRA, but no documentation.

Thanks a lot.

-Saïd

On Jun 26, 2010, at 1:18 AM, Otis Gospodnetic wrote:

> Saïd,
> 
> Dynamic fields could help here, for example imagine a doc with:
> id
> pic_url_*
> pic_caption_*
> pic_description_*
> 
> See http://wiki.apache.org/solr/SchemaXml#Dynamic_fields
> 
> So, for you:
> 
> <dynamicField name="pic_url_*"  type="string"  indexed="true"  stored="true"/>
> <dynamicField name="pic_caption_*"  type="text"  indexed="true"  stored="true"/>
> <dynamicField name="pic_description_*"  type="text"  indexed="true"  stored="true"/>
> 
> Then you can add docs with unlimited number of pic_(url|caption|description)_* fields, e.g.
> 
> id
> pic_url_1
> pic_caption_1
> pic_description_1
> 
> id
> pic_url_2
> pic_caption_2
> pic_description_2
> 
> 
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
> 
> 
> 
> ----- Original Message ----
>> From: Saïd Radhouani <r....@gmail.com>
>> To: solr-user@lucene.apache.org
>> Sent: Fri, June 25, 2010 6:01:13 PM
>> Subject: Setting many properties for a multivalued field. Schema.xml ? External file?
>> 
>> Hi,
> 
> I'm trying to index data containing a multivalued field "picture", 
>> that has three properties: url, caption and description:
> 
> <picture/> 
>> 
>    <url/>
> 
>> <caption/>
>    <description/>
> 
> Thus, each 
>> indexed document might have many pictures, each of them has a url, a caption, 
>> and a description.
> 
> I wonder wether it's possible to store this data using 
>> only schema.xml. I couldn't figure it out so far. Instead, I'm thinking of using 
>> an external file to sore the properties of each picture, but I haven't tried yet 
>> this solution, waiting for your suggestions...
> 
> Thanks,
> -Saïd


Re: Setting many properties for a multivalued field. Schema.xml ? External file?

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Saïd,

Dynamic fields could help here, for example imagine a doc with:
id
 pic_url_*
 pic_caption_*
 pic_description_*

See http://wiki.apache.org/solr/SchemaXml#Dynamic_fields

So, for you:

<dynamicField name="pic_url_*"  type="string"  indexed="true"  stored="true"/>
<dynamicField name="pic_caption_*"  type="text"  indexed="true"  stored="true"/>
<dynamicField name="pic_description_*"  type="text"  indexed="true"  stored="true"/>

Then you can add docs with unlimited number of pic_(url|caption|description)_* fields, e.g.

id
pic_url_1
pic_caption_1
pic_description_1

id
pic_url_2
pic_caption_2
pic_description_2


Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Saïd Radhouani <r....@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Fri, June 25, 2010 6:01:13 PM
> Subject: Setting many properties for a multivalued field. Schema.xml ? External file?
> 
> Hi,

I'm trying to index data containing a multivalued field "picture", 
> that has three properties: url, caption and description:

<picture/> 
> 
    <url/>
    
> <caption/>
    <description/>

Thus, each 
> indexed document might have many pictures, each of them has a url, a caption, 
> and a description.

I wonder wether it's possible to store this data using 
> only schema.xml. I couldn't figure it out so far. Instead, I'm thinking of using 
> an external file to sore the properties of each picture, but I haven't tried yet 
> this solution, waiting for your suggestions...

Thanks,
-Saïd

Setting many properties for a multivalued field. Schema.xml ? External file?

Posted by Saïd Radhouani <r....@gmail.com>.
Hi,

I'm trying to index data containing a multivalued field "picture", that has three properties: url, caption and description:

<picture/> 
	<url/>
	<caption/>
	<description/>

Thus, each indexed document might have many pictures, each of them has a url, a caption, and a description.

I wonder wether it's possible to store this data using only schema.xml. I couldn't figure it out so far. Instead, I'm thinking of using an external file to sore the properties of each picture, but I haven't tried yet this solution, waiting for your suggestions...

Thanks,
-Saïd


Re: [ANN] Solr 1.4.1 Released

Posted by Mark Miller <ma...@apache.org>.
Can a solr/maven dude look at this? I simply used the copy command on
the release to-do wiki (sounds like it should be updated).

If no one steps up, I'll try and straighten it out later.

On 6/25/10 10:28 AM, Stevo Slavić wrote:
> Congrats on the release!
> 
> Something seems to be wrong with solr 1.4.1 maven artifacts, there is in
> extra solr in the path. E.g. solr-parent-1.4.1.pom at in
> http://repo1.maven.org/maven2/org/apache/solr/solr/solr-parent/1.4.1/solr-parent-1.4.1.pomwhile
> it should be at
> http://repo1.maven.org/maven2/org/apache/solr/solr-parent/1.4.1/solr-parent-1.4.1.pom.
> Pom's seem to contain correct maven artifact coordinates.
> 
> Regards,
> Stevo.
> 
> On Fri, Jun 25, 2010 at 3:23 PM, Mark Miller <ma...@apache.org> wrote:
> 
> Apache Solr 1.4.1 has been released and is now available for public
> download!
> http://www.apache.org/dyn/closer.cgi/lucene/solr/
> 
> Solr is the popular, blazing fast open source enterprise search
> platform from the Apache Lucene project.  Its major features include
> powerful full-text search, hit highlighting, faceted search, dynamic
> clustering, database integration, and rich document (e.g., Word, PDF)
> handling.  Solr is highly scalable, providing distributed search and
> index replication, and it powers the search and navigation features of
> many of the world's largest internet sites.
> 
> Solr is written in Java and runs as a standalone full-text search server
> within a servlet container such as Tomcat.  Solr uses the Lucene Java
> search library at its core for full-text indexing and search, and has
> REST-like HTTP/XML and JSON APIs that make it easy to use from virtually
> any programming language.  Solr's powerful external configuration allows
> it to be tailored to almost any type of application without Java coding,
> and it has an extensive plugin architecture when more advanced
> customization is required.
> 
> Solr 1.4.1 is a bug fix release for Solr 1.4 that includes many Solr bug
> fixes as well as Lucene bug fixes from Lucene 2.9.3.
> 
> See all of the CHANGES here:
> http://svn.apache.org/repos/asf/lucene/solr/tags/release-1.4.1/CHANGES.txt
> 
> 
> - Mark Miller on behalf of the Solr team
>>


Re: [ANN] Solr 1.4.1 Released

Posted by Mark Miller <ma...@apache.org>.
Can a solr/maven dude look at this? I simply used the copy command on
the release to-do wiki (sounds like it should be updated).

If no one steps up, I'll try and straighten it out later.

On 6/25/10 10:28 AM, Stevo Slavić wrote:
> Congrats on the release!
> 
> Something seems to be wrong with solr 1.4.1 maven artifacts, there is in
> extra solr in the path. E.g. solr-parent-1.4.1.pom at in
> http://repo1.maven.org/maven2/org/apache/solr/solr/solr-parent/1.4.1/solr-parent-1.4.1.pomwhile
> it should be at
> http://repo1.maven.org/maven2/org/apache/solr/solr-parent/1.4.1/solr-parent-1.4.1.pom.
> Pom's seem to contain correct maven artifact coordinates.
> 
> Regards,
> Stevo.
> 
> On Fri, Jun 25, 2010 at 3:23 PM, Mark Miller <ma...@apache.org> wrote:
> 
> Apache Solr 1.4.1 has been released and is now available for public
> download!
> http://www.apache.org/dyn/closer.cgi/lucene/solr/
> 
> Solr is the popular, blazing fast open source enterprise search
> platform from the Apache Lucene project.  Its major features include
> powerful full-text search, hit highlighting, faceted search, dynamic
> clustering, database integration, and rich document (e.g., Word, PDF)
> handling.  Solr is highly scalable, providing distributed search and
> index replication, and it powers the search and navigation features of
> many of the world's largest internet sites.
> 
> Solr is written in Java and runs as a standalone full-text search server
> within a servlet container such as Tomcat.  Solr uses the Lucene Java
> search library at its core for full-text indexing and search, and has
> REST-like HTTP/XML and JSON APIs that make it easy to use from virtually
> any programming language.  Solr's powerful external configuration allows
> it to be tailored to almost any type of application without Java coding,
> and it has an extensive plugin architecture when more advanced
> customization is required.
> 
> Solr 1.4.1 is a bug fix release for Solr 1.4 that includes many Solr bug
> fixes as well as Lucene bug fixes from Lucene 2.9.3.
> 
> See all of the CHANGES here:
> http://svn.apache.org/repos/asf/lucene/solr/tags/release-1.4.1/CHANGES.txt
> 
> 
> - Mark Miller on behalf of the Solr team
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: [ANN] Solr 1.4.1 Released

Posted by Stevo Slavić <ss...@gmail.com>.
Congrats on the release!

Something seems to be wrong with solr 1.4.1 maven artifacts, there is in
extra solr in the path. E.g. solr-parent-1.4.1.pom at in
http://repo1.maven.org/maven2/org/apache/solr/solr/solr-parent/1.4.1/solr-parent-1.4.1.pomwhile
it should be at
http://repo1.maven.org/maven2/org/apache/solr/solr-parent/1.4.1/solr-parent-1.4.1.pom.
Pom's seem to contain correct maven artifact coordinates.

Regards,
Stevo.

On Fri, Jun 25, 2010 at 3:23 PM, Mark Miller <ma...@apache.org> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Apache Solr 1.4.1 has been released and is now available for public
> download!
> http://www.apache.org/dyn/closer.cgi/lucene/solr/
>
> Solr is the popular, blazing fast open source enterprise search
> platform from the Apache Lucene project.  Its major features include
> powerful full-text search, hit highlighting, faceted search, dynamic
> clustering, database integration, and rich document (e.g., Word, PDF)
> handling.  Solr is highly scalable, providing distributed search and
> index replication, and it powers the search and navigation features of
> many of the world's largest internet sites.
>
> Solr is written in Java and runs as a standalone full-text search server
> within a servlet container such as Tomcat.  Solr uses the Lucene Java
> search library at its core for full-text indexing and search, and has
> REST-like HTTP/XML and JSON APIs that make it easy to use from virtually
> any programming language.  Solr's powerful external configuration allows
> it to be tailored to almost any type of application without Java coding,
> and it has an extensive plugin architecture when more advanced
> customization is required.
>
> Solr 1.4.1 is a bug fix release for Solr 1.4 that includes many Solr bug
> fixes as well as Lucene bug fixes from Lucene 2.9.3.
>
> See all of the CHANGES here:
> http://svn.apache.org/repos/asf/lucene/solr/tags/release-1.4.1/CHANGES.txt
>
>
> - - Mark Miller on behalf of the Solr team
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG/MacGPG2 v2.0.14 (Darwin)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iQIcBAEBAgAGBQJMJK3AAAoJED+/0YJ4eWrIrfAP/RLD7QvreOBFebICN/eiRzCH
> 1dHOt9Scn7qGQU4RvXZ8GQq37AuoRMgmgckntttFLCCD5w5A29/GxzyZbAoQDQ0B
> OkaHsYIcUuhbLq8QtlTjt+rK3gc6oxMoCRMJBS7DfUFUyROl6om4gpYAVem50qDy
> FfBdgRxp4VZ07E7VwmMvma03nSrKuvX0bwE8NXksaCAVsvkmi8Sh7aLMPPVHgsuD
> pbY8kB0hXCULJgs9ZAc2t6+T38+eV9wxJSeAktVlGAvNlYTavW2bxzF5wQk+kXCd
> DwGjdlU9/ebHdx3MHJyE0zXSl4rGFsy8zfh/ntk7UV7qklQ2jn5Ur18zLqv4vkb1
> Ea78GpoqCZWlMGcRUSErtH33cGs4blo/kuJZj/VLrk6jxO4x4beUsAfRcM/YliJW
> Z6OuFtpcdVDjVl4aB2xbAMwDl2DXqgyNmlxs8vvqdRoDhN8wZ91raO0kkbrkzj1f
> 5gPD//Efx6RcrYtXAV3HKAwI7FLP8MhzFu1Y2FK2FY7DyFNmirad03+pB6bFs1xq
> ARU6pdeTYvv+PsWH3Keaw/L/nb0BYbU8R1sVhkvjm+S9gJ6cCcKJkeAkNgL+6QNm
> JPJ5VeXVFGVmwzQ5mE3j6qX1uDrEmLA2T5Dd7bssWtwveLoyfo0s7qezIfbRamnc
> T3iyCE6cuSU9CvCEqN+o
> =nBB9
> -----END PGP SIGNATURE-----
>