You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Luís Portela Afonso <me...@gmail.com> on 2013/08/01 18:26:20 UTC
Re: Solr PolyField
Hi,
I have tried the solr.CloneFieldUpdateProcessorFactory sugested in the pool but the fields are not copied.
My dataconfig.xml
<field column="enclosure_type" xpath="/rss/channel/item/enclosure/@type" />
My schema.xml
<dynamicField name="enclosure_*" type="string" indexed="false" stored="true" multiValued="true" />
<!-- </field> -->
<!-- <dynamicField name="enclosure_*" type="string" indexed="false" stored="true" multiValued="false" /> -->
<field name="enclosure" type="text" indexed="true" stored="true" multiValued="true" />
My solrconfig.xml
<updateRequestProcessorChain name="multiple-clones">
<processor class="solr.CloneFieldUpdateProcessorFactory">
<str name="source">enclosure_title</str>
<str name="dest">enclosure</str>
</processor>
</updateRequestProcessorChain>
and
<requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">rss-data-config.xml</str>
<str name="update.chain">multiple-clones</str>
<str name="update.chain">fixIndexedValues</str>
</lst>
</requestHandler>
Can you help? Thanks ;)
On Jul 31, 2013, at 6:03 PM, Luís Portela Afonso <me...@gmail.com> wrote:
> Ok, thanks. I will check it.
>
> On Jul 31, 2013, at 5:08 PM, "Jack Krupansky" <ja...@basetechnology.com> wrote:
>
>> See:
>> https://builds.apache.org/job/Solr-Artifacts-4.x/javadoc/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html
>>
>> I have more examples in my book.
>>
>> -- Jack Krupansky
>>
>> From: Luís Portela Afonso
>> Sent: Wednesday, July 31, 2013 11:41 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Solr PolyField
>>
>> Hum, ok.
>>
>> It's possible to add to a field, static text? Text that i write on the configuration and then append another field? I saw something like CloneFieldProcessor but when i'm starting solr, it says that could not find the class.
>> I was trying to use processors to move one field to another.
>>
>> I saw this:
>> <processor class="solr.FieldCopyProcessorFactory">
>> <str name="source">lastname firstname</str>
>> <str name="dest">fullname</str>
>> <bool name="append">true</bool>
>> <str name="append.delim">, </str>
>> </processor>
>> But when i try to use it solr says that he cannot find the solr.FieldCopyProcessorFactory. I'm using solr 4.4.0
>>
>> Thanks ;)
>>
>> On Jul 31, 2013, at 4:16 PM, Michael Della Bitta <mi...@appinions.com> wrote:
>>
>>
>> OK,
>>
>> Then I would suggest creating multiValued enclosure_type, etc. tags for
>> searching, and then one string-typed field to store the JSON snippet you've
>> been showing.
>>
>> Michael Della Bitta
>>
>> Applications Developer
>>
>> o: +1 646 532 3062 | c: +1 917 477 7906
>>
>> appinions inc.
>>
>> “The Science of Influence Marketing”
>>
>> 18 East 41st Street
>>
>> New York, NY 10017
>>
>> t: @appinions <https://twitter.com/Appinions> | g+:
>> plus.google.com/appinions<https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts>
>> w: appinions.com <http://www.appinions.com/>
>>
>>
>> On Wed, Jul 31, 2013 at 11:11 AM, Luís Portela Afonso <
>> meligaletiko@gmail.com> wrote:
>>
>>
>> As a single record? Hum, no.
>>
>> So an Rss has /rss/channel/ and then lot of /rss/channel/item, right?
>> Each /rss/channel/item is a new document on Solr. I start with the solr
>> example rss, but i change that to has more fields, other fields and get the
>> feed url from a database.
>>
>> So each /rss/channel/item is a document to the indexing, bue each
>> /rss/channel/item can have more than on enclosure tag.
>>
>> Many thanks
>>
>> On Jul 31, 2013, at 4:05 PM, Michael Della Bitta <
>> michael.della.bitta@appinions.com> wrote:
>>
>>
>> So you're trying to index a RSS feed as a single record, but you want to
>>
>> be
>>
>> able to search for and retrieve individual entries from within the feed?
>>
>> Is
>>
>> that the issue?
>>
>> Michael Della Bitta
>>
>> Applications Developer
>>
>> o: +1 646 532 3062 | c: +1 917 477 7906
>>
>> appinions inc.
>>
>> “The Science of Influence Marketing”
>>
>> 18 East 41st Street
>>
>> New York, NY 10017
>>
>> t: @appinions <https://twitter.com/Appinions> | g+:
>> plus.google.com/appinions<
>>
>> https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
>>
>>
>> w: appinions.com <http://www.appinions.com/>
>>
>>
>> On Wed, Jul 31, 2013 at 10:59 AM, Luís Portela Afonso <
>> meligaletiko@gmail.com> wrote:
>>
>>
>> This fields can be multiValued.
>> I the rss standart there is not correct to do that, but some sources do
>> and i like to grab it all. Is there any way that make it possible?
>>
>> Once again, Many thanks :)
>>
>> On Jul 31, 2013, at 3:54 PM, Michael Della Bitta <
>> michael.della.bitta@appinions.com> wrote:
>>
>>
>> Luís,
>>
>> Is there a reason why splitting this up into enclosure_type,
>>
>> enclosure_url,
>>
>> and enclosure_length would not work?
>>
>>
>> Michael Della Bitta
>>
>> Applications Developer
>>
>> o: +1 646 532 3062 | c: +1 917 477 7906
>>
>> appinions inc.
>>
>> “The Science of Influence Marketing”
>>
>> 18 East 41st Street
>>
>> New York, NY 10017
>>
>> t: @appinions <https://twitter.com/Appinions> | g+:
>> plus.google.com/appinions<
>>
>>
>>
>> https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
>>
>>
>> w: appinions.com <http://www.appinions.com/>
>>
>>
>> On Wed, Jul 31, 2013 at 10:43 AM, Luís Portela Afonso <
>> meligaletiko@gmail.com> wrote:
>>
>>
>> Hi,
>>
>> I'm trying to index information of RSS Feeds.
>>
>> So in a more detailed explanation:
>>
>> The RSS feed has something like:
>> <enclosure url="
>>
>> http://www.engadget.com/podcasts/Engadget_Podcast_353.mp3"
>>
>> length="32642192" type="audio/mpeg"/>
>>
>> *With my current configuration, this is working and i get a result
>>
>> like
>>
>> that:*
>>
>>
>> - enclosure:
>> [
>> - "audio/mpeg",
>> - "http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3",
>> - "37521428"
>> ],
>>
>>
>> *BUT,* this is not the result that i'm trying to reach. With that i'm
>>
>> not
>>
>> able to know in a "correct" way, if "audio/mpeg" is the *type*, or
>>
>> the *
>>
>> url,* or the *length*.
>> *
>> *
>> *I want to reach something like:*
>>
>> -
>> - enclosure:
>> {
>> - type: "a <http://www.gazzetta.it/>udio/mpeg",
>> - url:
>> "http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3",
>> - length: "37521428"
>> },
>>
>>
>>
>> So, how i intend this, this should be 3 fields inside of another
>>
>> field,
>>
>> no?
>>
>>
>>
>> Many Thanks for the answer and the help.
>>
>>
>> On Jul 31, 2013, at 3:34 PM, Erick Erickson <er...@gmail.com>
>> wrote:
>>
>> Nope. Solr fields are flat. Why do you want to do this? I'm
>> asking because this might be an XY problems and there
>> may be other possibilities.
>>
>> Best
>> Erick
>>
>> On Wed, Jul 31, 2013 at 5:09 AM, Luís Portela Afonso
>> <me...@gmail.com> wrote:
>>
>> Hi, I'm trying to create a field with multiple fields inside, that is:
>>
>> origin:
>> {
>>
>> htmlUrl: "http://www.gazzetta.it/",
>> streamId: "feed/http://www.gazzetta.it/rss/Home.xml",
>> title: "Gazzetta.it"
>>
>> },
>>
>>
>> Get something like this. Is that possible? I'm using Solr 4.4.0.
>>
>> Thanks
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
Re: Solr PolyField
Posted by Erick Erickson <er...@gmail.com>.
Been there, done that <G>.
On Thu, Aug 1, 2013 at 4:46 PM, Luís Portela Afonso
<me...@gmail.com>wrote:
> Oh my god. Thanks for notice. The field name its wrong. It should be
> enclosure_type. I'm so sorry.
>
> On Aug 1, 2013, at 6:33 PM, Jack Krupansky <ja...@basetechnology.com>
> wrote:
>
> > Are you sure the “enclosure_title” field is populated?
> >
> > Have you updated the request handler?
> >
> > -- Jack Krupansky
> >
> > From: Luís Portela Afonso
> > Sent: Thursday, August 01, 2013 1:23 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Solr PolyField
> >
> > So i have merged the two chains in one, and this is not copying. Hum…
> >
> > My solrconfig.xml
> >
> > <updateRequestProcessorChain name="fixIndexedValues">
> > <!-- <processor class="solr.UUIDUpdateProcessorFactory">
> > <str name="fieldName">uuid</str>
> > </processor> -->
> > <processor class="solr.CloneFieldUpdateProcessorFactory">
> > <str name="source">enclosure_title</str>
> > <str name="dest">enclosure</str>
> > </processor>
> > <processor class="solr.RunUpdateProcessorFactory" />
> > </updateRequestProcessorChain>
> >
> > I try too with the UUIDUpdateProcessorFactory commented and nothing
> happens. Weird.
> >
> > On Aug 1, 2013, at 5:37 PM, "Jack Krupansky" <ja...@basetechnology.com>
> wrote:
> >
> >
> > Hmmm... not sure what happens if you have two update chains specified:
> >
> > <str name="update.chain">multiple-clones</str>
> > <str name="update.chain">fixIndexedValues</str>
> >
> > You need to merge them into one.
> >
> > -- Jack Krupansky
> >
> > From: Luís Portela Afonso
> > Sent: Thursday, August 01, 2013 12:26 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Solr PolyField
> >
> > Hi,
> >
> > I have tried the solr.CloneFieldUpdateProcessorFactory sugested in the
> pool but the fields are not copied.
> >
> > My dataconfig.xml
> >
> > <field column="enclosure_type"
> xpath="/rss/channel/item/enclosure/@type" />
> >
> > My schema.xml
> >
> > <dynamicField name="enclosure_*" type="string" indexed="false"
> stored="true" multiValued="true" />
> > <!-- </field> -->
> > <!-- <dynamicField name="enclosure_*" type="string" indexed="false"
> stored="true" multiValued="false" /> -->
> >
> > <field name="enclosure" type="text" indexed="true" stored="true"
> multiValued="true" />
> >
> > My solrconfig.xml
> >
> > <updateRequestProcessorChain name="multiple-clones">
> > <processor class="solr.CloneFieldUpdateProcessorFactory">
> > <str name="source">enclosure_title</str>
> > <str name="dest">enclosure</str>
> > </processor>
> > </updateRequestProcessorChain>
> >
> > and
> >
> > <requestHandler name="/dataimport"
> > class="org.apache.solr.handler.dataimport.DataImportHandler">
> > <lst name="defaults">
> > <str name="config">rss-data-config.xml</str>
> > <str name="update.chain">multiple-clones</str>
> > <str name="update.chain">fixIndexedValues</str>
> > </lst>
> >
> > </requestHandler>
> >
> > Can you help? Thanks ;)
> >
> > On Jul 31, 2013, at 6:03 PM, Luís Portela Afonso <
> meligaletiko@gmail.com> wrote:
> >
> >
> > Ok, thanks. I will check it.
> >
> > On Jul 31, 2013, at 5:08 PM, "Jack Krupansky" <ja...@basetechnology.com>
> wrote:
> >
> >
> > See:
> >
> https://builds.apache.org/job/Solr-Artifacts-4.x/javadoc/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html
> >
> > I have more examples in my book.
> >
> > -- Jack Krupansky
> >
> > From: Luís Portela Afonso
> > Sent: Wednesday, July 31, 2013 11:41 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Solr PolyField
> >
> > Hum, ok.
> >
> > It's possible to add to a field, static text? Text that i write on
> the configuration and then append another field? I saw something like
> CloneFieldProcessor but when i'm starting solr, it says that could not find
> the class.
> > I was trying to use processors to move one field to another.
> >
> > I saw this:
> > <processor class="solr.FieldCopyProcessorFactory">
> > <str name="source">lastname firstname</str>
> > <str name="dest">fullname</str>
> > <bool name="append">true</bool>
> > <str name="append.delim">, </str>
> > </processor>
> > But when i try to use it solr says that he cannot find the
> solr.FieldCopyProcessorFactory. I'm using solr 4.4.0
> >
> > Thanks ;)
> >
> > On Jul 31, 2013, at 4:16 PM, Michael Della Bitta <
> michael.della.bitta@appinions.com> wrote:
> >
> >
> > OK,
> >
> > Then I would suggest creating multiValued enclosure_type, etc. tags
> for
> > searching, and then one string-typed field to store the JSON snippet
> you've
> > been showing.
> >
> > Michael Della Bitta
> >
> > Applications Developer
> >
> > o: +1 646 532 3062 | c: +1 917 477 7906
> >
> > appinions inc.
> >
> > “The Science of Influence Marketing”
> >
> > 18 East 41st Street
> >
> > New York, NY 10017
> >
> > t: @appinions <https://twitter.com/Appinions> | g+:
> > plus.google.com/appinions<
> https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
> >
> > w: appinions.com <http://www.appinions.com/>
> >
> >
> > On Wed, Jul 31, 2013 at 11:11 AM, Luís Portela Afonso <
> > meligaletiko@gmail.com> wrote:
> >
> >
> > As a single record? Hum, no.
> >
> > So an Rss has /rss/channel/ and then lot of /rss/channel/item,
> right?
> > Each /rss/channel/item is a new document on Solr. I start with the
> solr
> > example rss, but i change that to has more fields, other fields
> and get the
> > feed url from a database.
> >
> > So each /rss/channel/item is a document to the indexing, bue each
> > /rss/channel/item can have more than on enclosure tag.
> >
> > Many thanks
> >
> > On Jul 31, 2013, at 4:05 PM, Michael Della Bitta <
> > michael.della.bitta@appinions.com> wrote:
> >
> >
> > So you're trying to index a RSS feed as a single record, but you
> want to
> >
> > be
> >
> > able to search for and retrieve individual entries from within
> the feed?
> >
> > Is
> >
> > that the issue?
> >
> > Michael Della Bitta
> >
> > Applications Developer
> >
> > o: +1 646 532 3062 | c: +1 917 477 7906
> >
> > appinions inc.
> >
> > “The Science of Influence Marketing”
> >
> > 18 East 41st Street
> >
> > New York, NY 10017
> >
> > t: @appinions <https://twitter.com/Appinions> | g+:
> > plus.google.com/appinions<
> >
> >
> https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
> >
> >
> > w: appinions.com <http://www.appinions.com/>
> >
> >
> > On Wed, Jul 31, 2013 at 10:59 AM, Luís Portela Afonso <
> > meligaletiko@gmail.com> wrote:
> >
> >
> > This fields can be multiValued.
> > I the rss standart there is not correct to do that, but some
> sources do
> > and i like to grab it all. Is there any way that make it
> possible?
> >
> > Once again, Many thanks :)
> >
> > On Jul 31, 2013, at 3:54 PM, Michael Della Bitta <
> > michael.della.bitta@appinions.com> wrote:
> >
> >
> > Luís,
> >
> > Is there a reason why splitting this up into enclosure_type,
> >
> > enclosure_url,
> >
> > and enclosure_length would not work?
> >
> >
> > Michael Della Bitta
> >
> > Applications Developer
> >
> > o: +1 646 532 3062 | c: +1 917 477 7906
> >
> > appinions inc.
> >
> > “The Science of Influence Marketing”
> >
> > 18 East 41st Street
> >
> > New York, NY 10017
> >
> > t: @appinions <https://twitter.com/Appinions> | g+:
> > plus.google.com/appinions<
> >
> >
> >
> >
> https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
> >
> >
> > w: appinions.com <http://www.appinions.com/>
> >
> >
> > On Wed, Jul 31, 2013 at 10:43 AM, Luís Portela Afonso <
> > meligaletiko@gmail.com> wrote:
> >
> >
> > Hi,
> >
> > I'm trying to index information of RSS Feeds.
> >
> > So in a more detailed explanation:
> >
> > The RSS feed has something like:
> > <enclosure url="
> >
> > http://www.engadget.com/podcasts/Engadget_Podcast_353.mp3"
> >
> > length="32642192" type="audio/mpeg"/>
> >
> > *With my current configuration, this is working and i get
> a result
> >
> > like
> >
> > that:*
> >
> >
> > - enclosure:
> > [
> > - "audio/mpeg",
> > - "
> http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3",
> > - "37521428"
> > ],
> >
> >
> > *BUT,* this is not the result that i'm trying to reach.
> With that i'm
> >
> > not
> >
> > able to know in a "correct" way, if "audio/mpeg" is the
> *type*, or
> >
> > the *
> >
> > url,* or the *length*.
> > *
> > *
> > *I want to reach something like:*
> >
> > -
> > - enclosure:
> > {
> > - type: "a <http://www.gazzetta.it/>udio/mpeg",
> > - url:
> > "
> http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3",
> > - length: "37521428"
> > },
> >
> >
> >
> > So, how i intend this, this should be 3 fields inside of
> another
> >
> > field,
> >
> > no?
> >
> >
> >
> > Many Thanks for the answer and the help.
> >
> >
> > On Jul 31, 2013, at 3:34 PM, Erick Erickson <
> erickerickson@gmail.com>
> > wrote:
> >
> > Nope. Solr fields are flat. Why do you want to do this? I'm
> > asking because this might be an XY problems and there
> > may be other possibilities.
> >
> > Best
> > Erick
> >
> > On Wed, Jul 31, 2013 at 5:09 AM, Luís Portela Afonso
> > <me...@gmail.com> wrote:
> >
> > Hi, I'm trying to create a field with multiple fields
> inside, that is:
> >
> > origin:
> > {
> >
> > htmlUrl: "http://www.gazzetta.it/",
> > streamId: "feed/http://www.gazzetta.it/rss/Home.xml",
> > title: "Gazzetta.it"
> >
> > },
> >
> >
> > Get something like this. Is that possible? I'm using Solr
> 4.4.0.
> >
> > Thanks
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
Re: Solr PolyField
Posted by Luís Portela Afonso <me...@gmail.com>.
Oh my god. Thanks for notice. The field name its wrong. It should be enclosure_type. I'm so sorry.
On Aug 1, 2013, at 6:33 PM, Jack Krupansky <ja...@basetechnology.com> wrote:
> Are you sure the “enclosure_title” field is populated?
>
> Have you updated the request handler?
>
> -- Jack Krupansky
>
> From: Luís Portela Afonso
> Sent: Thursday, August 01, 2013 1:23 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr PolyField
>
> So i have merged the two chains in one, and this is not copying. Hum…
>
> My solrconfig.xml
>
> <updateRequestProcessorChain name="fixIndexedValues">
> <!-- <processor class="solr.UUIDUpdateProcessorFactory">
> <str name="fieldName">uuid</str>
> </processor> -->
> <processor class="solr.CloneFieldUpdateProcessorFactory">
> <str name="source">enclosure_title</str>
> <str name="dest">enclosure</str>
> </processor>
> <processor class="solr.RunUpdateProcessorFactory" />
> </updateRequestProcessorChain>
>
> I try too with the UUIDUpdateProcessorFactory commented and nothing happens. Weird.
>
> On Aug 1, 2013, at 5:37 PM, "Jack Krupansky" <ja...@basetechnology.com> wrote:
>
>
> Hmmm... not sure what happens if you have two update chains specified:
>
> <str name="update.chain">multiple-clones</str>
> <str name="update.chain">fixIndexedValues</str>
>
> You need to merge them into one.
>
> -- Jack Krupansky
>
> From: Luís Portela Afonso
> Sent: Thursday, August 01, 2013 12:26 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr PolyField
>
> Hi,
>
> I have tried the solr.CloneFieldUpdateProcessorFactory sugested in the pool but the fields are not copied.
>
> My dataconfig.xml
>
> <field column="enclosure_type" xpath="/rss/channel/item/enclosure/@type" />
>
> My schema.xml
>
> <dynamicField name="enclosure_*" type="string" indexed="false" stored="true" multiValued="true" />
> <!-- </field> -->
> <!-- <dynamicField name="enclosure_*" type="string" indexed="false" stored="true" multiValued="false" /> -->
>
> <field name="enclosure" type="text" indexed="true" stored="true" multiValued="true" />
>
> My solrconfig.xml
>
> <updateRequestProcessorChain name="multiple-clones">
> <processor class="solr.CloneFieldUpdateProcessorFactory">
> <str name="source">enclosure_title</str>
> <str name="dest">enclosure</str>
> </processor>
> </updateRequestProcessorChain>
>
> and
>
> <requestHandler name="/dataimport"
> class="org.apache.solr.handler.dataimport.DataImportHandler">
> <lst name="defaults">
> <str name="config">rss-data-config.xml</str>
> <str name="update.chain">multiple-clones</str>
> <str name="update.chain">fixIndexedValues</str>
> </lst>
>
> </requestHandler>
>
> Can you help? Thanks ;)
>
> On Jul 31, 2013, at 6:03 PM, Luís Portela Afonso <me...@gmail.com> wrote:
>
>
> Ok, thanks. I will check it.
>
> On Jul 31, 2013, at 5:08 PM, "Jack Krupansky" <ja...@basetechnology.com> wrote:
>
>
> See:
> https://builds.apache.org/job/Solr-Artifacts-4.x/javadoc/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html
>
> I have more examples in my book.
>
> -- Jack Krupansky
>
> From: Luís Portela Afonso
> Sent: Wednesday, July 31, 2013 11:41 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr PolyField
>
> Hum, ok.
>
> It's possible to add to a field, static text? Text that i write on the configuration and then append another field? I saw something like CloneFieldProcessor but when i'm starting solr, it says that could not find the class.
> I was trying to use processors to move one field to another.
>
> I saw this:
> <processor class="solr.FieldCopyProcessorFactory">
> <str name="source">lastname firstname</str>
> <str name="dest">fullname</str>
> <bool name="append">true</bool>
> <str name="append.delim">, </str>
> </processor>
> But when i try to use it solr says that he cannot find the solr.FieldCopyProcessorFactory. I'm using solr 4.4.0
>
> Thanks ;)
>
> On Jul 31, 2013, at 4:16 PM, Michael Della Bitta <mi...@appinions.com> wrote:
>
>
> OK,
>
> Then I would suggest creating multiValued enclosure_type, etc. tags for
> searching, and then one string-typed field to store the JSON snippet you've
> been showing.
>
> Michael Della Bitta
>
> Applications Developer
>
> o: +1 646 532 3062 | c: +1 917 477 7906
>
> appinions inc.
>
> “The Science of Influence Marketing”
>
> 18 East 41st Street
>
> New York, NY 10017
>
> t: @appinions <https://twitter.com/Appinions> | g+:
> plus.google.com/appinions<https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts>
> w: appinions.com <http://www.appinions.com/>
>
>
> On Wed, Jul 31, 2013 at 11:11 AM, Luís Portela Afonso <
> meligaletiko@gmail.com> wrote:
>
>
> As a single record? Hum, no.
>
> So an Rss has /rss/channel/ and then lot of /rss/channel/item, right?
> Each /rss/channel/item is a new document on Solr. I start with the solr
> example rss, but i change that to has more fields, other fields and get the
> feed url from a database.
>
> So each /rss/channel/item is a document to the indexing, bue each
> /rss/channel/item can have more than on enclosure tag.
>
> Many thanks
>
> On Jul 31, 2013, at 4:05 PM, Michael Della Bitta <
> michael.della.bitta@appinions.com> wrote:
>
>
> So you're trying to index a RSS feed as a single record, but you want to
>
> be
>
> able to search for and retrieve individual entries from within the feed?
>
> Is
>
> that the issue?
>
> Michael Della Bitta
>
> Applications Developer
>
> o: +1 646 532 3062 | c: +1 917 477 7906
>
> appinions inc.
>
> “The Science of Influence Marketing”
>
> 18 East 41st Street
>
> New York, NY 10017
>
> t: @appinions <https://twitter.com/Appinions> | g+:
> plus.google.com/appinions<
>
> https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
>
>
> w: appinions.com <http://www.appinions.com/>
>
>
> On Wed, Jul 31, 2013 at 10:59 AM, Luís Portela Afonso <
> meligaletiko@gmail.com> wrote:
>
>
> This fields can be multiValued.
> I the rss standart there is not correct to do that, but some sources do
> and i like to grab it all. Is there any way that make it possible?
>
> Once again, Many thanks :)
>
> On Jul 31, 2013, at 3:54 PM, Michael Della Bitta <
> michael.della.bitta@appinions.com> wrote:
>
>
> Luís,
>
> Is there a reason why splitting this up into enclosure_type,
>
> enclosure_url,
>
> and enclosure_length would not work?
>
>
> Michael Della Bitta
>
> Applications Developer
>
> o: +1 646 532 3062 | c: +1 917 477 7906
>
> appinions inc.
>
> “The Science of Influence Marketing”
>
> 18 East 41st Street
>
> New York, NY 10017
>
> t: @appinions <https://twitter.com/Appinions> | g+:
> plus.google.com/appinions<
>
>
>
> https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
>
>
> w: appinions.com <http://www.appinions.com/>
>
>
> On Wed, Jul 31, 2013 at 10:43 AM, Luís Portela Afonso <
> meligaletiko@gmail.com> wrote:
>
>
> Hi,
>
> I'm trying to index information of RSS Feeds.
>
> So in a more detailed explanation:
>
> The RSS feed has something like:
> <enclosure url="
>
> http://www.engadget.com/podcasts/Engadget_Podcast_353.mp3"
>
> length="32642192" type="audio/mpeg"/>
>
> *With my current configuration, this is working and i get a result
>
> like
>
> that:*
>
>
> - enclosure:
> [
> - "audio/mpeg",
> - "http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3",
> - "37521428"
> ],
>
>
> *BUT,* this is not the result that i'm trying to reach. With that i'm
>
> not
>
> able to know in a "correct" way, if "audio/mpeg" is the *type*, or
>
> the *
>
> url,* or the *length*.
> *
> *
> *I want to reach something like:*
>
> -
> - enclosure:
> {
> - type: "a <http://www.gazzetta.it/>udio/mpeg",
> - url:
> "http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3",
> - length: "37521428"
> },
>
>
>
> So, how i intend this, this should be 3 fields inside of another
>
> field,
>
> no?
>
>
>
> Many Thanks for the answer and the help.
>
>
> On Jul 31, 2013, at 3:34 PM, Erick Erickson <er...@gmail.com>
> wrote:
>
> Nope. Solr fields are flat. Why do you want to do this? I'm
> asking because this might be an XY problems and there
> may be other possibilities.
>
> Best
> Erick
>
> On Wed, Jul 31, 2013 at 5:09 AM, Luís Portela Afonso
> <me...@gmail.com> wrote:
>
> Hi, I'm trying to create a field with multiple fields inside, that is:
>
> origin:
> {
>
> htmlUrl: "http://www.gazzetta.it/",
> streamId: "feed/http://www.gazzetta.it/rss/Home.xml",
> title: "Gazzetta.it"
>
> },
>
>
> Get something like this. Is that possible? I'm using Solr 4.4.0.
>
> Thanks
>
>
>
>
>
>
>
>
>
>
>
>
>
>
Re: Solr PolyField
Posted by Jack Krupansky <ja...@basetechnology.com>.
Are you sure the “enclosure_title” field is populated?
Have you updated the request handler?
-- Jack Krupansky
From: Luís Portela Afonso
Sent: Thursday, August 01, 2013 1:23 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr PolyField
So i have merged the two chains in one, and this is not copying. Hum…
My solrconfig.xml
<updateRequestProcessorChain name="fixIndexedValues">
<!-- <processor class="solr.UUIDUpdateProcessorFactory">
<str name="fieldName">uuid</str>
</processor> -->
<processor class="solr.CloneFieldUpdateProcessorFactory">
<str name="source">enclosure_title</str>
<str name="dest">enclosure</str>
</processor>
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
I try too with the UUIDUpdateProcessorFactory commented and nothing happens. Weird.
On Aug 1, 2013, at 5:37 PM, "Jack Krupansky" <ja...@basetechnology.com> wrote:
Hmmm... not sure what happens if you have two update chains specified:
<str name="update.chain">multiple-clones</str>
<str name="update.chain">fixIndexedValues</str>
You need to merge them into one.
-- Jack Krupansky
From: Luís Portela Afonso
Sent: Thursday, August 01, 2013 12:26 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr PolyField
Hi,
I have tried the solr.CloneFieldUpdateProcessorFactory sugested in the pool but the fields are not copied.
My dataconfig.xml
<field column="enclosure_type" xpath="/rss/channel/item/enclosure/@type" />
My schema.xml
<dynamicField name="enclosure_*" type="string" indexed="false" stored="true" multiValued="true" />
<!-- </field> -->
<!-- <dynamicField name="enclosure_*" type="string" indexed="false" stored="true" multiValued="false" /> -->
<field name="enclosure" type="text" indexed="true" stored="true" multiValued="true" />
My solrconfig.xml
<updateRequestProcessorChain name="multiple-clones">
<processor class="solr.CloneFieldUpdateProcessorFactory">
<str name="source">enclosure_title</str>
<str name="dest">enclosure</str>
</processor>
</updateRequestProcessorChain>
and
<requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">rss-data-config.xml</str>
<str name="update.chain">multiple-clones</str>
<str name="update.chain">fixIndexedValues</str>
</lst>
</requestHandler>
Can you help? Thanks ;)
On Jul 31, 2013, at 6:03 PM, Luís Portela Afonso <me...@gmail.com> wrote:
Ok, thanks. I will check it.
On Jul 31, 2013, at 5:08 PM, "Jack Krupansky" <ja...@basetechnology.com> wrote:
See:
https://builds.apache.org/job/Solr-Artifacts-4.x/javadoc/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html
I have more examples in my book.
-- Jack Krupansky
From: Luís Portela Afonso
Sent: Wednesday, July 31, 2013 11:41 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr PolyField
Hum, ok.
It's possible to add to a field, static text? Text that i write on the configuration and then append another field? I saw something like CloneFieldProcessor but when i'm starting solr, it says that could not find the class.
I was trying to use processors to move one field to another.
I saw this:
<processor class="solr.FieldCopyProcessorFactory">
<str name="source">lastname firstname</str>
<str name="dest">fullname</str>
<bool name="append">true</bool>
<str name="append.delim">, </str>
</processor>
But when i try to use it solr says that he cannot find the solr.FieldCopyProcessorFactory. I'm using solr 4.4.0
Thanks ;)
On Jul 31, 2013, at 4:16 PM, Michael Della Bitta <mi...@appinions.com> wrote:
OK,
Then I would suggest creating multiValued enclosure_type, etc. tags for
searching, and then one string-typed field to store the JSON snippet you've
been showing.
Michael Della Bitta
Applications Developer
o: +1 646 532 3062 | c: +1 917 477 7906
appinions inc.
“The Science of Influence Marketing”
18 East 41st Street
New York, NY 10017
t: @appinions <https://twitter.com/Appinions> | g+:
plus.google.com/appinions<https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts>
w: appinions.com <http://www.appinions.com/>
On Wed, Jul 31, 2013 at 11:11 AM, Luís Portela Afonso <
meligaletiko@gmail.com> wrote:
As a single record? Hum, no.
So an Rss has /rss/channel/ and then lot of /rss/channel/item, right?
Each /rss/channel/item is a new document on Solr. I start with the solr
example rss, but i change that to has more fields, other fields and get the
feed url from a database.
So each /rss/channel/item is a document to the indexing, bue each
/rss/channel/item can have more than on enclosure tag.
Many thanks
On Jul 31, 2013, at 4:05 PM, Michael Della Bitta <
michael.della.bitta@appinions.com> wrote:
So you're trying to index a RSS feed as a single record, but you want to
be
able to search for and retrieve individual entries from within the feed?
Is
that the issue?
Michael Della Bitta
Applications Developer
o: +1 646 532 3062 | c: +1 917 477 7906
appinions inc.
“The Science of Influence Marketing”
18 East 41st Street
New York, NY 10017
t: @appinions <https://twitter.com/Appinions> | g+:
plus.google.com/appinions<
https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com <http://www.appinions.com/>
On Wed, Jul 31, 2013 at 10:59 AM, Luís Portela Afonso <
meligaletiko@gmail.com> wrote:
This fields can be multiValued.
I the rss standart there is not correct to do that, but some sources do
and i like to grab it all. Is there any way that make it possible?
Once again, Many thanks :)
On Jul 31, 2013, at 3:54 PM, Michael Della Bitta <
michael.della.bitta@appinions.com> wrote:
Luís,
Is there a reason why splitting this up into enclosure_type,
enclosure_url,
and enclosure_length would not work?
Michael Della Bitta
Applications Developer
o: +1 646 532 3062 | c: +1 917 477 7906
appinions inc.
“The Science of Influence Marketing”
18 East 41st Street
New York, NY 10017
t: @appinions <https://twitter.com/Appinions> | g+:
plus.google.com/appinions<
https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com <http://www.appinions.com/>
On Wed, Jul 31, 2013 at 10:43 AM, Luís Portela Afonso <
meligaletiko@gmail.com> wrote:
Hi,
I'm trying to index information of RSS Feeds.
So in a more detailed explanation:
The RSS feed has something like:
<enclosure url="
http://www.engadget.com/podcasts/Engadget_Podcast_353.mp3"
length="32642192" type="audio/mpeg"/>
*With my current configuration, this is working and i get a result
like
that:*
- enclosure:
[
- "audio/mpeg",
- "http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3",
- "37521428"
],
*BUT,* this is not the result that i'm trying to reach. With that i'm
not
able to know in a "correct" way, if "audio/mpeg" is the *type*, or
the *
url,* or the *length*.
*
*
*I want to reach something like:*
-
- enclosure:
{
- type: "a <http://www.gazzetta.it/>udio/mpeg",
- url:
"http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3",
- length: "37521428"
},
So, how i intend this, this should be 3 fields inside of another
field,
no?
Many Thanks for the answer and the help.
On Jul 31, 2013, at 3:34 PM, Erick Erickson <er...@gmail.com>
wrote:
Nope. Solr fields are flat. Why do you want to do this? I'm
asking because this might be an XY problems and there
may be other possibilities.
Best
Erick
On Wed, Jul 31, 2013 at 5:09 AM, Luís Portela Afonso
<me...@gmail.com> wrote:
Hi, I'm trying to create a field with multiple fields inside, that is:
origin:
{
htmlUrl: "http://www.gazzetta.it/",
streamId: "feed/http://www.gazzetta.it/rss/Home.xml",
title: "Gazzetta.it"
},
Get something like this. Is that possible? I'm using Solr 4.4.0.
Thanks
Re: Solr PolyField
Posted by Luís Portela Afonso <me...@gmail.com>.
So i have merged the two chains in one, and this is not copying. Hum…
My solrconfig.xml
<updateRequestProcessorChain name="fixIndexedValues">
<!-- <processor class="solr.UUIDUpdateProcessorFactory">
<str name="fieldName">uuid</str>
</processor> -->
<processor class="solr.CloneFieldUpdateProcessorFactory">
<str name="source">enclosure_title</str>
<str name="dest">enclosure</str>
</processor>
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
I try too with the UUIDUpdateProcessorFactory commented and nothing happens. Weird.
On Aug 1, 2013, at 5:37 PM, "Jack Krupansky" <ja...@basetechnology.com> wrote:
> Hmmm... not sure what happens if you have two update chains specified:
>
> <str name="update.chain">multiple-clones</str>
> <str name="update.chain">fixIndexedValues</str>
>
> You need to merge them into one.
>
> -- Jack Krupansky
>
> From: Luís Portela Afonso
> Sent: Thursday, August 01, 2013 12:26 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr PolyField
>
> Hi,
>
> I have tried the solr.CloneFieldUpdateProcessorFactory sugested in the pool but the fields are not copied.
>
> My dataconfig.xml
>
> <field column="enclosure_type" xpath="/rss/channel/item/enclosure/@type" />
>
> My schema.xml
>
> <dynamicField name="enclosure_*" type="string" indexed="false" stored="true" multiValued="true" />
> <!-- </field> -->
> <!-- <dynamicField name="enclosure_*" type="string" indexed="false" stored="true" multiValued="false" /> -->
>
> <field name="enclosure" type="text" indexed="true" stored="true" multiValued="true" />
>
> My solrconfig.xml
>
> <updateRequestProcessorChain name="multiple-clones">
> <processor class="solr.CloneFieldUpdateProcessorFactory">
> <str name="source">enclosure_title</str>
> <str name="dest">enclosure</str>
> </processor>
> </updateRequestProcessorChain>
>
> and
>
> <requestHandler name="/dataimport"
> class="org.apache.solr.handler.dataimport.DataImportHandler">
> <lst name="defaults">
> <str name="config">rss-data-config.xml</str>
> <str name="update.chain">multiple-clones</str>
> <str name="update.chain">fixIndexedValues</str>
> </lst>
>
> </requestHandler>
>
> Can you help? Thanks ;)
>
> On Jul 31, 2013, at 6:03 PM, Luís Portela Afonso <me...@gmail.com> wrote:
>
>
> Ok, thanks. I will check it.
>
> On Jul 31, 2013, at 5:08 PM, "Jack Krupansky" <ja...@basetechnology.com> wrote:
>
>
> See:
> https://builds.apache.org/job/Solr-Artifacts-4.x/javadoc/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html
>
> I have more examples in my book.
>
> -- Jack Krupansky
>
> From: Luís Portela Afonso
> Sent: Wednesday, July 31, 2013 11:41 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr PolyField
>
> Hum, ok.
>
> It's possible to add to a field, static text? Text that i write on the configuration and then append another field? I saw something like CloneFieldProcessor but when i'm starting solr, it says that could not find the class.
> I was trying to use processors to move one field to another.
>
> I saw this:
> <processor class="solr.FieldCopyProcessorFactory">
> <str name="source">lastname firstname</str>
> <str name="dest">fullname</str>
> <bool name="append">true</bool>
> <str name="append.delim">, </str>
> </processor>
> But when i try to use it solr says that he cannot find the solr.FieldCopyProcessorFactory. I'm using solr 4.4.0
>
> Thanks ;)
>
> On Jul 31, 2013, at 4:16 PM, Michael Della Bitta <mi...@appinions.com> wrote:
>
>
> OK,
>
> Then I would suggest creating multiValued enclosure_type, etc. tags for
> searching, and then one string-typed field to store the JSON snippet you've
> been showing.
>
> Michael Della Bitta
>
> Applications Developer
>
> o: +1 646 532 3062 | c: +1 917 477 7906
>
> appinions inc.
>
> “The Science of Influence Marketing”
>
> 18 East 41st Street
>
> New York, NY 10017
>
> t: @appinions <https://twitter.com/Appinions> | g+:
> plus.google.com/appinions<https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts>
> w: appinions.com <http://www.appinions.com/>
>
>
> On Wed, Jul 31, 2013 at 11:11 AM, Luís Portela Afonso <
> meligaletiko@gmail.com> wrote:
>
>
> As a single record? Hum, no.
>
> So an Rss has /rss/channel/ and then lot of /rss/channel/item, right?
> Each /rss/channel/item is a new document on Solr. I start with the solr
> example rss, but i change that to has more fields, other fields and get the
> feed url from a database.
>
> So each /rss/channel/item is a document to the indexing, bue each
> /rss/channel/item can have more than on enclosure tag.
>
> Many thanks
>
> On Jul 31, 2013, at 4:05 PM, Michael Della Bitta <
> michael.della.bitta@appinions.com> wrote:
>
>
> So you're trying to index a RSS feed as a single record, but you want to
>
> be
>
> able to search for and retrieve individual entries from within the feed?
>
> Is
>
> that the issue?
>
> Michael Della Bitta
>
> Applications Developer
>
> o: +1 646 532 3062 | c: +1 917 477 7906
>
> appinions inc.
>
> “The Science of Influence Marketing”
>
> 18 East 41st Street
>
> New York, NY 10017
>
> t: @appinions <https://twitter.com/Appinions> | g+:
> plus.google.com/appinions<
>
> https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
>
>
> w: appinions.com <http://www.appinions.com/>
>
>
> On Wed, Jul 31, 2013 at 10:59 AM, Luís Portela Afonso <
> meligaletiko@gmail.com> wrote:
>
>
> This fields can be multiValued.
> I the rss standart there is not correct to do that, but some sources do
> and i like to grab it all. Is there any way that make it possible?
>
> Once again, Many thanks :)
>
> On Jul 31, 2013, at 3:54 PM, Michael Della Bitta <
> michael.della.bitta@appinions.com> wrote:
>
>
> Luís,
>
> Is there a reason why splitting this up into enclosure_type,
>
> enclosure_url,
>
> and enclosure_length would not work?
>
>
> Michael Della Bitta
>
> Applications Developer
>
> o: +1 646 532 3062 | c: +1 917 477 7906
>
> appinions inc.
>
> “The Science of Influence Marketing”
>
> 18 East 41st Street
>
> New York, NY 10017
>
> t: @appinions <https://twitter.com/Appinions> | g+:
> plus.google.com/appinions<
>
>
>
> https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
>
>
> w: appinions.com <http://www.appinions.com/>
>
>
> On Wed, Jul 31, 2013 at 10:43 AM, Luís Portela Afonso <
> meligaletiko@gmail.com> wrote:
>
>
> Hi,
>
> I'm trying to index information of RSS Feeds.
>
> So in a more detailed explanation:
>
> The RSS feed has something like:
> <enclosure url="
>
> http://www.engadget.com/podcasts/Engadget_Podcast_353.mp3"
>
> length="32642192" type="audio/mpeg"/>
>
> *With my current configuration, this is working and i get a result
>
> like
>
> that:*
>
>
> - enclosure:
> [
> - "audio/mpeg",
> - "http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3",
> - "37521428"
> ],
>
>
> *BUT,* this is not the result that i'm trying to reach. With that i'm
>
> not
>
> able to know in a "correct" way, if "audio/mpeg" is the *type*, or
>
> the *
>
> url,* or the *length*.
> *
> *
> *I want to reach something like:*
>
> -
> - enclosure:
> {
> - type: "a <http://www.gazzetta.it/>udio/mpeg",
> - url:
> "http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3",
> - length: "37521428"
> },
>
>
>
> So, how i intend this, this should be 3 fields inside of another
>
> field,
>
> no?
>
>
>
> Many Thanks for the answer and the help.
>
>
> On Jul 31, 2013, at 3:34 PM, Erick Erickson <er...@gmail.com>
> wrote:
>
> Nope. Solr fields are flat. Why do you want to do this? I'm
> asking because this might be an XY problems and there
> may be other possibilities.
>
> Best
> Erick
>
> On Wed, Jul 31, 2013 at 5:09 AM, Luís Portela Afonso
> <me...@gmail.com> wrote:
>
> Hi, I'm trying to create a field with multiple fields inside, that is:
>
> origin:
> {
>
> htmlUrl: "http://www.gazzetta.it/",
> streamId: "feed/http://www.gazzetta.it/rss/Home.xml",
> title: "Gazzetta.it"
>
> },
>
>
> Get something like this. Is that possible? I'm using Solr 4.4.0.
>
> Thanks
>
>
>
>
>
>
>
>
>
>
>
>
>
Re: Solr PolyField
Posted by Jack Krupansky <ja...@basetechnology.com>.
Hmmm... not sure what happens if you have two update chains specified:
<str name="update.chain">multiple-clones</str>
<str name="update.chain">fixIndexedValues</str>
You need to merge them into one.
-- Jack Krupansky
From: Luís Portela Afonso
Sent: Thursday, August 01, 2013 12:26 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr PolyField
Hi,
I have tried the solr.CloneFieldUpdateProcessorFactory sugested in the pool but the fields are not copied.
My dataconfig.xml
<field column="enclosure_type" xpath="/rss/channel/item/enclosure/@type" />
My schema.xml
<dynamicField name="enclosure_*" type="string" indexed="false" stored="true" multiValued="true" />
<!-- </field> -->
<!-- <dynamicField name="enclosure_*" type="string" indexed="false" stored="true" multiValued="false" /> -->
<field name="enclosure" type="text" indexed="true" stored="true" multiValued="true" />
My solrconfig.xml
<updateRequestProcessorChain name="multiple-clones">
<processor class="solr.CloneFieldUpdateProcessorFactory">
<str name="source">enclosure_title</str>
<str name="dest">enclosure</str>
</processor>
</updateRequestProcessorChain>
and
<requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">rss-data-config.xml</str>
<str name="update.chain">multiple-clones</str>
<str name="update.chain">fixIndexedValues</str>
</lst>
</requestHandler>
Can you help? Thanks ;)
On Jul 31, 2013, at 6:03 PM, Luís Portela Afonso <me...@gmail.com> wrote:
Ok, thanks. I will check it.
On Jul 31, 2013, at 5:08 PM, "Jack Krupansky" <ja...@basetechnology.com> wrote:
See:
https://builds.apache.org/job/Solr-Artifacts-4.x/javadoc/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html
I have more examples in my book.
-- Jack Krupansky
From: Luís Portela Afonso
Sent: Wednesday, July 31, 2013 11:41 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr PolyField
Hum, ok.
It's possible to add to a field, static text? Text that i write on the configuration and then append another field? I saw something like CloneFieldProcessor but when i'm starting solr, it says that could not find the class.
I was trying to use processors to move one field to another.
I saw this:
<processor class="solr.FieldCopyProcessorFactory">
<str name="source">lastname firstname</str>
<str name="dest">fullname</str>
<bool name="append">true</bool>
<str name="append.delim">, </str>
</processor>
But when i try to use it solr says that he cannot find the solr.FieldCopyProcessorFactory. I'm using solr 4.4.0
Thanks ;)
On Jul 31, 2013, at 4:16 PM, Michael Della Bitta <mi...@appinions.com> wrote:
OK,
Then I would suggest creating multiValued enclosure_type, etc. tags for
searching, and then one string-typed field to store the JSON snippet you've
been showing.
Michael Della Bitta
Applications Developer
o: +1 646 532 3062 | c: +1 917 477 7906
appinions inc.
“The Science of Influence Marketing”
18 East 41st Street
New York, NY 10017
t: @appinions <https://twitter.com/Appinions> | g+:
plus.google.com/appinions<https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts>
w: appinions.com <http://www.appinions.com/>
On Wed, Jul 31, 2013 at 11:11 AM, Luís Portela Afonso <
meligaletiko@gmail.com> wrote:
As a single record? Hum, no.
So an Rss has /rss/channel/ and then lot of /rss/channel/item, right?
Each /rss/channel/item is a new document on Solr. I start with the solr
example rss, but i change that to has more fields, other fields and get the
feed url from a database.
So each /rss/channel/item is a document to the indexing, bue each
/rss/channel/item can have more than on enclosure tag.
Many thanks
On Jul 31, 2013, at 4:05 PM, Michael Della Bitta <
michael.della.bitta@appinions.com> wrote:
So you're trying to index a RSS feed as a single record, but you want to
be
able to search for and retrieve individual entries from within the feed?
Is
that the issue?
Michael Della Bitta
Applications Developer
o: +1 646 532 3062 | c: +1 917 477 7906
appinions inc.
“The Science of Influence Marketing”
18 East 41st Street
New York, NY 10017
t: @appinions <https://twitter.com/Appinions> | g+:
plus.google.com/appinions<
https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com <http://www.appinions.com/>
On Wed, Jul 31, 2013 at 10:59 AM, Luís Portela Afonso <
meligaletiko@gmail.com> wrote:
This fields can be multiValued.
I the rss standart there is not correct to do that, but some sources do
and i like to grab it all. Is there any way that make it possible?
Once again, Many thanks :)
On Jul 31, 2013, at 3:54 PM, Michael Della Bitta <
michael.della.bitta@appinions.com> wrote:
Luís,
Is there a reason why splitting this up into enclosure_type,
enclosure_url,
and enclosure_length would not work?
Michael Della Bitta
Applications Developer
o: +1 646 532 3062 | c: +1 917 477 7906
appinions inc.
“The Science of Influence Marketing”
18 East 41st Street
New York, NY 10017
t: @appinions <https://twitter.com/Appinions> | g+:
plus.google.com/appinions<
https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com <http://www.appinions.com/>
On Wed, Jul 31, 2013 at 10:43 AM, Luís Portela Afonso <
meligaletiko@gmail.com> wrote:
Hi,
I'm trying to index information of RSS Feeds.
So in a more detailed explanation:
The RSS feed has something like:
<enclosure url="
http://www.engadget.com/podcasts/Engadget_Podcast_353.mp3"
length="32642192" type="audio/mpeg"/>
*With my current configuration, this is working and i get a result
like
that:*
- enclosure:
[
- "audio/mpeg",
- "http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3",
- "37521428"
],
*BUT,* this is not the result that i'm trying to reach. With that i'm
not
able to know in a "correct" way, if "audio/mpeg" is the *type*, or
the *
url,* or the *length*.
*
*
*I want to reach something like:*
-
- enclosure:
{
- type: "a <http://www.gazzetta.it/>udio/mpeg",
- url:
"http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3",
- length: "37521428"
},
So, how i intend this, this should be 3 fields inside of another
field,
no?
Many Thanks for the answer and the help.
On Jul 31, 2013, at 3:34 PM, Erick Erickson <er...@gmail.com>
wrote:
Nope. Solr fields are flat. Why do you want to do this? I'm
asking because this might be an XY problems and there
may be other possibilities.
Best
Erick
On Wed, Jul 31, 2013 at 5:09 AM, Luís Portela Afonso
<me...@gmail.com> wrote:
Hi, I'm trying to create a field with multiple fields inside, that is:
origin:
{
htmlUrl: "http://www.gazzetta.it/",
streamId: "feed/http://www.gazzetta.it/rss/Home.xml",
title: "Gazzetta.it"
},
Get something like this. Is that possible? I'm using Solr 4.4.0.
Thanks