You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Zheng Lin Edwin Yeo <ed...@gmail.com> on 2018/01/03 10:53:34 UTC

Removing some fields from uprefix

Hi,

I'm using Solr 7.2.0, and I have this /extract handler in my solrconfig.xml

  <requestHandler name="/update/extract"
                  startup="lazy"
                  class="solr.extraction.ExtractingRequestHandler" >
    <lst name="defaults">
      <str name="xpath">/xhtml:html/xhtml:body/descendant:node()</str>
      <str name="capture">content</str>
      <str name="fmap.meta">attr_meta_</str>
      <str name="uprefix">attr_</str>
      <str name="lowernames">true</str>
      <str name="update.chain">dedupe</str>
    </lst>
  </requestHandler>

Understand that this <str name="uprefix">attr_</str> will cause all
generated fileds that aren't defined in the schema to be prefixed with
attr_

Is there any way that we can remove some of the fields, but keep the rest?
For example, I would like to remove attr_x_parsed_by.

Regards,
Edwin

Re: Removing some fields from uprefix

Posted by Zheng Lin Edwin Yeo <ed...@gmail.com>.
Hi Alex,

Thanks for your advice. It works.

Regards,
Edwin


On 3 January 2018 at 23:06, Alexandre Rafalovitch <ar...@gmail.com>
wrote:

> uprefix is only for the fields that do NOT exist in schema. So, you
> can define your x_parsed_by in schema, but map it to the type that has
> index=false, store=false, docvalues=false. Which means the field is
> acknowledged but effectively dropped.
>
> Regards,
>    Alex.
>
> On 3 January 2018 at 05:53, Zheng Lin Edwin Yeo <ed...@gmail.com>
> wrote:
> > Hi,
> >
> > I'm using Solr 7.2.0, and I have this /extract handler in my
> solrconfig.xml
> >
> >   <requestHandler name="/update/extract"
> >                   startup="lazy"
> >                   class="solr.extraction.ExtractingRequestHandler" >
> >     <lst name="defaults">
> >       <str name="xpath">/xhtml:html/xhtml:body/descendant:node()</str>
> >       <str name="capture">content</str>
> >       <str name="fmap.meta">attr_meta_</str>
> >       <str name="uprefix">attr_</str>
> >       <str name="lowernames">true</str>
> >       <str name="update.chain">dedupe</str>
> >     </lst>
> >   </requestHandler>
> >
> > Understand that this <str name="uprefix">attr_</str> will cause all
> > generated fileds that aren't defined in the schema to be prefixed with
> > attr_
> >
> > Is there any way that we can remove some of the fields, but keep the
> rest?
> > For example, I would like to remove attr_x_parsed_by.
> >
> > Regards,
> > Edwin
>

Re: Removing some fields from uprefix

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
uprefix is only for the fields that do NOT exist in schema. So, you
can define your x_parsed_by in schema, but map it to the type that has
index=false, store=false, docvalues=false. Which means the field is
acknowledged but effectively dropped.

Regards,
   Alex.

On 3 January 2018 at 05:53, Zheng Lin Edwin Yeo <ed...@gmail.com> wrote:
> Hi,
>
> I'm using Solr 7.2.0, and I have this /extract handler in my solrconfig.xml
>
>   <requestHandler name="/update/extract"
>                   startup="lazy"
>                   class="solr.extraction.ExtractingRequestHandler" >
>     <lst name="defaults">
>       <str name="xpath">/xhtml:html/xhtml:body/descendant:node()</str>
>       <str name="capture">content</str>
>       <str name="fmap.meta">attr_meta_</str>
>       <str name="uprefix">attr_</str>
>       <str name="lowernames">true</str>
>       <str name="update.chain">dedupe</str>
>     </lst>
>   </requestHandler>
>
> Understand that this <str name="uprefix">attr_</str> will cause all
> generated fileds that aren't defined in the schema to be prefixed with
> attr_
>
> Is there any way that we can remove some of the fields, but keep the rest?
> For example, I would like to remove attr_x_parsed_by.
>
> Regards,
> Edwin