You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Pushkar Raste <pu...@gmail.com> on 2019/05/22 15:11:33 UTC

Is it possible to reconstruct non stored fields and tun those into stored fields

I know this is a long shot. I am trying move from Solr4 to Solr7.
Reindexing all the data from the source is difficult to do in a reasonable
time. All the fields are of basic types like int, long, float, double,
Boolean, date,  string.

Since these fields don’t have analyzers, I was wondering if these fields
can be retrieved while iterating over index while reading the documents.
-- 
— Pushkar Raste

Re: Is it possible to reconstruct non stored fields and tun those into stored fields

Posted by Erick Erickson <er...@gmail.com>.
You might get some pointer from the Luke code….

All in all I’d focus on re-indexing somehow. Unless the original documents are just totally impossible to find again it’s probably easier.

Best,
Erick

> On May 22, 2019, at 3:30 PM, Shawn Heisey <ap...@elyograg.org> wrote:
> 
> On 5/22/2019 3:51 PM, Pushkar Raste wrote:
>> Looks like giving Luke a shot is the answer. Can you point me to an example
>> to extract the fields from inverted Index using Luke.
> 
> Luke is a GUI application that can view the Lucene index in considerable detail.  To use Luke directly, you'd have to have somebody running it and typing/copying what they find to some kind of system for indexing. It would be a very manual process.
> 
> To do it programmatically, you would have to write code yourself using the Lucene API.  I don't think we'd be able to point you at existing code.
> 
> Thanks,
> Shawn


Re: Is it possible to reconstruct non stored fields and tun those into stored fields

Posted by Shawn Heisey <ap...@elyograg.org>.
On 5/22/2019 3:51 PM, Pushkar Raste wrote:
> Looks like giving Luke a shot is the answer. Can you point me to an example
> to extract the fields from inverted Index using Luke.

Luke is a GUI application that can view the Lucene index in considerable 
detail.  To use Luke directly, you'd have to have somebody running it 
and typing/copying what they find to some kind of system for indexing. 
It would be a very manual process.

To do it programmatically, you would have to write code yourself using 
the Lucene API.  I don't think we'd be able to point you at existing code.

Thanks,
Shawn

Re: Is it possible to reconstruct non stored fields and tun those into stored fields

Posted by Pushkar Raste <pu...@gmail.com>.
We have only a handful of fields that are stored and many (non Text) fields
which are neither stored nor have docValues :-(

Looks like giving Luke a shot is the answer. Can you point me to an example
to extract the fields from inverted Index using Luke.

On Wed, May 22, 2019 at 11:52 AM Erick Erickson <er...@gmail.com>
wrote:

> Well, if they’re all docValues or stored=true, sure. It’d be kind of
> slow.. The short form is “if you can specify fl=f1,f2,f3…. for all your
> fields and see all your values, then it’s easy if slow”.
>
> If that works _and_ you are on Solr 4.7+ cursorMark will help the “deep
> paging” issue.
>
> If they’re all docValues, you could use the /export handler to dump them
> all to a file and re-index that.
>
> If none of those are possible, you can do this but it’d be quite painful.
> Luke can reassemble a document (lossily for text fields, but in this case
> it’d be OK since they’re simple types) by examining the inverted index and
> pulling out the values. Painfully slow and you’d have to write custom code
> probably at the Lucene level to make it all work.
>
> Best,
> Erick
>
> > On May 22, 2019, at 8:11 AM, Pushkar Raste <pu...@gmail.com>
> wrote:
> >
> > I know this is a long shot. I am trying move from Solr4 to Solr7.
> > Reindexing all the data from the source is difficult to do in a
> reasonable
> > time. All the fields are of basic types like int, long, float, double,
> > Boolean, date,  string.
> >
> > Since these fields don’t have analyzers, I was wondering if these fields
> > can be retrieved while iterating over index while reading the documents.
> > --
> > — Pushkar Raste
>
> --
— Pushkar Raste

Re: Is it possible to reconstruct non stored fields and tun those into stored fields

Posted by Erick Erickson <er...@gmail.com>.
Well, if they’re all docValues or stored=true, sure. It’d be kind of slow.. The short form is “if you can specify fl=f1,f2,f3…. for all your fields and see all your values, then it’s easy if slow”.

If that works _and_ you are on Solr 4.7+ cursorMark will help the “deep paging” issue.

If they’re all docValues, you could use the /export handler to dump them all to a file and re-index that.

If none of those are possible, you can do this but it’d be quite painful. Luke can reassemble a document (lossily for text fields, but in this case it’d be OK since they’re simple types) by examining the inverted index and pulling out the values. Painfully slow and you’d have to write custom code probably at the Lucene level to make it all work.

Best,
Erick

> On May 22, 2019, at 8:11 AM, Pushkar Raste <pu...@gmail.com> wrote:
> 
> I know this is a long shot. I am trying move from Solr4 to Solr7.
> Reindexing all the data from the source is difficult to do in a reasonable
> time. All the fields are of basic types like int, long, float, double,
> Boolean, date,  string.
> 
> Since these fields don’t have analyzers, I was wondering if these fields
> can be retrieved while iterating over index while reading the documents.
> -- 
> — Pushkar Raste