You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Kate Kas <ka...@gmail.com> on 2015/12/06 21:56:37 UTC

import file to solr

Hi,

I am trying to import xml files using data import request handler.

When i import xml file of 1,4 kB size, it works correctly. However, i
cannot import  xml file of 4 GB size to Solr. It does not present any
error, but i receive the following answer:

*Indexing completed. Added/Updated: 0 documents. Deleted 0 documents.
(Duration: 39s)* Requests: 0 (0/s), Fetched: 0 (0/s), Skipped: 0, Processed:
0

Also, both of these files have the same structure (same
elements/attributes).

I would like to ask you, if there are any limits regarding  the size of xml
files, which we can import to solr.

Thank you!

Best,
Kate

Re: import file to solr

Posted by Erick Erickson <er...@gmail.com>.
Still, 4GB is going to take a lot of resources to
1> hold the whole thing in memory and parse
2> process.

You may simply be hitting a timeout.

But I would ask what practical use indexing a 4GB file is.
Likely it'll be found by virtually every search (assuming
there's a huge text field or two in there) and also appear
near the bottom of the list relevance wise.

Then there's the problem of anyone ever actually being able
to view the file in their browser (although I don't  know what
your app is, so maybe that's not a concern)

This somewhat sounds like an XY problem. _Why_ do you
want to index a 4G file? What's the use-case you're
supporting?

Best
Erick

On Sun, Dec 6, 2015 at 3:26 PM, Alexandre Rafalovitch
<ar...@gmail.com> wrote:
> There should be no limit. Try 100K, 50K sizes. Maybe you have an error
> somewhere. Also check Solr logs, not just DIH messages.
> On 6 Dec 2015 3:56 pm, "Kate Kas" <ka...@gmail.com> wrote:
>
>> Hi,
>>
>> I am trying to import xml files using data import request handler.
>>
>> When i import xml file of 1,4 kB size, it works correctly. However, i
>> cannot import  xml file of 4 GB size to Solr. It does not present any
>> error, but i receive the following answer:
>>
>> *Indexing completed. Added/Updated: 0 documents. Deleted 0 documents.
>> (Duration: 39s)* Requests: 0 (0/s), Fetched: 0 (0/s), Skipped: 0,
>> Processed:
>> 0
>>
>> Also, both of these files have the same structure (same
>> elements/attributes).
>>
>> I would like to ask you, if there are any limits regarding  the size of xml
>> files, which we can import to solr.
>>
>> Thank you!
>>
>> Best,
>> Kate
>>

Re: import file to solr

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
There should be no limit. Try 100K, 50K sizes. Maybe you have an error
somewhere. Also check Solr logs, not just DIH messages.
On 6 Dec 2015 3:56 pm, "Kate Kas" <ka...@gmail.com> wrote:

> Hi,
>
> I am trying to import xml files using data import request handler.
>
> When i import xml file of 1,4 kB size, it works correctly. However, i
> cannot import  xml file of 4 GB size to Solr. It does not present any
> error, but i receive the following answer:
>
> *Indexing completed. Added/Updated: 0 documents. Deleted 0 documents.
> (Duration: 39s)* Requests: 0 (0/s), Fetched: 0 (0/s), Skipped: 0,
> Processed:
> 0
>
> Also, both of these files have the same structure (same
> elements/attributes).
>
> I would like to ask you, if there are any limits regarding  the size of xml
> files, which we can import to solr.
>
> Thank you!
>
> Best,
> Kate
>