You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Bruno Mannina <bm...@matheo-software.com> on 2018/05/30 14:18:35 UTC

Solr5.4 - Indexing a big file (size = 2.4Go)

Dear Solr User,

 

I got a invalid content length when I try to index my file (xml file with a
size of 2.4Go)

 

I use simpleposttool like in the documentation on my ubuntu server.

>bin/post -port 1234 -c mycollection /home/bruno/2013.xml

 

It works with smaller file but not with this one. I suppose it's the size.

 

Is exist a param to change to allow big file ?

 

I change in the solrconfig the param formdatauploadlimitinkb to 4096 and
miltipartuploadlimitinkb to 4096000 without successing.

 

Do you have an idea ?

 

Many thanks for your help,

 

Best Regards

Bruno



---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
https://www.avast.com/antivirus

RE: Solr5.4 - Indexing a big file (size = 2.4Go)

Posted by Bruno Mannina <bm...@free.fr>.
Hi Erick,

I want to index this file because I received this file from my boss.

This file contains around 1.5M docs.

I think I will split this file and index them. 
It will be better.

Thanks

-----Message d'origine-----
De : Erick Erickson [mailto:erickerickson@gmail.com] 
Envoyé : mercredi 30 mai 2018 16:50
À : solr-user
Objet : Re: Solr5.4 - Indexing a big file (size = 2.4Go)

Why do you want to index a 2G file in the first place? You can't really do anything with it.

If you deliver it to a browser, the browser will churn forever. If you try to export it it'll suck up your bandwidth terribly.

If it's a bunch of individual docs (in Solr's xml format) about the only thing that makes sense is to break it up.

This sounds like an XY problem, you've asked how to do X (index a 2G
file) without telling us Y (what
the use-case is).

Best,
Erick

On Wed, May 30, 2018 at 7:18 AM, Bruno Mannina <bm...@matheo-software.com> wrote:
> Dear Solr User,
>
>
>
> I got a invalid content length when I try to index my file (xml file 
> with a size of 2.4Go)
>
>
>
> I use simpleposttool like in the documentation on my ubuntu server.
>
>>bin/post -port 1234 -c mycollection /home/bruno/2013.xml
>
>
>
> It works with smaller file but not with this one. I suppose it's the size.
>
>
>
> Is exist a param to change to allow big file ?
>
>
>
> I change in the solrconfig the param formdatauploadlimitinkb to 4096 
> and miltipartuploadlimitinkb to 4096000 without successing.
>
>
>
> Do you have an idea ?
>
>
>
> Many thanks for your help,
>
>
>
> Best Regards
>
> Bruno
>
>
>
> ---
> L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
> https://www.avast.com/antivirus


RE: Solr5.4 - Indexing a big file (size = 2.4Go)

Posted by "Leonard, Carl" <CL...@whisolutions.com>.
Is it one document that is 2.4 GB or is that 2.4GB several documents?

There are some limits in solrconfig.xml.  Perhaps you are hitting the multipartUploadLimitInKB?

    <requestParsers enableRemoteStreaming="true"
                    multipartUploadLimitInKB="2048000"
                    formdataUploadLimitInKB="2048"
                    addHttpRequestToContext="false"/>


-----Original Message-----
From: Erick Erickson <er...@gmail.com> 
Sent: Wednesday, May 30, 2018 7:50 AM
To: solr-user <so...@lucene.apache.org>
Subject: Re: Solr5.4 - Indexing a big file (size = 2.4Go)

Why do you want to index a 2G file in the first place? You can't really do anything with it.

If you deliver it to a browser, the browser will churn forever. If you try to export it it'll suck up your bandwidth terribly.

If it's a bunch of individual docs (in Solr's xml format) about the only thing that makes sense is to break it up.

This sounds like an XY problem, you've asked how to do X (index a 2G
file) without telling us Y (what
the use-case is).

Best,
Erick

On Wed, May 30, 2018 at 7:18 AM, Bruno Mannina <bm...@matheo-software.com> wrote:
> Dear Solr User,
>
>
>
> I got a invalid content length when I try to index my file (xml file 
> with a size of 2.4Go)
>
>
>
> I use simpleposttool like in the documentation on my ubuntu server.
>
>>bin/post -port 1234 -c mycollection /home/bruno/2013.xml
>
>
>
> It works with smaller file but not with this one. I suppose it's the size.
>
>
>
> Is exist a param to change to allow big file ?
>
>
>
> I change in the solrconfig the param formdatauploadlimitinkb to 4096 
> and miltipartuploadlimitinkb to 4096000 without successing.
>
>
>
> Do you have an idea ?
>
>
>
> Many thanks for your help,
>
>
>
> Best Regards
>
> Bruno
>
>
>
> ---
> L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.a
> vast.com%2Fantivirus&data=02%7C01%7CCLeonard%40whisolutions.com%7C2546
> 89a9ef634c7f3cc708d5c63cc4e9%7C46326bff992841a0baca17c16c94ea99%7C0%7C
> 0%7C636632886654271771&sdata=8FiKfTYaUvx29ihtoHHgRriVr6%2Bb5SHx%2F6fx4
> BwQAGI%3D&reserved=0

Re: Solr5.4 - Indexing a big file (size = 2.4Go)

Posted by Erick Erickson <er...@gmail.com>.
Why do you want to index a 2G file in the first place? You can't
really do anything with it.

If you deliver it to a browser, the browser will churn forever. If you
try to export it it'll suck up
your bandwidth terribly.

If it's a bunch of individual docs (in Solr's xml format) about the
only thing that makes sense is to break it up.

This sounds like an XY problem, you've asked how to do X (index a 2G
file) without telling us Y (what
the use-case is).

Best,
Erick

On Wed, May 30, 2018 at 7:18 AM, Bruno Mannina
<bm...@matheo-software.com> wrote:
> Dear Solr User,
>
>
>
> I got a invalid content length when I try to index my file (xml file with a
> size of 2.4Go)
>
>
>
> I use simpleposttool like in the documentation on my ubuntu server.
>
>>bin/post -port 1234 -c mycollection /home/bruno/2013.xml
>
>
>
> It works with smaller file but not with this one. I suppose it's the size.
>
>
>
> Is exist a param to change to allow big file ?
>
>
>
> I change in the solrconfig the param formdatauploadlimitinkb to 4096 and
> miltipartuploadlimitinkb to 4096000 without successing.
>
>
>
> Do you have an idea ?
>
>
>
> Many thanks for your help,
>
>
>
> Best Regards
>
> Bruno
>
>
>
> ---
> L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
> https://www.avast.com/antivirus