You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by David Martin <dm...@netflix.com> on 2012/08/30 00:38:46 UTC

Large XML file sizes error out parsing the file size as an Integer

Folks:

One of our files of XML entities for import is almost 7GB in size.

When trying to import, we error out with the exception below.  6845266984 is the exact size of the input file in bytes.

Shouldn't the file size be a long?  Has anybody else experienced this problem?

We plan on dividing this file into smaller pieces, but if there's another solution I'd love to hear it.

Thanks,

David Martin

From: Desktop <dm...@netflix.com>>
Date: Wednesday, August 29, 2012 3:17 PM
Subject: contract item assets exception

Aug 29, 2012 10:04:03 PM org.apache.solr.handler.dataimport.SolrWriter upload
WARNING: Error creating document : SolrInputDocument[{fileSize=fileSize(1.0)={6845266984}, created_by=created_by(1.0)={CHILO}, id=id(1.0)={movie::70018848:country_code-NO:contract_id-9979:ccm_asset_id-369161014}, movie_id=movie_id(1.0)={70018848}, is_required=is_required(1.0)={0}, bcp_47_code=bcp_47_code(1.0)={nn}, element_category_id=element_category_id(1.0)={3}, updated_by=updated_by(1.0)={SYSADMIN}, last_updated=last_updated(1.0)={2012-08-29T19:25:21.585Z}, entity_type=entity_type(1.0)={CONTRACT_ITEM_ASSET}, country_code=country_code(1.0)={NO}, ccm_asset_id=ccm_asset_id(1.0)={369161014}}]
org.apache.solr.common.SolrException: ERROR: [doc=movie::70018848:country_code-NO:contract_id-9979:ccm_asset_id-369161014] Error adding field 'fileSize'='6845266984'
at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:333)
at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:115)
at org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:66)
at org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:293)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:723)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:709)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:619)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:327)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:426)
Caused by: java.lang.NumberFormatException: For input string: "6845266984"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Integer.parseInt(Integer.java:461)
at java.lang.Integer.parseInt(Integer.java:499)
at org.apache.solr.schema.TrieField.createField(TrieField.java:407)
at org.apache.solr.schema.SchemaField.createField(SchemaField.java:103)
at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:203)
at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:286)
... 12 more


Re: Large XML file sizes error out parsing the file size as an Integer

Posted by Chris Hostetter <ho...@fucit.org>.
: Shouldn't the file size be a long?  Has anybody else experienced this problem?

Your problem does not apear to be any internal limitation in Solr - your 
problem appears to be that you have a field in your schema named 
"fileSize" which uses a fieldType that is a "TrieIntField" but you are 
attempting to put a value in that field that is not a legal integer.

Unless i'm missing something: If you wnat it to be a long, edit your 
schema to make it a long ?

: Caused by: java.lang.NumberFormatException: For input string: "6845266984"
: at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
: at java.lang.Integer.parseInt(Integer.java:461)
: at java.lang.Integer.parseInt(Integer.java:499)
: at org.apache.solr.schema.TrieField.createField(TrieField.java:407)
: at org.apache.solr.schema.SchemaField.createField(SchemaField.java:103)
: at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:203)
: at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:286)

-Hoss

Re: Large XML file sizes error out parsing the file size as an Integer

Posted by Walter Underwood <wu...@wunderwood.org>.
Break it up.

You'll need 7GB of RAM for the source, at least that much for the parsed version, at least that much for the indexes, and so on.

Why try to make something work when you aren't going to do it that way in production?

wunder

On Aug 29, 2012, at 3:38 PM, David Martin wrote:

> Folks:
> 
> One of our files of XML entities for import is almost 7GB in size.
> 
> When trying to import, we error out with the exception below.  6845266984 is the exact size of the input file in bytes.
> 
> Shouldn't the file size be a long?  Has anybody else experienced this problem?
> 
> We plan on dividing this file into smaller pieces, but if there's another solution I'd love to hear it.
> 
> Thanks,
> 
> David Martin
> 
> From: Desktop <dm...@netflix.com>>
> Date: Wednesday, August 29, 2012 3:17 PM
> Subject: contract item assets exception
> 
> Aug 29, 2012 10:04:03 PM org.apache.solr.handler.dataimport.SolrWriter upload
> WARNING: Error creating document : SolrInputDocument[{fileSize=fileSize(1.0)={6845266984}, created_by=created_by(1.0)={CHILO}, id=id(1.0)={movie::70018848:country_code-NO:contract_id-9979:ccm_asset_id-369161014}, movie_id=movie_id(1.0)={70018848}, is_required=is_required(1.0)={0}, bcp_47_code=bcp_47_code(1.0)={nn}, element_category_id=element_category_id(1.0)={3}, updated_by=updated_by(1.0)={SYSADMIN}, last_updated=last_updated(1.0)={2012-08-29T19:25:21.585Z}, entity_type=entity_type(1.0)={CONTRACT_ITEM_ASSET}, country_code=country_code(1.0)={NO}, ccm_asset_id=ccm_asset_id(1.0)={369161014}}]
> org.apache.solr.common.SolrException: ERROR: [doc=movie::70018848:country_code-NO:contract_id-9979:ccm_asset_id-369161014] Error adding field 'fileSize'='6845266984'
> at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:333)
> at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
> at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:115)
> at org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:66)
> at org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:293)
> at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:723)
> at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:709)
> at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:619)
> at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:327)
> at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
> at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
> at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
> at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:426)
> Caused by: java.lang.NumberFormatException: For input string: "6845266984"
> at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
> at java.lang.Integer.parseInt(Integer.java:461)
> at java.lang.Integer.parseInt(Integer.java:499)
> at org.apache.solr.schema.TrieField.createField(TrieField.java:407)
> at org.apache.solr.schema.SchemaField.createField(SchemaField.java:103)
> at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:203)
> at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:286)
> ... 12 more
>