You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by crazy <el...@gmail.com> on 2007/11/19 15:40:08 UTC

Re: indexing excel file

Hi, i want to index an excel file and i have the following error: 

http://dev.torrez.us/public/2006/pundit/java/src/plugin/parse-msexcel/sample/test.xls:
failed(2,0): Can't be handled as Microsoft document.
java.lang.ArrayIndexOutOfBoundsException: No cell at position col1, row 0. 

I already add msexcel in the plugin.includes: 

<name>plugin.includes</name>
  <value>protocol-http|urlfilter-regex|parse-(text|html|htm|js|pdf|
msword|mspowerpoint|msexcel)|index-basic|query-(basic|site|url)|summary-basic|
scoring-opic|urlnormalizer-(pass|regex|basic)</value>

 i don't now where is the probleme 
help plz
-- 
View this message in context: http://www.nabble.com/Re%3A-indexing-excel-file-tf4836831.html#a13837469
Sent from the Nutch - User mailing list archive at Nabble.com.


AW: AW: indexing excel file

Posted by P....@Deutschepost.de.
I don't know it looks like the configuration seems to be incorrect.
Check the paths, check the config files, check if the plugin is really loaded?

If you can index word and powerpoint, but no excel, the excel plugin is maybe bad.

Also check what version of excel file you use... May be a too new format?

Since you gave too little information it is very hard to analyse your problem.

-----Ursprüngliche Nachricht-----
Von: crazy [mailto:elhatri.ouidad@gmail.com] 
Gesendet: Montag, 19. November 2007 17:36
An: nutch-user@lucene.apache.org
Betreff: Re: AW: indexing excel file



hi,
so what can i do to resolve this problem?
all the excel file indexed give me the same error even the simple file

tks for help 











P.Nguyen2 wrote:
> 
> Hi,
> 
> The excel functionality seems to be very experimental at the moment. 
> When I use it indexing quite big excel files or excel files with many 
> links and formulas the thread crashes.
> 
> The excel plugin only works for me on very simple files.
> Its quite instable, so no "normal" use is recommended.
> 
> Merc
> 
> -----Ursprüngliche Nachricht-----
> Von: crazy [mailto:elhatri.ouidad@gmail.com]
> Gesendet: Montag, 19. November 2007 15:40
> An: nutch-user@lucene.apache.org
> Betreff: Re: indexing excel file
> 
> 
> 
> Hi, i want to index an excel file and i have the following error:
> 
> http://dev.torrez.us/public/2006/pundit/java/src/plugin/parse-msexcel/
> sample/test.xls:
> failed(2,0): Can't be handled as Microsoft document.
> java.lang.ArrayIndexOutOfBoundsException: No cell at position col1, row 0. 
> 
> I already add msexcel in the plugin.includes:
> 
> <name>plugin.includes</name>
>   <value>protocol-http|urlfilter-regex|parse-(text|html|htm|js|pdf|
> msword|mspowerpoint|msexcel)|index-basic|query-(basic|site|url)|summar
> msword|mspowerpoint|y-
> msword|mspowerpoint|basic|
> scoring-opic|urlnormalizer-(pass|regex|basic)</value>
> 
>  i don't now where is the probleme
> help plz
> -- 
> View this message in context:
> http://www.nabble.com/Re%3A-indexing-excel-file-tf4836831.html#a13837469
> Sent from the Nutch - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Re%3A-indexing-excel-file-tf4836831.html#a13839869
Sent from the Nutch - User mailing list archive at Nabble.com.


Re: AW: indexing excel file

Posted by crazy <el...@gmail.com>.
hi,
so what can i do to resolve this problem?
all the excel file indexed give me the same error even the simple file

tks for help 











P.Nguyen2 wrote:
> 
> Hi,
> 
> The excel functionality seems to be very experimental at the moment.
> When I use it indexing quite big excel files or excel files with many
> links and formulas the thread crashes. 
> 
> The excel plugin only works for me on very simple files.
> Its quite instable, so no "normal" use is recommended.
> 
> Merc
> 
> -----Ursprüngliche Nachricht-----
> Von: crazy [mailto:elhatri.ouidad@gmail.com] 
> Gesendet: Montag, 19. November 2007 15:40
> An: nutch-user@lucene.apache.org
> Betreff: Re: indexing excel file
> 
> 
> 
> Hi, i want to index an excel file and i have the following error: 
> 
> http://dev.torrez.us/public/2006/pundit/java/src/plugin/parse-msexcel/sample/test.xls:
> failed(2,0): Can't be handled as Microsoft document.
> java.lang.ArrayIndexOutOfBoundsException: No cell at position col1, row 0. 
> 
> I already add msexcel in the plugin.includes: 
> 
> <name>plugin.includes</name>
>   <value>protocol-http|urlfilter-regex|parse-(text|html|htm|js|pdf|
> msword|mspowerpoint|msexcel)|index-basic|query-(basic|site|url)|summary-
> msword|mspowerpoint|basic|
> scoring-opic|urlnormalizer-(pass|regex|basic)</value>
> 
>  i don't now where is the probleme 
> help plz
> -- 
> View this message in context:
> http://www.nabble.com/Re%3A-indexing-excel-file-tf4836831.html#a13837469
> Sent from the Nutch - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Re%3A-indexing-excel-file-tf4836831.html#a13839869
Sent from the Nutch - User mailing list archive at Nabble.com.


AW: indexing excel file

Posted by P....@Deutschepost.de.
Hi,

The excel functionality seems to be very experimental at the moment.
When I use it indexing quite big excel files or excel files with many links and formulas the thread crashes. 

The excel plugin only works for me on very simple files.
Its quite instable, so no "normal" use is recommended.

Merc

-----Ursprüngliche Nachricht-----
Von: crazy [mailto:elhatri.ouidad@gmail.com] 
Gesendet: Montag, 19. November 2007 15:40
An: nutch-user@lucene.apache.org
Betreff: Re: indexing excel file



Hi, i want to index an excel file and i have the following error: 

http://dev.torrez.us/public/2006/pundit/java/src/plugin/parse-msexcel/sample/test.xls:
failed(2,0): Can't be handled as Microsoft document.
java.lang.ArrayIndexOutOfBoundsException: No cell at position col1, row 0. 

I already add msexcel in the plugin.includes: 

<name>plugin.includes</name>
  <value>protocol-http|urlfilter-regex|parse-(text|html|htm|js|pdf|
msword|mspowerpoint|msexcel)|index-basic|query-(basic|site|url)|summary-
msword|mspowerpoint|basic|
scoring-opic|urlnormalizer-(pass|regex|basic)</value>

 i don't now where is the probleme 
help plz
-- 
View this message in context: http://www.nabble.com/Re%3A-indexing-excel-file-tf4836831.html#a13837469
Sent from the Nutch - User mailing list archive at Nabble.com.