You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sandeep Gond <sa...@gmail.com> on 2011/07/02 04:15:59 UTC

Indexing CSV data in Multicore setup

I am trying to index CSV data in multicore setup using post.jar.

Here is what I have tried so far:
1) Started the server using "java -Dsolr.solr.home=multicore -jar
start.jar"

2a) Tried to post to "localhost:8983/solr/core0/update/csv" using "java
-Dcommit=no -Durl=http://localhost:8983/solr/core0/update/csv -jar post.jar
test.csv"
  Error: SimplePostTool: FATAL: Solr returned an error #404 Not Found

2b) Tried to send CSV data to core0 using "java -Durl=
http://localhost:8983/solr/core0/update -jar post.jar test.csv"
  Error: SimplePostTool: FATAL: Solr returned an error #400 Unexpected
character 'S' (code 83) in prolog; expected '<'   at [row,col
{unknown-source}]: [1,1]

I could feed in the xml files to core0 without any issues.

Am I missing something here?

Re: Indexing CSV data in Multicore setup

Posted by Sandeep Gond <sa...@gmail.com>.
Thanks Stefan.

/update/csv handler was not defined in my solrconfig.xml. After defining it
in the solconfig I could get the CSV files indexed using following two
commands:

> java -Dcommit=no -Durl=http://localhost:8983/solr/core0/update/csv -jar
post.jar books.csv
> java -Dcommit=yes -Durl=http://localhost:8983/solr/core0/update -jar
post.jar

Thanks,
Sandeep


On Sat, Jul 2, 2011 at 9:15 PM, Stefan Matheis <
matheis.stefan@googlemail.com> wrote:

> Sandeep,
>
> did you check that this handler is defined in your solrconfig?
> Otherwise it will not work, and you'll get an HTTP 404
>
> Regards
> Stefan
>
> Am 02.07.2011 17:15, schrieb sandeep:
>
>  post.jar is used to post xml files. You can use curl to feed csv.
>>> http://wiki.apache.org/solr/**UpdateCSV<http://wiki.apache.org/solr/UpdateCSV>
>>>
>>
>>
>> I tried using curl as well to post the CSV data using following command.
>>
>> curl http://localhost:8983/solr/**core0/update/csv<http://localhost:8983/solr/core0/update/csv>--data-binary @books.csv -H
>> 'Content-type:text/plain;**charset=utf-8'
>>
>> It errors out saying problem accessing "/solr/core0/update/csv".
>>
>> "<body>
>> HTTP ERROR 404
>>
>> <p>Problem accessing /solr/core0/update/csv. Reason:
>> <pre>     NOT_FOUND</pre></p><hr />/<small>Powered by
>> Jetty://</small>/<br/>"
>>
>> --
>> View this message in context: http://lucene.472066.n3.**
>> nabble.com/Indexing-CSV-data-**in-Multicore-setup-**
>> tp3131252p3132350.html<http://lucene.472066.n3.nabble.com/Indexing-CSV-data-in-Multicore-setup-tp3131252p3132350.html>
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>

Re: Indexing CSV data in Multicore setup

Posted by Stefan Matheis <ma...@googlemail.com>.
Sandeep,

did you check that this handler is defined in your solrconfig?
Otherwise it will not work, and you'll get an HTTP 404

Regards
Stefan

Am 02.07.2011 17:15, schrieb sandeep:
>> post.jar is used to post xml files. You can use curl to feed csv.
>> http://wiki.apache.org/solr/UpdateCSV
>
>
> I tried using curl as well to post the CSV data using following command.
>
> curl http://localhost:8983/solr/core0/update/csv --data-binary @books.csv -H
> 'Content-type:text/plain;charset=utf-8'
>
> It errors out saying problem accessing "/solr/core0/update/csv".
>
> "<body>
> HTTP ERROR 404
>
> <p>Problem accessing /solr/core0/update/csv. Reason:
> <pre>     NOT_FOUND</pre></p><hr />/<small>Powered by Jetty://</small>/<br/>"
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Indexing-CSV-data-in-Multicore-setup-tp3131252p3132350.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Indexing CSV data in Multicore setup

Posted by sandeep <sa...@gmail.com>.
> post.jar is used to post xml files. You can use curl to feed csv. 
> http://wiki.apache.org/solr/UpdateCSV


I tried using curl as well to post the CSV data using following command.

curl http://localhost:8983/solr/core0/update/csv --data-binary @books.csv -H
'Content-type:text/plain;charset=utf-8'

It errors out saying problem accessing "/solr/core0/update/csv".

"<body>
HTTP ERROR 404

<p>Problem accessing /solr/core0/update/csv. Reason:
<pre>    NOT_FOUND</pre></p><hr />/<small>Powered by Jetty://</small>/<br/>"

--
View this message in context: http://lucene.472066.n3.nabble.com/Indexing-CSV-data-in-Multicore-setup-tp3131252p3132350.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Indexing CSV data in Multicore setup

Posted by Ahmet Arslan <io...@yahoo.com>.
> I am trying to index CSV data in
> multicore setup using post.jar.
> 
> Here is what I have tried so far:
> 1) Started the server using "java
> -Dsolr.solr.home=multicore -jar
> start.jar"
> 
> 2a) Tried to post to "localhost:8983/solr/core0/update/csv"
> using "java
> -Dcommit=no -Durl=http://localhost:8983/solr/core0/update/csv -jar
> post.jar
> test.csv"
>   Error: SimplePostTool: FATAL: Solr returned an error
> #404 Not Found
> 
> 2b) Tried to send CSV data to core0 using "java -Durl=
> http://localhost:8983/solr/core0/update
> -jar post.jar test.csv"
>   Error: SimplePostTool: FATAL: Solr returned an error
> #400 Unexpected
> character 'S' (code 83) in prolog; expected
> '<'   at [row,col
> {unknown-source}]: [1,1]
> 
> I could feed in the xml files to core0 without any issues.
> 
> Am I missing something here?

post.jar is used to post xml files. You can use curl to feed csv.
http://wiki.apache.org/solr/UpdateCSV