You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by jorge hernandez <jo...@carousel.nyc> on 2022/07/21 19:12:38 UTC
RE: Help with new install
Hello,
I just downloaded solr, copied the config sets in _default to my new core,
copied post.jar to where I have the files I want to index, created the new
core using the web GUI, everything seems right, but when I ran:
Java -Dauto -Dc=mynewcore -jar post.jar *.html
It keeps saying:
SimplePostTool: WARNING: IOException while reading response:
java.io.FileNotFoundException:
http://localhost:8983/solr/mynescore/update/extract?resource.name=%3cpath_of_the_files>
I’m new at using solr, so I’m pretty sure I missed something, can anybody
tell me what I missed?
Thanks.
Re: Help with new install
Posted by Shawn Heisey <ap...@elyograg.org>.
On 7/21/22 13:12, jorge hernandez wrote:
> SimplePostTool: WARNING: IOException while reading response:
> java.io.FileNotFoundException:
> http://localhost:8983/solr/mynescore/update/extract?resource.name=%3cpath_of_the_files>
The problem here is that the _default configset does NOT create the
/update/extract handler, which you need to extract data from document
types like html, word, PDF, etc.
This feature requires loading additional jars, because the feature (also
called SolrCell) is not included in the webapp. It is in the download
as a module.
Note that the following document is for Solr 9.0 ... earlier versions
will be slightly different.
https://solr.apache.org/guide/solr/latest/indexing-guide/indexing-with-tika.html
One final note ... we STRONGLY recommend not using SolrCell in
production. Tika can be unstable -- some documents can cause it to
consume huge amounts of memory, and even crash. If Tika is running
inside Solr when that happens, then Solr itself will suffer the
effects. Instead, you should run Tika in a separate process with crash
handling, so that Solr remains operational if there is a problem with
extraction.
Thanks,
Shawn
RE: Help with new install
Posted by jorge hernandez <jo...@carousel.nyc>.
I triple checked it and the name of the core is correct, what I don't
understand is why is it looking for the files in the core's folder when the
files are somewhere else? The post.jar command was run from the folder with
the files.
-----Original Message-----
From: Eric Pugh <ep...@opensourceconnections.com>
Sent: Thursday, July 21, 2022 3:43 PM
To: users@solr.apache.org
Subject: Re: Help with new install
Looks like your core name is wrong in your command, at least, what is coming
back in the message...
> On Jul 21, 2022, at 3:12 PM, jorge hernandez <jo...@carousel.nyc> wrote:
>
> Hello,
>
> I just downloaded solr, copied the config sets in _default to my new
> core, copied post.jar to where I have the files I want to index,
> created the new core using the web GUI, everything seems right, but when I
> ran:
>
> Java -Dauto -Dc=mynewcore -jar post.jar *.html
>
> It keeps saying:
>
> SimplePostTool: WARNING: IOException while reading response:
> java.io.FileNotFoundException:
> http://localhost:8983/solr/mynescore/update/extract?resource.name=%3cp
> ath_of_the_files>
>
> I’m new at using solr, so I’m pretty sure I missed something, can
> anybody tell me what I missed?
>
> Thanks.
_______________________
Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 |
http://www.opensourceconnections.com <http://www.opensourceconnections.com/>
| My Free/Busy <http://tinyurl.com/eric-cal>
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
<https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>This
e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless of
whether attachments are marked as such.
Re: Help with new install
Posted by Eric Pugh <ep...@opensourceconnections.com>.
Looks like your core name is wrong in your command, at least, what is coming back in the message...
> On Jul 21, 2022, at 3:12 PM, jorge hernandez <jo...@carousel.nyc> wrote:
>
> Hello,
>
> I just downloaded solr, copied the config sets in _default to my new core,
> copied post.jar to where I have the files I want to index, created the new
> core using the web GUI, everything seems right, but when I ran:
>
> Java -Dauto -Dc=mynewcore -jar post.jar *.html
>
> It keeps saying:
>
> SimplePostTool: WARNING: IOException while reading response:
> java.io.FileNotFoundException:
> http://localhost:8983/solr/mynescore/update/extract?resource.name=%3cpath_of_the_files>
>
> I’m new at using solr, so I’m pretty sure I missed something, can anybody
> tell me what I missed?
>
> Thanks.
_______________________
Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | My Free/Busy <http://tinyurl.com/eric-cal>
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.