You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Sebastian Nagel <wa...@googlemail.com.INVALID> on 2021/10/22 10:56:49 UTC

Re: [Non-DoD Source] Re: Cant integrate the kerberos enabled solr cloud with nutch (UNCLASSIFIED)

Hi Kris,

please see "Unsubscribe from List" on
   https://nutch.apache.org/mailing_lists.html

Sebastian

On 10/22/21 12:54 PM, Musshorn, Kris T CTR USARMY SEC (USA) wrote:
> CLASSIFICATION: UNCLASSIFIED
> 
> Can I get off the mailing list?
> 
> Thanks,
> Kris T Musshorn CTR Ad Hoc Research
> CECOM SharePoint Team Requirements Analyst
> (443) 861-8623
> APG Bldg 6002 D5101/108
> I am currently teleworking and can be reached at CELL - (860) 670 9494
> 
> -----Original Message-----
> From: Sebastian Nagel <wa...@googlemail.com.INVALID>
> Sent: Friday, October 22, 2021 5:46 AM
> To: user@nutch.apache.org
> Subject: [Non-DoD Source] Re: Cant integrate the kerberos enabled solr cloud with nutch
> 
> Hi Shi Wei,
> 
> could you also share the index writer configuration (conf/index-writers.xml)?
> 
> The default is unauthenticated access to Solr, see the snippet below.
> The file httpclient-auth.xml is not relevant for the Solr indexer, it's used if a crawled web site requires authentication in order to fetch the content via the plugin protocol-httpclient.
> 
> Best,
> Sebastian
> 
>     <writer id="indexer_solr_1" class="org.apache.nutch.indexwriter.solr.SolrIndexWriter">
>       <parameters>
>         <param name="type" value="http"/>
>         <param name="url" value="http://localhost:8983/solr/nutch"/>
>         <param name="collection" value=""/>
>         <param name="weight.field" value=""/>
>         <param name="commitSize" value="1000"/>
>         <param name="auth" value="false"/>
>         <param name="username" value="username"/>
>         <param name="password" value="password"/>
> 
> 
> On 10/22/21 10:10 AM, sw.ling@quandatics.com wrote:
>> Hi,
>>
>> We have encountered a problem which can’t integrate the kerberos enabled solr cloud with nutch.
>>
>> When execute "nutch index crawl/crawldb/ -linkdb crawl/linkdb/ $s1
>> -filter -normalize" command ,it will fail with "HTTP ERROR 401Problem accessing /solr/admin/collections. Reason:Authentication required" but we able to curl it with the keytab.
>>
>> Version of Nutch :1.18
>>
>> Your Sincerely,
>>
>> Shi Wei
>>
> 
> 
> 
> CLASSIFICATION: UNCLASSIFIED
>