You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@manifoldcf.apache.org by Martijn v Groningen <ma...@gmail.com> on 2010/09/12 11:00:36 UTC

Sharepoint connector question

Hi All,

I've configured the Sharepoint connector (to connect to sharepoint
3.0), Solr connector and a job that adds documents into Solr. The only
thing that I'm missing is the meta data from Sharepoint. Per document
I need to know which users can access it. In the metadata tab on the
job page I've configured the metadata to be included, but this doesn't
end up in my Solr index. Does anybody know what I should do to also
have the metadata in my index?

I also had another issue with the Sharepoint connector which I managed
to solve. But I'm curious to know if someone else encountered a
similar issue.
When I was setting up the sharepoint connecter I always got a 401
message on the connectors page as status. I was sure I entered the
correct credentials. After some debugging I noticed that the NLTM data
that was send to Solr was different then when I did a http post with
Firefox poster plugin to a Sharepoint webservice url (I check this
with Wireshark). After writing a little test case with httpclient used
in afc, I got the same 401 error. I then ran the test with a clean
http client (version 3.1), that ran as expected. I got a response code
200 back with a soap response. I then used this version of http client
(with some class filesfrom the afc provided jar that were missing is
the plain jar file) and the connector worked as expected as I was able
to index documents. Did someone else have this particular issue? I
noticed that acf is using httpclient 3.1 (from the manifest file), but
I'm curious to know why http client was modified.

BTW I've been using the latest trunk version (I did a checkout last
tuesday). I'm also new to Sharepoint

Cheers,

Martijn

Re: Sharepoint connector question

Posted by Karl Wright <da...@gmail.com>.
That's too bad - I would really like to have understood this issue.
If you pick it up again in the future, please let us know.

Karl

On Thu, Sep 16, 2010 at 11:41 AM, Martijn v Groningen
<ma...@gmail.com> wrote:
> Hi Karl,
>
> Unfortunately in the environment where I'm currently working, is
> installing a new Sharepoint instance is not an option. Due to time
> pressure we dropped the requirement of crawling documents from
> Sharepoint.
>
> Martijn
>
> On 15 September 2010 23:53, Karl Wright <da...@gmail.com> wrote:
>> Any news on this issue?
>> Karl
>>
>> On Mon, Sep 13, 2010 at 3:00 PM, Karl Wright <da...@gmail.com> wrote:
>>>
>>> I would expect that domain administrator privs would be sufficient. ;-)
>>>
>>> Unfortunately, SharePoint (and .NET services in general) often seem to
>>> have unusual security problems with internal communication.  I've seen cases
>>> where some of SharePoint's own web services have this issue.  Never was able
>>> to figure out the problem.  Perhaps a .NET guru could, but that's not me.
>>>
>>> Karl
>>>
>>> On Mon, Sep 13, 2010 at 2:55 PM, Martijn v Groningen
>>> <ma...@gmail.com> wrote:
>>>>
>>>> That could explain the error. I've installed and uninstalled the
>>>> webservice extension a few times, but I know for sure that the last
>>>> time I installed it as domain administrator. Last week I used a
>>>> plain-vanilla Sharepoint (trail version), the webservice extension
>>>> worked there without any problem.
>>>>
>>>> On 13 September 2010 19:30, Karl Wright <da...@gmail.com> wrote:
>>>> > The key error is the following:
>>>> >
>>>> >>>>>>>
>>>> > <soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
>>>> > request failed with HTTP status 401:
>>>> >
>>>> > Unauthorized.</faultstring><faultactor>http://[host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
>>>> > request failed with HTTP status 401:
>>>> >
>>>> > Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault>
>>>> > <<<<<<
>>>> >
>>>> > Clearly the MCPermissions web service does not have sufficient
>>>> > permissions
>>>> > to perform its task.  I don't recall ever having seen this before, but
>>>> > perhaps during installation you were not logged in as a user that has
>>>> > enough
>>>> > permission to perform security lookups.
>>>> >
>>>> > Karl
>>>> >
>>>> > On Mon, Sep 13, 2010 at 12:15 PM, Martijn v Groningen
>>>> > <ma...@gmail.com> wrote:
>>>> >>
>>>> >> Hi Karl,
>>>> >>
>>>> >> Today I'm not at the environment where I can verify this, but I'll
>>>> >> definitely check this. But I ran into another issue with the
>>>> >> Sharepoint connector. In a another environment I installed the
>>>> >> Metacarta Sharepoint webservice extensions, but by executing the
>>>> >> following post:
>>>> >> HTTP POST http://[host]/rotterdamn/_vti_bin/MCPermissions.asmx
>>>> >> <?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope
>>>> >> xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
>>>> >> xmlns:xsd="http://www.w3.org/2001/XMLSchema"
>>>> >>
>>>> >>
>>>> >> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><soapenv:Body><GetPermissionCollection
>>>> >>
>>>> >>
>>>> >> xmlns="http://microsoft.com/sharepoint/webpartpages/"><objectName>/</objectName><objectType>Web</objectType></GetPermissionCollection></soapenv:Body></soapenv:Envelope>
>>>> >>
>>>> >> I get back the following response (http 500):
>>>> >> <?xml version="1.0" encoding="utf-8"?><soap:Envelope
>>>> >> xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
>>>> >> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>>>> >>
>>>> >>
>>>> >> xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
>>>> >> request failed with HTTP status 401:
>>>> >>
>>>> >>
>>>> >> Unauthorized.</faultstring><faultactor>http://[host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
>>>> >> request failed with HTTP status 401:
>>>> >>
>>>> >>
>>>> >> Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault></soap:Body></soap:Envelope>
>>>> >>
>>>> >> I seems that I only get an error with the MCPermissions webservice
>>>> >> call. Other calls such as GetListCollection work fine. I'm
>>>> >> authentication with the domain administrator account. This environment
>>>> >> has also Sharepoint 3.0 installed. I'm making these posts to
>>>> >> Sharepoint with Firefox http poster plugin.  Also the url in the
>>>> >> response is without the subsite.
>>>> >>
>>>> >> Also important to note is that browsing to
>>>> >> http://[host]/[subsite]/_vti_bin/MCPermissions.asmx shows the
>>>> >> GetPermissionsCollection operation. That is what I checked after
>>>> >> installing the webservice extension. You have a clue what might be
>>>> >> wrong here?
>>>> >>
>>>> >> Thanks,
>>>> >>
>>>> >> Martijn
>>>> >>
>>>> >> On 13 September 2010 10:27, Karl Wright <da...@gmail.com> wrote:
>>>> >> > Hi Martijn,
>>>> >> >
>>>> >> > For the 401 error, here's something also worth trying, to remove the
>>>> >> > possibility that your error has anything to do with other recent
>>>> >> > changes.
>>>> >> > Can you check out the following:
>>>> >> >
>>>> >> > svn co -r987345
>>>> >> > https://svn.apache.org/repos/asf/incubator/lcf/trunk/modules/lib
>>>> >> >
>>>> >> > In the checkout lib area, you will see a jar called
>>>> >> > commons-httpclient-lcf.jar.  Replace commons-httpclient-acf.jar with
>>>> >> > it
>>>> >> > (renaming to commons-httpclient-acf.jar, of course), and try running
>>>> >> > with
>>>> >> > it.  If your 401 error no longer happens, then it means something
>>>> >> > was
>>>> >> > messed
>>>> >> > up, and I'll need to do some research.
>>>> >> >
>>>> >> > Thanks,
>>>> >> > Karl
>>>> >> >
>>>> >> > On Sun, Sep 12, 2010 at 5:02 PM, Karl Wright <da...@gmail.com>
>>>> >> > wrote:
>>>> >> >>
>>>> >> >> I confirmed that without any mappings set, the Solr Connector
>>>> >> >> *should*
>>>> >> >> just be passing the metadata through using the metadata's name as
>>>> >> >> the
>>>> >> >> Solr
>>>> >> >> field name.
>>>> >> >>
>>>> >> >> For debugging, if you could post the Solr output from one update
>>>> >> >> operation, I'd love to see if any metadata seems to be in it.
>>>> >> >> Potentially
>>>> >> >> it's there but the Solr schema is not right somehow - that should
>>>> >> >> be
>>>> >> >> the
>>>> >> >> first thing we verify.
>>>> >> >>
>>>> >> >> Karl
>>>> >> >>
>>>> >> >>
>>>> >> >> On Sun, Sep 12, 2010 at 4:50 PM, Martijn v Groningen
>>>> >> >> <ma...@gmail.com> wrote:
>>>> >> >>>
>>>> >> >>> Tomorrow I'll dive into code and do some more debugging. Last week
>>>> >> >>> I
>>>> >> >>> didn't specify any mappings in the mapping tab for the meta data
>>>> >> >>> fields I selected in the metadata tab. But this shouldn't be the
>>>> >> >>> problem, right?
>>>> >> >>>
>>>> >> >>> Thanks,
>>>> >> >>>
>>>> >> >>> Martijn
>>>> >> >>>
>>>> >> >>> On 12 September 2010 22:29, Karl Wright <da...@gmail.com>
>>>> >> >>> wrote:
>>>> >> >>> > Martijn,
>>>> >> >>> >
>>>> >> >>> > (1) The precise svn url for the acf version of httpclient is as
>>>> >> >>> > follows.  My
>>>> >> >>> > apologies for any earlier confusion - I was away from my
>>>> >> >>> > computer at
>>>> >> >>> > the
>>>> >> >>> > time.
>>>> >> >>> >
>>>> >> >>> >
>>>> >> >>> >
>>>> >> >>> >
>>>> >> >>> > https://svn.apache.org/repos/asf/incubator/lcf/upstream/commons-httpclient-3x
>>>> >> >>> >
>>>> >> >>> > (2) Each time the solr connector posts into Solr, you should see
>>>> >> >>> > a
>>>> >> >>> > set
>>>> >> >>> > of
>>>> >> >>> > argument names and values dumped to standard out (or the log).
>>>> >> >>> > So
>>>> >> >>> > it
>>>> >> >>> > should
>>>> >> >>> > be easy to see what is being sent, and whether the arguments in
>>>> >> >>> > fact
>>>> >> >>> > are the
>>>> >> >>> > correct ones for the extracting update request handler, or not.
>>>> >> >>> > Furthermore, the Solr output connector recently had a tab added
>>>> >> >>> > which
>>>> >> >>> > performs the mapping I alluded to.  This mapping is designed to
>>>> >> >>> > translate
>>>> >> >>> > metadata coming from a connector like SharePoint, into fields
>>>> >> >>> > that
>>>> >> >>> > you
>>>> >> >>> > presumably have in your Solr schema.  However, if you don't set
>>>> >> >>> > anything,
>>>> >> >>> > the fields are not changed, and you should see an argument for
>>>> >> >>> > every
>>>> >> >>> > metadata field, something like: literal.xxx=yyy.
>>>> >> >>> >
>>>> >> >>> > If you have a document that you *know* has metadata, and you've
>>>> >> >>> > specified
>>>> >> >>> > that metadata in the job, and you run the job after you specify
>>>> >> >>> > that
>>>> >> >>> > metadata, but still see no literal.xxx=yyy corresponding to it
>>>> >> >>> > in
>>>> >> >>> > the
>>>> >> >>> > Solr
>>>> >> >>> > output, then we should spend some time chasing this problem
>>>> >> >>> > down.
>>>> >> >>> > Be
>>>> >> >>> > wary
>>>> >> >>> > because incremental crawling means you'll probably not see your
>>>> >> >>> > document
>>>> >> >>> > processed again unless you either change it in SharePoint, or
>>>> >> >>> > delete
>>>> >> >>> > and
>>>> >> >>> > recreate the job.  But be reassured that SharePoint metadata was
>>>> >> >>> > covered by
>>>> >> >>> > the old MetaCarta tests, and there have been no changes of any
>>>> >> >>> > significance
>>>> >> >>> > to the SharePoint connector since then, so I have no explanation
>>>> >> >>> > why
>>>> >> >>> > it
>>>> >> >>> > would not work for you too.  That's why I'm spending time trying
>>>> >> >>> > to
>>>> >> >>> > figure
>>>> >> >>> > out if this is a Solr connector issue instead.
>>>> >> >>> >
>>>> >> >>> > Please let me know if this helps you, or whether you need to go
>>>> >> >>> > deeper
>>>> >> >>> > into
>>>> >> >>> > debugging.
>>>> >> >>> >
>>>> >> >>> > Karl
>>>> >> >>> >
>>>> >> >>> >
>>>> >> >>> > On Sun, Sep 12, 2010 at 4:05 PM, Martijn v Groningen
>>>> >> >>> > <ma...@gmail.com> wrote:
>>>> >> >>> >>
>>>> >> >>> >> I didn't notice that I was under the upstream-changes
>>>> >> >>> >> directory.
>>>> >> >>> >> Thanks for pointing that out.
>>>> >> >>> >>
>>>> >> >>> >> In Solr I have a wildcard (*) dynamic field, so everything acf
>>>> >> >>> >> sends
>>>> >> >>> >> should end up in my index (or at least that is what I assume).
>>>> >> >>> >> I
>>>> >> >>> >> also
>>>> >> >>> >> did some debugging in the Solr connecter and I noticed that no
>>>> >> >>> >> metadata was send to Solr. I didn't create field mappings in my
>>>> >> >>> >> acf
>>>> >> >>> >> job. Do you always have to make mapping for metadata?
>>>> >> >>> >>
>>>> >> >>> >> Martijn
>>>> >> >>> >>
>>>> >> >>> >> On 12 September 2010 21:09, Karl Wright <da...@gmail.com>
>>>> >> >>> >> wrote:
>>>> >> >>> >> > The source for upstream changes is under
>>>> >> >>> >> > lcf/upstream-changes/httpclient, not under trunk.
>>>> >> >>> >> >
>>>> >> >>> >> > As for the metadata, how are you determining that no metadata
>>>> >> >>> >> > is
>>>> >> >>> >> > being
>>>> >> >>> >> > indexed?  If this is Solr you are indexing into, have you set
>>>> >> >>> >> > up
>>>> >> >>> >> > the
>>>> >> >>> >> > appropriate metadata/field mappings?
>>>> >> >>> >> >
>>>> >> >>> >> > Karl
>>>> >> >>> >> >
>>>> >> >>> >> > On 9/12/10, Martijn v Groningen <ma...@gmail.com>
>>>> >> >>> >> > wrote:
>>>> >> >>> >> >> To authenticate with Share point I had to include the domain
>>>> >> >>> >> >> as
>>>> >> >>> >> >> well.
>>>> >> >>> >> >> Also the ui reported an error if I didn't specify the
>>>> >> >>> >> >> username
>>>> >> >>> >> >> in a
>>>> >> >>> >> >> domain / username format. Maybe this http client issue was
>>>> >> >>> >> >> just
>>>> >> >>> >> >> particular with the Sharepoint / Domain Controller
>>>> >> >>> >> >> installation
>>>> >> >>> >> >> I
>>>> >> >>> >> >> was
>>>> >> >>> >> >> working with. I also couldn't find the source of afc version
>>>> >> >>> >> >> of
>>>> >> >>> >> >> http
>>>> >> >>> >> >> client. Is it hosted in another source repository?
>>>> >> >>> >> >>
>>>> >> >>> >> >> I still don't understand why for the documents I crawled, I
>>>> >> >>> >> >> didn't
>>>> >> >>> >> >> have any metadata associated with it. In the job
>>>> >> >>> >> >> configuration I
>>>> >> >>> >> >> was
>>>> >> >>> >> >> able to choose which metadata I wanted to include. You have
>>>> >> >>> >> >> an
>>>> >> >>> >> >> idea
>>>> >> >>> >> >> what might be the cause of this?
>>>> >> >>> >> >>
>>>> >> >>> >> >> Regards,
>>>> >> >>> >> >>
>>>> >> >>> >> >> Martijn
>>>> >> >>> >> >>
>>>> >> >>> >> >> On 12 September 2010 18:40, Karl Wright <da...@gmail.com>
>>>> >> >>> >> >> wrote:
>>>> >> >>> >> >>> Hi Martijn,
>>>> >> >>> >> >>>
>>>> >> >>> >> >>> The ACF version of httpclient has support for NTLMv1,
>>>> >> >>> >> >>> NTLMv2,
>>>> >> >>> >> >>> and
>>>> >> >>> >> >>> NTLM2
>>>> >> >>> >> >>> protocols.  The standard client does not.
>>>> >> >>> >> >>>
>>>> >> >>> >> >>> What this means practically for you depends on how the
>>>> >> >>> >> >>> Windows
>>>> >> >>> >> >>> domain
>>>> >> >>> >> >>> controller you are working with is configured.  You cannot
>>>> >> >>> >> >>> use
>>>> >> >>> >> >>> the
>>>> >> >>> >> >>> off-the-shelf httpclient and still authenticate if the
>>>> >> >>> >> >>> domain
>>>> >> >>> >> >>> controller
>>>> >> >>> >> >>> is
>>>> >> >>> >> >>> configured to not allow LM connections, which is what
>>>> >> >>> >> >>> Microsoft
>>>> >> >>> >> >>> recommends
>>>> >> >>> >> >>> people do.
>>>> >> >>> >> >>>
>>>> >> >>> >> >>> Since the ACF version of httpclient will always try to
>>>> >> >>> >> >>> connect
>>>> >> >>> >> >>> using
>>>> >> >>> >> >>> NTLMv2,
>>>> >> >>> >> >>> this means that you must be more rigorous about setting up
>>>> >> >>> >> >>> your
>>>> >> >>> >> >>> client
>>>> >> >>> >> >>> machine.  First, it must have a name, and it must have a
>>>> >> >>> >> >>> machine
>>>> >> >>> >> >>> account
>>>> >> >>> >> >>> in
>>>> >> >>> >> >>> the domain.  Second, NTLMv2 is much more picky about how
>>>> >> >>> >> >>> you
>>>> >> >>> >> >>> specify
>>>> >> >>> >> >>> user
>>>> >> >>> >> >>> and domain.  The end user documentation provides details
>>>> >> >>> >> >>> that
>>>> >> >>> >> >>> may
>>>> >> >>> >> >>> be
>>>> >> >>> >> >>> helpful
>>>> >> >>> >> >>> to you in this regard.
>>>> >> >>> >> >>>
>>>> >> >>> >> >>> Thanks,
>>>> >> >>> >> >>> Karl
>>>> >> >>> >> >>>
>>>> >> >>> >> >>>
>>>> >> >>> >> >>> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
>>>> >> >>> >> >>> <ma...@gmail.com> wrote:
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> Hi All,
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> I've configured the Sharepoint connector (to connect to
>>>> >> >>> >> >>>> sharepoint
>>>> >> >>> >> >>>> 3.0), Solr connector and a job that adds documents into
>>>> >> >>> >> >>>> Solr.
>>>> >> >>> >> >>>> The
>>>> >> >>> >> >>>> only
>>>> >> >>> >> >>>> thing that I'm missing is the meta data from Sharepoint.
>>>> >> >>> >> >>>> Per
>>>> >> >>> >> >>>> document
>>>> >> >>> >> >>>> I need to know which users can access it. In the metadata
>>>> >> >>> >> >>>> tab
>>>> >> >>> >> >>>> on
>>>> >> >>> >> >>>> the
>>>> >> >>> >> >>>> job page I've configured the metadata to be included, but
>>>> >> >>> >> >>>> this
>>>> >> >>> >> >>>> doesn't
>>>> >> >>> >> >>>> end up in my Solr index. Does anybody know what I should
>>>> >> >>> >> >>>> do to
>>>> >> >>> >> >>>> also
>>>> >> >>> >> >>>> have the metadata in my index?
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> I also had another issue with the Sharepoint connector
>>>> >> >>> >> >>>> which I
>>>> >> >>> >> >>>> managed
>>>> >> >>> >> >>>> to solve. But I'm curious to know if someone else
>>>> >> >>> >> >>>> encountered
>>>> >> >>> >> >>>> a
>>>> >> >>> >> >>>> similar issue.
>>>> >> >>> >> >>>> When I was setting up the sharepoint connecter I always
>>>> >> >>> >> >>>> got a
>>>> >> >>> >> >>>> 401
>>>> >> >>> >> >>>> message on the connectors page as status. I was sure I
>>>> >> >>> >> >>>> entered
>>>> >> >>> >> >>>> the
>>>> >> >>> >> >>>> correct credentials. After some debugging I noticed that
>>>> >> >>> >> >>>> the
>>>> >> >>> >> >>>> NLTM
>>>> >> >>> >> >>>> data
>>>> >> >>> >> >>>> that was send to Solr was different then when I did a http
>>>> >> >>> >> >>>> post
>>>> >> >>> >> >>>> with
>>>> >> >>> >> >>>> Firefox poster plugin to a Sharepoint webservice url (I
>>>> >> >>> >> >>>> check
>>>> >> >>> >> >>>> this
>>>> >> >>> >> >>>> with Wireshark). After writing a little test case with
>>>> >> >>> >> >>>> httpclient
>>>> >> >>> >> >>>> used
>>>> >> >>> >> >>>> in afc, I got the same 401 error. I then ran the test with
>>>> >> >>> >> >>>> a
>>>> >> >>> >> >>>> clean
>>>> >> >>> >> >>>> http client (version 3.1), that ran as expected. I got a
>>>> >> >>> >> >>>> response
>>>> >> >>> >> >>>> code
>>>> >> >>> >> >>>> 200 back with a soap response. I then used this version of
>>>> >> >>> >> >>>> http
>>>> >> >>> >> >>>> client
>>>> >> >>> >> >>>> (with some class filesfrom the afc provided jar that were
>>>> >> >>> >> >>>> missing
>>>> >> >>> >> >>>> is
>>>> >> >>> >> >>>> the plain jar file) and the connector worked as expected
>>>> >> >>> >> >>>> as I
>>>> >> >>> >> >>>> was
>>>> >> >>> >> >>>> able
>>>> >> >>> >> >>>> to index documents. Did someone else have this particular
>>>> >> >>> >> >>>> issue?
>>>> >> >>> >> >>>> I
>>>> >> >>> >> >>>> noticed that acf is using httpclient 3.1 (from the
>>>> >> >>> >> >>>> manifest
>>>> >> >>> >> >>>> file),
>>>> >> >>> >> >>>> but
>>>> >> >>> >> >>>> I'm curious to know why http client was modified.
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> BTW I've been using the latest trunk version (I did a
>>>> >> >>> >> >>>> checkout
>>>> >> >>> >> >>>> last
>>>> >> >>> >> >>>> tuesday). I'm also new to Sharepoint
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> Cheers,
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> Martijn
>>>> >> >>> >> >>>
>>>> >> >>> >> >>>
>>>> >> >>> >> >>
>>>> >> >>> >> >
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>> >> --
>>>> >> >>> >> Met vriendelijke groet,
>>>> >> >>> >>
>>>> >> >>> >> Martijn van Groningen
>>>> >> >>> >
>>>> >> >>> >
>>>> >> >>>
>>>> >> >>>
>>>> >> >>>
>>>> >> >>> --
>>>> >> >>> Met vriendelijke groet,
>>>> >> >>>
>>>> >> >>> Martijn van Groningen
>>>> >> >>
>>>> >> >
>>>> >> >
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Met vriendelijke groet,
>>>> >>
>>>> >> Martijn van Groningen
>>>> >
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Met vriendelijke groet,
>>>>
>>>> Martijn van Groningen
>>>
>>
>>
>
>
>
> --
> Met vriendelijke groet,
>
> Martijn van Groningen
>

Re: Sharepoint connector question

Posted by Martijn v Groningen <ma...@gmail.com>.
Hi Karl,

Unfortunately in the environment where I'm currently working, is
installing a new Sharepoint instance is not an option. Due to time
pressure we dropped the requirement of crawling documents from
Sharepoint.

Martijn

On 15 September 2010 23:53, Karl Wright <da...@gmail.com> wrote:
> Any news on this issue?
> Karl
>
> On Mon, Sep 13, 2010 at 3:00 PM, Karl Wright <da...@gmail.com> wrote:
>>
>> I would expect that domain administrator privs would be sufficient. ;-)
>>
>> Unfortunately, SharePoint (and .NET services in general) often seem to
>> have unusual security problems with internal communication.  I've seen cases
>> where some of SharePoint's own web services have this issue.  Never was able
>> to figure out the problem.  Perhaps a .NET guru could, but that's not me.
>>
>> Karl
>>
>> On Mon, Sep 13, 2010 at 2:55 PM, Martijn v Groningen
>> <ma...@gmail.com> wrote:
>>>
>>> That could explain the error. I've installed and uninstalled the
>>> webservice extension a few times, but I know for sure that the last
>>> time I installed it as domain administrator. Last week I used a
>>> plain-vanilla Sharepoint (trail version), the webservice extension
>>> worked there without any problem.
>>>
>>> On 13 September 2010 19:30, Karl Wright <da...@gmail.com> wrote:
>>> > The key error is the following:
>>> >
>>> >>>>>>>
>>> > <soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
>>> > request failed with HTTP status 401:
>>> >
>>> > Unauthorized.</faultstring><faultactor>http://[host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
>>> > request failed with HTTP status 401:
>>> >
>>> > Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault>
>>> > <<<<<<
>>> >
>>> > Clearly the MCPermissions web service does not have sufficient
>>> > permissions
>>> > to perform its task.  I don't recall ever having seen this before, but
>>> > perhaps during installation you were not logged in as a user that has
>>> > enough
>>> > permission to perform security lookups.
>>> >
>>> > Karl
>>> >
>>> > On Mon, Sep 13, 2010 at 12:15 PM, Martijn v Groningen
>>> > <ma...@gmail.com> wrote:
>>> >>
>>> >> Hi Karl,
>>> >>
>>> >> Today I'm not at the environment where I can verify this, but I'll
>>> >> definitely check this. But I ran into another issue with the
>>> >> Sharepoint connector. In a another environment I installed the
>>> >> Metacarta Sharepoint webservice extensions, but by executing the
>>> >> following post:
>>> >> HTTP POST http://[host]/rotterdamn/_vti_bin/MCPermissions.asmx
>>> >> <?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope
>>> >> xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
>>> >> xmlns:xsd="http://www.w3.org/2001/XMLSchema"
>>> >>
>>> >>
>>> >> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><soapenv:Body><GetPermissionCollection
>>> >>
>>> >>
>>> >> xmlns="http://microsoft.com/sharepoint/webpartpages/"><objectName>/</objectName><objectType>Web</objectType></GetPermissionCollection></soapenv:Body></soapenv:Envelope>
>>> >>
>>> >> I get back the following response (http 500):
>>> >> <?xml version="1.0" encoding="utf-8"?><soap:Envelope
>>> >> xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
>>> >> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>>> >>
>>> >>
>>> >> xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
>>> >> request failed with HTTP status 401:
>>> >>
>>> >>
>>> >> Unauthorized.</faultstring><faultactor>http://[host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
>>> >> request failed with HTTP status 401:
>>> >>
>>> >>
>>> >> Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault></soap:Body></soap:Envelope>
>>> >>
>>> >> I seems that I only get an error with the MCPermissions webservice
>>> >> call. Other calls such as GetListCollection work fine. I'm
>>> >> authentication with the domain administrator account. This environment
>>> >> has also Sharepoint 3.0 installed. I'm making these posts to
>>> >> Sharepoint with Firefox http poster plugin.  Also the url in the
>>> >> response is without the subsite.
>>> >>
>>> >> Also important to note is that browsing to
>>> >> http://[host]/[subsite]/_vti_bin/MCPermissions.asmx shows the
>>> >> GetPermissionsCollection operation. That is what I checked after
>>> >> installing the webservice extension. You have a clue what might be
>>> >> wrong here?
>>> >>
>>> >> Thanks,
>>> >>
>>> >> Martijn
>>> >>
>>> >> On 13 September 2010 10:27, Karl Wright <da...@gmail.com> wrote:
>>> >> > Hi Martijn,
>>> >> >
>>> >> > For the 401 error, here's something also worth trying, to remove the
>>> >> > possibility that your error has anything to do with other recent
>>> >> > changes.
>>> >> > Can you check out the following:
>>> >> >
>>> >> > svn co -r987345
>>> >> > https://svn.apache.org/repos/asf/incubator/lcf/trunk/modules/lib
>>> >> >
>>> >> > In the checkout lib area, you will see a jar called
>>> >> > commons-httpclient-lcf.jar.  Replace commons-httpclient-acf.jar with
>>> >> > it
>>> >> > (renaming to commons-httpclient-acf.jar, of course), and try running
>>> >> > with
>>> >> > it.  If your 401 error no longer happens, then it means something
>>> >> > was
>>> >> > messed
>>> >> > up, and I'll need to do some research.
>>> >> >
>>> >> > Thanks,
>>> >> > Karl
>>> >> >
>>> >> > On Sun, Sep 12, 2010 at 5:02 PM, Karl Wright <da...@gmail.com>
>>> >> > wrote:
>>> >> >>
>>> >> >> I confirmed that without any mappings set, the Solr Connector
>>> >> >> *should*
>>> >> >> just be passing the metadata through using the metadata's name as
>>> >> >> the
>>> >> >> Solr
>>> >> >> field name.
>>> >> >>
>>> >> >> For debugging, if you could post the Solr output from one update
>>> >> >> operation, I'd love to see if any metadata seems to be in it.
>>> >> >> Potentially
>>> >> >> it's there but the Solr schema is not right somehow - that should
>>> >> >> be
>>> >> >> the
>>> >> >> first thing we verify.
>>> >> >>
>>> >> >> Karl
>>> >> >>
>>> >> >>
>>> >> >> On Sun, Sep 12, 2010 at 4:50 PM, Martijn v Groningen
>>> >> >> <ma...@gmail.com> wrote:
>>> >> >>>
>>> >> >>> Tomorrow I'll dive into code and do some more debugging. Last week
>>> >> >>> I
>>> >> >>> didn't specify any mappings in the mapping tab for the meta data
>>> >> >>> fields I selected in the metadata tab. But this shouldn't be the
>>> >> >>> problem, right?
>>> >> >>>
>>> >> >>> Thanks,
>>> >> >>>
>>> >> >>> Martijn
>>> >> >>>
>>> >> >>> On 12 September 2010 22:29, Karl Wright <da...@gmail.com>
>>> >> >>> wrote:
>>> >> >>> > Martijn,
>>> >> >>> >
>>> >> >>> > (1) The precise svn url for the acf version of httpclient is as
>>> >> >>> > follows.  My
>>> >> >>> > apologies for any earlier confusion - I was away from my
>>> >> >>> > computer at
>>> >> >>> > the
>>> >> >>> > time.
>>> >> >>> >
>>> >> >>> >
>>> >> >>> >
>>> >> >>> >
>>> >> >>> > https://svn.apache.org/repos/asf/incubator/lcf/upstream/commons-httpclient-3x
>>> >> >>> >
>>> >> >>> > (2) Each time the solr connector posts into Solr, you should see
>>> >> >>> > a
>>> >> >>> > set
>>> >> >>> > of
>>> >> >>> > argument names and values dumped to standard out (or the log).
>>> >> >>> > So
>>> >> >>> > it
>>> >> >>> > should
>>> >> >>> > be easy to see what is being sent, and whether the arguments in
>>> >> >>> > fact
>>> >> >>> > are the
>>> >> >>> > correct ones for the extracting update request handler, or not.
>>> >> >>> > Furthermore, the Solr output connector recently had a tab added
>>> >> >>> > which
>>> >> >>> > performs the mapping I alluded to.  This mapping is designed to
>>> >> >>> > translate
>>> >> >>> > metadata coming from a connector like SharePoint, into fields
>>> >> >>> > that
>>> >> >>> > you
>>> >> >>> > presumably have in your Solr schema.  However, if you don't set
>>> >> >>> > anything,
>>> >> >>> > the fields are not changed, and you should see an argument for
>>> >> >>> > every
>>> >> >>> > metadata field, something like: literal.xxx=yyy.
>>> >> >>> >
>>> >> >>> > If you have a document that you *know* has metadata, and you've
>>> >> >>> > specified
>>> >> >>> > that metadata in the job, and you run the job after you specify
>>> >> >>> > that
>>> >> >>> > metadata, but still see no literal.xxx=yyy corresponding to it
>>> >> >>> > in
>>> >> >>> > the
>>> >> >>> > Solr
>>> >> >>> > output, then we should spend some time chasing this problem
>>> >> >>> > down.
>>> >> >>> > Be
>>> >> >>> > wary
>>> >> >>> > because incremental crawling means you'll probably not see your
>>> >> >>> > document
>>> >> >>> > processed again unless you either change it in SharePoint, or
>>> >> >>> > delete
>>> >> >>> > and
>>> >> >>> > recreate the job.  But be reassured that SharePoint metadata was
>>> >> >>> > covered by
>>> >> >>> > the old MetaCarta tests, and there have been no changes of any
>>> >> >>> > significance
>>> >> >>> > to the SharePoint connector since then, so I have no explanation
>>> >> >>> > why
>>> >> >>> > it
>>> >> >>> > would not work for you too.  That's why I'm spending time trying
>>> >> >>> > to
>>> >> >>> > figure
>>> >> >>> > out if this is a Solr connector issue instead.
>>> >> >>> >
>>> >> >>> > Please let me know if this helps you, or whether you need to go
>>> >> >>> > deeper
>>> >> >>> > into
>>> >> >>> > debugging.
>>> >> >>> >
>>> >> >>> > Karl
>>> >> >>> >
>>> >> >>> >
>>> >> >>> > On Sun, Sep 12, 2010 at 4:05 PM, Martijn v Groningen
>>> >> >>> > <ma...@gmail.com> wrote:
>>> >> >>> >>
>>> >> >>> >> I didn't notice that I was under the upstream-changes
>>> >> >>> >> directory.
>>> >> >>> >> Thanks for pointing that out.
>>> >> >>> >>
>>> >> >>> >> In Solr I have a wildcard (*) dynamic field, so everything acf
>>> >> >>> >> sends
>>> >> >>> >> should end up in my index (or at least that is what I assume).
>>> >> >>> >> I
>>> >> >>> >> also
>>> >> >>> >> did some debugging in the Solr connecter and I noticed that no
>>> >> >>> >> metadata was send to Solr. I didn't create field mappings in my
>>> >> >>> >> acf
>>> >> >>> >> job. Do you always have to make mapping for metadata?
>>> >> >>> >>
>>> >> >>> >> Martijn
>>> >> >>> >>
>>> >> >>> >> On 12 September 2010 21:09, Karl Wright <da...@gmail.com>
>>> >> >>> >> wrote:
>>> >> >>> >> > The source for upstream changes is under
>>> >> >>> >> > lcf/upstream-changes/httpclient, not under trunk.
>>> >> >>> >> >
>>> >> >>> >> > As for the metadata, how are you determining that no metadata
>>> >> >>> >> > is
>>> >> >>> >> > being
>>> >> >>> >> > indexed?  If this is Solr you are indexing into, have you set
>>> >> >>> >> > up
>>> >> >>> >> > the
>>> >> >>> >> > appropriate metadata/field mappings?
>>> >> >>> >> >
>>> >> >>> >> > Karl
>>> >> >>> >> >
>>> >> >>> >> > On 9/12/10, Martijn v Groningen <ma...@gmail.com>
>>> >> >>> >> > wrote:
>>> >> >>> >> >> To authenticate with Share point I had to include the domain
>>> >> >>> >> >> as
>>> >> >>> >> >> well.
>>> >> >>> >> >> Also the ui reported an error if I didn't specify the
>>> >> >>> >> >> username
>>> >> >>> >> >> in a
>>> >> >>> >> >> domain / username format. Maybe this http client issue was
>>> >> >>> >> >> just
>>> >> >>> >> >> particular with the Sharepoint / Domain Controller
>>> >> >>> >> >> installation
>>> >> >>> >> >> I
>>> >> >>> >> >> was
>>> >> >>> >> >> working with. I also couldn't find the source of afc version
>>> >> >>> >> >> of
>>> >> >>> >> >> http
>>> >> >>> >> >> client. Is it hosted in another source repository?
>>> >> >>> >> >>
>>> >> >>> >> >> I still don't understand why for the documents I crawled, I
>>> >> >>> >> >> didn't
>>> >> >>> >> >> have any metadata associated with it. In the job
>>> >> >>> >> >> configuration I
>>> >> >>> >> >> was
>>> >> >>> >> >> able to choose which metadata I wanted to include. You have
>>> >> >>> >> >> an
>>> >> >>> >> >> idea
>>> >> >>> >> >> what might be the cause of this?
>>> >> >>> >> >>
>>> >> >>> >> >> Regards,
>>> >> >>> >> >>
>>> >> >>> >> >> Martijn
>>> >> >>> >> >>
>>> >> >>> >> >> On 12 September 2010 18:40, Karl Wright <da...@gmail.com>
>>> >> >>> >> >> wrote:
>>> >> >>> >> >>> Hi Martijn,
>>> >> >>> >> >>>
>>> >> >>> >> >>> The ACF version of httpclient has support for NTLMv1,
>>> >> >>> >> >>> NTLMv2,
>>> >> >>> >> >>> and
>>> >> >>> >> >>> NTLM2
>>> >> >>> >> >>> protocols.  The standard client does not.
>>> >> >>> >> >>>
>>> >> >>> >> >>> What this means practically for you depends on how the
>>> >> >>> >> >>> Windows
>>> >> >>> >> >>> domain
>>> >> >>> >> >>> controller you are working with is configured.  You cannot
>>> >> >>> >> >>> use
>>> >> >>> >> >>> the
>>> >> >>> >> >>> off-the-shelf httpclient and still authenticate if the
>>> >> >>> >> >>> domain
>>> >> >>> >> >>> controller
>>> >> >>> >> >>> is
>>> >> >>> >> >>> configured to not allow LM connections, which is what
>>> >> >>> >> >>> Microsoft
>>> >> >>> >> >>> recommends
>>> >> >>> >> >>> people do.
>>> >> >>> >> >>>
>>> >> >>> >> >>> Since the ACF version of httpclient will always try to
>>> >> >>> >> >>> connect
>>> >> >>> >> >>> using
>>> >> >>> >> >>> NTLMv2,
>>> >> >>> >> >>> this means that you must be more rigorous about setting up
>>> >> >>> >> >>> your
>>> >> >>> >> >>> client
>>> >> >>> >> >>> machine.  First, it must have a name, and it must have a
>>> >> >>> >> >>> machine
>>> >> >>> >> >>> account
>>> >> >>> >> >>> in
>>> >> >>> >> >>> the domain.  Second, NTLMv2 is much more picky about how
>>> >> >>> >> >>> you
>>> >> >>> >> >>> specify
>>> >> >>> >> >>> user
>>> >> >>> >> >>> and domain.  The end user documentation provides details
>>> >> >>> >> >>> that
>>> >> >>> >> >>> may
>>> >> >>> >> >>> be
>>> >> >>> >> >>> helpful
>>> >> >>> >> >>> to you in this regard.
>>> >> >>> >> >>>
>>> >> >>> >> >>> Thanks,
>>> >> >>> >> >>> Karl
>>> >> >>> >> >>>
>>> >> >>> >> >>>
>>> >> >>> >> >>> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
>>> >> >>> >> >>> <ma...@gmail.com> wrote:
>>> >> >>> >> >>>>
>>> >> >>> >> >>>> Hi All,
>>> >> >>> >> >>>>
>>> >> >>> >> >>>> I've configured the Sharepoint connector (to connect to
>>> >> >>> >> >>>> sharepoint
>>> >> >>> >> >>>> 3.0), Solr connector and a job that adds documents into
>>> >> >>> >> >>>> Solr.
>>> >> >>> >> >>>> The
>>> >> >>> >> >>>> only
>>> >> >>> >> >>>> thing that I'm missing is the meta data from Sharepoint.
>>> >> >>> >> >>>> Per
>>> >> >>> >> >>>> document
>>> >> >>> >> >>>> I need to know which users can access it. In the metadata
>>> >> >>> >> >>>> tab
>>> >> >>> >> >>>> on
>>> >> >>> >> >>>> the
>>> >> >>> >> >>>> job page I've configured the metadata to be included, but
>>> >> >>> >> >>>> this
>>> >> >>> >> >>>> doesn't
>>> >> >>> >> >>>> end up in my Solr index. Does anybody know what I should
>>> >> >>> >> >>>> do to
>>> >> >>> >> >>>> also
>>> >> >>> >> >>>> have the metadata in my index?
>>> >> >>> >> >>>>
>>> >> >>> >> >>>> I also had another issue with the Sharepoint connector
>>> >> >>> >> >>>> which I
>>> >> >>> >> >>>> managed
>>> >> >>> >> >>>> to solve. But I'm curious to know if someone else
>>> >> >>> >> >>>> encountered
>>> >> >>> >> >>>> a
>>> >> >>> >> >>>> similar issue.
>>> >> >>> >> >>>> When I was setting up the sharepoint connecter I always
>>> >> >>> >> >>>> got a
>>> >> >>> >> >>>> 401
>>> >> >>> >> >>>> message on the connectors page as status. I was sure I
>>> >> >>> >> >>>> entered
>>> >> >>> >> >>>> the
>>> >> >>> >> >>>> correct credentials. After some debugging I noticed that
>>> >> >>> >> >>>> the
>>> >> >>> >> >>>> NLTM
>>> >> >>> >> >>>> data
>>> >> >>> >> >>>> that was send to Solr was different then when I did a http
>>> >> >>> >> >>>> post
>>> >> >>> >> >>>> with
>>> >> >>> >> >>>> Firefox poster plugin to a Sharepoint webservice url (I
>>> >> >>> >> >>>> check
>>> >> >>> >> >>>> this
>>> >> >>> >> >>>> with Wireshark). After writing a little test case with
>>> >> >>> >> >>>> httpclient
>>> >> >>> >> >>>> used
>>> >> >>> >> >>>> in afc, I got the same 401 error. I then ran the test with
>>> >> >>> >> >>>> a
>>> >> >>> >> >>>> clean
>>> >> >>> >> >>>> http client (version 3.1), that ran as expected. I got a
>>> >> >>> >> >>>> response
>>> >> >>> >> >>>> code
>>> >> >>> >> >>>> 200 back with a soap response. I then used this version of
>>> >> >>> >> >>>> http
>>> >> >>> >> >>>> client
>>> >> >>> >> >>>> (with some class filesfrom the afc provided jar that were
>>> >> >>> >> >>>> missing
>>> >> >>> >> >>>> is
>>> >> >>> >> >>>> the plain jar file) and the connector worked as expected
>>> >> >>> >> >>>> as I
>>> >> >>> >> >>>> was
>>> >> >>> >> >>>> able
>>> >> >>> >> >>>> to index documents. Did someone else have this particular
>>> >> >>> >> >>>> issue?
>>> >> >>> >> >>>> I
>>> >> >>> >> >>>> noticed that acf is using httpclient 3.1 (from the
>>> >> >>> >> >>>> manifest
>>> >> >>> >> >>>> file),
>>> >> >>> >> >>>> but
>>> >> >>> >> >>>> I'm curious to know why http client was modified.
>>> >> >>> >> >>>>
>>> >> >>> >> >>>> BTW I've been using the latest trunk version (I did a
>>> >> >>> >> >>>> checkout
>>> >> >>> >> >>>> last
>>> >> >>> >> >>>> tuesday). I'm also new to Sharepoint
>>> >> >>> >> >>>>
>>> >> >>> >> >>>> Cheers,
>>> >> >>> >> >>>>
>>> >> >>> >> >>>> Martijn
>>> >> >>> >> >>>
>>> >> >>> >> >>>
>>> >> >>> >> >>
>>> >> >>> >> >
>>> >> >>> >>
>>> >> >>> >>
>>> >> >>> >>
>>> >> >>> >> --
>>> >> >>> >> Met vriendelijke groet,
>>> >> >>> >>
>>> >> >>> >> Martijn van Groningen
>>> >> >>> >
>>> >> >>> >
>>> >> >>>
>>> >> >>>
>>> >> >>>
>>> >> >>> --
>>> >> >>> Met vriendelijke groet,
>>> >> >>>
>>> >> >>> Martijn van Groningen
>>> >> >>
>>> >> >
>>> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Met vriendelijke groet,
>>> >>
>>> >> Martijn van Groningen
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Met vriendelijke groet,
>>>
>>> Martijn van Groningen
>>
>
>



-- 
Met vriendelijke groet,

Martijn van Groningen

Re: Sharepoint connector question

Posted by Karl Wright <da...@gmail.com>.
Any news on this issue?
Karl

On Mon, Sep 13, 2010 at 3:00 PM, Karl Wright <da...@gmail.com> wrote:

> I would expect that domain administrator privs would be sufficient. ;-)
>
> Unfortunately, SharePoint (and .NET services in general) often seem to have
> unusual security problems with internal communication.  I've seen cases
> where some of SharePoint's own web services have this issue.  Never was able
> to figure out the problem.  Perhaps a .NET guru could, but that's not me.
>
> Karl
>
>
> On Mon, Sep 13, 2010 at 2:55 PM, Martijn v Groningen <
> martijn.is.hier@gmail.com> wrote:
>
>> That could explain the error. I've installed and uninstalled the
>> webservice extension a few times, but I know for sure that the last
>> time I installed it as domain administrator. Last week I used a
>> plain-vanilla Sharepoint (trail version), the webservice extension
>> worked there without any problem.
>>
>> On 13 September 2010 19:30, Karl Wright <da...@gmail.com> wrote:
>> > The key error is the following:
>> >
>> >>>>>>>
>> > <soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
>> > request failed with HTTP status 401:
>> > Unauthorized.</faultstring><faultactor>http://
>> [host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
>> > request failed with HTTP status 401:
>> >
>> Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault>
>> > <<<<<<
>> >
>> > Clearly the MCPermissions web service does not have sufficient
>> permissions
>> > to perform its task.  I don't recall ever having seen this before, but
>> > perhaps during installation you were not logged in as a user that has
>> enough
>> > permission to perform security lookups.
>> >
>> > Karl
>> >
>> > On Mon, Sep 13, 2010 at 12:15 PM, Martijn v Groningen
>> > <ma...@gmail.com> wrote:
>> >>
>> >> Hi Karl,
>> >>
>> >> Today I'm not at the environment where I can verify this, but I'll
>> >> definitely check this. But I ran into another issue with the
>> >> Sharepoint connector. In a another environment I installed the
>> >> Metacarta Sharepoint webservice extensions, but by executing the
>> >> following post:
>> >> HTTP POST http://[host]/rotterdamn/_vti_bin/MCPermissions.asmx
>> >> <?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope
>> >> xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
>> >> xmlns:xsd="http://www.w3.org/2001/XMLSchema"
>> >>
>> >> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance
>> "><soapenv:Body><GetPermissionCollection
>> >>
>> >> xmlns="http://microsoft.com/sharepoint/webpartpages/
>> "><objectName>/</objectName><objectType>Web</objectType></GetPermissionCollection></soapenv:Body></soapenv:Envelope>
>> >>
>> >> I get back the following response (http 500):
>> >> <?xml version="1.0" encoding="utf-8"?><soap:Envelope
>> >> xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
>> >> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>> >>
>> >> xmlns:xsd="http://www.w3.org/2001/XMLSchema
>> "><soap:Body><soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
>> >> request failed with HTTP status 401:
>> >>
>> >> Unauthorized.</faultstring><faultactor>http://
>> [host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
>> >> request failed with HTTP status 401:
>> >>
>> >>
>> Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault></soap:Body></soap:Envelope>
>> >>
>> >> I seems that I only get an error with the MCPermissions webservice
>> >> call. Other calls such as GetListCollection work fine. I'm
>> >> authentication with the domain administrator account. This environment
>> >> has also Sharepoint 3.0 installed. I'm making these posts to
>> >> Sharepoint with Firefox http poster plugin.  Also the url in the
>> >> response is without the subsite.
>> >>
>> >> Also important to note is that browsing to
>> >> http://[host]/[subsite]/_vti_bin/MCPermissions.asmx shows the
>> >> GetPermissionsCollection operation. That is what I checked after
>> >> installing the webservice extension. You have a clue what might be
>> >> wrong here?
>> >>
>> >> Thanks,
>> >>
>> >> Martijn
>> >>
>> >> On 13 September 2010 10:27, Karl Wright <da...@gmail.com> wrote:
>> >> > Hi Martijn,
>> >> >
>> >> > For the 401 error, here's something also worth trying, to remove the
>> >> > possibility that your error has anything to do with other recent
>> >> > changes.
>> >> > Can you check out the following:
>> >> >
>> >> > svn co -r987345
>> >> > https://svn.apache.org/repos/asf/incubator/lcf/trunk/modules/lib
>> >> >
>> >> > In the checkout lib area, you will see a jar called
>> >> > commons-httpclient-lcf.jar.  Replace commons-httpclient-acf.jar with
>> it
>> >> > (renaming to commons-httpclient-acf.jar, of course), and try running
>> >> > with
>> >> > it.  If your 401 error no longer happens, then it means something was
>> >> > messed
>> >> > up, and I'll need to do some research.
>> >> >
>> >> > Thanks,
>> >> > Karl
>> >> >
>> >> > On Sun, Sep 12, 2010 at 5:02 PM, Karl Wright <da...@gmail.com>
>> wrote:
>> >> >>
>> >> >> I confirmed that without any mappings set, the Solr Connector
>> *should*
>> >> >> just be passing the metadata through using the metadata's name as
>> the
>> >> >> Solr
>> >> >> field name.
>> >> >>
>> >> >> For debugging, if you could post the Solr output from one update
>> >> >> operation, I'd love to see if any metadata seems to be in it.
>> >> >> Potentially
>> >> >> it's there but the Solr schema is not right somehow - that should be
>> >> >> the
>> >> >> first thing we verify.
>> >> >>
>> >> >> Karl
>> >> >>
>> >> >>
>> >> >> On Sun, Sep 12, 2010 at 4:50 PM, Martijn v Groningen
>> >> >> <ma...@gmail.com> wrote:
>> >> >>>
>> >> >>> Tomorrow I'll dive into code and do some more debugging. Last week
>> I
>> >> >>> didn't specify any mappings in the mapping tab for the meta data
>> >> >>> fields I selected in the metadata tab. But this shouldn't be the
>> >> >>> problem, right?
>> >> >>>
>> >> >>> Thanks,
>> >> >>>
>> >> >>> Martijn
>> >> >>>
>> >> >>> On 12 September 2010 22:29, Karl Wright <da...@gmail.com>
>> wrote:
>> >> >>> > Martijn,
>> >> >>> >
>> >> >>> > (1) The precise svn url for the acf version of httpclient is as
>> >> >>> > follows.  My
>> >> >>> > apologies for any earlier confusion - I was away from my computer
>> at
>> >> >>> > the
>> >> >>> > time.
>> >> >>> >
>> >> >>> >
>> >> >>> >
>> >> >>> >
>> https://svn.apache.org/repos/asf/incubator/lcf/upstream/commons-httpclient-3x
>> >> >>> >
>> >> >>> > (2) Each time the solr connector posts into Solr, you should see
>> a
>> >> >>> > set
>> >> >>> > of
>> >> >>> > argument names and values dumped to standard out (or the log).
>> So
>> >> >>> > it
>> >> >>> > should
>> >> >>> > be easy to see what is being sent, and whether the arguments in
>> fact
>> >> >>> > are the
>> >> >>> > correct ones for the extracting update request handler, or not.
>> >> >>> > Furthermore, the Solr output connector recently had a tab added
>> >> >>> > which
>> >> >>> > performs the mapping I alluded to.  This mapping is designed to
>> >> >>> > translate
>> >> >>> > metadata coming from a connector like SharePoint, into fields
>> that
>> >> >>> > you
>> >> >>> > presumably have in your Solr schema.  However, if you don't set
>> >> >>> > anything,
>> >> >>> > the fields are not changed, and you should see an argument for
>> every
>> >> >>> > metadata field, something like: literal.xxx=yyy.
>> >> >>> >
>> >> >>> > If you have a document that you *know* has metadata, and you've
>> >> >>> > specified
>> >> >>> > that metadata in the job, and you run the job after you specify
>> that
>> >> >>> > metadata, but still see no literal.xxx=yyy corresponding to it in
>> >> >>> > the
>> >> >>> > Solr
>> >> >>> > output, then we should spend some time chasing this problem down.
>> >> >>> > Be
>> >> >>> > wary
>> >> >>> > because incremental crawling means you'll probably not see your
>> >> >>> > document
>> >> >>> > processed again unless you either change it in SharePoint, or
>> delete
>> >> >>> > and
>> >> >>> > recreate the job.  But be reassured that SharePoint metadata was
>> >> >>> > covered by
>> >> >>> > the old MetaCarta tests, and there have been no changes of any
>> >> >>> > significance
>> >> >>> > to the SharePoint connector since then, so I have no explanation
>> why
>> >> >>> > it
>> >> >>> > would not work for you too.  That's why I'm spending time trying
>> to
>> >> >>> > figure
>> >> >>> > out if this is a Solr connector issue instead.
>> >> >>> >
>> >> >>> > Please let me know if this helps you, or whether you need to go
>> >> >>> > deeper
>> >> >>> > into
>> >> >>> > debugging.
>> >> >>> >
>> >> >>> > Karl
>> >> >>> >
>> >> >>> >
>> >> >>> > On Sun, Sep 12, 2010 at 4:05 PM, Martijn v Groningen
>> >> >>> > <ma...@gmail.com> wrote:
>> >> >>> >>
>> >> >>> >> I didn't notice that I was under the upstream-changes directory.
>> >> >>> >> Thanks for pointing that out.
>> >> >>> >>
>> >> >>> >> In Solr I have a wildcard (*) dynamic field, so everything acf
>> >> >>> >> sends
>> >> >>> >> should end up in my index (or at least that is what I assume). I
>> >> >>> >> also
>> >> >>> >> did some debugging in the Solr connecter and I noticed that no
>> >> >>> >> metadata was send to Solr. I didn't create field mappings in my
>> acf
>> >> >>> >> job. Do you always have to make mapping for metadata?
>> >> >>> >>
>> >> >>> >> Martijn
>> >> >>> >>
>> >> >>> >> On 12 September 2010 21:09, Karl Wright <da...@gmail.com>
>> wrote:
>> >> >>> >> > The source for upstream changes is under
>> >> >>> >> > lcf/upstream-changes/httpclient, not under trunk.
>> >> >>> >> >
>> >> >>> >> > As for the metadata, how are you determining that no metadata
>> is
>> >> >>> >> > being
>> >> >>> >> > indexed?  If this is Solr you are indexing into, have you set
>> up
>> >> >>> >> > the
>> >> >>> >> > appropriate metadata/field mappings?
>> >> >>> >> >
>> >> >>> >> > Karl
>> >> >>> >> >
>> >> >>> >> > On 9/12/10, Martijn v Groningen <ma...@gmail.com>
>> >> >>> >> > wrote:
>> >> >>> >> >> To authenticate with Share point I had to include the domain
>> as
>> >> >>> >> >> well.
>> >> >>> >> >> Also the ui reported an error if I didn't specify the
>> username
>> >> >>> >> >> in a
>> >> >>> >> >> domain / username format. Maybe this http client issue was
>> just
>> >> >>> >> >> particular with the Sharepoint / Domain Controller
>> installation
>> >> >>> >> >> I
>> >> >>> >> >> was
>> >> >>> >> >> working with. I also couldn't find the source of afc version
>> of
>> >> >>> >> >> http
>> >> >>> >> >> client. Is it hosted in another source repository?
>> >> >>> >> >>
>> >> >>> >> >> I still don't understand why for the documents I crawled, I
>> >> >>> >> >> didn't
>> >> >>> >> >> have any metadata associated with it. In the job
>> configuration I
>> >> >>> >> >> was
>> >> >>> >> >> able to choose which metadata I wanted to include. You have
>> an
>> >> >>> >> >> idea
>> >> >>> >> >> what might be the cause of this?
>> >> >>> >> >>
>> >> >>> >> >> Regards,
>> >> >>> >> >>
>> >> >>> >> >> Martijn
>> >> >>> >> >>
>> >> >>> >> >> On 12 September 2010 18:40, Karl Wright <da...@gmail.com>
>> >> >>> >> >> wrote:
>> >> >>> >> >>> Hi Martijn,
>> >> >>> >> >>>
>> >> >>> >> >>> The ACF version of httpclient has support for NTLMv1,
>> NTLMv2,
>> >> >>> >> >>> and
>> >> >>> >> >>> NTLM2
>> >> >>> >> >>> protocols.  The standard client does not.
>> >> >>> >> >>>
>> >> >>> >> >>> What this means practically for you depends on how the
>> Windows
>> >> >>> >> >>> domain
>> >> >>> >> >>> controller you are working with is configured.  You cannot
>> use
>> >> >>> >> >>> the
>> >> >>> >> >>> off-the-shelf httpclient and still authenticate if the
>> domain
>> >> >>> >> >>> controller
>> >> >>> >> >>> is
>> >> >>> >> >>> configured to not allow LM connections, which is what
>> Microsoft
>> >> >>> >> >>> recommends
>> >> >>> >> >>> people do.
>> >> >>> >> >>>
>> >> >>> >> >>> Since the ACF version of httpclient will always try to
>> connect
>> >> >>> >> >>> using
>> >> >>> >> >>> NTLMv2,
>> >> >>> >> >>> this means that you must be more rigorous about setting up
>> your
>> >> >>> >> >>> client
>> >> >>> >> >>> machine.  First, it must have a name, and it must have a
>> >> >>> >> >>> machine
>> >> >>> >> >>> account
>> >> >>> >> >>> in
>> >> >>> >> >>> the domain.  Second, NTLMv2 is much more picky about how you
>> >> >>> >> >>> specify
>> >> >>> >> >>> user
>> >> >>> >> >>> and domain.  The end user documentation provides details
>> that
>> >> >>> >> >>> may
>> >> >>> >> >>> be
>> >> >>> >> >>> helpful
>> >> >>> >> >>> to you in this regard.
>> >> >>> >> >>>
>> >> >>> >> >>> Thanks,
>> >> >>> >> >>> Karl
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
>> >> >>> >> >>> <ma...@gmail.com> wrote:
>> >> >>> >> >>>>
>> >> >>> >> >>>> Hi All,
>> >> >>> >> >>>>
>> >> >>> >> >>>> I've configured the Sharepoint connector (to connect to
>> >> >>> >> >>>> sharepoint
>> >> >>> >> >>>> 3.0), Solr connector and a job that adds documents into
>> Solr.
>> >> >>> >> >>>> The
>> >> >>> >> >>>> only
>> >> >>> >> >>>> thing that I'm missing is the meta data from Sharepoint.
>> Per
>> >> >>> >> >>>> document
>> >> >>> >> >>>> I need to know which users can access it. In the metadata
>> tab
>> >> >>> >> >>>> on
>> >> >>> >> >>>> the
>> >> >>> >> >>>> job page I've configured the metadata to be included, but
>> this
>> >> >>> >> >>>> doesn't
>> >> >>> >> >>>> end up in my Solr index. Does anybody know what I should do
>> to
>> >> >>> >> >>>> also
>> >> >>> >> >>>> have the metadata in my index?
>> >> >>> >> >>>>
>> >> >>> >> >>>> I also had another issue with the Sharepoint connector
>> which I
>> >> >>> >> >>>> managed
>> >> >>> >> >>>> to solve. But I'm curious to know if someone else
>> encountered
>> >> >>> >> >>>> a
>> >> >>> >> >>>> similar issue.
>> >> >>> >> >>>> When I was setting up the sharepoint connecter I always got
>> a
>> >> >>> >> >>>> 401
>> >> >>> >> >>>> message on the connectors page as status. I was sure I
>> entered
>> >> >>> >> >>>> the
>> >> >>> >> >>>> correct credentials. After some debugging I noticed that
>> the
>> >> >>> >> >>>> NLTM
>> >> >>> >> >>>> data
>> >> >>> >> >>>> that was send to Solr was different then when I did a http
>> >> >>> >> >>>> post
>> >> >>> >> >>>> with
>> >> >>> >> >>>> Firefox poster plugin to a Sharepoint webservice url (I
>> check
>> >> >>> >> >>>> this
>> >> >>> >> >>>> with Wireshark). After writing a little test case with
>> >> >>> >> >>>> httpclient
>> >> >>> >> >>>> used
>> >> >>> >> >>>> in afc, I got the same 401 error. I then ran the test with
>> a
>> >> >>> >> >>>> clean
>> >> >>> >> >>>> http client (version 3.1), that ran as expected. I got a
>> >> >>> >> >>>> response
>> >> >>> >> >>>> code
>> >> >>> >> >>>> 200 back with a soap response. I then used this version of
>> >> >>> >> >>>> http
>> >> >>> >> >>>> client
>> >> >>> >> >>>> (with some class filesfrom the afc provided jar that were
>> >> >>> >> >>>> missing
>> >> >>> >> >>>> is
>> >> >>> >> >>>> the plain jar file) and the connector worked as expected as
>> I
>> >> >>> >> >>>> was
>> >> >>> >> >>>> able
>> >> >>> >> >>>> to index documents. Did someone else have this particular
>> >> >>> >> >>>> issue?
>> >> >>> >> >>>> I
>> >> >>> >> >>>> noticed that acf is using httpclient 3.1 (from the manifest
>> >> >>> >> >>>> file),
>> >> >>> >> >>>> but
>> >> >>> >> >>>> I'm curious to know why http client was modified.
>> >> >>> >> >>>>
>> >> >>> >> >>>> BTW I've been using the latest trunk version (I did a
>> checkout
>> >> >>> >> >>>> last
>> >> >>> >> >>>> tuesday). I'm also new to Sharepoint
>> >> >>> >> >>>>
>> >> >>> >> >>>> Cheers,
>> >> >>> >> >>>>
>> >> >>> >> >>>> Martijn
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>
>> >> >>> >> >
>> >> >>> >>
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> --
>> >> >>> >> Met vriendelijke groet,
>> >> >>> >>
>> >> >>> >> Martijn van Groningen
>> >> >>> >
>> >> >>> >
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>> --
>> >> >>> Met vriendelijke groet,
>> >> >>>
>> >> >>> Martijn van Groningen
>> >> >>
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Met vriendelijke groet,
>> >>
>> >> Martijn van Groningen
>> >
>> >
>>
>>
>>
>> --
>> Met vriendelijke groet,
>>
>> Martijn van Groningen
>>
>
>

Re: Sharepoint connector question

Posted by Karl Wright <da...@gmail.com>.
I would expect that domain administrator privs would be sufficient. ;-)

Unfortunately, SharePoint (and .NET services in general) often seem to have
unusual security problems with internal communication.  I've seen cases
where some of SharePoint's own web services have this issue.  Never was able
to figure out the problem.  Perhaps a .NET guru could, but that's not me.

Karl

On Mon, Sep 13, 2010 at 2:55 PM, Martijn v Groningen <
martijn.is.hier@gmail.com> wrote:

> That could explain the error. I've installed and uninstalled the
> webservice extension a few times, but I know for sure that the last
> time I installed it as domain administrator. Last week I used a
> plain-vanilla Sharepoint (trail version), the webservice extension
> worked there without any problem.
>
> On 13 September 2010 19:30, Karl Wright <da...@gmail.com> wrote:
> > The key error is the following:
> >
> >>>>>>>
> > <soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
> > request failed with HTTP status 401:
> > Unauthorized.</faultstring><faultactor>http://
> [host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
> > request failed with HTTP status 401:
> >
> Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault>
> > <<<<<<
> >
> > Clearly the MCPermissions web service does not have sufficient
> permissions
> > to perform its task.  I don't recall ever having seen this before, but
> > perhaps during installation you were not logged in as a user that has
> enough
> > permission to perform security lookups.
> >
> > Karl
> >
> > On Mon, Sep 13, 2010 at 12:15 PM, Martijn v Groningen
> > <ma...@gmail.com> wrote:
> >>
> >> Hi Karl,
> >>
> >> Today I'm not at the environment where I can verify this, but I'll
> >> definitely check this. But I ran into another issue with the
> >> Sharepoint connector. In a another environment I installed the
> >> Metacarta Sharepoint webservice extensions, but by executing the
> >> following post:
> >> HTTP POST http://[host]/rotterdamn/_vti_bin/MCPermissions.asmx
> >> <?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope
> >> xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
> >> xmlns:xsd="http://www.w3.org/2001/XMLSchema"
> >>
> >> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance
> "><soapenv:Body><GetPermissionCollection
> >>
> >> xmlns="http://microsoft.com/sharepoint/webpartpages/
> "><objectName>/</objectName><objectType>Web</objectType></GetPermissionCollection></soapenv:Body></soapenv:Envelope>
> >>
> >> I get back the following response (http 500):
> >> <?xml version="1.0" encoding="utf-8"?><soap:Envelope
> >> xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
> >> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> >>
> >> xmlns:xsd="http://www.w3.org/2001/XMLSchema
> "><soap:Body><soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
> >> request failed with HTTP status 401:
> >>
> >> Unauthorized.</faultstring><faultactor>http://
> [host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
> >> request failed with HTTP status 401:
> >>
> >>
> Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault></soap:Body></soap:Envelope>
> >>
> >> I seems that I only get an error with the MCPermissions webservice
> >> call. Other calls such as GetListCollection work fine. I'm
> >> authentication with the domain administrator account. This environment
> >> has also Sharepoint 3.0 installed. I'm making these posts to
> >> Sharepoint with Firefox http poster plugin.  Also the url in the
> >> response is without the subsite.
> >>
> >> Also important to note is that browsing to
> >> http://[host]/[subsite]/_vti_bin/MCPermissions.asmx shows the
> >> GetPermissionsCollection operation. That is what I checked after
> >> installing the webservice extension. You have a clue what might be
> >> wrong here?
> >>
> >> Thanks,
> >>
> >> Martijn
> >>
> >> On 13 September 2010 10:27, Karl Wright <da...@gmail.com> wrote:
> >> > Hi Martijn,
> >> >
> >> > For the 401 error, here's something also worth trying, to remove the
> >> > possibility that your error has anything to do with other recent
> >> > changes.
> >> > Can you check out the following:
> >> >
> >> > svn co -r987345
> >> > https://svn.apache.org/repos/asf/incubator/lcf/trunk/modules/lib
> >> >
> >> > In the checkout lib area, you will see a jar called
> >> > commons-httpclient-lcf.jar.  Replace commons-httpclient-acf.jar with
> it
> >> > (renaming to commons-httpclient-acf.jar, of course), and try running
> >> > with
> >> > it.  If your 401 error no longer happens, then it means something was
> >> > messed
> >> > up, and I'll need to do some research.
> >> >
> >> > Thanks,
> >> > Karl
> >> >
> >> > On Sun, Sep 12, 2010 at 5:02 PM, Karl Wright <da...@gmail.com>
> wrote:
> >> >>
> >> >> I confirmed that without any mappings set, the Solr Connector
> *should*
> >> >> just be passing the metadata through using the metadata's name as the
> >> >> Solr
> >> >> field name.
> >> >>
> >> >> For debugging, if you could post the Solr output from one update
> >> >> operation, I'd love to see if any metadata seems to be in it.
> >> >> Potentially
> >> >> it's there but the Solr schema is not right somehow - that should be
> >> >> the
> >> >> first thing we verify.
> >> >>
> >> >> Karl
> >> >>
> >> >>
> >> >> On Sun, Sep 12, 2010 at 4:50 PM, Martijn v Groningen
> >> >> <ma...@gmail.com> wrote:
> >> >>>
> >> >>> Tomorrow I'll dive into code and do some more debugging. Last week I
> >> >>> didn't specify any mappings in the mapping tab for the meta data
> >> >>> fields I selected in the metadata tab. But this shouldn't be the
> >> >>> problem, right?
> >> >>>
> >> >>> Thanks,
> >> >>>
> >> >>> Martijn
> >> >>>
> >> >>> On 12 September 2010 22:29, Karl Wright <da...@gmail.com> wrote:
> >> >>> > Martijn,
> >> >>> >
> >> >>> > (1) The precise svn url for the acf version of httpclient is as
> >> >>> > follows.  My
> >> >>> > apologies for any earlier confusion - I was away from my computer
> at
> >> >>> > the
> >> >>> > time.
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>> >
> https://svn.apache.org/repos/asf/incubator/lcf/upstream/commons-httpclient-3x
> >> >>> >
> >> >>> > (2) Each time the solr connector posts into Solr, you should see a
> >> >>> > set
> >> >>> > of
> >> >>> > argument names and values dumped to standard out (or the log).  So
> >> >>> > it
> >> >>> > should
> >> >>> > be easy to see what is being sent, and whether the arguments in
> fact
> >> >>> > are the
> >> >>> > correct ones for the extracting update request handler, or not.
> >> >>> > Furthermore, the Solr output connector recently had a tab added
> >> >>> > which
> >> >>> > performs the mapping I alluded to.  This mapping is designed to
> >> >>> > translate
> >> >>> > metadata coming from a connector like SharePoint, into fields that
> >> >>> > you
> >> >>> > presumably have in your Solr schema.  However, if you don't set
> >> >>> > anything,
> >> >>> > the fields are not changed, and you should see an argument for
> every
> >> >>> > metadata field, something like: literal.xxx=yyy.
> >> >>> >
> >> >>> > If you have a document that you *know* has metadata, and you've
> >> >>> > specified
> >> >>> > that metadata in the job, and you run the job after you specify
> that
> >> >>> > metadata, but still see no literal.xxx=yyy corresponding to it in
> >> >>> > the
> >> >>> > Solr
> >> >>> > output, then we should spend some time chasing this problem down.
> >> >>> > Be
> >> >>> > wary
> >> >>> > because incremental crawling means you'll probably not see your
> >> >>> > document
> >> >>> > processed again unless you either change it in SharePoint, or
> delete
> >> >>> > and
> >> >>> > recreate the job.  But be reassured that SharePoint metadata was
> >> >>> > covered by
> >> >>> > the old MetaCarta tests, and there have been no changes of any
> >> >>> > significance
> >> >>> > to the SharePoint connector since then, so I have no explanation
> why
> >> >>> > it
> >> >>> > would not work for you too.  That's why I'm spending time trying
> to
> >> >>> > figure
> >> >>> > out if this is a Solr connector issue instead.
> >> >>> >
> >> >>> > Please let me know if this helps you, or whether you need to go
> >> >>> > deeper
> >> >>> > into
> >> >>> > debugging.
> >> >>> >
> >> >>> > Karl
> >> >>> >
> >> >>> >
> >> >>> > On Sun, Sep 12, 2010 at 4:05 PM, Martijn v Groningen
> >> >>> > <ma...@gmail.com> wrote:
> >> >>> >>
> >> >>> >> I didn't notice that I was under the upstream-changes directory.
> >> >>> >> Thanks for pointing that out.
> >> >>> >>
> >> >>> >> In Solr I have a wildcard (*) dynamic field, so everything acf
> >> >>> >> sends
> >> >>> >> should end up in my index (or at least that is what I assume). I
> >> >>> >> also
> >> >>> >> did some debugging in the Solr connecter and I noticed that no
> >> >>> >> metadata was send to Solr. I didn't create field mappings in my
> acf
> >> >>> >> job. Do you always have to make mapping for metadata?
> >> >>> >>
> >> >>> >> Martijn
> >> >>> >>
> >> >>> >> On 12 September 2010 21:09, Karl Wright <da...@gmail.com>
> wrote:
> >> >>> >> > The source for upstream changes is under
> >> >>> >> > lcf/upstream-changes/httpclient, not under trunk.
> >> >>> >> >
> >> >>> >> > As for the metadata, how are you determining that no metadata
> is
> >> >>> >> > being
> >> >>> >> > indexed?  If this is Solr you are indexing into, have you set
> up
> >> >>> >> > the
> >> >>> >> > appropriate metadata/field mappings?
> >> >>> >> >
> >> >>> >> > Karl
> >> >>> >> >
> >> >>> >> > On 9/12/10, Martijn v Groningen <ma...@gmail.com>
> >> >>> >> > wrote:
> >> >>> >> >> To authenticate with Share point I had to include the domain
> as
> >> >>> >> >> well.
> >> >>> >> >> Also the ui reported an error if I didn't specify the username
> >> >>> >> >> in a
> >> >>> >> >> domain / username format. Maybe this http client issue was
> just
> >> >>> >> >> particular with the Sharepoint / Domain Controller
> installation
> >> >>> >> >> I
> >> >>> >> >> was
> >> >>> >> >> working with. I also couldn't find the source of afc version
> of
> >> >>> >> >> http
> >> >>> >> >> client. Is it hosted in another source repository?
> >> >>> >> >>
> >> >>> >> >> I still don't understand why for the documents I crawled, I
> >> >>> >> >> didn't
> >> >>> >> >> have any metadata associated with it. In the job configuration
> I
> >> >>> >> >> was
> >> >>> >> >> able to choose which metadata I wanted to include. You have an
> >> >>> >> >> idea
> >> >>> >> >> what might be the cause of this?
> >> >>> >> >>
> >> >>> >> >> Regards,
> >> >>> >> >>
> >> >>> >> >> Martijn
> >> >>> >> >>
> >> >>> >> >> On 12 September 2010 18:40, Karl Wright <da...@gmail.com>
> >> >>> >> >> wrote:
> >> >>> >> >>> Hi Martijn,
> >> >>> >> >>>
> >> >>> >> >>> The ACF version of httpclient has support for NTLMv1, NTLMv2,
> >> >>> >> >>> and
> >> >>> >> >>> NTLM2
> >> >>> >> >>> protocols.  The standard client does not.
> >> >>> >> >>>
> >> >>> >> >>> What this means practically for you depends on how the
> Windows
> >> >>> >> >>> domain
> >> >>> >> >>> controller you are working with is configured.  You cannot
> use
> >> >>> >> >>> the
> >> >>> >> >>> off-the-shelf httpclient and still authenticate if the domain
> >> >>> >> >>> controller
> >> >>> >> >>> is
> >> >>> >> >>> configured to not allow LM connections, which is what
> Microsoft
> >> >>> >> >>> recommends
> >> >>> >> >>> people do.
> >> >>> >> >>>
> >> >>> >> >>> Since the ACF version of httpclient will always try to
> connect
> >> >>> >> >>> using
> >> >>> >> >>> NTLMv2,
> >> >>> >> >>> this means that you must be more rigorous about setting up
> your
> >> >>> >> >>> client
> >> >>> >> >>> machine.  First, it must have a name, and it must have a
> >> >>> >> >>> machine
> >> >>> >> >>> account
> >> >>> >> >>> in
> >> >>> >> >>> the domain.  Second, NTLMv2 is much more picky about how you
> >> >>> >> >>> specify
> >> >>> >> >>> user
> >> >>> >> >>> and domain.  The end user documentation provides details that
> >> >>> >> >>> may
> >> >>> >> >>> be
> >> >>> >> >>> helpful
> >> >>> >> >>> to you in this regard.
> >> >>> >> >>>
> >> >>> >> >>> Thanks,
> >> >>> >> >>> Karl
> >> >>> >> >>>
> >> >>> >> >>>
> >> >>> >> >>> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
> >> >>> >> >>> <ma...@gmail.com> wrote:
> >> >>> >> >>>>
> >> >>> >> >>>> Hi All,
> >> >>> >> >>>>
> >> >>> >> >>>> I've configured the Sharepoint connector (to connect to
> >> >>> >> >>>> sharepoint
> >> >>> >> >>>> 3.0), Solr connector and a job that adds documents into
> Solr.
> >> >>> >> >>>> The
> >> >>> >> >>>> only
> >> >>> >> >>>> thing that I'm missing is the meta data from Sharepoint. Per
> >> >>> >> >>>> document
> >> >>> >> >>>> I need to know which users can access it. In the metadata
> tab
> >> >>> >> >>>> on
> >> >>> >> >>>> the
> >> >>> >> >>>> job page I've configured the metadata to be included, but
> this
> >> >>> >> >>>> doesn't
> >> >>> >> >>>> end up in my Solr index. Does anybody know what I should do
> to
> >> >>> >> >>>> also
> >> >>> >> >>>> have the metadata in my index?
> >> >>> >> >>>>
> >> >>> >> >>>> I also had another issue with the Sharepoint connector which
> I
> >> >>> >> >>>> managed
> >> >>> >> >>>> to solve. But I'm curious to know if someone else
> encountered
> >> >>> >> >>>> a
> >> >>> >> >>>> similar issue.
> >> >>> >> >>>> When I was setting up the sharepoint connecter I always got
> a
> >> >>> >> >>>> 401
> >> >>> >> >>>> message on the connectors page as status. I was sure I
> entered
> >> >>> >> >>>> the
> >> >>> >> >>>> correct credentials. After some debugging I noticed that the
> >> >>> >> >>>> NLTM
> >> >>> >> >>>> data
> >> >>> >> >>>> that was send to Solr was different then when I did a http
> >> >>> >> >>>> post
> >> >>> >> >>>> with
> >> >>> >> >>>> Firefox poster plugin to a Sharepoint webservice url (I
> check
> >> >>> >> >>>> this
> >> >>> >> >>>> with Wireshark). After writing a little test case with
> >> >>> >> >>>> httpclient
> >> >>> >> >>>> used
> >> >>> >> >>>> in afc, I got the same 401 error. I then ran the test with a
> >> >>> >> >>>> clean
> >> >>> >> >>>> http client (version 3.1), that ran as expected. I got a
> >> >>> >> >>>> response
> >> >>> >> >>>> code
> >> >>> >> >>>> 200 back with a soap response. I then used this version of
> >> >>> >> >>>> http
> >> >>> >> >>>> client
> >> >>> >> >>>> (with some class filesfrom the afc provided jar that were
> >> >>> >> >>>> missing
> >> >>> >> >>>> is
> >> >>> >> >>>> the plain jar file) and the connector worked as expected as
> I
> >> >>> >> >>>> was
> >> >>> >> >>>> able
> >> >>> >> >>>> to index documents. Did someone else have this particular
> >> >>> >> >>>> issue?
> >> >>> >> >>>> I
> >> >>> >> >>>> noticed that acf is using httpclient 3.1 (from the manifest
> >> >>> >> >>>> file),
> >> >>> >> >>>> but
> >> >>> >> >>>> I'm curious to know why http client was modified.
> >> >>> >> >>>>
> >> >>> >> >>>> BTW I've been using the latest trunk version (I did a
> checkout
> >> >>> >> >>>> last
> >> >>> >> >>>> tuesday). I'm also new to Sharepoint
> >> >>> >> >>>>
> >> >>> >> >>>> Cheers,
> >> >>> >> >>>>
> >> >>> >> >>>> Martijn
> >> >>> >> >>>
> >> >>> >> >>>
> >> >>> >> >>
> >> >>> >> >
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >> --
> >> >>> >> Met vriendelijke groet,
> >> >>> >>
> >> >>> >> Martijn van Groningen
> >> >>> >
> >> >>> >
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Met vriendelijke groet,
> >> >>>
> >> >>> Martijn van Groningen
> >> >>
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Met vriendelijke groet,
> >>
> >> Martijn van Groningen
> >
> >
>
>
>
> --
> Met vriendelijke groet,
>
> Martijn van Groningen
>

Re: Sharepoint connector question

Posted by Martijn v Groningen <ma...@gmail.com>.
That could explain the error. I've installed and uninstalled the
webservice extension a few times, but I know for sure that the last
time I installed it as domain administrator. Last week I used a
plain-vanilla Sharepoint (trail version), the webservice extension
worked there without any problem.

On 13 September 2010 19:30, Karl Wright <da...@gmail.com> wrote:
> The key error is the following:
>
>>>>>>>
> <soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
> request failed with HTTP status 401:
> Unauthorized.</faultstring><faultactor>http://[host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
> request failed with HTTP status 401:
> Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault>
> <<<<<<
>
> Clearly the MCPermissions web service does not have sufficient permissions
> to perform its task.  I don't recall ever having seen this before, but
> perhaps during installation you were not logged in as a user that has enough
> permission to perform security lookups.
>
> Karl
>
> On Mon, Sep 13, 2010 at 12:15 PM, Martijn v Groningen
> <ma...@gmail.com> wrote:
>>
>> Hi Karl,
>>
>> Today I'm not at the environment where I can verify this, but I'll
>> definitely check this. But I ran into another issue with the
>> Sharepoint connector. In a another environment I installed the
>> Metacarta Sharepoint webservice extensions, but by executing the
>> following post:
>> HTTP POST http://[host]/rotterdamn/_vti_bin/MCPermissions.asmx
>> <?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope
>> xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
>> xmlns:xsd="http://www.w3.org/2001/XMLSchema"
>>
>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><soapenv:Body><GetPermissionCollection
>>
>> xmlns="http://microsoft.com/sharepoint/webpartpages/"><objectName>/</objectName><objectType>Web</objectType></GetPermissionCollection></soapenv:Body></soapenv:Envelope>
>>
>> I get back the following response (http 500):
>> <?xml version="1.0" encoding="utf-8"?><soap:Envelope
>> xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>>
>> xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
>> request failed with HTTP status 401:
>>
>> Unauthorized.</faultstring><faultactor>http://[host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
>> request failed with HTTP status 401:
>>
>> Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault></soap:Body></soap:Envelope>
>>
>> I seems that I only get an error with the MCPermissions webservice
>> call. Other calls such as GetListCollection work fine. I'm
>> authentication with the domain administrator account. This environment
>> has also Sharepoint 3.0 installed. I'm making these posts to
>> Sharepoint with Firefox http poster plugin.  Also the url in the
>> response is without the subsite.
>>
>> Also important to note is that browsing to
>> http://[host]/[subsite]/_vti_bin/MCPermissions.asmx shows the
>> GetPermissionsCollection operation. That is what I checked after
>> installing the webservice extension. You have a clue what might be
>> wrong here?
>>
>> Thanks,
>>
>> Martijn
>>
>> On 13 September 2010 10:27, Karl Wright <da...@gmail.com> wrote:
>> > Hi Martijn,
>> >
>> > For the 401 error, here's something also worth trying, to remove the
>> > possibility that your error has anything to do with other recent
>> > changes.
>> > Can you check out the following:
>> >
>> > svn co -r987345
>> > https://svn.apache.org/repos/asf/incubator/lcf/trunk/modules/lib
>> >
>> > In the checkout lib area, you will see a jar called
>> > commons-httpclient-lcf.jar.  Replace commons-httpclient-acf.jar with it
>> > (renaming to commons-httpclient-acf.jar, of course), and try running
>> > with
>> > it.  If your 401 error no longer happens, then it means something was
>> > messed
>> > up, and I'll need to do some research.
>> >
>> > Thanks,
>> > Karl
>> >
>> > On Sun, Sep 12, 2010 at 5:02 PM, Karl Wright <da...@gmail.com> wrote:
>> >>
>> >> I confirmed that without any mappings set, the Solr Connector *should*
>> >> just be passing the metadata through using the metadata's name as the
>> >> Solr
>> >> field name.
>> >>
>> >> For debugging, if you could post the Solr output from one update
>> >> operation, I'd love to see if any metadata seems to be in it.
>> >> Potentially
>> >> it's there but the Solr schema is not right somehow - that should be
>> >> the
>> >> first thing we verify.
>> >>
>> >> Karl
>> >>
>> >>
>> >> On Sun, Sep 12, 2010 at 4:50 PM, Martijn v Groningen
>> >> <ma...@gmail.com> wrote:
>> >>>
>> >>> Tomorrow I'll dive into code and do some more debugging. Last week I
>> >>> didn't specify any mappings in the mapping tab for the meta data
>> >>> fields I selected in the metadata tab. But this shouldn't be the
>> >>> problem, right?
>> >>>
>> >>> Thanks,
>> >>>
>> >>> Martijn
>> >>>
>> >>> On 12 September 2010 22:29, Karl Wright <da...@gmail.com> wrote:
>> >>> > Martijn,
>> >>> >
>> >>> > (1) The precise svn url for the acf version of httpclient is as
>> >>> > follows.  My
>> >>> > apologies for any earlier confusion - I was away from my computer at
>> >>> > the
>> >>> > time.
>> >>> >
>> >>> >
>> >>> >
>> >>> > https://svn.apache.org/repos/asf/incubator/lcf/upstream/commons-httpclient-3x
>> >>> >
>> >>> > (2) Each time the solr connector posts into Solr, you should see a
>> >>> > set
>> >>> > of
>> >>> > argument names and values dumped to standard out (or the log).  So
>> >>> > it
>> >>> > should
>> >>> > be easy to see what is being sent, and whether the arguments in fact
>> >>> > are the
>> >>> > correct ones for the extracting update request handler, or not.
>> >>> > Furthermore, the Solr output connector recently had a tab added
>> >>> > which
>> >>> > performs the mapping I alluded to.  This mapping is designed to
>> >>> > translate
>> >>> > metadata coming from a connector like SharePoint, into fields that
>> >>> > you
>> >>> > presumably have in your Solr schema.  However, if you don't set
>> >>> > anything,
>> >>> > the fields are not changed, and you should see an argument for every
>> >>> > metadata field, something like: literal.xxx=yyy.
>> >>> >
>> >>> > If you have a document that you *know* has metadata, and you've
>> >>> > specified
>> >>> > that metadata in the job, and you run the job after you specify that
>> >>> > metadata, but still see no literal.xxx=yyy corresponding to it in
>> >>> > the
>> >>> > Solr
>> >>> > output, then we should spend some time chasing this problem down.
>> >>> > Be
>> >>> > wary
>> >>> > because incremental crawling means you'll probably not see your
>> >>> > document
>> >>> > processed again unless you either change it in SharePoint, or delete
>> >>> > and
>> >>> > recreate the job.  But be reassured that SharePoint metadata was
>> >>> > covered by
>> >>> > the old MetaCarta tests, and there have been no changes of any
>> >>> > significance
>> >>> > to the SharePoint connector since then, so I have no explanation why
>> >>> > it
>> >>> > would not work for you too.  That's why I'm spending time trying to
>> >>> > figure
>> >>> > out if this is a Solr connector issue instead.
>> >>> >
>> >>> > Please let me know if this helps you, or whether you need to go
>> >>> > deeper
>> >>> > into
>> >>> > debugging.
>> >>> >
>> >>> > Karl
>> >>> >
>> >>> >
>> >>> > On Sun, Sep 12, 2010 at 4:05 PM, Martijn v Groningen
>> >>> > <ma...@gmail.com> wrote:
>> >>> >>
>> >>> >> I didn't notice that I was under the upstream-changes directory.
>> >>> >> Thanks for pointing that out.
>> >>> >>
>> >>> >> In Solr I have a wildcard (*) dynamic field, so everything acf
>> >>> >> sends
>> >>> >> should end up in my index (or at least that is what I assume). I
>> >>> >> also
>> >>> >> did some debugging in the Solr connecter and I noticed that no
>> >>> >> metadata was send to Solr. I didn't create field mappings in my acf
>> >>> >> job. Do you always have to make mapping for metadata?
>> >>> >>
>> >>> >> Martijn
>> >>> >>
>> >>> >> On 12 September 2010 21:09, Karl Wright <da...@gmail.com> wrote:
>> >>> >> > The source for upstream changes is under
>> >>> >> > lcf/upstream-changes/httpclient, not under trunk.
>> >>> >> >
>> >>> >> > As for the metadata, how are you determining that no metadata is
>> >>> >> > being
>> >>> >> > indexed?  If this is Solr you are indexing into, have you set up
>> >>> >> > the
>> >>> >> > appropriate metadata/field mappings?
>> >>> >> >
>> >>> >> > Karl
>> >>> >> >
>> >>> >> > On 9/12/10, Martijn v Groningen <ma...@gmail.com>
>> >>> >> > wrote:
>> >>> >> >> To authenticate with Share point I had to include the domain as
>> >>> >> >> well.
>> >>> >> >> Also the ui reported an error if I didn't specify the username
>> >>> >> >> in a
>> >>> >> >> domain / username format. Maybe this http client issue was just
>> >>> >> >> particular with the Sharepoint / Domain Controller installation
>> >>> >> >> I
>> >>> >> >> was
>> >>> >> >> working with. I also couldn't find the source of afc version of
>> >>> >> >> http
>> >>> >> >> client. Is it hosted in another source repository?
>> >>> >> >>
>> >>> >> >> I still don't understand why for the documents I crawled, I
>> >>> >> >> didn't
>> >>> >> >> have any metadata associated with it. In the job configuration I
>> >>> >> >> was
>> >>> >> >> able to choose which metadata I wanted to include. You have an
>> >>> >> >> idea
>> >>> >> >> what might be the cause of this?
>> >>> >> >>
>> >>> >> >> Regards,
>> >>> >> >>
>> >>> >> >> Martijn
>> >>> >> >>
>> >>> >> >> On 12 September 2010 18:40, Karl Wright <da...@gmail.com>
>> >>> >> >> wrote:
>> >>> >> >>> Hi Martijn,
>> >>> >> >>>
>> >>> >> >>> The ACF version of httpclient has support for NTLMv1, NTLMv2,
>> >>> >> >>> and
>> >>> >> >>> NTLM2
>> >>> >> >>> protocols.  The standard client does not.
>> >>> >> >>>
>> >>> >> >>> What this means practically for you depends on how the Windows
>> >>> >> >>> domain
>> >>> >> >>> controller you are working with is configured.  You cannot use
>> >>> >> >>> the
>> >>> >> >>> off-the-shelf httpclient and still authenticate if the domain
>> >>> >> >>> controller
>> >>> >> >>> is
>> >>> >> >>> configured to not allow LM connections, which is what Microsoft
>> >>> >> >>> recommends
>> >>> >> >>> people do.
>> >>> >> >>>
>> >>> >> >>> Since the ACF version of httpclient will always try to connect
>> >>> >> >>> using
>> >>> >> >>> NTLMv2,
>> >>> >> >>> this means that you must be more rigorous about setting up your
>> >>> >> >>> client
>> >>> >> >>> machine.  First, it must have a name, and it must have a
>> >>> >> >>> machine
>> >>> >> >>> account
>> >>> >> >>> in
>> >>> >> >>> the domain.  Second, NTLMv2 is much more picky about how you
>> >>> >> >>> specify
>> >>> >> >>> user
>> >>> >> >>> and domain.  The end user documentation provides details that
>> >>> >> >>> may
>> >>> >> >>> be
>> >>> >> >>> helpful
>> >>> >> >>> to you in this regard.
>> >>> >> >>>
>> >>> >> >>> Thanks,
>> >>> >> >>> Karl
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
>> >>> >> >>> <ma...@gmail.com> wrote:
>> >>> >> >>>>
>> >>> >> >>>> Hi All,
>> >>> >> >>>>
>> >>> >> >>>> I've configured the Sharepoint connector (to connect to
>> >>> >> >>>> sharepoint
>> >>> >> >>>> 3.0), Solr connector and a job that adds documents into Solr.
>> >>> >> >>>> The
>> >>> >> >>>> only
>> >>> >> >>>> thing that I'm missing is the meta data from Sharepoint. Per
>> >>> >> >>>> document
>> >>> >> >>>> I need to know which users can access it. In the metadata tab
>> >>> >> >>>> on
>> >>> >> >>>> the
>> >>> >> >>>> job page I've configured the metadata to be included, but this
>> >>> >> >>>> doesn't
>> >>> >> >>>> end up in my Solr index. Does anybody know what I should do to
>> >>> >> >>>> also
>> >>> >> >>>> have the metadata in my index?
>> >>> >> >>>>
>> >>> >> >>>> I also had another issue with the Sharepoint connector which I
>> >>> >> >>>> managed
>> >>> >> >>>> to solve. But I'm curious to know if someone else encountered
>> >>> >> >>>> a
>> >>> >> >>>> similar issue.
>> >>> >> >>>> When I was setting up the sharepoint connecter I always got a
>> >>> >> >>>> 401
>> >>> >> >>>> message on the connectors page as status. I was sure I entered
>> >>> >> >>>> the
>> >>> >> >>>> correct credentials. After some debugging I noticed that the
>> >>> >> >>>> NLTM
>> >>> >> >>>> data
>> >>> >> >>>> that was send to Solr was different then when I did a http
>> >>> >> >>>> post
>> >>> >> >>>> with
>> >>> >> >>>> Firefox poster plugin to a Sharepoint webservice url (I check
>> >>> >> >>>> this
>> >>> >> >>>> with Wireshark). After writing a little test case with
>> >>> >> >>>> httpclient
>> >>> >> >>>> used
>> >>> >> >>>> in afc, I got the same 401 error. I then ran the test with a
>> >>> >> >>>> clean
>> >>> >> >>>> http client (version 3.1), that ran as expected. I got a
>> >>> >> >>>> response
>> >>> >> >>>> code
>> >>> >> >>>> 200 back with a soap response. I then used this version of
>> >>> >> >>>> http
>> >>> >> >>>> client
>> >>> >> >>>> (with some class filesfrom the afc provided jar that were
>> >>> >> >>>> missing
>> >>> >> >>>> is
>> >>> >> >>>> the plain jar file) and the connector worked as expected as I
>> >>> >> >>>> was
>> >>> >> >>>> able
>> >>> >> >>>> to index documents. Did someone else have this particular
>> >>> >> >>>> issue?
>> >>> >> >>>> I
>> >>> >> >>>> noticed that acf is using httpclient 3.1 (from the manifest
>> >>> >> >>>> file),
>> >>> >> >>>> but
>> >>> >> >>>> I'm curious to know why http client was modified.
>> >>> >> >>>>
>> >>> >> >>>> BTW I've been using the latest trunk version (I did a checkout
>> >>> >> >>>> last
>> >>> >> >>>> tuesday). I'm also new to Sharepoint
>> >>> >> >>>>
>> >>> >> >>>> Cheers,
>> >>> >> >>>>
>> >>> >> >>>> Martijn
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>
>> >>> >> >
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> --
>> >>> >> Met vriendelijke groet,
>> >>> >>
>> >>> >> Martijn van Groningen
>> >>> >
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Met vriendelijke groet,
>> >>>
>> >>> Martijn van Groningen
>> >>
>> >
>> >
>>
>>
>>
>> --
>> Met vriendelijke groet,
>>
>> Martijn van Groningen
>
>



-- 
Met vriendelijke groet,

Martijn van Groningen

Re: Sharepoint connector question

Posted by Karl Wright <da...@gmail.com>.
I was able to confirm that the current sharepoint connector with the current
httpclient connects to a plain-vanilla IIS (which is set up for NTLMv1) just
fine.
Due to infrastructure issues here, it will be a bit of time before I can
confirm that NTLMv2 still works, but this is somewhat reassuring.

So now I feel like the right way forward is for you to set up a
plain-vanilla SharePoint 3.0 instance, and get the connector working against
that.  It is much easier to figure out what's broken when you have a working
instance right next to you. ;-)

Karl



On Mon, Sep 13, 2010 at 1:30 PM, Karl Wright <da...@gmail.com> wrote:

> The key error is the following:
>
> >>>>>>
> <soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
> request failed with HTTP status 401:
> Unauthorized.</faultstring><faultactor>http://
> [host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
> request failed with HTTP status 401:
>
> Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault>
> <<<<<<
>
> Clearly the MCPermissions web service does not have sufficient permissions
> to perform its task.  I don't recall ever having seen this before, but
> perhaps during installation you were not logged in as a user that has enough
> permission to perform security lookups.
>
> Karl
>
>
> On Mon, Sep 13, 2010 at 12:15 PM, Martijn v Groningen <
> martijn.is.hier@gmail.com> wrote:
>
>> Hi Karl,
>>
>> Today I'm not at the environment where I can verify this, but I'll
>> definitely check this. But I ran into another issue with the
>> Sharepoint connector. In a another environment I installed the
>> Metacarta Sharepoint webservice extensions, but by executing the
>> following post:
>> HTTP POST http://[host]/rotterdamn/_vti_bin/MCPermissions.asmx
>> <?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope
>> xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
>> xmlns:xsd="http://www.w3.org/2001/XMLSchema"
>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance
>> "><soapenv:Body><GetPermissionCollection
>> xmlns="http://microsoft.com/sharepoint/webpartpages/
>> "><objectName>/</objectName><objectType>Web</objectType></GetPermissionCollection></soapenv:Body></soapenv:Envelope>
>>
>> I get back the following response (http 500):
>> <?xml version="1.0" encoding="utf-8"?><soap:Envelope
>> xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>> xmlns:xsd="http://www.w3.org/2001/XMLSchema
>> "><soap:Body><soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
>> request failed with HTTP status 401:
>> Unauthorized.</faultstring><faultactor>http://
>> [host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
>> request failed with HTTP status 401:
>>
>> Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault></soap:Body></soap:Envelope>
>>
>> I seems that I only get an error with the MCPermissions webservice
>> call. Other calls such as GetListCollection work fine. I'm
>> authentication with the domain administrator account. This environment
>> has also Sharepoint 3.0 installed. I'm making these posts to
>> Sharepoint with Firefox http poster plugin.  Also the url in the
>> response is without the subsite.
>>
>> Also important to note is that browsing to
>> http://[host]/[subsite]/_vti_bin/MCPermissions.asmx shows the
>> GetPermissionsCollection operation. That is what I checked after
>> installing the webservice extension. You have a clue what might be
>> wrong here?
>>
>> Thanks,
>>
>> Martijn
>>
>> On 13 September 2010 10:27, Karl Wright <da...@gmail.com> wrote:
>> > Hi Martijn,
>> >
>> > For the 401 error, here's something also worth trying, to remove the
>> > possibility that your error has anything to do with other recent
>> changes.
>> > Can you check out the following:
>> >
>> > svn co -r987345
>> > https://svn.apache.org/repos/asf/incubator/lcf/trunk/modules/lib
>> >
>> > In the checkout lib area, you will see a jar called
>> > commons-httpclient-lcf.jar.  Replace commons-httpclient-acf.jar with it
>> > (renaming to commons-httpclient-acf.jar, of course), and try running
>> with
>> > it.  If your 401 error no longer happens, then it means something was
>> messed
>> > up, and I'll need to do some research.
>> >
>> > Thanks,
>> > Karl
>> >
>> > On Sun, Sep 12, 2010 at 5:02 PM, Karl Wright <da...@gmail.com>
>> wrote:
>> >>
>> >> I confirmed that without any mappings set, the Solr Connector *should*
>> >> just be passing the metadata through using the metadata's name as the
>> Solr
>> >> field name.
>> >>
>> >> For debugging, if you could post the Solr output from one update
>> >> operation, I'd love to see if any metadata seems to be in it.
>> Potentially
>> >> it's there but the Solr schema is not right somehow - that should be
>> the
>> >> first thing we verify.
>> >>
>> >> Karl
>> >>
>> >>
>> >> On Sun, Sep 12, 2010 at 4:50 PM, Martijn v Groningen
>> >> <ma...@gmail.com> wrote:
>> >>>
>> >>> Tomorrow I'll dive into code and do some more debugging. Last week I
>> >>> didn't specify any mappings in the mapping tab for the meta data
>> >>> fields I selected in the metadata tab. But this shouldn't be the
>> >>> problem, right?
>> >>>
>> >>> Thanks,
>> >>>
>> >>> Martijn
>> >>>
>> >>> On 12 September 2010 22:29, Karl Wright <da...@gmail.com> wrote:
>> >>> > Martijn,
>> >>> >
>> >>> > (1) The precise svn url for the acf version of httpclient is as
>> >>> > follows.  My
>> >>> > apologies for any earlier confusion - I was away from my computer at
>> >>> > the
>> >>> > time.
>> >>> >
>> >>> >
>> >>> >
>> https://svn.apache.org/repos/asf/incubator/lcf/upstream/commons-httpclient-3x
>> >>> >
>> >>> > (2) Each time the solr connector posts into Solr, you should see a
>> set
>> >>> > of
>> >>> > argument names and values dumped to standard out (or the log).  So
>> it
>> >>> > should
>> >>> > be easy to see what is being sent, and whether the arguments in fact
>> >>> > are the
>> >>> > correct ones for the extracting update request handler, or not.
>> >>> > Furthermore, the Solr output connector recently had a tab added
>> which
>> >>> > performs the mapping I alluded to.  This mapping is designed to
>> >>> > translate
>> >>> > metadata coming from a connector like SharePoint, into fields that
>> you
>> >>> > presumably have in your Solr schema.  However, if you don't set
>> >>> > anything,
>> >>> > the fields are not changed, and you should see an argument for every
>> >>> > metadata field, something like: literal.xxx=yyy.
>> >>> >
>> >>> > If you have a document that you *know* has metadata, and you've
>> >>> > specified
>> >>> > that metadata in the job, and you run the job after you specify that
>> >>> > metadata, but still see no literal.xxx=yyy corresponding to it in
>> the
>> >>> > Solr
>> >>> > output, then we should spend some time chasing this problem down.
>> Be
>> >>> > wary
>> >>> > because incremental crawling means you'll probably not see your
>> >>> > document
>> >>> > processed again unless you either change it in SharePoint, or delete
>> >>> > and
>> >>> > recreate the job.  But be reassured that SharePoint metadata was
>> >>> > covered by
>> >>> > the old MetaCarta tests, and there have been no changes of any
>> >>> > significance
>> >>> > to the SharePoint connector since then, so I have no explanation why
>> it
>> >>> > would not work for you too.  That's why I'm spending time trying to
>> >>> > figure
>> >>> > out if this is a Solr connector issue instead.
>> >>> >
>> >>> > Please let me know if this helps you, or whether you need to go
>> deeper
>> >>> > into
>> >>> > debugging.
>> >>> >
>> >>> > Karl
>> >>> >
>> >>> >
>> >>> > On Sun, Sep 12, 2010 at 4:05 PM, Martijn v Groningen
>> >>> > <ma...@gmail.com> wrote:
>> >>> >>
>> >>> >> I didn't notice that I was under the upstream-changes directory.
>> >>> >> Thanks for pointing that out.
>> >>> >>
>> >>> >> In Solr I have a wildcard (*) dynamic field, so everything acf
>> sends
>> >>> >> should end up in my index (or at least that is what I assume). I
>> also
>> >>> >> did some debugging in the Solr connecter and I noticed that no
>> >>> >> metadata was send to Solr. I didn't create field mappings in my acf
>> >>> >> job. Do you always have to make mapping for metadata?
>> >>> >>
>> >>> >> Martijn
>> >>> >>
>> >>> >> On 12 September 2010 21:09, Karl Wright <da...@gmail.com>
>> wrote:
>> >>> >> > The source for upstream changes is under
>> >>> >> > lcf/upstream-changes/httpclient, not under trunk.
>> >>> >> >
>> >>> >> > As for the metadata, how are you determining that no metadata is
>> >>> >> > being
>> >>> >> > indexed?  If this is Solr you are indexing into, have you set up
>> the
>> >>> >> > appropriate metadata/field mappings?
>> >>> >> >
>> >>> >> > Karl
>> >>> >> >
>> >>> >> > On 9/12/10, Martijn v Groningen <ma...@gmail.com>
>> wrote:
>> >>> >> >> To authenticate with Share point I had to include the domain as
>> >>> >> >> well.
>> >>> >> >> Also the ui reported an error if I didn't specify the username
>> in a
>> >>> >> >> domain / username format. Maybe this http client issue was just
>> >>> >> >> particular with the Sharepoint / Domain Controller installation
>> I
>> >>> >> >> was
>> >>> >> >> working with. I also couldn't find the source of afc version of
>> >>> >> >> http
>> >>> >> >> client. Is it hosted in another source repository?
>> >>> >> >>
>> >>> >> >> I still don't understand why for the documents I crawled, I
>> didn't
>> >>> >> >> have any metadata associated with it. In the job configuration I
>> >>> >> >> was
>> >>> >> >> able to choose which metadata I wanted to include. You have an
>> idea
>> >>> >> >> what might be the cause of this?
>> >>> >> >>
>> >>> >> >> Regards,
>> >>> >> >>
>> >>> >> >> Martijn
>> >>> >> >>
>> >>> >> >> On 12 September 2010 18:40, Karl Wright <da...@gmail.com>
>> wrote:
>> >>> >> >>> Hi Martijn,
>> >>> >> >>>
>> >>> >> >>> The ACF version of httpclient has support for NTLMv1, NTLMv2,
>> and
>> >>> >> >>> NTLM2
>> >>> >> >>> protocols.  The standard client does not.
>> >>> >> >>>
>> >>> >> >>> What this means practically for you depends on how the Windows
>> >>> >> >>> domain
>> >>> >> >>> controller you are working with is configured.  You cannot use
>> the
>> >>> >> >>> off-the-shelf httpclient and still authenticate if the domain
>> >>> >> >>> controller
>> >>> >> >>> is
>> >>> >> >>> configured to not allow LM connections, which is what Microsoft
>> >>> >> >>> recommends
>> >>> >> >>> people do.
>> >>> >> >>>
>> >>> >> >>> Since the ACF version of httpclient will always try to connect
>> >>> >> >>> using
>> >>> >> >>> NTLMv2,
>> >>> >> >>> this means that you must be more rigorous about setting up your
>> >>> >> >>> client
>> >>> >> >>> machine.  First, it must have a name, and it must have a
>> machine
>> >>> >> >>> account
>> >>> >> >>> in
>> >>> >> >>> the domain.  Second, NTLMv2 is much more picky about how you
>> >>> >> >>> specify
>> >>> >> >>> user
>> >>> >> >>> and domain.  The end user documentation provides details that
>> may
>> >>> >> >>> be
>> >>> >> >>> helpful
>> >>> >> >>> to you in this regard.
>> >>> >> >>>
>> >>> >> >>> Thanks,
>> >>> >> >>> Karl
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
>> >>> >> >>> <ma...@gmail.com> wrote:
>> >>> >> >>>>
>> >>> >> >>>> Hi All,
>> >>> >> >>>>
>> >>> >> >>>> I've configured the Sharepoint connector (to connect to
>> >>> >> >>>> sharepoint
>> >>> >> >>>> 3.0), Solr connector and a job that adds documents into Solr.
>> The
>> >>> >> >>>> only
>> >>> >> >>>> thing that I'm missing is the meta data from Sharepoint. Per
>> >>> >> >>>> document
>> >>> >> >>>> I need to know which users can access it. In the metadata tab
>> on
>> >>> >> >>>> the
>> >>> >> >>>> job page I've configured the metadata to be included, but this
>> >>> >> >>>> doesn't
>> >>> >> >>>> end up in my Solr index. Does anybody know what I should do to
>> >>> >> >>>> also
>> >>> >> >>>> have the metadata in my index?
>> >>> >> >>>>
>> >>> >> >>>> I also had another issue with the Sharepoint connector which I
>> >>> >> >>>> managed
>> >>> >> >>>> to solve. But I'm curious to know if someone else encountered
>> a
>> >>> >> >>>> similar issue.
>> >>> >> >>>> When I was setting up the sharepoint connecter I always got a
>> 401
>> >>> >> >>>> message on the connectors page as status. I was sure I entered
>> >>> >> >>>> the
>> >>> >> >>>> correct credentials. After some debugging I noticed that the
>> NLTM
>> >>> >> >>>> data
>> >>> >> >>>> that was send to Solr was different then when I did a http
>> post
>> >>> >> >>>> with
>> >>> >> >>>> Firefox poster plugin to a Sharepoint webservice url (I check
>> >>> >> >>>> this
>> >>> >> >>>> with Wireshark). After writing a little test case with
>> httpclient
>> >>> >> >>>> used
>> >>> >> >>>> in afc, I got the same 401 error. I then ran the test with a
>> >>> >> >>>> clean
>> >>> >> >>>> http client (version 3.1), that ran as expected. I got a
>> response
>> >>> >> >>>> code
>> >>> >> >>>> 200 back with a soap response. I then used this version of
>> http
>> >>> >> >>>> client
>> >>> >> >>>> (with some class filesfrom the afc provided jar that were
>> missing
>> >>> >> >>>> is
>> >>> >> >>>> the plain jar file) and the connector worked as expected as I
>> was
>> >>> >> >>>> able
>> >>> >> >>>> to index documents. Did someone else have this particular
>> issue?
>> >>> >> >>>> I
>> >>> >> >>>> noticed that acf is using httpclient 3.1 (from the manifest
>> >>> >> >>>> file),
>> >>> >> >>>> but
>> >>> >> >>>> I'm curious to know why http client was modified.
>> >>> >> >>>>
>> >>> >> >>>> BTW I've been using the latest trunk version (I did a checkout
>> >>> >> >>>> last
>> >>> >> >>>> tuesday). I'm also new to Sharepoint
>> >>> >> >>>>
>> >>> >> >>>> Cheers,
>> >>> >> >>>>
>> >>> >> >>>> Martijn
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>
>> >>> >> >
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> --
>> >>> >> Met vriendelijke groet,
>> >>> >>
>> >>> >> Martijn van Groningen
>> >>> >
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Met vriendelijke groet,
>> >>>
>> >>> Martijn van Groningen
>> >>
>> >
>> >
>>
>>
>>
>> --
>> Met vriendelijke groet,
>>
>> Martijn van Groningen
>>
>
>

Re: Sharepoint connector question

Posted by Karl Wright <da...@gmail.com>.
The key error is the following:

>>>>>>
<soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
request failed with HTTP status 401:
Unauthorized.</faultstring><faultactor>http://
[host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
request failed with HTTP status 401:
Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault>
<<<<<<

Clearly the MCPermissions web service does not have sufficient permissions
to perform its task.  I don't recall ever having seen this before, but
perhaps during installation you were not logged in as a user that has enough
permission to perform security lookups.

Karl

On Mon, Sep 13, 2010 at 12:15 PM, Martijn v Groningen <
martijn.is.hier@gmail.com> wrote:

> Hi Karl,
>
> Today I'm not at the environment where I can verify this, but I'll
> definitely check this. But I ran into another issue with the
> Sharepoint connector. In a another environment I installed the
> Metacarta Sharepoint webservice extensions, but by executing the
> following post:
> HTTP POST http://[host]/rotterdamn/_vti_bin/MCPermissions.asmx
> <?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope
> xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
> xmlns:xsd="http://www.w3.org/2001/XMLSchema"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance
> "><soapenv:Body><GetPermissionCollection
> xmlns="http://microsoft.com/sharepoint/webpartpages/
> "><objectName>/</objectName><objectType>Web</objectType></GetPermissionCollection></soapenv:Body></soapenv:Envelope>
>
> I get back the following response (http 500):
> <?xml version="1.0" encoding="utf-8"?><soap:Envelope
> xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> xmlns:xsd="http://www.w3.org/2001/XMLSchema
> "><soap:Body><soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
> request failed with HTTP status 401:
> Unauthorized.</faultstring><faultactor>http://
> [host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
> request failed with HTTP status 401:
>
> Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault></soap:Body></soap:Envelope>
>
> I seems that I only get an error with the MCPermissions webservice
> call. Other calls such as GetListCollection work fine. I'm
> authentication with the domain administrator account. This environment
> has also Sharepoint 3.0 installed. I'm making these posts to
> Sharepoint with Firefox http poster plugin.  Also the url in the
> response is without the subsite.
>
> Also important to note is that browsing to
> http://[host]/[subsite]/_vti_bin/MCPermissions.asmx shows the
> GetPermissionsCollection operation. That is what I checked after
> installing the webservice extension. You have a clue what might be
> wrong here?
>
> Thanks,
>
> Martijn
>
> On 13 September 2010 10:27, Karl Wright <da...@gmail.com> wrote:
> > Hi Martijn,
> >
> > For the 401 error, here's something also worth trying, to remove the
> > possibility that your error has anything to do with other recent changes.
> > Can you check out the following:
> >
> > svn co -r987345
> > https://svn.apache.org/repos/asf/incubator/lcf/trunk/modules/lib
> >
> > In the checkout lib area, you will see a jar called
> > commons-httpclient-lcf.jar.  Replace commons-httpclient-acf.jar with it
> > (renaming to commons-httpclient-acf.jar, of course), and try running with
> > it.  If your 401 error no longer happens, then it means something was
> messed
> > up, and I'll need to do some research.
> >
> > Thanks,
> > Karl
> >
> > On Sun, Sep 12, 2010 at 5:02 PM, Karl Wright <da...@gmail.com> wrote:
> >>
> >> I confirmed that without any mappings set, the Solr Connector *should*
> >> just be passing the metadata through using the metadata's name as the
> Solr
> >> field name.
> >>
> >> For debugging, if you could post the Solr output from one update
> >> operation, I'd love to see if any metadata seems to be in it.
> Potentially
> >> it's there but the Solr schema is not right somehow - that should be the
> >> first thing we verify.
> >>
> >> Karl
> >>
> >>
> >> On Sun, Sep 12, 2010 at 4:50 PM, Martijn v Groningen
> >> <ma...@gmail.com> wrote:
> >>>
> >>> Tomorrow I'll dive into code and do some more debugging. Last week I
> >>> didn't specify any mappings in the mapping tab for the meta data
> >>> fields I selected in the metadata tab. But this shouldn't be the
> >>> problem, right?
> >>>
> >>> Thanks,
> >>>
> >>> Martijn
> >>>
> >>> On 12 September 2010 22:29, Karl Wright <da...@gmail.com> wrote:
> >>> > Martijn,
> >>> >
> >>> > (1) The precise svn url for the acf version of httpclient is as
> >>> > follows.  My
> >>> > apologies for any earlier confusion - I was away from my computer at
> >>> > the
> >>> > time.
> >>> >
> >>> >
> >>> >
> https://svn.apache.org/repos/asf/incubator/lcf/upstream/commons-httpclient-3x
> >>> >
> >>> > (2) Each time the solr connector posts into Solr, you should see a
> set
> >>> > of
> >>> > argument names and values dumped to standard out (or the log).  So it
> >>> > should
> >>> > be easy to see what is being sent, and whether the arguments in fact
> >>> > are the
> >>> > correct ones for the extracting update request handler, or not.
> >>> > Furthermore, the Solr output connector recently had a tab added which
> >>> > performs the mapping I alluded to.  This mapping is designed to
> >>> > translate
> >>> > metadata coming from a connector like SharePoint, into fields that
> you
> >>> > presumably have in your Solr schema.  However, if you don't set
> >>> > anything,
> >>> > the fields are not changed, and you should see an argument for every
> >>> > metadata field, something like: literal.xxx=yyy.
> >>> >
> >>> > If you have a document that you *know* has metadata, and you've
> >>> > specified
> >>> > that metadata in the job, and you run the job after you specify that
> >>> > metadata, but still see no literal.xxx=yyy corresponding to it in the
> >>> > Solr
> >>> > output, then we should spend some time chasing this problem down.  Be
> >>> > wary
> >>> > because incremental crawling means you'll probably not see your
> >>> > document
> >>> > processed again unless you either change it in SharePoint, or delete
> >>> > and
> >>> > recreate the job.  But be reassured that SharePoint metadata was
> >>> > covered by
> >>> > the old MetaCarta tests, and there have been no changes of any
> >>> > significance
> >>> > to the SharePoint connector since then, so I have no explanation why
> it
> >>> > would not work for you too.  That's why I'm spending time trying to
> >>> > figure
> >>> > out if this is a Solr connector issue instead.
> >>> >
> >>> > Please let me know if this helps you, or whether you need to go
> deeper
> >>> > into
> >>> > debugging.
> >>> >
> >>> > Karl
> >>> >
> >>> >
> >>> > On Sun, Sep 12, 2010 at 4:05 PM, Martijn v Groningen
> >>> > <ma...@gmail.com> wrote:
> >>> >>
> >>> >> I didn't notice that I was under the upstream-changes directory.
> >>> >> Thanks for pointing that out.
> >>> >>
> >>> >> In Solr I have a wildcard (*) dynamic field, so everything acf sends
> >>> >> should end up in my index (or at least that is what I assume). I
> also
> >>> >> did some debugging in the Solr connecter and I noticed that no
> >>> >> metadata was send to Solr. I didn't create field mappings in my acf
> >>> >> job. Do you always have to make mapping for metadata?
> >>> >>
> >>> >> Martijn
> >>> >>
> >>> >> On 12 September 2010 21:09, Karl Wright <da...@gmail.com> wrote:
> >>> >> > The source for upstream changes is under
> >>> >> > lcf/upstream-changes/httpclient, not under trunk.
> >>> >> >
> >>> >> > As for the metadata, how are you determining that no metadata is
> >>> >> > being
> >>> >> > indexed?  If this is Solr you are indexing into, have you set up
> the
> >>> >> > appropriate metadata/field mappings?
> >>> >> >
> >>> >> > Karl
> >>> >> >
> >>> >> > On 9/12/10, Martijn v Groningen <ma...@gmail.com>
> wrote:
> >>> >> >> To authenticate with Share point I had to include the domain as
> >>> >> >> well.
> >>> >> >> Also the ui reported an error if I didn't specify the username in
> a
> >>> >> >> domain / username format. Maybe this http client issue was just
> >>> >> >> particular with the Sharepoint / Domain Controller installation I
> >>> >> >> was
> >>> >> >> working with. I also couldn't find the source of afc version of
> >>> >> >> http
> >>> >> >> client. Is it hosted in another source repository?
> >>> >> >>
> >>> >> >> I still don't understand why for the documents I crawled, I
> didn't
> >>> >> >> have any metadata associated with it. In the job configuration I
> >>> >> >> was
> >>> >> >> able to choose which metadata I wanted to include. You have an
> idea
> >>> >> >> what might be the cause of this?
> >>> >> >>
> >>> >> >> Regards,
> >>> >> >>
> >>> >> >> Martijn
> >>> >> >>
> >>> >> >> On 12 September 2010 18:40, Karl Wright <da...@gmail.com>
> wrote:
> >>> >> >>> Hi Martijn,
> >>> >> >>>
> >>> >> >>> The ACF version of httpclient has support for NTLMv1, NTLMv2,
> and
> >>> >> >>> NTLM2
> >>> >> >>> protocols.  The standard client does not.
> >>> >> >>>
> >>> >> >>> What this means practically for you depends on how the Windows
> >>> >> >>> domain
> >>> >> >>> controller you are working with is configured.  You cannot use
> the
> >>> >> >>> off-the-shelf httpclient and still authenticate if the domain
> >>> >> >>> controller
> >>> >> >>> is
> >>> >> >>> configured to not allow LM connections, which is what Microsoft
> >>> >> >>> recommends
> >>> >> >>> people do.
> >>> >> >>>
> >>> >> >>> Since the ACF version of httpclient will always try to connect
> >>> >> >>> using
> >>> >> >>> NTLMv2,
> >>> >> >>> this means that you must be more rigorous about setting up your
> >>> >> >>> client
> >>> >> >>> machine.  First, it must have a name, and it must have a machine
> >>> >> >>> account
> >>> >> >>> in
> >>> >> >>> the domain.  Second, NTLMv2 is much more picky about how you
> >>> >> >>> specify
> >>> >> >>> user
> >>> >> >>> and domain.  The end user documentation provides details that
> may
> >>> >> >>> be
> >>> >> >>> helpful
> >>> >> >>> to you in this regard.
> >>> >> >>>
> >>> >> >>> Thanks,
> >>> >> >>> Karl
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
> >>> >> >>> <ma...@gmail.com> wrote:
> >>> >> >>>>
> >>> >> >>>> Hi All,
> >>> >> >>>>
> >>> >> >>>> I've configured the Sharepoint connector (to connect to
> >>> >> >>>> sharepoint
> >>> >> >>>> 3.0), Solr connector and a job that adds documents into Solr.
> The
> >>> >> >>>> only
> >>> >> >>>> thing that I'm missing is the meta data from Sharepoint. Per
> >>> >> >>>> document
> >>> >> >>>> I need to know which users can access it. In the metadata tab
> on
> >>> >> >>>> the
> >>> >> >>>> job page I've configured the metadata to be included, but this
> >>> >> >>>> doesn't
> >>> >> >>>> end up in my Solr index. Does anybody know what I should do to
> >>> >> >>>> also
> >>> >> >>>> have the metadata in my index?
> >>> >> >>>>
> >>> >> >>>> I also had another issue with the Sharepoint connector which I
> >>> >> >>>> managed
> >>> >> >>>> to solve. But I'm curious to know if someone else encountered a
> >>> >> >>>> similar issue.
> >>> >> >>>> When I was setting up the sharepoint connecter I always got a
> 401
> >>> >> >>>> message on the connectors page as status. I was sure I entered
> >>> >> >>>> the
> >>> >> >>>> correct credentials. After some debugging I noticed that the
> NLTM
> >>> >> >>>> data
> >>> >> >>>> that was send to Solr was different then when I did a http post
> >>> >> >>>> with
> >>> >> >>>> Firefox poster plugin to a Sharepoint webservice url (I check
> >>> >> >>>> this
> >>> >> >>>> with Wireshark). After writing a little test case with
> httpclient
> >>> >> >>>> used
> >>> >> >>>> in afc, I got the same 401 error. I then ran the test with a
> >>> >> >>>> clean
> >>> >> >>>> http client (version 3.1), that ran as expected. I got a
> response
> >>> >> >>>> code
> >>> >> >>>> 200 back with a soap response. I then used this version of http
> >>> >> >>>> client
> >>> >> >>>> (with some class filesfrom the afc provided jar that were
> missing
> >>> >> >>>> is
> >>> >> >>>> the plain jar file) and the connector worked as expected as I
> was
> >>> >> >>>> able
> >>> >> >>>> to index documents. Did someone else have this particular
> issue?
> >>> >> >>>> I
> >>> >> >>>> noticed that acf is using httpclient 3.1 (from the manifest
> >>> >> >>>> file),
> >>> >> >>>> but
> >>> >> >>>> I'm curious to know why http client was modified.
> >>> >> >>>>
> >>> >> >>>> BTW I've been using the latest trunk version (I did a checkout
> >>> >> >>>> last
> >>> >> >>>> tuesday). I'm also new to Sharepoint
> >>> >> >>>>
> >>> >> >>>> Cheers,
> >>> >> >>>>
> >>> >> >>>> Martijn
> >>> >> >>>
> >>> >> >>>
> >>> >> >>
> >>> >> >
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> Met vriendelijke groet,
> >>> >>
> >>> >> Martijn van Groningen
> >>> >
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Met vriendelijke groet,
> >>>
> >>> Martijn van Groningen
> >>
> >
> >
>
>
>
> --
> Met vriendelijke groet,
>
> Martijn van Groningen
>

Re: Sharepoint connector question

Posted by Karl Wright <da...@gmail.com>.
Martijn, if you have access to MSDN, perhaps you might want to install a
plain-vanilla embedded-MSDE version of SharePoint 3.0 somewhere, and see if
the connector works against that. I'm going to try the same thing this
afternoon, as a sanity check.

Karl


On Mon, Sep 13, 2010 at 1:17 PM, Karl Wright <da...@gmail.com> wrote:

> The key error is the following:
>
> >>>>>>
> <soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
> request failed with HTTP status 401:
> Unauthorized.</faultstring><faultactor>http://
> [host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
> request failed with HTTP status 401:
>
> Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault>
> <<<<<<
>
> Clearly the MCPermissions web service does not have sufficient permissions
> to perform its task in this case.  I don't recall ever having seen this
> before, but perhaps during installation you were not logged in as a user
> that has enough permission to perform security lookups?
>
> Karl
>
>
>
> On Mon, Sep 13, 2010 at 12:15 PM, Martijn v Groningen <
> martijn.is.hier@gmail.com> wrote:
>
>> Hi Karl,
>>
>> Today I'm not at the environment where I can verify this, but I'll
>> definitely check this. But I ran into another issue with the
>> Sharepoint connector. In a another environment I installed the
>> Metacarta Sharepoint webservice extensions, but by executing the
>> following post:
>> HTTP POST http://[host]/rotterdamn/_vti_bin/MCPermissions.asmx
>> <?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope
>> xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
>> xmlns:xsd="http://www.w3.org/2001/XMLSchema"
>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance
>> "><soapenv:Body><GetPermissionCollection
>> xmlns="http://microsoft.com/sharepoint/webpartpages/
>> "><objectName>/</objectName><objectType>Web</objectType></GetPermissionCollection></soapenv:Body></soapenv:Envelope>
>>
>> I get back the following response (http 500):
>> <?xml version="1.0" encoding="utf-8"?><soap:Envelope
>> xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>> xmlns:xsd="http://www.w3.org/2001/XMLSchema
>> "><soap:Body><soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
>> request failed with HTTP status 401:
>> Unauthorized.</faultstring><faultactor>http://
>> [host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
>> request failed with HTTP status 401:
>>
>> Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault></soap:Body></soap:Envelope>
>>
>> I seems that I only get an error with the MCPermissions webservice
>> call. Other calls such as GetListCollection work fine. I'm
>> authentication with the domain administrator account. This environment
>> has also Sharepoint 3.0 installed. I'm making these posts to
>> Sharepoint with Firefox http poster plugin.  Also the url in the
>> response is without the subsite.
>>
>> Also important to note is that browsing to
>> http://[host]/[subsite]/_vti_bin/MCPermissions.asmx shows the
>> GetPermissionsCollection operation. That is what I checked after
>> installing the webservice extension. You have a clue what might be
>> wrong here?
>>
>> Thanks,
>>
>> Martijn
>>
>> On 13 September 2010 10:27, Karl Wright <da...@gmail.com> wrote:
>> > Hi Martijn,
>> >
>> > For the 401 error, here's something also worth trying, to remove the
>> > possibility that your error has anything to do with other recent
>> changes.
>> > Can you check out the following:
>> >
>> > svn co -r987345
>> > https://svn.apache.org/repos/asf/incubator/lcf/trunk/modules/lib
>> >
>> > In the checkout lib area, you will see a jar called
>> > commons-httpclient-lcf.jar.  Replace commons-httpclient-acf.jar with it
>> > (renaming to commons-httpclient-acf.jar, of course), and try running
>> with
>> > it.  If your 401 error no longer happens, then it means something was
>> messed
>> > up, and I'll need to do some research.
>> >
>> > Thanks,
>> > Karl
>> >
>> > On Sun, Sep 12, 2010 at 5:02 PM, Karl Wright <da...@gmail.com>
>> wrote:
>> >>
>> >> I confirmed that without any mappings set, the Solr Connector *should*
>> >> just be passing the metadata through using the metadata's name as the
>> Solr
>> >> field name.
>> >>
>> >> For debugging, if you could post the Solr output from one update
>> >> operation, I'd love to see if any metadata seems to be in it.
>> Potentially
>> >> it's there but the Solr schema is not right somehow - that should be
>> the
>> >> first thing we verify.
>> >>
>> >> Karl
>> >>
>> >>
>> >> On Sun, Sep 12, 2010 at 4:50 PM, Martijn v Groningen
>> >> <ma...@gmail.com> wrote:
>> >>>
>> >>> Tomorrow I'll dive into code and do some more debugging. Last week I
>> >>> didn't specify any mappings in the mapping tab for the meta data
>> >>> fields I selected in the metadata tab. But this shouldn't be the
>> >>> problem, right?
>> >>>
>> >>> Thanks,
>> >>>
>> >>> Martijn
>> >>>
>> >>> On 12 September 2010 22:29, Karl Wright <da...@gmail.com> wrote:
>> >>> > Martijn,
>> >>> >
>> >>> > (1) The precise svn url for the acf version of httpclient is as
>> >>> > follows.  My
>> >>> > apologies for any earlier confusion - I was away from my computer at
>> >>> > the
>> >>> > time.
>> >>> >
>> >>> >
>> >>> >
>> https://svn.apache.org/repos/asf/incubator/lcf/upstream/commons-httpclient-3x
>> >>> >
>> >>> > (2) Each time the solr connector posts into Solr, you should see a
>> set
>> >>> > of
>> >>> > argument names and values dumped to standard out (or the log).  So
>> it
>> >>> > should
>> >>> > be easy to see what is being sent, and whether the arguments in fact
>> >>> > are the
>> >>> > correct ones for the extracting update request handler, or not.
>> >>> > Furthermore, the Solr output connector recently had a tab added
>> which
>> >>> > performs the mapping I alluded to.  This mapping is designed to
>> >>> > translate
>> >>> > metadata coming from a connector like SharePoint, into fields that
>> you
>> >>> > presumably have in your Solr schema.  However, if you don't set
>> >>> > anything,
>> >>> > the fields are not changed, and you should see an argument for every
>> >>> > metadata field, something like: literal.xxx=yyy.
>> >>> >
>> >>> > If you have a document that you *know* has metadata, and you've
>> >>> > specified
>> >>> > that metadata in the job, and you run the job after you specify that
>> >>> > metadata, but still see no literal.xxx=yyy corresponding to it in
>> the
>> >>> > Solr
>> >>> > output, then we should spend some time chasing this problem down.
>> Be
>> >>> > wary
>> >>> > because incremental crawling means you'll probably not see your
>> >>> > document
>> >>> > processed again unless you either change it in SharePoint, or delete
>> >>> > and
>> >>> > recreate the job.  But be reassured that SharePoint metadata was
>> >>> > covered by
>> >>> > the old MetaCarta tests, and there have been no changes of any
>> >>> > significance
>> >>> > to the SharePoint connector since then, so I have no explanation why
>> it
>> >>> > would not work for you too.  That's why I'm spending time trying to
>> >>> > figure
>> >>> > out if this is a Solr connector issue instead.
>> >>> >
>> >>> > Please let me know if this helps you, or whether you need to go
>> deeper
>> >>> > into
>> >>> > debugging.
>> >>> >
>> >>> > Karl
>> >>> >
>> >>> >
>> >>> > On Sun, Sep 12, 2010 at 4:05 PM, Martijn v Groningen
>> >>> > <ma...@gmail.com> wrote:
>> >>> >>
>> >>> >> I didn't notice that I was under the upstream-changes directory.
>> >>> >> Thanks for pointing that out.
>> >>> >>
>> >>> >> In Solr I have a wildcard (*) dynamic field, so everything acf
>> sends
>> >>> >> should end up in my index (or at least that is what I assume). I
>> also
>> >>> >> did some debugging in the Solr connecter and I noticed that no
>> >>> >> metadata was send to Solr. I didn't create field mappings in my acf
>> >>> >> job. Do you always have to make mapping for metadata?
>> >>> >>
>> >>> >> Martijn
>> >>> >>
>> >>> >> On 12 September 2010 21:09, Karl Wright <da...@gmail.com>
>> wrote:
>> >>> >> > The source for upstream changes is under
>> >>> >> > lcf/upstream-changes/httpclient, not under trunk.
>> >>> >> >
>> >>> >> > As for the metadata, how are you determining that no metadata is
>> >>> >> > being
>> >>> >> > indexed?  If this is Solr you are indexing into, have you set up
>> the
>> >>> >> > appropriate metadata/field mappings?
>> >>> >> >
>> >>> >> > Karl
>> >>> >> >
>> >>> >> > On 9/12/10, Martijn v Groningen <ma...@gmail.com>
>> wrote:
>> >>> >> >> To authenticate with Share point I had to include the domain as
>> >>> >> >> well.
>> >>> >> >> Also the ui reported an error if I didn't specify the username
>> in a
>> >>> >> >> domain / username format. Maybe this http client issue was just
>> >>> >> >> particular with the Sharepoint / Domain Controller installation
>> I
>> >>> >> >> was
>> >>> >> >> working with. I also couldn't find the source of afc version of
>> >>> >> >> http
>> >>> >> >> client. Is it hosted in another source repository?
>> >>> >> >>
>> >>> >> >> I still don't understand why for the documents I crawled, I
>> didn't
>> >>> >> >> have any metadata associated with it. In the job configuration I
>> >>> >> >> was
>> >>> >> >> able to choose which metadata I wanted to include. You have an
>> idea
>> >>> >> >> what might be the cause of this?
>> >>> >> >>
>> >>> >> >> Regards,
>> >>> >> >>
>> >>> >> >> Martijn
>> >>> >> >>
>> >>> >> >> On 12 September 2010 18:40, Karl Wright <da...@gmail.com>
>> wrote:
>> >>> >> >>> Hi Martijn,
>> >>> >> >>>
>> >>> >> >>> The ACF version of httpclient has support for NTLMv1, NTLMv2,
>> and
>> >>> >> >>> NTLM2
>> >>> >> >>> protocols.  The standard client does not.
>> >>> >> >>>
>> >>> >> >>> What this means practically for you depends on how the Windows
>> >>> >> >>> domain
>> >>> >> >>> controller you are working with is configured.  You cannot use
>> the
>> >>> >> >>> off-the-shelf httpclient and still authenticate if the domain
>> >>> >> >>> controller
>> >>> >> >>> is
>> >>> >> >>> configured to not allow LM connections, which is what Microsoft
>> >>> >> >>> recommends
>> >>> >> >>> people do.
>> >>> >> >>>
>> >>> >> >>> Since the ACF version of httpclient will always try to connect
>> >>> >> >>> using
>> >>> >> >>> NTLMv2,
>> >>> >> >>> this means that you must be more rigorous about setting up your
>> >>> >> >>> client
>> >>> >> >>> machine.  First, it must have a name, and it must have a
>> machine
>> >>> >> >>> account
>> >>> >> >>> in
>> >>> >> >>> the domain.  Second, NTLMv2 is much more picky about how you
>> >>> >> >>> specify
>> >>> >> >>> user
>> >>> >> >>> and domain.  The end user documentation provides details that
>> may
>> >>> >> >>> be
>> >>> >> >>> helpful
>> >>> >> >>> to you in this regard.
>> >>> >> >>>
>> >>> >> >>> Thanks,
>> >>> >> >>> Karl
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
>> >>> >> >>> <ma...@gmail.com> wrote:
>> >>> >> >>>>
>> >>> >> >>>> Hi All,
>> >>> >> >>>>
>> >>> >> >>>> I've configured the Sharepoint connector (to connect to
>> >>> >> >>>> sharepoint
>> >>> >> >>>> 3.0), Solr connector and a job that adds documents into Solr.
>> The
>> >>> >> >>>> only
>> >>> >> >>>> thing that I'm missing is the meta data from Sharepoint. Per
>> >>> >> >>>> document
>> >>> >> >>>> I need to know which users can access it. In the metadata tab
>> on
>> >>> >> >>>> the
>> >>> >> >>>> job page I've configured the metadata to be included, but this
>> >>> >> >>>> doesn't
>> >>> >> >>>> end up in my Solr index. Does anybody know what I should do to
>> >>> >> >>>> also
>> >>> >> >>>> have the metadata in my index?
>> >>> >> >>>>
>> >>> >> >>>> I also had another issue with the Sharepoint connector which I
>> >>> >> >>>> managed
>> >>> >> >>>> to solve. But I'm curious to know if someone else encountered
>> a
>> >>> >> >>>> similar issue.
>> >>> >> >>>> When I was setting up the sharepoint connecter I always got a
>> 401
>> >>> >> >>>> message on the connectors page as status. I was sure I entered
>> >>> >> >>>> the
>> >>> >> >>>> correct credentials. After some debugging I noticed that the
>> NLTM
>> >>> >> >>>> data
>> >>> >> >>>> that was send to Solr was different then when I did a http
>> post
>> >>> >> >>>> with
>> >>> >> >>>> Firefox poster plugin to a Sharepoint webservice url (I check
>> >>> >> >>>> this
>> >>> >> >>>> with Wireshark). After writing a little test case with
>> httpclient
>> >>> >> >>>> used
>> >>> >> >>>> in afc, I got the same 401 error. I then ran the test with a
>> >>> >> >>>> clean
>> >>> >> >>>> http client (version 3.1), that ran as expected. I got a
>> response
>> >>> >> >>>> code
>> >>> >> >>>> 200 back with a soap response. I then used this version of
>> http
>> >>> >> >>>> client
>> >>> >> >>>> (with some class filesfrom the afc provided jar that were
>> missing
>> >>> >> >>>> is
>> >>> >> >>>> the plain jar file) and the connector worked as expected as I
>> was
>> >>> >> >>>> able
>> >>> >> >>>> to index documents. Did someone else have this particular
>> issue?
>> >>> >> >>>> I
>> >>> >> >>>> noticed that acf is using httpclient 3.1 (from the manifest
>> >>> >> >>>> file),
>> >>> >> >>>> but
>> >>> >> >>>> I'm curious to know why http client was modified.
>> >>> >> >>>>
>> >>> >> >>>> BTW I've been using the latest trunk version (I did a checkout
>> >>> >> >>>> last
>> >>> >> >>>> tuesday). I'm also new to Sharepoint
>> >>> >> >>>>
>> >>> >> >>>> Cheers,
>> >>> >> >>>>
>> >>> >> >>>> Martijn
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>
>> >>> >> >
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> --
>> >>> >> Met vriendelijke groet,
>> >>> >>
>> >>> >> Martijn van Groningen
>> >>> >
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Met vriendelijke groet,
>> >>>
>> >>> Martijn van Groningen
>> >>
>> >
>> >
>>
>>
>>
>> --
>> Met vriendelijke groet,
>>
>> Martijn van Groningen
>>
>
>

Re: Sharepoint connector question

Posted by Karl Wright <da...@gmail.com>.
The key error is the following:

>>>>>>
<soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
request failed with HTTP status 401:
Unauthorized.</faultstring><faultactor>http://
[host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
request failed with HTTP status 401:
Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault>
<<<<<<

Clearly the MCPermissions web service does not have sufficient permissions
to perform its task in this case.  I don't recall ever having seen this
before, but perhaps during installation you were not logged in as a user
that has enough permission to perform security lookups?

Karl


On Mon, Sep 13, 2010 at 12:15 PM, Martijn v Groningen <
martijn.is.hier@gmail.com> wrote:

> Hi Karl,
>
> Today I'm not at the environment where I can verify this, but I'll
> definitely check this. But I ran into another issue with the
> Sharepoint connector. In a another environment I installed the
> Metacarta Sharepoint webservice extensions, but by executing the
> following post:
> HTTP POST http://[host]/rotterdamn/_vti_bin/MCPermissions.asmx
> <?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope
> xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
> xmlns:xsd="http://www.w3.org/2001/XMLSchema"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance
> "><soapenv:Body><GetPermissionCollection
> xmlns="http://microsoft.com/sharepoint/webpartpages/
> "><objectName>/</objectName><objectType>Web</objectType></GetPermissionCollection></soapenv:Body></soapenv:Envelope>
>
> I get back the following response (http 500):
> <?xml version="1.0" encoding="utf-8"?><soap:Envelope
> xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> xmlns:xsd="http://www.w3.org/2001/XMLSchema
> "><soap:Body><soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
> request failed with HTTP status 401:
> Unauthorized.</faultstring><faultactor>http://
> [host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
> request failed with HTTP status 401:
>
> Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault></soap:Body></soap:Envelope>
>
> I seems that I only get an error with the MCPermissions webservice
> call. Other calls such as GetListCollection work fine. I'm
> authentication with the domain administrator account. This environment
> has also Sharepoint 3.0 installed. I'm making these posts to
> Sharepoint with Firefox http poster plugin.  Also the url in the
> response is without the subsite.
>
> Also important to note is that browsing to
> http://[host]/[subsite]/_vti_bin/MCPermissions.asmx shows the
> GetPermissionsCollection operation. That is what I checked after
> installing the webservice extension. You have a clue what might be
> wrong here?
>
> Thanks,
>
> Martijn
>
> On 13 September 2010 10:27, Karl Wright <da...@gmail.com> wrote:
> > Hi Martijn,
> >
> > For the 401 error, here's something also worth trying, to remove the
> > possibility that your error has anything to do with other recent changes.
> > Can you check out the following:
> >
> > svn co -r987345
> > https://svn.apache.org/repos/asf/incubator/lcf/trunk/modules/lib
> >
> > In the checkout lib area, you will see a jar called
> > commons-httpclient-lcf.jar.  Replace commons-httpclient-acf.jar with it
> > (renaming to commons-httpclient-acf.jar, of course), and try running with
> > it.  If your 401 error no longer happens, then it means something was
> messed
> > up, and I'll need to do some research.
> >
> > Thanks,
> > Karl
> >
> > On Sun, Sep 12, 2010 at 5:02 PM, Karl Wright <da...@gmail.com> wrote:
> >>
> >> I confirmed that without any mappings set, the Solr Connector *should*
> >> just be passing the metadata through using the metadata's name as the
> Solr
> >> field name.
> >>
> >> For debugging, if you could post the Solr output from one update
> >> operation, I'd love to see if any metadata seems to be in it.
> Potentially
> >> it's there but the Solr schema is not right somehow - that should be the
> >> first thing we verify.
> >>
> >> Karl
> >>
> >>
> >> On Sun, Sep 12, 2010 at 4:50 PM, Martijn v Groningen
> >> <ma...@gmail.com> wrote:
> >>>
> >>> Tomorrow I'll dive into code and do some more debugging. Last week I
> >>> didn't specify any mappings in the mapping tab for the meta data
> >>> fields I selected in the metadata tab. But this shouldn't be the
> >>> problem, right?
> >>>
> >>> Thanks,
> >>>
> >>> Martijn
> >>>
> >>> On 12 September 2010 22:29, Karl Wright <da...@gmail.com> wrote:
> >>> > Martijn,
> >>> >
> >>> > (1) The precise svn url for the acf version of httpclient is as
> >>> > follows.  My
> >>> > apologies for any earlier confusion - I was away from my computer at
> >>> > the
> >>> > time.
> >>> >
> >>> >
> >>> >
> https://svn.apache.org/repos/asf/incubator/lcf/upstream/commons-httpclient-3x
> >>> >
> >>> > (2) Each time the solr connector posts into Solr, you should see a
> set
> >>> > of
> >>> > argument names and values dumped to standard out (or the log).  So it
> >>> > should
> >>> > be easy to see what is being sent, and whether the arguments in fact
> >>> > are the
> >>> > correct ones for the extracting update request handler, or not.
> >>> > Furthermore, the Solr output connector recently had a tab added which
> >>> > performs the mapping I alluded to.  This mapping is designed to
> >>> > translate
> >>> > metadata coming from a connector like SharePoint, into fields that
> you
> >>> > presumably have in your Solr schema.  However, if you don't set
> >>> > anything,
> >>> > the fields are not changed, and you should see an argument for every
> >>> > metadata field, something like: literal.xxx=yyy.
> >>> >
> >>> > If you have a document that you *know* has metadata, and you've
> >>> > specified
> >>> > that metadata in the job, and you run the job after you specify that
> >>> > metadata, but still see no literal.xxx=yyy corresponding to it in the
> >>> > Solr
> >>> > output, then we should spend some time chasing this problem down.  Be
> >>> > wary
> >>> > because incremental crawling means you'll probably not see your
> >>> > document
> >>> > processed again unless you either change it in SharePoint, or delete
> >>> > and
> >>> > recreate the job.  But be reassured that SharePoint metadata was
> >>> > covered by
> >>> > the old MetaCarta tests, and there have been no changes of any
> >>> > significance
> >>> > to the SharePoint connector since then, so I have no explanation why
> it
> >>> > would not work for you too.  That's why I'm spending time trying to
> >>> > figure
> >>> > out if this is a Solr connector issue instead.
> >>> >
> >>> > Please let me know if this helps you, or whether you need to go
> deeper
> >>> > into
> >>> > debugging.
> >>> >
> >>> > Karl
> >>> >
> >>> >
> >>> > On Sun, Sep 12, 2010 at 4:05 PM, Martijn v Groningen
> >>> > <ma...@gmail.com> wrote:
> >>> >>
> >>> >> I didn't notice that I was under the upstream-changes directory.
> >>> >> Thanks for pointing that out.
> >>> >>
> >>> >> In Solr I have a wildcard (*) dynamic field, so everything acf sends
> >>> >> should end up in my index (or at least that is what I assume). I
> also
> >>> >> did some debugging in the Solr connecter and I noticed that no
> >>> >> metadata was send to Solr. I didn't create field mappings in my acf
> >>> >> job. Do you always have to make mapping for metadata?
> >>> >>
> >>> >> Martijn
> >>> >>
> >>> >> On 12 September 2010 21:09, Karl Wright <da...@gmail.com> wrote:
> >>> >> > The source for upstream changes is under
> >>> >> > lcf/upstream-changes/httpclient, not under trunk.
> >>> >> >
> >>> >> > As for the metadata, how are you determining that no metadata is
> >>> >> > being
> >>> >> > indexed?  If this is Solr you are indexing into, have you set up
> the
> >>> >> > appropriate metadata/field mappings?
> >>> >> >
> >>> >> > Karl
> >>> >> >
> >>> >> > On 9/12/10, Martijn v Groningen <ma...@gmail.com>
> wrote:
> >>> >> >> To authenticate with Share point I had to include the domain as
> >>> >> >> well.
> >>> >> >> Also the ui reported an error if I didn't specify the username in
> a
> >>> >> >> domain / username format. Maybe this http client issue was just
> >>> >> >> particular with the Sharepoint / Domain Controller installation I
> >>> >> >> was
> >>> >> >> working with. I also couldn't find the source of afc version of
> >>> >> >> http
> >>> >> >> client. Is it hosted in another source repository?
> >>> >> >>
> >>> >> >> I still don't understand why for the documents I crawled, I
> didn't
> >>> >> >> have any metadata associated with it. In the job configuration I
> >>> >> >> was
> >>> >> >> able to choose which metadata I wanted to include. You have an
> idea
> >>> >> >> what might be the cause of this?
> >>> >> >>
> >>> >> >> Regards,
> >>> >> >>
> >>> >> >> Martijn
> >>> >> >>
> >>> >> >> On 12 September 2010 18:40, Karl Wright <da...@gmail.com>
> wrote:
> >>> >> >>> Hi Martijn,
> >>> >> >>>
> >>> >> >>> The ACF version of httpclient has support for NTLMv1, NTLMv2,
> and
> >>> >> >>> NTLM2
> >>> >> >>> protocols.  The standard client does not.
> >>> >> >>>
> >>> >> >>> What this means practically for you depends on how the Windows
> >>> >> >>> domain
> >>> >> >>> controller you are working with is configured.  You cannot use
> the
> >>> >> >>> off-the-shelf httpclient and still authenticate if the domain
> >>> >> >>> controller
> >>> >> >>> is
> >>> >> >>> configured to not allow LM connections, which is what Microsoft
> >>> >> >>> recommends
> >>> >> >>> people do.
> >>> >> >>>
> >>> >> >>> Since the ACF version of httpclient will always try to connect
> >>> >> >>> using
> >>> >> >>> NTLMv2,
> >>> >> >>> this means that you must be more rigorous about setting up your
> >>> >> >>> client
> >>> >> >>> machine.  First, it must have a name, and it must have a machine
> >>> >> >>> account
> >>> >> >>> in
> >>> >> >>> the domain.  Second, NTLMv2 is much more picky about how you
> >>> >> >>> specify
> >>> >> >>> user
> >>> >> >>> and domain.  The end user documentation provides details that
> may
> >>> >> >>> be
> >>> >> >>> helpful
> >>> >> >>> to you in this regard.
> >>> >> >>>
> >>> >> >>> Thanks,
> >>> >> >>> Karl
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
> >>> >> >>> <ma...@gmail.com> wrote:
> >>> >> >>>>
> >>> >> >>>> Hi All,
> >>> >> >>>>
> >>> >> >>>> I've configured the Sharepoint connector (to connect to
> >>> >> >>>> sharepoint
> >>> >> >>>> 3.0), Solr connector and a job that adds documents into Solr.
> The
> >>> >> >>>> only
> >>> >> >>>> thing that I'm missing is the meta data from Sharepoint. Per
> >>> >> >>>> document
> >>> >> >>>> I need to know which users can access it. In the metadata tab
> on
> >>> >> >>>> the
> >>> >> >>>> job page I've configured the metadata to be included, but this
> >>> >> >>>> doesn't
> >>> >> >>>> end up in my Solr index. Does anybody know what I should do to
> >>> >> >>>> also
> >>> >> >>>> have the metadata in my index?
> >>> >> >>>>
> >>> >> >>>> I also had another issue with the Sharepoint connector which I
> >>> >> >>>> managed
> >>> >> >>>> to solve. But I'm curious to know if someone else encountered a
> >>> >> >>>> similar issue.
> >>> >> >>>> When I was setting up the sharepoint connecter I always got a
> 401
> >>> >> >>>> message on the connectors page as status. I was sure I entered
> >>> >> >>>> the
> >>> >> >>>> correct credentials. After some debugging I noticed that the
> NLTM
> >>> >> >>>> data
> >>> >> >>>> that was send to Solr was different then when I did a http post
> >>> >> >>>> with
> >>> >> >>>> Firefox poster plugin to a Sharepoint webservice url (I check
> >>> >> >>>> this
> >>> >> >>>> with Wireshark). After writing a little test case with
> httpclient
> >>> >> >>>> used
> >>> >> >>>> in afc, I got the same 401 error. I then ran the test with a
> >>> >> >>>> clean
> >>> >> >>>> http client (version 3.1), that ran as expected. I got a
> response
> >>> >> >>>> code
> >>> >> >>>> 200 back with a soap response. I then used this version of http
> >>> >> >>>> client
> >>> >> >>>> (with some class filesfrom the afc provided jar that were
> missing
> >>> >> >>>> is
> >>> >> >>>> the plain jar file) and the connector worked as expected as I
> was
> >>> >> >>>> able
> >>> >> >>>> to index documents. Did someone else have this particular
> issue?
> >>> >> >>>> I
> >>> >> >>>> noticed that acf is using httpclient 3.1 (from the manifest
> >>> >> >>>> file),
> >>> >> >>>> but
> >>> >> >>>> I'm curious to know why http client was modified.
> >>> >> >>>>
> >>> >> >>>> BTW I've been using the latest trunk version (I did a checkout
> >>> >> >>>> last
> >>> >> >>>> tuesday). I'm also new to Sharepoint
> >>> >> >>>>
> >>> >> >>>> Cheers,
> >>> >> >>>>
> >>> >> >>>> Martijn
> >>> >> >>>
> >>> >> >>>
> >>> >> >>
> >>> >> >
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> Met vriendelijke groet,
> >>> >>
> >>> >> Martijn van Groningen
> >>> >
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Met vriendelijke groet,
> >>>
> >>> Martijn van Groningen
> >>
> >
> >
>
>
>
> --
> Met vriendelijke groet,
>
> Martijn van Groningen
>

Re: Sharepoint connector question

Posted by Martijn v Groningen <ma...@gmail.com>.
Hi Karl,

Today I'm not at the environment where I can verify this, but I'll
definitely check this. But I ran into another issue with the
Sharepoint connector. In a another environment I installed the
Metacarta Sharepoint webservice extensions, but by executing the
following post:
HTTP POST http://[host]/rotterdamn/_vti_bin/MCPermissions.asmx
<?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><soapenv:Body><GetPermissionCollection
xmlns="http://microsoft.com/sharepoint/webpartpages/"><objectName>/</objectName><objectType>Web</objectType></GetPermissionCollection></soapenv:Body></soapenv:Envelope>

I get back the following response (http 500):
<?xml version="1.0" encoding="utf-8"?><soap:Envelope
xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><soap:Fault><faultcode>soap:Client</faultcode><faultstring>The
request failed with HTTP status 401:
Unauthorized.</faultstring><faultactor>http://[host]/_vti_bin/MCPermissions.asmx</faultactor><detail><Error><ErrorNumber>1000</ErrorNumber><ErrorMessage>The
request failed with HTTP status 401:
Unauthorized.</ErrorMessage><ErrorSource>System.Web.Services</ErrorSource></Error></detail></soap:Fault></soap:Body></soap:Envelope>

I seems that I only get an error with the MCPermissions webservice
call. Other calls such as GetListCollection work fine. I'm
authentication with the domain administrator account. This environment
has also Sharepoint 3.0 installed. I'm making these posts to
Sharepoint with Firefox http poster plugin.  Also the url in the
response is without the subsite.

Also important to note is that browsing to
http://[host]/[subsite]/_vti_bin/MCPermissions.asmx shows the
GetPermissionsCollection operation. That is what I checked after
installing the webservice extension. You have a clue what might be
wrong here?

Thanks,

Martijn

On 13 September 2010 10:27, Karl Wright <da...@gmail.com> wrote:
> Hi Martijn,
>
> For the 401 error, here's something also worth trying, to remove the
> possibility that your error has anything to do with other recent changes.
> Can you check out the following:
>
> svn co -r987345
> https://svn.apache.org/repos/asf/incubator/lcf/trunk/modules/lib
>
> In the checkout lib area, you will see a jar called
> commons-httpclient-lcf.jar.  Replace commons-httpclient-acf.jar with it
> (renaming to commons-httpclient-acf.jar, of course), and try running with
> it.  If your 401 error no longer happens, then it means something was messed
> up, and I'll need to do some research.
>
> Thanks,
> Karl
>
> On Sun, Sep 12, 2010 at 5:02 PM, Karl Wright <da...@gmail.com> wrote:
>>
>> I confirmed that without any mappings set, the Solr Connector *should*
>> just be passing the metadata through using the metadata's name as the Solr
>> field name.
>>
>> For debugging, if you could post the Solr output from one update
>> operation, I'd love to see if any metadata seems to be in it.  Potentially
>> it's there but the Solr schema is not right somehow - that should be the
>> first thing we verify.
>>
>> Karl
>>
>>
>> On Sun, Sep 12, 2010 at 4:50 PM, Martijn v Groningen
>> <ma...@gmail.com> wrote:
>>>
>>> Tomorrow I'll dive into code and do some more debugging. Last week I
>>> didn't specify any mappings in the mapping tab for the meta data
>>> fields I selected in the metadata tab. But this shouldn't be the
>>> problem, right?
>>>
>>> Thanks,
>>>
>>> Martijn
>>>
>>> On 12 September 2010 22:29, Karl Wright <da...@gmail.com> wrote:
>>> > Martijn,
>>> >
>>> > (1) The precise svn url for the acf version of httpclient is as
>>> > follows.  My
>>> > apologies for any earlier confusion - I was away from my computer at
>>> > the
>>> > time.
>>> >
>>> >
>>> > https://svn.apache.org/repos/asf/incubator/lcf/upstream/commons-httpclient-3x
>>> >
>>> > (2) Each time the solr connector posts into Solr, you should see a set
>>> > of
>>> > argument names and values dumped to standard out (or the log).  So it
>>> > should
>>> > be easy to see what is being sent, and whether the arguments in fact
>>> > are the
>>> > correct ones for the extracting update request handler, or not.
>>> > Furthermore, the Solr output connector recently had a tab added which
>>> > performs the mapping I alluded to.  This mapping is designed to
>>> > translate
>>> > metadata coming from a connector like SharePoint, into fields that you
>>> > presumably have in your Solr schema.  However, if you don't set
>>> > anything,
>>> > the fields are not changed, and you should see an argument for every
>>> > metadata field, something like: literal.xxx=yyy.
>>> >
>>> > If you have a document that you *know* has metadata, and you've
>>> > specified
>>> > that metadata in the job, and you run the job after you specify that
>>> > metadata, but still see no literal.xxx=yyy corresponding to it in the
>>> > Solr
>>> > output, then we should spend some time chasing this problem down.  Be
>>> > wary
>>> > because incremental crawling means you'll probably not see your
>>> > document
>>> > processed again unless you either change it in SharePoint, or delete
>>> > and
>>> > recreate the job.  But be reassured that SharePoint metadata was
>>> > covered by
>>> > the old MetaCarta tests, and there have been no changes of any
>>> > significance
>>> > to the SharePoint connector since then, so I have no explanation why it
>>> > would not work for you too.  That's why I'm spending time trying to
>>> > figure
>>> > out if this is a Solr connector issue instead.
>>> >
>>> > Please let me know if this helps you, or whether you need to go deeper
>>> > into
>>> > debugging.
>>> >
>>> > Karl
>>> >
>>> >
>>> > On Sun, Sep 12, 2010 at 4:05 PM, Martijn v Groningen
>>> > <ma...@gmail.com> wrote:
>>> >>
>>> >> I didn't notice that I was under the upstream-changes directory.
>>> >> Thanks for pointing that out.
>>> >>
>>> >> In Solr I have a wildcard (*) dynamic field, so everything acf sends
>>> >> should end up in my index (or at least that is what I assume). I also
>>> >> did some debugging in the Solr connecter and I noticed that no
>>> >> metadata was send to Solr. I didn't create field mappings in my acf
>>> >> job. Do you always have to make mapping for metadata?
>>> >>
>>> >> Martijn
>>> >>
>>> >> On 12 September 2010 21:09, Karl Wright <da...@gmail.com> wrote:
>>> >> > The source for upstream changes is under
>>> >> > lcf/upstream-changes/httpclient, not under trunk.
>>> >> >
>>> >> > As for the metadata, how are you determining that no metadata is
>>> >> > being
>>> >> > indexed?  If this is Solr you are indexing into, have you set up the
>>> >> > appropriate metadata/field mappings?
>>> >> >
>>> >> > Karl
>>> >> >
>>> >> > On 9/12/10, Martijn v Groningen <ma...@gmail.com> wrote:
>>> >> >> To authenticate with Share point I had to include the domain as
>>> >> >> well.
>>> >> >> Also the ui reported an error if I didn't specify the username in a
>>> >> >> domain / username format. Maybe this http client issue was just
>>> >> >> particular with the Sharepoint / Domain Controller installation I
>>> >> >> was
>>> >> >> working with. I also couldn't find the source of afc version of
>>> >> >> http
>>> >> >> client. Is it hosted in another source repository?
>>> >> >>
>>> >> >> I still don't understand why for the documents I crawled, I didn't
>>> >> >> have any metadata associated with it. In the job configuration I
>>> >> >> was
>>> >> >> able to choose which metadata I wanted to include. You have an idea
>>> >> >> what might be the cause of this?
>>> >> >>
>>> >> >> Regards,
>>> >> >>
>>> >> >> Martijn
>>> >> >>
>>> >> >> On 12 September 2010 18:40, Karl Wright <da...@gmail.com> wrote:
>>> >> >>> Hi Martijn,
>>> >> >>>
>>> >> >>> The ACF version of httpclient has support for NTLMv1, NTLMv2, and
>>> >> >>> NTLM2
>>> >> >>> protocols.  The standard client does not.
>>> >> >>>
>>> >> >>> What this means practically for you depends on how the Windows
>>> >> >>> domain
>>> >> >>> controller you are working with is configured.  You cannot use the
>>> >> >>> off-the-shelf httpclient and still authenticate if the domain
>>> >> >>> controller
>>> >> >>> is
>>> >> >>> configured to not allow LM connections, which is what Microsoft
>>> >> >>> recommends
>>> >> >>> people do.
>>> >> >>>
>>> >> >>> Since the ACF version of httpclient will always try to connect
>>> >> >>> using
>>> >> >>> NTLMv2,
>>> >> >>> this means that you must be more rigorous about setting up your
>>> >> >>> client
>>> >> >>> machine.  First, it must have a name, and it must have a machine
>>> >> >>> account
>>> >> >>> in
>>> >> >>> the domain.  Second, NTLMv2 is much more picky about how you
>>> >> >>> specify
>>> >> >>> user
>>> >> >>> and domain.  The end user documentation provides details that may
>>> >> >>> be
>>> >> >>> helpful
>>> >> >>> to you in this regard.
>>> >> >>>
>>> >> >>> Thanks,
>>> >> >>> Karl
>>> >> >>>
>>> >> >>>
>>> >> >>> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
>>> >> >>> <ma...@gmail.com> wrote:
>>> >> >>>>
>>> >> >>>> Hi All,
>>> >> >>>>
>>> >> >>>> I've configured the Sharepoint connector (to connect to
>>> >> >>>> sharepoint
>>> >> >>>> 3.0), Solr connector and a job that adds documents into Solr. The
>>> >> >>>> only
>>> >> >>>> thing that I'm missing is the meta data from Sharepoint. Per
>>> >> >>>> document
>>> >> >>>> I need to know which users can access it. In the metadata tab on
>>> >> >>>> the
>>> >> >>>> job page I've configured the metadata to be included, but this
>>> >> >>>> doesn't
>>> >> >>>> end up in my Solr index. Does anybody know what I should do to
>>> >> >>>> also
>>> >> >>>> have the metadata in my index?
>>> >> >>>>
>>> >> >>>> I also had another issue with the Sharepoint connector which I
>>> >> >>>> managed
>>> >> >>>> to solve. But I'm curious to know if someone else encountered a
>>> >> >>>> similar issue.
>>> >> >>>> When I was setting up the sharepoint connecter I always got a 401
>>> >> >>>> message on the connectors page as status. I was sure I entered
>>> >> >>>> the
>>> >> >>>> correct credentials. After some debugging I noticed that the NLTM
>>> >> >>>> data
>>> >> >>>> that was send to Solr was different then when I did a http post
>>> >> >>>> with
>>> >> >>>> Firefox poster plugin to a Sharepoint webservice url (I check
>>> >> >>>> this
>>> >> >>>> with Wireshark). After writing a little test case with httpclient
>>> >> >>>> used
>>> >> >>>> in afc, I got the same 401 error. I then ran the test with a
>>> >> >>>> clean
>>> >> >>>> http client (version 3.1), that ran as expected. I got a response
>>> >> >>>> code
>>> >> >>>> 200 back with a soap response. I then used this version of http
>>> >> >>>> client
>>> >> >>>> (with some class filesfrom the afc provided jar that were missing
>>> >> >>>> is
>>> >> >>>> the plain jar file) and the connector worked as expected as I was
>>> >> >>>> able
>>> >> >>>> to index documents. Did someone else have this particular issue?
>>> >> >>>> I
>>> >> >>>> noticed that acf is using httpclient 3.1 (from the manifest
>>> >> >>>> file),
>>> >> >>>> but
>>> >> >>>> I'm curious to know why http client was modified.
>>> >> >>>>
>>> >> >>>> BTW I've been using the latest trunk version (I did a checkout
>>> >> >>>> last
>>> >> >>>> tuesday). I'm also new to Sharepoint
>>> >> >>>>
>>> >> >>>> Cheers,
>>> >> >>>>
>>> >> >>>> Martijn
>>> >> >>>
>>> >> >>>
>>> >> >>
>>> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Met vriendelijke groet,
>>> >>
>>> >> Martijn van Groningen
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Met vriendelijke groet,
>>>
>>> Martijn van Groningen
>>
>
>



-- 
Met vriendelijke groet,

Martijn van Groningen

Re: Sharepoint connector question

Posted by Karl Wright <da...@gmail.com>.
Hi Martijn,

For the 401 error, here's something also worth trying, to remove the
possibility that your error has anything to do with other recent changes.
Can you check out the following:

svn co -r987345
https://svn.apache.org/repos/asf/incubator/lcf/trunk/modules/lib

In the checkout lib area, you will see a jar called
commons-httpclient-lcf.jar.  Replace commons-httpclient-acf.jar with it
(renaming to commons-httpclient-acf.jar, of course), and try running with
it.  If your 401 error no longer happens, then it means something was messed
up, and I'll need to do some research.

Thanks,
Karl

On Sun, Sep 12, 2010 at 5:02 PM, Karl Wright <da...@gmail.com> wrote:

> I confirmed that without any mappings set, the Solr Connector *should* just
> be passing the metadata through using the metadata's name as the Solr field
> name.
>
> For debugging, if you could post the Solr output from one update operation,
> I'd love to see if any metadata seems to be in it.  Potentially it's there
> but the Solr schema is not right somehow - that should be the first thing we
> verify.
>
> Karl
>
>
>
> On Sun, Sep 12, 2010 at 4:50 PM, Martijn v Groningen <
> martijn.is.hier@gmail.com> wrote:
>
>> Tomorrow I'll dive into code and do some more debugging. Last week I
>> didn't specify any mappings in the mapping tab for the meta data
>> fields I selected in the metadata tab. But this shouldn't be the
>> problem, right?
>>
>> Thanks,
>>
>> Martijn
>>
>> On 12 September 2010 22:29, Karl Wright <da...@gmail.com> wrote:
>> > Martijn,
>> >
>> > (1) The precise svn url for the acf version of httpclient is as
>> follows.  My
>> > apologies for any earlier confusion - I was away from my computer at the
>> > time.
>> >
>> >
>> https://svn.apache.org/repos/asf/incubator/lcf/upstream/commons-httpclient-3x
>> >
>> > (2) Each time the solr connector posts into Solr, you should see a set
>> of
>> > argument names and values dumped to standard out (or the log).  So it
>> should
>> > be easy to see what is being sent, and whether the arguments in fact are
>> the
>> > correct ones for the extracting update request handler, or not.
>> > Furthermore, the Solr output connector recently had a tab added which
>> > performs the mapping I alluded to.  This mapping is designed to
>> translate
>> > metadata coming from a connector like SharePoint, into fields that you
>> > presumably have in your Solr schema.  However, if you don't set
>> anything,
>> > the fields are not changed, and you should see an argument for every
>> > metadata field, something like: literal.xxx=yyy.
>> >
>> > If you have a document that you *know* has metadata, and you've
>> specified
>> > that metadata in the job, and you run the job after you specify that
>> > metadata, but still see no literal.xxx=yyy corresponding to it in the
>> Solr
>> > output, then we should spend some time chasing this problem down.  Be
>> wary
>> > because incremental crawling means you'll probably not see your document
>> > processed again unless you either change it in SharePoint, or delete and
>> > recreate the job.  But be reassured that SharePoint metadata was covered
>> by
>> > the old MetaCarta tests, and there have been no changes of any
>> significance
>> > to the SharePoint connector since then, so I have no explanation why it
>> > would not work for you too.  That's why I'm spending time trying to
>> figure
>> > out if this is a Solr connector issue instead.
>> >
>> > Please let me know if this helps you, or whether you need to go deeper
>> into
>> > debugging.
>> >
>> > Karl
>> >
>> >
>> > On Sun, Sep 12, 2010 at 4:05 PM, Martijn v Groningen
>> > <ma...@gmail.com> wrote:
>> >>
>> >> I didn't notice that I was under the upstream-changes directory.
>> >> Thanks for pointing that out.
>> >>
>> >> In Solr I have a wildcard (*) dynamic field, so everything acf sends
>> >> should end up in my index (or at least that is what I assume). I also
>> >> did some debugging in the Solr connecter and I noticed that no
>> >> metadata was send to Solr. I didn't create field mappings in my acf
>> >> job. Do you always have to make mapping for metadata?
>> >>
>> >> Martijn
>> >>
>> >> On 12 September 2010 21:09, Karl Wright <da...@gmail.com> wrote:
>> >> > The source for upstream changes is under
>> >> > lcf/upstream-changes/httpclient, not under trunk.
>> >> >
>> >> > As for the metadata, how are you determining that no metadata is
>> being
>> >> > indexed?  If this is Solr you are indexing into, have you set up the
>> >> > appropriate metadata/field mappings?
>> >> >
>> >> > Karl
>> >> >
>> >> > On 9/12/10, Martijn v Groningen <ma...@gmail.com> wrote:
>> >> >> To authenticate with Share point I had to include the domain as
>> well.
>> >> >> Also the ui reported an error if I didn't specify the username in a
>> >> >> domain / username format. Maybe this http client issue was just
>> >> >> particular with the Sharepoint / Domain Controller installation I
>> was
>> >> >> working with. I also couldn't find the source of afc version of http
>> >> >> client. Is it hosted in another source repository?
>> >> >>
>> >> >> I still don't understand why for the documents I crawled, I didn't
>> >> >> have any metadata associated with it. In the job configuration I was
>> >> >> able to choose which metadata I wanted to include. You have an idea
>> >> >> what might be the cause of this?
>> >> >>
>> >> >> Regards,
>> >> >>
>> >> >> Martijn
>> >> >>
>> >> >> On 12 September 2010 18:40, Karl Wright <da...@gmail.com> wrote:
>> >> >>> Hi Martijn,
>> >> >>>
>> >> >>> The ACF version of httpclient has support for NTLMv1, NTLMv2, and
>> >> >>> NTLM2
>> >> >>> protocols.  The standard client does not.
>> >> >>>
>> >> >>> What this means practically for you depends on how the Windows
>> domain
>> >> >>> controller you are working with is configured.  You cannot use the
>> >> >>> off-the-shelf httpclient and still authenticate if the domain
>> >> >>> controller
>> >> >>> is
>> >> >>> configured to not allow LM connections, which is what Microsoft
>> >> >>> recommends
>> >> >>> people do.
>> >> >>>
>> >> >>> Since the ACF version of httpclient will always try to connect
>> using
>> >> >>> NTLMv2,
>> >> >>> this means that you must be more rigorous about setting up your
>> client
>> >> >>> machine.  First, it must have a name, and it must have a machine
>> >> >>> account
>> >> >>> in
>> >> >>> the domain.  Second, NTLMv2 is much more picky about how you
>> specify
>> >> >>> user
>> >> >>> and domain.  The end user documentation provides details that may
>> be
>> >> >>> helpful
>> >> >>> to you in this regard.
>> >> >>>
>> >> >>> Thanks,
>> >> >>> Karl
>> >> >>>
>> >> >>>
>> >> >>> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
>> >> >>> <ma...@gmail.com> wrote:
>> >> >>>>
>> >> >>>> Hi All,
>> >> >>>>
>> >> >>>> I've configured the Sharepoint connector (to connect to sharepoint
>> >> >>>> 3.0), Solr connector and a job that adds documents into Solr. The
>> >> >>>> only
>> >> >>>> thing that I'm missing is the meta data from Sharepoint. Per
>> document
>> >> >>>> I need to know which users can access it. In the metadata tab on
>> the
>> >> >>>> job page I've configured the metadata to be included, but this
>> >> >>>> doesn't
>> >> >>>> end up in my Solr index. Does anybody know what I should do to
>> also
>> >> >>>> have the metadata in my index?
>> >> >>>>
>> >> >>>> I also had another issue with the Sharepoint connector which I
>> >> >>>> managed
>> >> >>>> to solve. But I'm curious to know if someone else encountered a
>> >> >>>> similar issue.
>> >> >>>> When I was setting up the sharepoint connecter I always got a 401
>> >> >>>> message on the connectors page as status. I was sure I entered the
>> >> >>>> correct credentials. After some debugging I noticed that the NLTM
>> >> >>>> data
>> >> >>>> that was send to Solr was different then when I did a http post
>> with
>> >> >>>> Firefox poster plugin to a Sharepoint webservice url (I check this
>> >> >>>> with Wireshark). After writing a little test case with httpclient
>> >> >>>> used
>> >> >>>> in afc, I got the same 401 error. I then ran the test with a clean
>> >> >>>> http client (version 3.1), that ran as expected. I got a response
>> >> >>>> code
>> >> >>>> 200 back with a soap response. I then used this version of http
>> >> >>>> client
>> >> >>>> (with some class filesfrom the afc provided jar that were missing
>> is
>> >> >>>> the plain jar file) and the connector worked as expected as I was
>> >> >>>> able
>> >> >>>> to index documents. Did someone else have this particular issue? I
>> >> >>>> noticed that acf is using httpclient 3.1 (from the manifest file),
>> >> >>>> but
>> >> >>>> I'm curious to know why http client was modified.
>> >> >>>>
>> >> >>>> BTW I've been using the latest trunk version (I did a checkout
>> last
>> >> >>>> tuesday). I'm also new to Sharepoint
>> >> >>>>
>> >> >>>> Cheers,
>> >> >>>>
>> >> >>>> Martijn
>> >> >>>
>> >> >>>
>> >> >>
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Met vriendelijke groet,
>> >>
>> >> Martijn van Groningen
>> >
>> >
>>
>>
>>
>> --
>> Met vriendelijke groet,
>>
>> Martijn van Groningen
>>
>
>

Re: Sharepoint connector question

Posted by Karl Wright <da...@gmail.com>.
I confirmed that without any mappings set, the Solr Connector *should* just
be passing the metadata through using the metadata's name as the Solr field
name.

For debugging, if you could post the Solr output from one update operation,
I'd love to see if any metadata seems to be in it.  Potentially it's there
but the Solr schema is not right somehow - that should be the first thing we
verify.

Karl


On Sun, Sep 12, 2010 at 4:50 PM, Martijn v Groningen <
martijn.is.hier@gmail.com> wrote:

> Tomorrow I'll dive into code and do some more debugging. Last week I
> didn't specify any mappings in the mapping tab for the meta data
> fields I selected in the metadata tab. But this shouldn't be the
> problem, right?
>
> Thanks,
>
> Martijn
>
> On 12 September 2010 22:29, Karl Wright <da...@gmail.com> wrote:
> > Martijn,
> >
> > (1) The precise svn url for the acf version of httpclient is as follows.
> My
> > apologies for any earlier confusion - I was away from my computer at the
> > time.
> >
> >
> https://svn.apache.org/repos/asf/incubator/lcf/upstream/commons-httpclient-3x
> >
> > (2) Each time the solr connector posts into Solr, you should see a set of
> > argument names and values dumped to standard out (or the log).  So it
> should
> > be easy to see what is being sent, and whether the arguments in fact are
> the
> > correct ones for the extracting update request handler, or not.
> > Furthermore, the Solr output connector recently had a tab added which
> > performs the mapping I alluded to.  This mapping is designed to translate
> > metadata coming from a connector like SharePoint, into fields that you
> > presumably have in your Solr schema.  However, if you don't set anything,
> > the fields are not changed, and you should see an argument for every
> > metadata field, something like: literal.xxx=yyy.
> >
> > If you have a document that you *know* has metadata, and you've specified
> > that metadata in the job, and you run the job after you specify that
> > metadata, but still see no literal.xxx=yyy corresponding to it in the
> Solr
> > output, then we should spend some time chasing this problem down.  Be
> wary
> > because incremental crawling means you'll probably not see your document
> > processed again unless you either change it in SharePoint, or delete and
> > recreate the job.  But be reassured that SharePoint metadata was covered
> by
> > the old MetaCarta tests, and there have been no changes of any
> significance
> > to the SharePoint connector since then, so I have no explanation why it
> > would not work for you too.  That's why I'm spending time trying to
> figure
> > out if this is a Solr connector issue instead.
> >
> > Please let me know if this helps you, or whether you need to go deeper
> into
> > debugging.
> >
> > Karl
> >
> >
> > On Sun, Sep 12, 2010 at 4:05 PM, Martijn v Groningen
> > <ma...@gmail.com> wrote:
> >>
> >> I didn't notice that I was under the upstream-changes directory.
> >> Thanks for pointing that out.
> >>
> >> In Solr I have a wildcard (*) dynamic field, so everything acf sends
> >> should end up in my index (or at least that is what I assume). I also
> >> did some debugging in the Solr connecter and I noticed that no
> >> metadata was send to Solr. I didn't create field mappings in my acf
> >> job. Do you always have to make mapping for metadata?
> >>
> >> Martijn
> >>
> >> On 12 September 2010 21:09, Karl Wright <da...@gmail.com> wrote:
> >> > The source for upstream changes is under
> >> > lcf/upstream-changes/httpclient, not under trunk.
> >> >
> >> > As for the metadata, how are you determining that no metadata is being
> >> > indexed?  If this is Solr you are indexing into, have you set up the
> >> > appropriate metadata/field mappings?
> >> >
> >> > Karl
> >> >
> >> > On 9/12/10, Martijn v Groningen <ma...@gmail.com> wrote:
> >> >> To authenticate with Share point I had to include the domain as well.
> >> >> Also the ui reported an error if I didn't specify the username in a
> >> >> domain / username format. Maybe this http client issue was just
> >> >> particular with the Sharepoint / Domain Controller installation I was
> >> >> working with. I also couldn't find the source of afc version of http
> >> >> client. Is it hosted in another source repository?
> >> >>
> >> >> I still don't understand why for the documents I crawled, I didn't
> >> >> have any metadata associated with it. In the job configuration I was
> >> >> able to choose which metadata I wanted to include. You have an idea
> >> >> what might be the cause of this?
> >> >>
> >> >> Regards,
> >> >>
> >> >> Martijn
> >> >>
> >> >> On 12 September 2010 18:40, Karl Wright <da...@gmail.com> wrote:
> >> >>> Hi Martijn,
> >> >>>
> >> >>> The ACF version of httpclient has support for NTLMv1, NTLMv2, and
> >> >>> NTLM2
> >> >>> protocols.  The standard client does not.
> >> >>>
> >> >>> What this means practically for you depends on how the Windows
> domain
> >> >>> controller you are working with is configured.  You cannot use the
> >> >>> off-the-shelf httpclient and still authenticate if the domain
> >> >>> controller
> >> >>> is
> >> >>> configured to not allow LM connections, which is what Microsoft
> >> >>> recommends
> >> >>> people do.
> >> >>>
> >> >>> Since the ACF version of httpclient will always try to connect using
> >> >>> NTLMv2,
> >> >>> this means that you must be more rigorous about setting up your
> client
> >> >>> machine.  First, it must have a name, and it must have a machine
> >> >>> account
> >> >>> in
> >> >>> the domain.  Second, NTLMv2 is much more picky about how you specify
> >> >>> user
> >> >>> and domain.  The end user documentation provides details that may be
> >> >>> helpful
> >> >>> to you in this regard.
> >> >>>
> >> >>> Thanks,
> >> >>> Karl
> >> >>>
> >> >>>
> >> >>> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
> >> >>> <ma...@gmail.com> wrote:
> >> >>>>
> >> >>>> Hi All,
> >> >>>>
> >> >>>> I've configured the Sharepoint connector (to connect to sharepoint
> >> >>>> 3.0), Solr connector and a job that adds documents into Solr. The
> >> >>>> only
> >> >>>> thing that I'm missing is the meta data from Sharepoint. Per
> document
> >> >>>> I need to know which users can access it. In the metadata tab on
> the
> >> >>>> job page I've configured the metadata to be included, but this
> >> >>>> doesn't
> >> >>>> end up in my Solr index. Does anybody know what I should do to also
> >> >>>> have the metadata in my index?
> >> >>>>
> >> >>>> I also had another issue with the Sharepoint connector which I
> >> >>>> managed
> >> >>>> to solve. But I'm curious to know if someone else encountered a
> >> >>>> similar issue.
> >> >>>> When I was setting up the sharepoint connecter I always got a 401
> >> >>>> message on the connectors page as status. I was sure I entered the
> >> >>>> correct credentials. After some debugging I noticed that the NLTM
> >> >>>> data
> >> >>>> that was send to Solr was different then when I did a http post
> with
> >> >>>> Firefox poster plugin to a Sharepoint webservice url (I check this
> >> >>>> with Wireshark). After writing a little test case with httpclient
> >> >>>> used
> >> >>>> in afc, I got the same 401 error. I then ran the test with a clean
> >> >>>> http client (version 3.1), that ran as expected. I got a response
> >> >>>> code
> >> >>>> 200 back with a soap response. I then used this version of http
> >> >>>> client
> >> >>>> (with some class filesfrom the afc provided jar that were missing
> is
> >> >>>> the plain jar file) and the connector worked as expected as I was
> >> >>>> able
> >> >>>> to index documents. Did someone else have this particular issue? I
> >> >>>> noticed that acf is using httpclient 3.1 (from the manifest file),
> >> >>>> but
> >> >>>> I'm curious to know why http client was modified.
> >> >>>>
> >> >>>> BTW I've been using the latest trunk version (I did a checkout last
> >> >>>> tuesday). I'm also new to Sharepoint
> >> >>>>
> >> >>>> Cheers,
> >> >>>>
> >> >>>> Martijn
> >> >>>
> >> >>>
> >> >>
> >> >
> >>
> >>
> >>
> >> --
> >> Met vriendelijke groet,
> >>
> >> Martijn van Groningen
> >
> >
>
>
>
> --
> Met vriendelijke groet,
>
> Martijn van Groningen
>

Re: Sharepoint connector question

Posted by Martijn v Groningen <ma...@gmail.com>.
Tomorrow I'll dive into code and do some more debugging. Last week I
didn't specify any mappings in the mapping tab for the meta data
fields I selected in the metadata tab. But this shouldn't be the
problem, right?

Thanks,

Martijn

On 12 September 2010 22:29, Karl Wright <da...@gmail.com> wrote:
> Martijn,
>
> (1) The precise svn url for the acf version of httpclient is as follows.  My
> apologies for any earlier confusion - I was away from my computer at the
> time.
>
> https://svn.apache.org/repos/asf/incubator/lcf/upstream/commons-httpclient-3x
>
> (2) Each time the solr connector posts into Solr, you should see a set of
> argument names and values dumped to standard out (or the log).  So it should
> be easy to see what is being sent, and whether the arguments in fact are the
> correct ones for the extracting update request handler, or not.
> Furthermore, the Solr output connector recently had a tab added which
> performs the mapping I alluded to.  This mapping is designed to translate
> metadata coming from a connector like SharePoint, into fields that you
> presumably have in your Solr schema.  However, if you don't set anything,
> the fields are not changed, and you should see an argument for every
> metadata field, something like: literal.xxx=yyy.
>
> If you have a document that you *know* has metadata, and you've specified
> that metadata in the job, and you run the job after you specify that
> metadata, but still see no literal.xxx=yyy corresponding to it in the Solr
> output, then we should spend some time chasing this problem down.  Be wary
> because incremental crawling means you'll probably not see your document
> processed again unless you either change it in SharePoint, or delete and
> recreate the job.  But be reassured that SharePoint metadata was covered by
> the old MetaCarta tests, and there have been no changes of any significance
> to the SharePoint connector since then, so I have no explanation why it
> would not work for you too.  That's why I'm spending time trying to figure
> out if this is a Solr connector issue instead.
>
> Please let me know if this helps you, or whether you need to go deeper into
> debugging.
>
> Karl
>
>
> On Sun, Sep 12, 2010 at 4:05 PM, Martijn v Groningen
> <ma...@gmail.com> wrote:
>>
>> I didn't notice that I was under the upstream-changes directory.
>> Thanks for pointing that out.
>>
>> In Solr I have a wildcard (*) dynamic field, so everything acf sends
>> should end up in my index (or at least that is what I assume). I also
>> did some debugging in the Solr connecter and I noticed that no
>> metadata was send to Solr. I didn't create field mappings in my acf
>> job. Do you always have to make mapping for metadata?
>>
>> Martijn
>>
>> On 12 September 2010 21:09, Karl Wright <da...@gmail.com> wrote:
>> > The source for upstream changes is under
>> > lcf/upstream-changes/httpclient, not under trunk.
>> >
>> > As for the metadata, how are you determining that no metadata is being
>> > indexed?  If this is Solr you are indexing into, have you set up the
>> > appropriate metadata/field mappings?
>> >
>> > Karl
>> >
>> > On 9/12/10, Martijn v Groningen <ma...@gmail.com> wrote:
>> >> To authenticate with Share point I had to include the domain as well.
>> >> Also the ui reported an error if I didn't specify the username in a
>> >> domain / username format. Maybe this http client issue was just
>> >> particular with the Sharepoint / Domain Controller installation I was
>> >> working with. I also couldn't find the source of afc version of http
>> >> client. Is it hosted in another source repository?
>> >>
>> >> I still don't understand why for the documents I crawled, I didn't
>> >> have any metadata associated with it. In the job configuration I was
>> >> able to choose which metadata I wanted to include. You have an idea
>> >> what might be the cause of this?
>> >>
>> >> Regards,
>> >>
>> >> Martijn
>> >>
>> >> On 12 September 2010 18:40, Karl Wright <da...@gmail.com> wrote:
>> >>> Hi Martijn,
>> >>>
>> >>> The ACF version of httpclient has support for NTLMv1, NTLMv2, and
>> >>> NTLM2
>> >>> protocols.  The standard client does not.
>> >>>
>> >>> What this means practically for you depends on how the Windows domain
>> >>> controller you are working with is configured.  You cannot use the
>> >>> off-the-shelf httpclient and still authenticate if the domain
>> >>> controller
>> >>> is
>> >>> configured to not allow LM connections, which is what Microsoft
>> >>> recommends
>> >>> people do.
>> >>>
>> >>> Since the ACF version of httpclient will always try to connect using
>> >>> NTLMv2,
>> >>> this means that you must be more rigorous about setting up your client
>> >>> machine.  First, it must have a name, and it must have a machine
>> >>> account
>> >>> in
>> >>> the domain.  Second, NTLMv2 is much more picky about how you specify
>> >>> user
>> >>> and domain.  The end user documentation provides details that may be
>> >>> helpful
>> >>> to you in this regard.
>> >>>
>> >>> Thanks,
>> >>> Karl
>> >>>
>> >>>
>> >>> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
>> >>> <ma...@gmail.com> wrote:
>> >>>>
>> >>>> Hi All,
>> >>>>
>> >>>> I've configured the Sharepoint connector (to connect to sharepoint
>> >>>> 3.0), Solr connector and a job that adds documents into Solr. The
>> >>>> only
>> >>>> thing that I'm missing is the meta data from Sharepoint. Per document
>> >>>> I need to know which users can access it. In the metadata tab on the
>> >>>> job page I've configured the metadata to be included, but this
>> >>>> doesn't
>> >>>> end up in my Solr index. Does anybody know what I should do to also
>> >>>> have the metadata in my index?
>> >>>>
>> >>>> I also had another issue with the Sharepoint connector which I
>> >>>> managed
>> >>>> to solve. But I'm curious to know if someone else encountered a
>> >>>> similar issue.
>> >>>> When I was setting up the sharepoint connecter I always got a 401
>> >>>> message on the connectors page as status. I was sure I entered the
>> >>>> correct credentials. After some debugging I noticed that the NLTM
>> >>>> data
>> >>>> that was send to Solr was different then when I did a http post with
>> >>>> Firefox poster plugin to a Sharepoint webservice url (I check this
>> >>>> with Wireshark). After writing a little test case with httpclient
>> >>>> used
>> >>>> in afc, I got the same 401 error. I then ran the test with a clean
>> >>>> http client (version 3.1), that ran as expected. I got a response
>> >>>> code
>> >>>> 200 back with a soap response. I then used this version of http
>> >>>> client
>> >>>> (with some class filesfrom the afc provided jar that were missing is
>> >>>> the plain jar file) and the connector worked as expected as I was
>> >>>> able
>> >>>> to index documents. Did someone else have this particular issue? I
>> >>>> noticed that acf is using httpclient 3.1 (from the manifest file),
>> >>>> but
>> >>>> I'm curious to know why http client was modified.
>> >>>>
>> >>>> BTW I've been using the latest trunk version (I did a checkout last
>> >>>> tuesday). I'm also new to Sharepoint
>> >>>>
>> >>>> Cheers,
>> >>>>
>> >>>> Martijn
>> >>>
>> >>>
>> >>
>> >
>>
>>
>>
>> --
>> Met vriendelijke groet,
>>
>> Martijn van Groningen
>
>



-- 
Met vriendelijke groet,

Martijn van Groningen

Re: Sharepoint connector question

Posted by Karl Wright <da...@gmail.com>.
Martijn,

(1) The precise svn url for the acf version of httpclient is as follows.  My
apologies for any earlier confusion - I was away from my computer at the
time.

https://svn.apache.org/repos/asf/incubator/lcf/upstream/commons-httpclient-3x

(2) Each time the solr connector posts into Solr, you should see a set of
argument names and values dumped to standard out (or the log).  So it should
be easy to see what is being sent, and whether the arguments in fact are the
correct ones for the extracting update request handler, or not.
Furthermore, the Solr output connector recently had a tab added which
performs the mapping I alluded to.  This mapping is designed to translate
metadata coming from a connector like SharePoint, into fields that you
presumably have in your Solr schema.  However, if you don't set anything,
the fields are not changed, and you should see an argument for every
metadata field, something like: literal.xxx=yyy.

If you have a document that you *know* has metadata, and you've specified
that metadata in the job, and you run the job after you specify that
metadata, but still see no literal.xxx=yyy corresponding to it in the Solr
output, then we should spend some time chasing this problem down.  Be wary
because incremental crawling means you'll probably not see your document
processed again unless you either change it in SharePoint, or delete and
recreate the job.  But be reassured that SharePoint metadata was covered by
the old MetaCarta tests, and there have been no changes of any significance
to the SharePoint connector since then, so I have no explanation why it
would not work for you too.  That's why I'm spending time trying to figure
out if this is a Solr connector issue instead.

Please let me know if this helps you, or whether you need to go deeper into
debugging.

Karl


On Sun, Sep 12, 2010 at 4:05 PM, Martijn v Groningen <
martijn.is.hier@gmail.com> wrote:

> I didn't notice that I was under the upstream-changes directory.
> Thanks for pointing that out.
>
> In Solr I have a wildcard (*) dynamic field, so everything acf sends
> should end up in my index (or at least that is what I assume). I also
> did some debugging in the Solr connecter and I noticed that no
> metadata was send to Solr. I didn't create field mappings in my acf
> job. Do you always have to make mapping for metadata?
>
> Martijn
>
> On 12 September 2010 21:09, Karl Wright <da...@gmail.com> wrote:
> > The source for upstream changes is under
> > lcf/upstream-changes/httpclient, not under trunk.
> >
> > As for the metadata, how are you determining that no metadata is being
> > indexed?  If this is Solr you are indexing into, have you set up the
> > appropriate metadata/field mappings?
> >
> > Karl
> >
> > On 9/12/10, Martijn v Groningen <ma...@gmail.com> wrote:
> >> To authenticate with Share point I had to include the domain as well.
> >> Also the ui reported an error if I didn't specify the username in a
> >> domain / username format. Maybe this http client issue was just
> >> particular with the Sharepoint / Domain Controller installation I was
> >> working with. I also couldn't find the source of afc version of http
> >> client. Is it hosted in another source repository?
> >>
> >> I still don't understand why for the documents I crawled, I didn't
> >> have any metadata associated with it. In the job configuration I was
> >> able to choose which metadata I wanted to include. You have an idea
> >> what might be the cause of this?
> >>
> >> Regards,
> >>
> >> Martijn
> >>
> >> On 12 September 2010 18:40, Karl Wright <da...@gmail.com> wrote:
> >>> Hi Martijn,
> >>>
> >>> The ACF version of httpclient has support for NTLMv1, NTLMv2, and NTLM2
> >>> protocols.  The standard client does not.
> >>>
> >>> What this means practically for you depends on how the Windows domain
> >>> controller you are working with is configured.  You cannot use the
> >>> off-the-shelf httpclient and still authenticate if the domain
> controller
> >>> is
> >>> configured to not allow LM connections, which is what Microsoft
> recommends
> >>> people do.
> >>>
> >>> Since the ACF version of httpclient will always try to connect using
> >>> NTLMv2,
> >>> this means that you must be more rigorous about setting up your client
> >>> machine.  First, it must have a name, and it must have a machine
> account
> >>> in
> >>> the domain.  Second, NTLMv2 is much more picky about how you specify
> user
> >>> and domain.  The end user documentation provides details that may be
> >>> helpful
> >>> to you in this regard.
> >>>
> >>> Thanks,
> >>> Karl
> >>>
> >>>
> >>> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
> >>> <ma...@gmail.com> wrote:
> >>>>
> >>>> Hi All,
> >>>>
> >>>> I've configured the Sharepoint connector (to connect to sharepoint
> >>>> 3.0), Solr connector and a job that adds documents into Solr. The only
> >>>> thing that I'm missing is the meta data from Sharepoint. Per document
> >>>> I need to know which users can access it. In the metadata tab on the
> >>>> job page I've configured the metadata to be included, but this doesn't
> >>>> end up in my Solr index. Does anybody know what I should do to also
> >>>> have the metadata in my index?
> >>>>
> >>>> I also had another issue with the Sharepoint connector which I managed
> >>>> to solve. But I'm curious to know if someone else encountered a
> >>>> similar issue.
> >>>> When I was setting up the sharepoint connecter I always got a 401
> >>>> message on the connectors page as status. I was sure I entered the
> >>>> correct credentials. After some debugging I noticed that the NLTM data
> >>>> that was send to Solr was different then when I did a http post with
> >>>> Firefox poster plugin to a Sharepoint webservice url (I check this
> >>>> with Wireshark). After writing a little test case with httpclient used
> >>>> in afc, I got the same 401 error. I then ran the test with a clean
> >>>> http client (version 3.1), that ran as expected. I got a response code
> >>>> 200 back with a soap response. I then used this version of http client
> >>>> (with some class filesfrom the afc provided jar that were missing is
> >>>> the plain jar file) and the connector worked as expected as I was able
> >>>> to index documents. Did someone else have this particular issue? I
> >>>> noticed that acf is using httpclient 3.1 (from the manifest file), but
> >>>> I'm curious to know why http client was modified.
> >>>>
> >>>> BTW I've been using the latest trunk version (I did a checkout last
> >>>> tuesday). I'm also new to Sharepoint
> >>>>
> >>>> Cheers,
> >>>>
> >>>> Martijn
> >>>
> >>>
> >>
> >
>
>
>
> --
> Met vriendelijke groet,
>
> Martijn van Groningen
>

Re: Sharepoint connector question

Posted by Martijn v Groningen <ma...@gmail.com>.
I didn't notice that I was under the upstream-changes directory.
Thanks for pointing that out.

In Solr I have a wildcard (*) dynamic field, so everything acf sends
should end up in my index (or at least that is what I assume). I also
did some debugging in the Solr connecter and I noticed that no
metadata was send to Solr. I didn't create field mappings in my acf
job. Do you always have to make mapping for metadata?

Martijn

On 12 September 2010 21:09, Karl Wright <da...@gmail.com> wrote:
> The source for upstream changes is under
> lcf/upstream-changes/httpclient, not under trunk.
>
> As for the metadata, how are you determining that no metadata is being
> indexed?  If this is Solr you are indexing into, have you set up the
> appropriate metadata/field mappings?
>
> Karl
>
> On 9/12/10, Martijn v Groningen <ma...@gmail.com> wrote:
>> To authenticate with Share point I had to include the domain as well.
>> Also the ui reported an error if I didn't specify the username in a
>> domain / username format. Maybe this http client issue was just
>> particular with the Sharepoint / Domain Controller installation I was
>> working with. I also couldn't find the source of afc version of http
>> client. Is it hosted in another source repository?
>>
>> I still don't understand why for the documents I crawled, I didn't
>> have any metadata associated with it. In the job configuration I was
>> able to choose which metadata I wanted to include. You have an idea
>> what might be the cause of this?
>>
>> Regards,
>>
>> Martijn
>>
>> On 12 September 2010 18:40, Karl Wright <da...@gmail.com> wrote:
>>> Hi Martijn,
>>>
>>> The ACF version of httpclient has support for NTLMv1, NTLMv2, and NTLM2
>>> protocols.  The standard client does not.
>>>
>>> What this means practically for you depends on how the Windows domain
>>> controller you are working with is configured.  You cannot use the
>>> off-the-shelf httpclient and still authenticate if the domain controller
>>> is
>>> configured to not allow LM connections, which is what Microsoft recommends
>>> people do.
>>>
>>> Since the ACF version of httpclient will always try to connect using
>>> NTLMv2,
>>> this means that you must be more rigorous about setting up your client
>>> machine.  First, it must have a name, and it must have a machine account
>>> in
>>> the domain.  Second, NTLMv2 is much more picky about how you specify user
>>> and domain.  The end user documentation provides details that may be
>>> helpful
>>> to you in this regard.
>>>
>>> Thanks,
>>> Karl
>>>
>>>
>>> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
>>> <ma...@gmail.com> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I've configured the Sharepoint connector (to connect to sharepoint
>>>> 3.0), Solr connector and a job that adds documents into Solr. The only
>>>> thing that I'm missing is the meta data from Sharepoint. Per document
>>>> I need to know which users can access it. In the metadata tab on the
>>>> job page I've configured the metadata to be included, but this doesn't
>>>> end up in my Solr index. Does anybody know what I should do to also
>>>> have the metadata in my index?
>>>>
>>>> I also had another issue with the Sharepoint connector which I managed
>>>> to solve. But I'm curious to know if someone else encountered a
>>>> similar issue.
>>>> When I was setting up the sharepoint connecter I always got a 401
>>>> message on the connectors page as status. I was sure I entered the
>>>> correct credentials. After some debugging I noticed that the NLTM data
>>>> that was send to Solr was different then when I did a http post with
>>>> Firefox poster plugin to a Sharepoint webservice url (I check this
>>>> with Wireshark). After writing a little test case with httpclient used
>>>> in afc, I got the same 401 error. I then ran the test with a clean
>>>> http client (version 3.1), that ran as expected. I got a response code
>>>> 200 back with a soap response. I then used this version of http client
>>>> (with some class filesfrom the afc provided jar that were missing is
>>>> the plain jar file) and the connector worked as expected as I was able
>>>> to index documents. Did someone else have this particular issue? I
>>>> noticed that acf is using httpclient 3.1 (from the manifest file), but
>>>> I'm curious to know why http client was modified.
>>>>
>>>> BTW I've been using the latest trunk version (I did a checkout last
>>>> tuesday). I'm also new to Sharepoint
>>>>
>>>> Cheers,
>>>>
>>>> Martijn
>>>
>>>
>>
>



-- 
Met vriendelijke groet,

Martijn van Groningen

Re: Sharepoint connector question

Posted by Karl Wright <da...@gmail.com>.
The source for upstream changes is under
lcf/upstream-changes/httpclient, not under trunk.

As for the metadata, how are you determining that no metadata is being
indexed?  If this is Solr you are indexing into, have you set up the
appropriate metadata/field mappings?

Karl

On 9/12/10, Martijn v Groningen <ma...@gmail.com> wrote:
> To authenticate with Share point I had to include the domain as well.
> Also the ui reported an error if I didn't specify the username in a
> domain / username format. Maybe this http client issue was just
> particular with the Sharepoint / Domain Controller installation I was
> working with. I also couldn't find the source of afc version of http
> client. Is it hosted in another source repository?
>
> I still don't understand why for the documents I crawled, I didn't
> have any metadata associated with it. In the job configuration I was
> able to choose which metadata I wanted to include. You have an idea
> what might be the cause of this?
>
> Regards,
>
> Martijn
>
> On 12 September 2010 18:40, Karl Wright <da...@gmail.com> wrote:
>> Hi Martijn,
>>
>> The ACF version of httpclient has support for NTLMv1, NTLMv2, and NTLM2
>> protocols.  The standard client does not.
>>
>> What this means practically for you depends on how the Windows domain
>> controller you are working with is configured.  You cannot use the
>> off-the-shelf httpclient and still authenticate if the domain controller
>> is
>> configured to not allow LM connections, which is what Microsoft recommends
>> people do.
>>
>> Since the ACF version of httpclient will always try to connect using
>> NTLMv2,
>> this means that you must be more rigorous about setting up your client
>> machine.  First, it must have a name, and it must have a machine account
>> in
>> the domain.  Second, NTLMv2 is much more picky about how you specify user
>> and domain.  The end user documentation provides details that may be
>> helpful
>> to you in this regard.
>>
>> Thanks,
>> Karl
>>
>>
>> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
>> <ma...@gmail.com> wrote:
>>>
>>> Hi All,
>>>
>>> I've configured the Sharepoint connector (to connect to sharepoint
>>> 3.0), Solr connector and a job that adds documents into Solr. The only
>>> thing that I'm missing is the meta data from Sharepoint. Per document
>>> I need to know which users can access it. In the metadata tab on the
>>> job page I've configured the metadata to be included, but this doesn't
>>> end up in my Solr index. Does anybody know what I should do to also
>>> have the metadata in my index?
>>>
>>> I also had another issue with the Sharepoint connector which I managed
>>> to solve. But I'm curious to know if someone else encountered a
>>> similar issue.
>>> When I was setting up the sharepoint connecter I always got a 401
>>> message on the connectors page as status. I was sure I entered the
>>> correct credentials. After some debugging I noticed that the NLTM data
>>> that was send to Solr was different then when I did a http post with
>>> Firefox poster plugin to a Sharepoint webservice url (I check this
>>> with Wireshark). After writing a little test case with httpclient used
>>> in afc, I got the same 401 error. I then ran the test with a clean
>>> http client (version 3.1), that ran as expected. I got a response code
>>> 200 back with a soap response. I then used this version of http client
>>> (with some class filesfrom the afc provided jar that were missing is
>>> the plain jar file) and the connector worked as expected as I was able
>>> to index documents. Did someone else have this particular issue? I
>>> noticed that acf is using httpclient 3.1 (from the manifest file), but
>>> I'm curious to know why http client was modified.
>>>
>>> BTW I've been using the latest trunk version (I did a checkout last
>>> tuesday). I'm also new to Sharepoint
>>>
>>> Cheers,
>>>
>>> Martijn
>>
>>
>

Re: Sharepoint connector question

Posted by Martijn v Groningen <ma...@gmail.com>.
To authenticate with Share point I had to include the domain as well.
Also the ui reported an error if I didn't specify the username in a
domain / username format. Maybe this http client issue was just
particular with the Sharepoint / Domain Controller installation I was
working with. I also couldn't find the source of afc version of http
client. Is it hosted in another source repository?

I still don't understand why for the documents I crawled, I didn't
have any metadata associated with it. In the job configuration I was
able to choose which metadata I wanted to include. You have an idea
what might be the cause of this?

Regards,

Martijn

On 12 September 2010 18:40, Karl Wright <da...@gmail.com> wrote:
> Hi Martijn,
>
> The ACF version of httpclient has support for NTLMv1, NTLMv2, and NTLM2
> protocols.  The standard client does not.
>
> What this means practically for you depends on how the Windows domain
> controller you are working with is configured.  You cannot use the
> off-the-shelf httpclient and still authenticate if the domain controller is
> configured to not allow LM connections, which is what Microsoft recommends
> people do.
>
> Since the ACF version of httpclient will always try to connect using NTLMv2,
> this means that you must be more rigorous about setting up your client
> machine.  First, it must have a name, and it must have a machine account in
> the domain.  Second, NTLMv2 is much more picky about how you specify user
> and domain.  The end user documentation provides details that may be helpful
> to you in this regard.
>
> Thanks,
> Karl
>
>
> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
> <ma...@gmail.com> wrote:
>>
>> Hi All,
>>
>> I've configured the Sharepoint connector (to connect to sharepoint
>> 3.0), Solr connector and a job that adds documents into Solr. The only
>> thing that I'm missing is the meta data from Sharepoint. Per document
>> I need to know which users can access it. In the metadata tab on the
>> job page I've configured the metadata to be included, but this doesn't
>> end up in my Solr index. Does anybody know what I should do to also
>> have the metadata in my index?
>>
>> I also had another issue with the Sharepoint connector which I managed
>> to solve. But I'm curious to know if someone else encountered a
>> similar issue.
>> When I was setting up the sharepoint connecter I always got a 401
>> message on the connectors page as status. I was sure I entered the
>> correct credentials. After some debugging I noticed that the NLTM data
>> that was send to Solr was different then when I did a http post with
>> Firefox poster plugin to a Sharepoint webservice url (I check this
>> with Wireshark). After writing a little test case with httpclient used
>> in afc, I got the same 401 error. I then ran the test with a clean
>> http client (version 3.1), that ran as expected. I got a response code
>> 200 back with a soap response. I then used this version of http client
>> (with some class filesfrom the afc provided jar that were missing is
>> the plain jar file) and the connector worked as expected as I was able
>> to index documents. Did someone else have this particular issue? I
>> noticed that acf is using httpclient 3.1 (from the manifest file), but
>> I'm curious to know why http client was modified.
>>
>> BTW I've been using the latest trunk version (I did a checkout last
>> tuesday). I'm also new to Sharepoint
>>
>> Cheers,
>>
>> Martijn
>
>

Re: Sharepoint connector question

Posted by Karl Wright <da...@gmail.com>.
Hi Martijn,

The ACF version of httpclient has support for NTLMv1, NTLMv2, and NTLM2
protocols.  The standard client does not.

What this means practically for you depends on how the Windows domain
controller you are working with is configured.  You cannot use the
off-the-shelf httpclient and still authenticate if the domain controller is
configured to not allow LM connections, which is what Microsoft recommends
people do.

Since the ACF version of httpclient will always try to connect using NTLMv2,
this means that you must be more rigorous about setting up your client
machine.  First, it must have a name, and it must have a machine account in
the domain.  Second, NTLMv2 is much more picky about how you specify user
and domain.  The end user documentation provides details that may be helpful
to you in this regard.

Thanks,
Karl


On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen <
martijn.is.hier@gmail.com> wrote:

> Hi All,
>
> I've configured the Sharepoint connector (to connect to sharepoint
> 3.0), Solr connector and a job that adds documents into Solr. The only
> thing that I'm missing is the meta data from Sharepoint. Per document
> I need to know which users can access it. In the metadata tab on the
> job page I've configured the metadata to be included, but this doesn't
> end up in my Solr index. Does anybody know what I should do to also
> have the metadata in my index?
>
> I also had another issue with the Sharepoint connector which I managed
> to solve. But I'm curious to know if someone else encountered a
> similar issue.
> When I was setting up the sharepoint connecter I always got a 401
> message on the connectors page as status. I was sure I entered the
> correct credentials. After some debugging I noticed that the NLTM data
> that was send to Solr was different then when I did a http post with
> Firefox poster plugin to a Sharepoint webservice url (I check this
> with Wireshark). After writing a little test case with httpclient used
> in afc, I got the same 401 error. I then ran the test with a clean
> http client (version 3.1), that ran as expected. I got a response code
> 200 back with a soap response. I then used this version of http client
> (with some class filesfrom the afc provided jar that were missing is
> the plain jar file) and the connector worked as expected as I was able
> to index documents. Did someone else have this particular issue? I
> noticed that acf is using httpclient 3.1 (from the manifest file), but
> I'm curious to know why http client was modified.
>
> BTW I've been using the latest trunk version (I did a checkout last
> tuesday). I'm also new to Sharepoint
>
> Cheers,
>
> Martijn
>