You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@manifoldcf.apache.org by Paul Farrell <pf...@funnelback.com> on 2015/10/19 13:43:41 UTC

Manifold/Alfresco seeding and security

Hi Everyone,

Hoping someone may be able to advise. 

I am currently using Manifold, together with a CMIS connector, to retrieve and index content from an Alfresco repository. 

All is going well apart from, what I would call, the ‘incremental crawl’. 

The main issue I am having is that the modification of a document’s security settings, in Alfresco, is not being picked up in next Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A and B as Consumers. I run a crawl in Manifold and it picks up the documents fine.  The security is set as expected. I then remove ‘User A’ from the security of that document and re-run the Manifold crawl. User A can still see the document in the local search engine. 

It is as if Manifold is not treating the security update as a ‘modification’ and is therefore not refreshing it. Note that if I go into the Output Connections, edit and save the relevant output connection and then click ‘Remove all associated documents’, the next time I crawl, the changes are picked up. It is clear that Manifold is just not updating whatever internal record it has for this item.

Any ideas?

Many thanks.

Re: Manifold/Alfresco seeding and security

Posted by pf...@funnelback.com.
My apologies Karl/Maurizio,

With everything that was going on yesterday I must have glossed over that ticket without realising it's significance. 

I will check on the versions when I am in the office later this morning. 

Thanks for the follow-up.

-----Original Message-----
From: "Karl Wright" <da...@gmail.com>
Sent: Tuesday, October 20, 2015 9:23pm
To: "user@manifoldcf.apache.org" <us...@manifoldcf.apache.org>
Subject: Re: Manifold/Alfresco seeding and security

Hi Paul,
Looking at Issue 3, I think that Maurizio has indeed pointed you in the
right direction.  Can you check your version of the plugin to be sure that
/api/node/ is NOT present in the described line of code?

Karl


On Tue, Oct 20, 2015 at 5:00 PM, <pf...@funnelback.com> wrote:

> Hi Maurizio,
>
> I will be available all day tomorrow (Wednesday) to help out as much as I
> can. If it's possible for you to look into this I can take whatever steps
> you need.
>
> Many thanks,
>
> Paul
>
> -----Original Message-----
> From: "Karl Wright" <da...@gmail.com>
> Sent: Tuesday, October 20, 2015 12:34pm
> To: "user@manifoldcf.apache.org" <us...@manifoldcf.apache.org>
> Subject: Re: Manifold/Alfresco seeding and security
>
> Hi Maurizio,
>
> This is the third time we've seen this; can you use Paul's help to chase
> down what the issue is?
>
> Karl
>
>
> On Tue, Oct 20, 2015 at 12:19 PM, Paul Farrell <pf...@funnelback.com>
> wrote:
>
> > Hi,
> >
> > I am using Alfresco Community 5.0.
> >
> > Having taken that AMP file (version 0.7.1) and then installed it into
> > Alfresco and restarted the services, the issue is still present.
> >
> > I suspect that this is probably more to do with the Manifold end than the
> > Alfresco end. It seems it is Manifold that is automatically appending the
> > “/api/node” string into the path whenever I use “/alfresco/service” as
> the
> > Context in the repository connection configuration.
> >
> > If it is of interest, this is the output in the manifoldcf.log file when
> I
> > use the repo connection config I mentioned earlier.
> >
> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection request:
> > [route: {}->http://54.165.85.140:8080][total kept alive: 0; route
> > allocated: 0 of 2; total allocated: 0 of 20]
> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection leased: [id:
> > 10][route: {}->http://54.165.85.140:8080][total kept alive: 0; route
> > allocated: 1 of 2; total allocated: 1 of 20]
> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Opening connection {}->
> > http://54.165.85.140:8080
> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connecting to /
> > 54.165.85.140:8080
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Connection established
> > 172.31.23.90:58712<->54.165.85.140:8080
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Executing request GET
> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Proxy auth state:
> > UNCHALLENGED
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> GET
> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > Accept: application/json
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > Authorization: Basic YWRtaW46RnVubmVsYmFjazE=
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > Host: 54.165.85.140:8080
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > Connection: Keep-Alive
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > User-Agent: Apache-HttpClient/4.3.5 (java 1.5)
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > Accept-Encoding: gzip,deflate
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> "GET
> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "Accept: application/json[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "Authorization: Basic YWRtaW46RnVubmVsYmFjazE=[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "Host: 54.165.85.140:8080[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "Connection: Keep-Alive[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "User-Agent: Apache-HttpClient/4.3.5 (java 1.5)[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "Accept-Encoding: gzip,deflate[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "HTTP/1.1 404 Not Found[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Server: Apache-Coyote/1.1[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Cache-Control: no-cache[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Expires: Thu, 01 Jan 1970 00:00:00 GMT[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Pragma: no-cache[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Content-Type: text/html;charset=UTF-8[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Transfer-Encoding: chunked[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Date: Tue, 20 Oct 2015 16:18:47 GMT[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "630[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "
> > http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "<html xmlns="http://www.w3.org/1999/xhtml">[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> > <head>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >    <title>Web Script Status 404 - Not Found</title>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >    <link rel="stylesheet" href="/alfresco/css/webscripts.css"
> > type="text/css" />[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> > </head>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> > <body>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >    <div>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       <table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >             <td><img src="/alfresco/images/logo/AlfrescoLogo32.png"
> > alt="Alfresco" /></td>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >             <td><span class="title">Web Script Status 404 - Not
> > Found</span></td>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          </tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       </table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       <br/>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       <table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td>The Web Script <a
> >
> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
> > has responded with a status of 404 - Not Found.</td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       </table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       <br/>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       <table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td><b>404 Description:</b></td><td> Requested resource is
> not
> > available.</td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td> </td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td><b>Message:</b></td><td>Cannot find object for
> > NodeIdReference[storeRef=auth://resolve,id=admin]</td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td><b>Server</b>:</td><td>Community v5.0.0 (r75118-b23)
> > schema 8,001</td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td><b>Time</b>:</td><td>Oct 20, 2015 4:18:47
> PM</td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td></td><td> </td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td><b>Diagnostics</b>:</td><td><a
> > href="/alfresco/service/script/org/alfresco/cmis/item.get">Inspect Web
> > Script (org/alfresco/cmis/item.get)</a></td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       </table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >    </div>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> > </body>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "</html>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > HTTP/1.1 404 Not Found
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Server: Apache-Coyote/1.1
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Cache-Control: no-cache
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Expires: Thu, 01 Jan 1970 00:00:00 GMT
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Pragma: no-cache
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Content-Type: text/html;charset=UTF-8
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Transfer-Encoding: chunked
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Date: Tue, 20 Oct 2015 16:18:47 GMT
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection can be kept
> > alive indefinitely
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10:
> > Shutdown connection
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection discarded
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10: Close
> > connection
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection released:
> > [id: 10][route: {}->http://54.165.85.140:8080][total kept alive: 0;
> route
> > allocated: 0 of 2; total allocated: 0 of 20]
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager is
> > shutting down
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager shut
> > down
> >
> > *Paul Farrell*
> > Senior Search Consultant
> >
> > 109-123 Clifton Street, London EC2A 4LD
> > *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
> >
> > *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
> >
> > Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> -
> > Twitter <https://twitter.com/funnelback>
> >
> > Funnelback UK Ltd is a limited liability company registered in England &
> > Wales. Registered address: Zetland House 109-123, Clifton Street, London.
> > EC2A 4LD. Company registration number: 07004264.
> >
> > On 20 Oct 2015, at 16:50, Maurizio Pillitu <ma...@apache.org> wrote:
> >
> > Hi Paul,
> >
> > it looks like you're hitting
> > https://github.com/maoo/alfresco-indexer/issues/3 ; which version of
> > alfresco-indexer are you using? Can you try using
> >
> http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp
> (or
> > the pre-built WAR file -
> >
> http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar
> >  )
> >
> > HTH
> >   mao
> >
> > On Tue, Oct 20, 2015 at 5:36 PM Paul Farrell <pf...@funnelback.com>
> > wrote:
> >
> >> Hi,
> >>
> >> Having had to go back to basics and re-install my Alfresco instance, I
> >> can confirm that the AMP file for the alfresco indexer web scripts
> *does*
> >> actually install without error. There must have been an issue with my
> >> previous Alfresco instance.
> >>
> >> Having said that, the Alfresco WebScript connector fails. The failure is
> >> down to the ‘Context’ setting (see below):
> >>
> >> <4a6db6238cff01e7ff77cdaf7e6ea050.png>
> >>
> >> When you attempt to save the configuration of the WebScript connector,
> >> Manifold clearly tries to check the connection. It seems to do this by
> >> making an API call (/auth/resolve/admin). The issue is with what
> Manifold
> >> prepends to the start of that path.
> >> If I leave the setting as above then Manifold reports   :
> >>
> >> <tr><td>The Web Script <a
> >>
> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
> >> has responded with a status of 404 - Not Found.</td></tr>[\n]”
> >>
> >> In other words, it builds the full path as
> >> “alfresco/service/api/node/auth/resolve/admin”.
> >>
> >> For my Alfresco Community 5.0 instance, I get to that same web script
> via
> >> the URL “/alfresco/service/auth/resolve/admin” i.e. without the
> ‘/api/node’.
> >>
> >> Somewhere, Manifold is assuming that the ‘/api/node’ is a correct path
> >> inclusion. In other words, there is nothing I can put into that box to
> >> prevent it.
> >>
> >> Paul
> >>
> >> On 20 Oct 2015, at 12:56, Karl Wright <da...@gmail.com> wrote:
> >>
> >> Hmm.  What file was missing?  Maurizio maintains the indexer plugin; I
> >> feel certain he'd want to know.
> >>
> >> Karl
> >>
> >>
> >> On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <pf...@funnelback.com>
> >> wrote:
> >>
> >>> Hi guys,
> >>>
> >>> Just to let you know what’s going on - for informational purposes more
> >>> than anything.
> >>>
> >>> I initially tried taking the AMP file provided in the MCF plugins
> >>> directory (0.7.0) and tried to install it into Alfresco but got a
> message
> >>> saying a file was missing.
> >>>
> >>> Instead, I cloned the repository on GitHub for the alfresco-indexer
> >>> project and then built it on my local machine. This generated the AMP
> file
> >>> (0.7.2).
> >>>
> >>> I was able to successfully install the AMP file onto my Alfresco
> >>> instance.
> >>>
> >>> As it happens I now cannot log into Alfresco Share ('bad credentials or
> >>> server not available' message) but that is something I can work on.
> >>> Apparently the installation of some AMP files have been known to cause
> this
> >>> issue.
> >>>
> >>> So, progress to a point!
> >>>
> >>> *Paul Farrell*
> >>> Senior Search Consultant
> >>>
> >>> 109-123 Clifton Street, London EC2A 4LD
> >>> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
> >>>
> >>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
> >>>
> >>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback>
> -
> >>>  Twitter <https://twitter.com/funnelback>
> >>>
> >>> Funnelback UK Ltd is a limited liability company registered in England
> &
> >>> Wales. Registered address: Zetland House 109-123, Clifton Street,
> London.
> >>> EC2A 4LD. Company registration number: 07004264.
> >>>
> >>> On 20 Oct 2015, at 12:36, Rafa Haro <rh...@gmail.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> At the Alfresco side, hope this helps:
> >>>
> >>> http://docs.alfresco.com/4.1/tasks/amp-install.html
> >>>
> >>> Cheers
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <da...@gmail.com>
> wrote:
> >>>
> >>>> The AMP file is actually shipped as part of the binary MCF
> >>>> distribution.  You can find it under "plugins".
> >>>>
> >>>> Karl
> >>>>
> >>>>
> >>>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <
> pfarrell@funnelback.com>
> >>>> wrote:
> >>>>
> >>>>> Hi all,
> >>>>>
> >>>>> Hopefully this will be my only request for information today.
> >>>>> I’m afraid this is a bit of a newbie question but I have managed to
> >>>>> get the Manifold UI to now show ‘Alfresco Webscripts’ as a
> connector. The
> >>>>> only bit I am missing now is to install the AMP file in Afresco.
> >>>>>
> >>>>> I realise that this is slightly outside of the Manifold remit but I
> >>>>> wondered if anyone can advise how I build the AMP file from the URL (
> >>>>> https://github.com/maoo/alfresco-indexer)? I have cloned the
> >>>>> repository to my local drive but, having never worked with Maven, am
> at a
> >>>>> loss at how to generate the AMP file that I then need to install into
> >>>>> Alfresco.
> >>>>>
> >>>>> Many thanks,
> >>>>>
> >>>>> On 19 Oct 2015, at 17:36, Karl Wright <da...@gmail.com> wrote:
> >>>>>
> >>>>> The only way you can have such a reduced list of connectors is if
> >>>>> somebody commented out many connectors in your connectors.xml, or
> removed
> >>>>> them from the database table where they are registered by hand.
> >>>>>
> >>>>> Karl
> >>>>>
> >>>>>
> >>>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <
> >>>>> pfarrell@funnelback.com> wrote:
> >>>>>
> >>>>>> After a good deal of time clicking around I came to the same
> >>>>>> conclusion - that there is no way of telling from the UI!!
> >>>>>>
> >>>>>> Having dug a bit deeper I believe I may actually have the Alfresco
> >>>>>> WebScript connectors installed. At least the 0.7.0 version. I
> notice in the
> >>>>>> ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
> >>>>>>
> >>>>>> Looking in the ‘connectors.xml’ file I can also see the line :
> >>>>>>
> >>>>>> <repositoryconnector name="Alfresco Webscript"
> >>>>>>
> class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
> >>>>>>
> >>>>>> You can imagine my excitement!
> >>>>>>
> >>>>>> The only thing I am missing is the option in the UI. When I click to
> >>>>>> create a new repo connection I get:  CMIS, Dropbox, Generic,
> GoogleDrive,
> >>>>>> HDFS, Jira, Meridio, RSS, Sharepoint.
> >>>>>>
> >>>>>> Perhaps I am hoping for too much to hope that I can make a simple
> >>>>>> change to enable this repo connection?
> >>>>>>
> >>>>>> Thanks for all the help everyone
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 19 Oct 2015, at 17:26, Karl Wright <da...@gmail.com> wrote:
> >>>>>>
> >>>>>> Hah; there's not a way to inquire in the UI, if that's what you
> >>>>>> mean.  But if you see "Alfresco webscript" in the list of repository
> >>>>>> connection types, you've got a version that supports that connector.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Karl
> >>>>>>
> >>>>>>
> >>>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <
> >>>>>> pfarrell@funnelback.com> wrote:
> >>>>>>
> >>>>>>> Thanks Rafa.
> >>>>>>>
> >>>>>>> As an aside, is there an easy way to identify which version of
> >>>>>>> ManifoldCF you are on?
> >>>>>>>
> >>>>>>> Cheers
> >>>>>>>
> >>>>>>> *Paul Farrell*
> >>>>>>> Senior Search Consultant
> >>>>>>>
> >>>>>>> 109-123 Clifton Street, London EC2A 4LD
> >>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
> >>>>>>> <http://www.funnelback.com/>
> >>>>>>>
> >>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
> >>>>>>>
> >>>>>>> Connect with us: LinkedIn
> >>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
> >>>>>>> <https://twitter.com/funnelback>
> >>>>>>>
> >>>>>>> Funnelback UK Ltd is a limited liability company registered in
> >>>>>>> England & Wales. Registered address: Zetland House 109-123,
> Clifton Street,
> >>>>>>> London. EC2A 4LD. Company registration number: 07004264.
> >>>>>>>
> >>>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rh...@apache.org> wrote:
> >>>>>>>
> >>>>>>> Hi Paul,
> >>>>>>>
> >>>>>>> All you need to do is to install this webscript
> >>>>>>> <https://github.com/maoo/alfresco-indexer> within your Alfresco
> >>>>>>> instance. The connector itself is already part of the most recent
> versions
> >>>>>>> of ManifoldCF
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Rafa
> >>>>>>>
> >>>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <
> >>>>>>> pfarrell@funnelback.com> wrote:
> >>>>>>>
> >>>>>>>> Ok, thanks again guys.
> >>>>>>>>
> >>>>>>>> The Webscript connector it is.
> >>>>>>>>
> >>>>>>>> I realise I am asking a lot here but are there any easy-to-follow
> >>>>>>>> guidelines on how to get this Webscript connector installed?  I
> see there
> >>>>>>>> is a GitHub page here (
> >>>>>>>> https://github.com/maoo/alfresco-webscript-manifold-connector)
> >>>>>>>> which discusses it (although it directs you to a repository of
> files).
> >>>>>>>>
> >>>>>>>> I am just keen to make sure that any steps I follow to try and get
> >>>>>>>> this Webscript connector installed and working are updated,
> reliable steps.
> >>>>>>>> I would hate to waste time with out of date information.
> >>>>>>>>
> >>>>>>>> Thanks all
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh...@gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> Hi Paul,
> >>>>>>>>
> >>>>>>>> I suggest that you should use Alfresco Webscript as Karl
> mentioned.
> >>>>>>>> Web services is so slow compared to other services and I've also
> checked
> >>>>>>>> that Alfresco CMIS web services does not return change token(may
> be there
> >>>>>>>> is something that I don't know).
> >>>>>>>>
> >>>>>>>> By the way current version of CMIS connector is not aware of
> change
> >>>>>>>> token. I would write a patch for you if alfresco supports change
> token
> >>>>>>>> property.
> >>>>>>>>
> >>>>>>>> Thanks!
> >>>>>>>> Muhammed
> >>>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <
> >>>>>>>> daddywri@gmail.com> şunu yazdı:
> >>>>>>>>
> >>>>>>>>> Hi Paul,
> >>>>>>>>>
> >>>>>>>>> The Alfresco Webscript connector is a wholly different connector
> >>>>>>>>> that has no relation to the CMIS connector.  It requires an
> Alfresco
> >>>>>>>>> webscript plugin be installed on your Alfresco server to work,
> though.
> >>>>>>>>>
> >>>>>>>>> Hope that helps.
> >>>>>>>>>
> >>>>>>>>> Karl
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <
> >>>>>>>>> pfarrell@funnelback.com> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi Muhammed/Karl,
> >>>>>>>>>>
> >>>>>>>>>> Firstly, thank-you so much for taking the time to reply. It is
> >>>>>>>>>> very much appreciated.
> >>>>>>>>>>
> >>>>>>>>>> Currently I am using the AtomPub for my CMIS repository
> >>>>>>>>>> connection. I have just read something which may shed a little
> light on
> >>>>>>>>>> this. The post read that change tokens are not passed via
> AtomPub
> >>>>>>>>>> connections (
> >>>>>>>>>>
> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758
> ).
> >>>>>>>>>> If true, this would explain why ManifoldCF may be unable to
> determine a
> >>>>>>>>>> change in Alfresco.
> >>>>>>>>>>
> >>>>>>>>>> It looks like I have two possible options left open to me
> >>>>>>>>>> (correct me if I’m wrong):
> >>>>>>>>>>
> >>>>>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the
> >>>>>>>>>> connection mechanism
> >>>>>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’
> >>>>>>>>>> connector?  (or is this the same as the ‘Web Services’
> connection mentioned
> >>>>>>>>>> above?)
> >>>>>>>>>>
> >>>>>>>>>> Thanks again,
> >>>>>>>>>>
> >>>>>>>>>> Paul
> >>>>>>>>>>
> >>>>>>>>>> *Paul Farrell*
> >>>>>>>>>> Senior Search Consultant
> >>>>>>>>>>
> >>>>>>>>>> 109-123 Clifton Street, London EC2A 4LD
> >>>>>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
> >>>>>>>>>> <http://www.funnelback.com/>
> >>>>>>>>>>
> >>>>>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED
> >>>>>>>>>> STATES
> >>>>>>>>>>
> >>>>>>>>>> Connect with us: LinkedIn
> >>>>>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
> >>>>>>>>>> <https://twitter.com/funnelback>
> >>>>>>>>>>
> >>>>>>>>>> Funnelback UK Ltd is a limited liability company registered in
> >>>>>>>>>> England & Wales. Registered address: Zetland House 109-123,
> Clifton Street,
> >>>>>>>>>> London. EC2A 4LD. Company registration number: 07004264.
> >>>>>>>>>>
> >>>>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh...@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hi Paul,
> >>>>>>>>>>
> >>>>>>>>>> Repositories should give information to ManifoldCF when they
> >>>>>>>>>> updated. Current CMIS connector reindex document if the lastest
> version of
> >>>>>>>>>> the document has changed, not updated.
> >>>>>>>>>>
> >>>>>>>>>> There is a change token property in CMIS specification and it
> >>>>>>>>>> should change when document is updated so ManifoldCF can
> understand that
> >>>>>>>>>> document is updated but implementing change token property is
> optional.
> >>>>>>>>>> I've checked Alfresco's CMIS web site and seen that they didn't
> set the
> >>>>>>>>>> change token.
> >>>>>>>>>>
> >>>>>>>>>> I think, there is nothing we can do at this point.
> >>>>>>>>>>
> >>>>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <
> daddywri@gmail.com>
> >>>>>>>>>> şunu yazdı:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Paul,
> >>>>>>>>>>>
> >>>>>>>>>>> This looks like a bug in the CMIS connector to me; usually the
> >>>>>>>>>>> document version string the connector constructs should be
> adequate to
> >>>>>>>>>>> detect all changes.  Can you create a ticket?
> >>>>>>>>>>> https://issues.apache.org/jira , project ManifoldCF.  Please
> >>>>>>>>>>> include what version of MCF you are using here.  FWIW, this
> may be in fact
> >>>>>>>>>>> a bug in the Alfresco CMIS implementation, but we'll have to
> have some back
> >>>>>>>>>>> and forth before I can determine that for sure.
> >>>>>>>>>>>
> >>>>>>>>>>> In the meantime, have you considered using the Alfresco
> >>>>>>>>>>> Webscript connector?  It's the preferred way to do Alfresco
> indexing,
> >>>>>>>>>>> although there have been issues reported having to do with
> running it on
> >>>>>>>>>>> some configurations of Alfresco.  I'm not entirely sure what
> the problem is
> >>>>>>>>>>> there; maybe a version dependency of some kind.
> >>>>>>>>>>>
> >>>>>>>>>>> Karl
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <
> >>>>>>>>>>> pfarrell@funnelback.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Everyone,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hoping someone may be able to advise.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I am currently using Manifold, together with a CMIS connector,
> >>>>>>>>>>>> to retrieve and index content from an Alfresco repository.
> >>>>>>>>>>>>
> >>>>>>>>>>>> All is going well apart from, what I would call, the
> >>>>>>>>>>>> ‘incremental crawl’.
> >>>>>>>>>>>>
> >>>>>>>>>>>> The main issue I am having is that the modification of a
> >>>>>>>>>>>> document’s security settings, in Alfresco, is not being
> picked up in next
> >>>>>>>>>>>> Manifold crawl. As an example I have a document ‘TestDoc1’
> which has user A
> >>>>>>>>>>>> and B as Consumers. I run a crawl in Manifold and it picks up
> the documents
> >>>>>>>>>>>> fine.  The security is set as expected. I then remove ‘User
> A’ from the
> >>>>>>>>>>>> security of that document and re-run the Manifold crawl. User
> A can still
> >>>>>>>>>>>> see the document in the local search engine.
> >>>>>>>>>>>>
> >>>>>>>>>>>> It is as if Manifold is not treating the security update as a
> >>>>>>>>>>>> ‘modification’ and is therefore not refreshing it. Note that
> if I go into
> >>>>>>>>>>>> the Output Connections, edit and save the relevant output
> connection and
> >>>>>>>>>>>> then click ‘Remove all associated documents’, the next time I
> crawl, the
> >>>>>>>>>>>> changes are picked up. It is clear that Manifold is just not
> updating
> >>>>>>>>>>>> whatever internal record it has for this item.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Any ideas?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Many thanks.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >
>
>
>



Re: Manifold/Alfresco seeding and security

Posted by Paul Farrell <pf...@funnelback.com>.
Hi guys,

With regard to the recent thread on the new Alfresco WebScript connector for Manifold, I wanted to share an update on a recent issue as well as ask a further question. 

With regard to the issue I was having with the update to the alfresco-indexer.webscript.jar file, it seems that the issue had nothing to do with Manifold and lay with the Amazon AWS instance I was using. Essentially, even a process/server restart was not rebuilding the classes which means the new JAR was not getting picked up. I have not had chance to look into why. 

Q. My new question is regarding the install of the ‘alfresco-indexer-webscript.amp’ file. I would like to know what I can look at to confirm that the installation of this AMP has been successful. I had imagined I should be able to log into Alfresco Share, navigate to Repository -> Data Dictionary -> Web Scripts Extensions and see some custom scripts. After the previous install, I was unable to see anything changed. 

NOTE: I install the AMP via   -   java -jar alfresco-mmt.jar install /opt/Software/alfresco-indexer-webscript.amp /opt/alfresco/tomcat/webapps/alfresco.war   (and then restart the alfresco services)

Thanks all

Paul

> On 21 Oct 2015, at 16:57, Paul Farrell <pf...@funnelback.com> wrote:
> 
> Hi Karl,
> 
> Yes, I know what you mean. As well as restarting the server I have attempted numerous manual restarts of the app/web server together with the termination of all java processes and the purging of the server’s ‘work’ directory. Nothing I can do will cause the new code to pick up. 
> 
> I have now asked someone else in here to cast their eyes over it as I was getting a little too close to this to see things afresh.
> 
> Thanks
> 
> 
>> On 21 Oct 2015, at 15:42, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hi Paul,
>> 
>> If you are starting and stopping a whole virtual machine, that will NOT cause jars within each process to be reloaded.  You have to start/stop processes.
>> 
>> Karl
>> 
>> 
>> On Wed, Oct 21, 2015 at 10:19 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>> I’m quite fortunate to be running all of this on a personal AWS Virtual Machine so have been able to actually stop and start the server. 
>> 
>> Having run that command line I sent below, I can confirm that the string “api/node” does not exist in any .jar file or regular file. I am at a loss to explain how the Manifold repository connection test process is still trying to access :
>> 
>> <tr><td>The Web Script <a href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a> has responded with a status of 404 - Not Found.</td></tr>[\n]”
>> 
>> It has got to the point now where I may just have to bite the bullet and tell the client that we cannot support nightly Alfresco crawls i.e. crawls that take into account the change log. Tough thing to do but I can’t see I have much choice right now. 
>> 
>> Really appreciate the help
>> 
>> 
>>> On 21 Oct 2015, at 14:47, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> Hi Paul,
>>> 
>>> I can't answer that question until I know how you've deployed things.  I'm presuming that you are using a multiprocess deployment?  If so, for the web applications, recycling the application server should be sufficient, but you really want to check to be sure what properties.xml file the application server is pointing at, so you change the jar in the right place.  In a multiprocess setup, there are also agents processes (at least one), which you would also need to cycle.
>>> 
>>> Thanks,
>>> Karl
>>> 
>>> 
>>> On Wed, Oct 21, 2015 at 9:41 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>> Thanks Karl.
>>> 
>>> Can you clarify what you mean by ‘recycle Manifold processes’? My fallback position in anything like this is to restart whatever app/web server is hosting Manifold. Is that not sufficient?
>>> 
>>> As for this path being defined elsewhere, I have just finished constructing a one-liner that lets me search through the classes within jar’s. Quite useful:
>>> 
>>> find . -iname '*.jar' -printf "unzip -c %p | grep -q 'stringToSearchFor' && echo %p\n" | sh
>>> 
>>> Going to see if that original ‘api/node’ string exists anywhere else. 
>>> 
>>> Cheers
>>> 
>>> 
>>> 
>>> 
>>>> On 21 Oct 2015, at 14:36, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>>> 
>>>> Hi Paul,
>>>> 
>>>> The indexer jar should appear in only one place, in the connector-lib directory that is referenced by your properties.xml file.  However, if you replace that, you will need to recycle all ManifoldCF processes or they will not be able to pick it up.
>>>> 
>>>> I would also check the URL that's being logged to be sure it matches the pattern Maurizio pointed out.  If it doesn't, there's a possibility that some other place in the connector has a similar problem that hasn't been fixed.
>>>> 
>>>> Thanks,
>>>> Karl
>>>> 
>>>> 
>>>> On Wed, Oct 21, 2015 at 8:48 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>> Hi Karl/Maurizio,
>>>> 
>>>> I have a very very odd circumstance at present. This may or may not be related to the Alfresco WebScript plugin OR the environment in which I am running Manifold but thought I would raise the question. 
>>>> 
>>>> I have cloned the repo for the Alfresco Webscript connector and can see that there is a ‘alfresco-indexer-client.jar’ file in the ‘target’ directory. 
>>>> 
>>>> I have taken that jar and have replaced the jar that existed in the Manifold instance. This was at a path called ‘apache-manifoldcf/connector-lib’. This path is referenced in an ‘mcf-properties.xml’ file which may or may not be specific to our environment. 
>>>> 
>>>> Anyway, as I say I have replaced the existing jar but the strangest thing is that the same path is being used when I ‘Save’ the repository connection. In other words, the path ‘….api/node…’ is still being used despite the jar file saying otherwise. 
>>>> 
>>>> NOTE: the way I am testing this is to apply the jar, restart Jetty (our app server), open Manifold, navigate to the Alfresco WebScript Repository connection, hit ‘Save’ and then open the ‘manifold.log’ file. It is in this file that I see the HTTP request and the 404 error. It is in this HTTP request that it stipulates the path it is using - the old path. 
>>>> 
>>>> —
>>>> 
>>>> I have even gone to the extreme of removing this jar file and restarting the app server to see if this jar is ignored by Manifold. If I do this Manifold does not even start so it is clearly expecting that jar to exist. This is even more strange. It is clearly reliant on the jar but it is not using the content of that jar. 
>>>> 
>>>> Can I ask if you guys can think of any reason at all that this might be happening. It is starting to drive me mad!
>>>> 
>>>> Thanks
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On 21 Oct 2015, at 02:23, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>>>> 
>>>>> Hi Paul,
>>>>> Looking at Issue 3, I think that Maurizio has indeed pointed you in the right direction.  Can you check your version of the plugin to be sure that /api/node/ is NOT present in the described line of code?
>>>>> 
>>>>> Karl
>>>>> 
>>>>> 
>>>>> On Tue, Oct 20, 2015 at 5:00 PM, <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>> Hi Maurizio,
>>>>> 
>>>>> I will be available all day tomorrow (Wednesday) to help out as much as I can. If it's possible for you to look into this I can take whatever steps you need.
>>>>> 
>>>>> Many thanks,
>>>>> 
>>>>> Paul
>>>>> 
>>>>> -----Original Message-----
>>>>> From: "Karl Wright" <daddywri@gmail.com <ma...@gmail.com>>
>>>>> Sent: Tuesday, October 20, 2015 12:34pm
>>>>> To: "user@manifoldcf.apache.org <ma...@manifoldcf.apache.org>" <user@manifoldcf.apache.org <ma...@manifoldcf.apache.org>>
>>>>> Subject: Re: Manifold/Alfresco seeding and security
>>>>> 
>>>>> Hi Maurizio,
>>>>> 
>>>>> This is the third time we've seen this; can you use Paul's help to chase
>>>>> down what the issue is?
>>>>> 
>>>>> Karl
>>>>> 
>>>>> 
>>>>> On Tue, Oct 20, 2015 at 12:19 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
>>>>> wrote:
>>>>> 
>>>>> > Hi,
>>>>> >
>>>>> > I am using Alfresco Community 5.0.
>>>>> >
>>>>> > Having taken that AMP file (version 0.7.1) and then installed it into
>>>>> > Alfresco and restarted the services, the issue is still present.
>>>>> >
>>>>> > I suspect that this is probably more to do with the Manifold end than the
>>>>> > Alfresco end. It seems it is Manifold that is automatically appending the
>>>>> > “/api/node” string into the path whenever I use “/alfresco/service” as the
>>>>> > Context in the repository connection configuration.
>>>>> >
>>>>> > If it is of interest, this is the output in the manifoldcf.log file when I
>>>>> > use the repo connection config I mentioned earlier.
>>>>> >
>>>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection request:
>>>>> > [route: {}->http://54.165.85.140:8080][total <> kept alive: 0; route
>>>>> > allocated: 0 of 2; total allocated: 0 of 20]
>>>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection leased: [id:
>>>>> > 10][route: {}->http://54.165.85.140:8080][total <> kept alive: 0; route
>>>>> > allocated: 1 of 2; total allocated: 1 of 20]
>>>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Opening connection {}->
>>>>> > http://54.165.85.140:8080 <http://54.165.85.140:8080/>
>>>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connecting to /
>>>>> > 54.165.85.140:8080 <http://54.165.85.140:8080/>
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Connection established
>>>>> > 172.31.23.90:58712 <http://172.31.23.90:58712/><->54.165.85.140:8080 <http://54.165.85.140:8080/>
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Executing request GET
>>>>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Proxy auth state:
>>>>> > UNCHALLENGED
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> GET
>>>>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>>> > Accept: application/json
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>>> > Authorization: Basic YWRtaW46RnVubmVsYmFjazE=
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>>> > Host: 54.165.85.140:8080 <http://54.165.85.140:8080/>
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>>> > Connection: Keep-Alive
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>>> > User-Agent: Apache-HttpClient/4.3.5 (java 1.5)
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>>> > Accept-Encoding: gzip,deflate
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> "GET
>>>>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>>> > "Accept: application/json[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>>> > "Authorization: Basic YWRtaW46RnVubmVsYmFjazE=[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>>> > "Host: 54.165.85.140:8080[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>>> > "Connection: Keep-Alive[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>>> > "User-Agent: Apache-HttpClient/4.3.5 (java 1.5)[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>>> > "Accept-Encoding: gzip,deflate[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>>> > "[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > "HTTP/1.1 404 Not Found[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > "Server: Apache-Coyote/1.1[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > "Cache-Control: no-cache[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > "Expires: Thu, 01 Jan 1970 00:00:00 GMT[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > "Pragma: no-cache[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > "Content-Type: text/html;charset=UTF-8[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > "Transfer-Encoding: chunked[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > "Date: Tue, 20 Oct 2015 16:18:47 GMT[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > "[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > "630[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "
>>>>> > http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd <http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd>">[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > "<html xmlns="http://www.w3.org/1999/xhtml <http://www.w3.org/1999/xhtml>">[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> > <head>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >    <title>Web Script Status 404 - Not Found</title>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >    <link rel="stylesheet" href="/alfresco/css/webscripts.css"
>>>>> > type="text/css" />[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> > </head>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> > <body>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >    <div>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >       <table>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >          <tr>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >             <td><img src="/alfresco/images/logo/AlfrescoLogo32.png"
>>>>> > alt="Alfresco" /></td>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >             <td><span class="title">Web Script Status 404 - Not
>>>>> > Found</span></td>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >          </tr>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >       </table>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >       <br/>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >       <table>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >          <tr><td>The Web Script <a
>>>>> > href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
>>>>> > has responded with a status of 404 - Not Found.</td></tr>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >       </table>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >       <br/>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >       <table>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >          <tr><td><b>404 Description:</b></td><td> Requested resource is not
>>>>> > available.</td></tr>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >          <tr><td> </td></tr>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >          <tr><td><b>Message:</b></td><td>Cannot find object for
>>>>> > NodeIdReference[storeRef=auth://resolve,id=admin]</td></tr>[\n] <>"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >          <tr><td><b>Server</b>:</td><td>Community v5.0.0 (r75118-b23)
>>>>> > schema 8,001</td></tr>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >          <tr><td><b>Time</b>:</td><td>Oct 20, 2015 4:18:47 PM</td></tr>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >          <tr><td></td><td> </td></tr>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >          <tr><td><b>Diagnostics</b>:</td><td><a
>>>>> > href="/alfresco/service/script/org/alfresco/cmis/item.get">Inspect Web
>>>>> > Script (org/alfresco/cmis/item.get)</a></td></tr>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >       </table>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> >    </div>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>>> > </body>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > "</html>[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > "[\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > "[\r][\n]"
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > HTTP/1.1 404 Not Found
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > Server: Apache-Coyote/1.1
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > Cache-Control: no-cache
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > Expires: Thu, 01 Jan 1970 00:00:00 GMT
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > Pragma: no-cache
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > Content-Type: text/html;charset=UTF-8
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > Transfer-Encoding: chunked
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>>> > Date: Tue, 20 Oct 2015 16:18:47 GMT
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection can be kept
>>>>> > alive indefinitely
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10:
>>>>> > Shutdown connection
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection discarded
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10: Close
>>>>> > connection
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection released:
>>>>> > [id: 10][route: {}->http://54.165.85.140:8080][total <> kept alive: 0; route
>>>>> > allocated: 0 of 2; total allocated: 0 of 20]
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager is
>>>>> > shutting down
>>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager shut
>>>>> > down
>>>>> >
>>>>> > *Paul Farrell*
>>>>> > Senior Search Consultant
>>>>> >
>>>>> > 109-123 Clifton Street, London EC2A 4LD
>>>>> > *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/> <http://www.funnelback.com/ <http://www.funnelback.com/>>
>>>>> >
>>>>> > *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>> >
>>>>> > Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> -
>>>>> > Twitter <https://twitter.com/funnelback <https://twitter.com/funnelback>>
>>>>> >
>>>>> > Funnelback UK Ltd is a limited liability company registered in England &
>>>>> > Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>>>>> > EC2A 4LD. Company registration number: 07004264.
>>>>> >
>>>>> > On 20 Oct 2015, at 16:50, Maurizio Pillitu <maoo@apache.org <ma...@apache.org>> wrote:
>>>>> >
>>>>> > Hi Paul,
>>>>> >
>>>>> > it looks like you're hitting
>>>>> > https://github.com/maoo/alfresco-indexer/issues/3 <https://github.com/maoo/alfresco-indexer/issues/3> ; which version of
>>>>> > alfresco-indexer are you using? Can you try using
>>>>> > http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp <http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp> (or
>>>>> > the pre-built WAR file -
>>>>> > http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar <http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar>
>>>>> >  )
>>>>> >
>>>>> > HTH
>>>>> >   mao
>>>>> >
>>>>> > On Tue, Oct 20, 2015 at 5:36 PM Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
>>>>> > wrote:
>>>>> >
>>>>> >> Hi,
>>>>> >>
>>>>> >> Having had to go back to basics and re-install my Alfresco instance, I
>>>>> >> can confirm that the AMP file for the alfresco indexer web scripts *does*
>>>>> >> actually install without error. There must have been an issue with my
>>>>> >> previous Alfresco instance.
>>>>> >>
>>>>> >> Having said that, the Alfresco WebScript connector fails. The failure is
>>>>> >> down to the ‘Context’ setting (see below):
>>>>> >>
>>>>> >> <4a6db6238cff01e7ff77cdaf7e6ea050.png>
>>>>> >>
>>>>> >> When you attempt to save the configuration of the WebScript connector,
>>>>> >> Manifold clearly tries to check the connection. It seems to do this by
>>>>> >> making an API call (/auth/resolve/admin). The issue is with what Manifold
>>>>> >> prepends to the start of that path.
>>>>> >> If I leave the setting as above then Manifold reports   :
>>>>> >>
>>>>> >> <tr><td>The Web Script <a
>>>>> >> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
>>>>> >> has responded with a status of 404 - Not Found.</td></tr>[\n]”
>>>>> >>
>>>>> >> In other words, it builds the full path as
>>>>> >> “alfresco/service/api/node/auth/resolve/admin”.
>>>>> >>
>>>>> >> For my Alfresco Community 5.0 instance, I get to that same web script via
>>>>> >> the URL “/alfresco/service/auth/resolve/admin” i.e. without the ‘/api/node’.
>>>>> >>
>>>>> >> Somewhere, Manifold is assuming that the ‘/api/node’ is a correct path
>>>>> >> inclusion. In other words, there is nothing I can put into that box to
>>>>> >> prevent it.
>>>>> >>
>>>>> >> Paul
>>>>> >>
>>>>> >> On 20 Oct 2015, at 12:56, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>>>> >>
>>>>> >> Hmm.  What file was missing?  Maurizio maintains the indexer plugin; I
>>>>> >> feel certain he'd want to know.
>>>>> >>
>>>>> >> Karl
>>>>> >>
>>>>> >>
>>>>> >> On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
>>>>> >> wrote:
>>>>> >>
>>>>> >>> Hi guys,
>>>>> >>>
>>>>> >>> Just to let you know what’s going on - for informational purposes more
>>>>> >>> than anything.
>>>>> >>>
>>>>> >>> I initially tried taking the AMP file provided in the MCF plugins
>>>>> >>> directory (0.7.0) and tried to install it into Alfresco but got a message
>>>>> >>> saying a file was missing.
>>>>> >>>
>>>>> >>> Instead, I cloned the repository on GitHub for the alfresco-indexer
>>>>> >>> project and then built it on my local machine. This generated the AMP file
>>>>> >>> (0.7.2).
>>>>> >>>
>>>>> >>> I was able to successfully install the AMP file onto my Alfresco
>>>>> >>> instance.
>>>>> >>>
>>>>> >>> As it happens I now cannot log into Alfresco Share ('bad credentials or
>>>>> >>> server not available' message) but that is something I can work on.
>>>>> >>> Apparently the installation of some AMP files have been known to cause this
>>>>> >>> issue.
>>>>> >>>
>>>>> >>> So, progress to a point!
>>>>> >>>
>>>>> >>> *Paul Farrell*
>>>>> >>> Senior Search Consultant
>>>>> >>>
>>>>> >>> 109-123 Clifton Street, London EC2A 4LD
>>>>> >>> *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/> <http://www.funnelback.com/ <http://www.funnelback.com/>>
>>>>> >>>
>>>>> >>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>> >>>
>>>>> >>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> -
>>>>> >>>  Twitter <https://twitter.com/funnelback <https://twitter.com/funnelback>>
>>>>> >>>
>>>>> >>> Funnelback UK Ltd is a limited liability company registered in England &
>>>>> >>> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>>>>> >>> EC2A 4LD. Company registration number: 07004264.
>>>>> >>>
>>>>> >>> On 20 Oct 2015, at 12:36, Rafa Haro <rharoapache@gmail.com <ma...@gmail.com>> wrote:
>>>>> >>>
>>>>> >>> Hi,
>>>>> >>>
>>>>> >>> At the Alfresco side, hope this helps:
>>>>> >>>
>>>>> >>> http://docs.alfresco.com/4.1/tasks/amp-install.html <http://docs.alfresco.com/4.1/tasks/amp-install.html>
>>>>> >>>
>>>>> >>> Cheers
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> >>> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>>>> >>>
>>>>> >>>> The AMP file is actually shipped as part of the binary MCF
>>>>> >>>> distribution.  You can find it under "plugins".
>>>>> >>>>
>>>>> >>>> Karl
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
>>>>> >>>> wrote:
>>>>> >>>>
>>>>> >>>>> Hi all,
>>>>> >>>>>
>>>>> >>>>> Hopefully this will be my only request for information today.
>>>>> >>>>> I’m afraid this is a bit of a newbie question but I have managed to
>>>>> >>>>> get the Manifold UI to now show ‘Alfresco Webscripts’ as a connector. The
>>>>> >>>>> only bit I am missing now is to install the AMP file in Afresco.
>>>>> >>>>>
>>>>> >>>>> I realise that this is slightly outside of the Manifold remit but I
>>>>> >>>>> wondered if anyone can advise how I build the AMP file from the URL (
>>>>> >>>>> https://github.com/maoo/alfresco-indexer <https://github.com/maoo/alfresco-indexer>)? I have cloned the
>>>>> >>>>> repository to my local drive but, having never worked with Maven, am at a
>>>>> >>>>> loss at how to generate the AMP file that I then need to install into
>>>>> >>>>> Alfresco.
>>>>> >>>>>
>>>>> >>>>> Many thanks,
>>>>> >>>>>
>>>>> >>>>> On 19 Oct 2015, at 17:36, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>>>> >>>>>
>>>>> >>>>> The only way you can have such a reduced list of connectors is if
>>>>> >>>>> somebody commented out many connectors in your connectors.xml, or removed
>>>>> >>>>> them from the database table where they are registered by hand.
>>>>> >>>>>
>>>>> >>>>> Karl
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <
>>>>> >>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>> >>>>>
>>>>> >>>>>> After a good deal of time clicking around I came to the same
>>>>> >>>>>> conclusion - that there is no way of telling from the UI!!
>>>>> >>>>>>
>>>>> >>>>>> Having dug a bit deeper I believe I may actually have the Alfresco
>>>>> >>>>>> WebScript connectors installed. At least the 0.7.0 version. I notice in the
>>>>> >>>>>> ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
>>>>> >>>>>>
>>>>> >>>>>> Looking in the ‘connectors.xml’ file I can also see the line :
>>>>> >>>>>>
>>>>> >>>>>> <repositoryconnector name="Alfresco Webscript"
>>>>> >>>>>> class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>>>>> >>>>>>
>>>>> >>>>>> You can imagine my excitement!
>>>>> >>>>>>
>>>>> >>>>>> The only thing I am missing is the option in the UI. When I click to
>>>>> >>>>>> create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive,
>>>>> >>>>>> HDFS, Jira, Meridio, RSS, Sharepoint.
>>>>> >>>>>>
>>>>> >>>>>> Perhaps I am hoping for too much to hope that I can make a simple
>>>>> >>>>>> change to enable this repo connection?
>>>>> >>>>>>
>>>>> >>>>>> Thanks for all the help everyone
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>> On 19 Oct 2015, at 17:26, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>>>> >>>>>>
>>>>> >>>>>> Hah; there's not a way to inquire in the UI, if that's what you
>>>>> >>>>>> mean.  But if you see "Alfresco webscript" in the list of repository
>>>>> >>>>>> connection types, you've got a version that supports that connector.
>>>>> >>>>>>
>>>>> >>>>>> Thanks,
>>>>> >>>>>> Karl
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <
>>>>> >>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>> >>>>>>
>>>>> >>>>>>> Thanks Rafa.
>>>>> >>>>>>>
>>>>> >>>>>>> As an aside, is there an easy way to identify which version of
>>>>> >>>>>>> ManifoldCF you are on?
>>>>> >>>>>>>
>>>>> >>>>>>> Cheers
>>>>> >>>>>>>
>>>>> >>>>>>> *Paul Farrell*
>>>>> >>>>>>> Senior Search Consultant
>>>>> >>>>>>>
>>>>> >>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>> >>>>>>> *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/>
>>>>> >>>>>>> <http://www.funnelback.com/ <http://www.funnelback.com/>>
>>>>> >>>>>>>
>>>>> >>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>> >>>>>>>
>>>>> >>>>>>> Connect with us: LinkedIn
>>>>> >>>>>>> <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> - Twitter
>>>>> >>>>>>> <https://twitter.com/funnelback <https://twitter.com/funnelback>>
>>>>> >>>>>>>
>>>>> >>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>>>> >>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
>>>>> >>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>>>> >>>>>>>
>>>>> >>>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rharo@apache.org <ma...@apache.org>> wrote:
>>>>> >>>>>>>
>>>>> >>>>>>> Hi Paul,
>>>>> >>>>>>>
>>>>> >>>>>>> All you need to do is to install this webscript
>>>>> >>>>>>> <https://github.com/maoo/alfresco-indexer <https://github.com/maoo/alfresco-indexer>> within your Alfresco
>>>>> >>>>>>> instance. The connector itself is already part of the most recent versions
>>>>> >>>>>>> of ManifoldCF
>>>>> >>>>>>>
>>>>> >>>>>>> Cheers,
>>>>> >>>>>>> Rafa
>>>>> >>>>>>>
>>>>> >>>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <
>>>>> >>>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>> >>>>>>>
>>>>> >>>>>>>> Ok, thanks again guys.
>>>>> >>>>>>>>
>>>>> >>>>>>>> The Webscript connector it is.
>>>>> >>>>>>>>
>>>>> >>>>>>>> I realise I am asking a lot here but are there any easy-to-follow
>>>>> >>>>>>>> guidelines on how to get this Webscript connector installed?  I see there
>>>>> >>>>>>>> is a GitHub page here (
>>>>> >>>>>>>> https://github.com/maoo/alfresco-webscript-manifold-connector <https://github.com/maoo/alfresco-webscript-manifold-connector>)
>>>>> >>>>>>>> which discusses it (although it directs you to a repository of files).
>>>>> >>>>>>>>
>>>>> >>>>>>>> I am just keen to make sure that any steps I follow to try and get
>>>>> >>>>>>>> this Webscript connector installed and working are updated, reliable steps.
>>>>> >>>>>>>> I would hate to waste time with out of date information.
>>>>> >>>>>>>>
>>>>> >>>>>>>> Thanks all
>>>>> >>>>>>>>
>>>>> >>>>>>>>
>>>>> >>>>>>>>
>>>>> >>>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>>
>>>>> >>>>>>>> wrote:
>>>>> >>>>>>>>
>>>>> >>>>>>>> Hi Paul,
>>>>> >>>>>>>>
>>>>> >>>>>>>> I suggest that you should use Alfresco Webscript as Karl mentioned.
>>>>> >>>>>>>> Web services is so slow compared to other services and I've also checked
>>>>> >>>>>>>> that Alfresco CMIS web services does not return change token(may be there
>>>>> >>>>>>>> is something that I don't know).
>>>>> >>>>>>>>
>>>>> >>>>>>>> By the way current version of CMIS connector is not aware of change
>>>>> >>>>>>>> token. I would write a patch for you if alfresco supports change token
>>>>> >>>>>>>> property.
>>>>> >>>>>>>>
>>>>> >>>>>>>> Thanks!
>>>>> >>>>>>>> Muhammed
>>>>> >>>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <
>>>>> >>>>>>>> daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>>>>> >>>>>>>>
>>>>> >>>>>>>>> Hi Paul,
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> The Alfresco Webscript connector is a wholly different connector
>>>>> >>>>>>>>> that has no relation to the CMIS connector.  It requires an Alfresco
>>>>> >>>>>>>>> webscript plugin be installed on your Alfresco server to work, though.
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> Hope that helps.
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> Karl
>>>>> >>>>>>>>>
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <
>>>>> >>>>>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>> >>>>>>>>>
>>>>> >>>>>>>>>> Hi Muhammed/Karl,
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Firstly, thank-you so much for taking the time to reply. It is
>>>>> >>>>>>>>>> very much appreciated.
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Currently I am using the AtomPub for my CMIS repository
>>>>> >>>>>>>>>> connection. I have just read something which may shed a little light on
>>>>> >>>>>>>>>> this. The post read that change tokens are not passed via AtomPub
>>>>> >>>>>>>>>> connections (
>>>>> >>>>>>>>>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758 <https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758>).
>>>>> >>>>>>>>>> If true, this would explain why ManifoldCF may be unable to determine a
>>>>> >>>>>>>>>> change in Alfresco.
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> It looks like I have two possible options left open to me
>>>>> >>>>>>>>>> (correct me if I’m wrong):
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the
>>>>> >>>>>>>>>> connection mechanism
>>>>> >>>>>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’
>>>>> >>>>>>>>>> connector?  (or is this the same as the ‘Web Services’ connection mentioned
>>>>> >>>>>>>>>> above?)
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Thanks again,
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Paul
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> *Paul Farrell*
>>>>> >>>>>>>>>> Senior Search Consultant
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>> >>>>>>>>>> *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/>
>>>>> >>>>>>>>>> <http://www.funnelback.com/ <http://www.funnelback.com/>>
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED
>>>>> >>>>>>>>>> STATES
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Connect with us: LinkedIn
>>>>> >>>>>>>>>> <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> - Twitter
>>>>> >>>>>>>>>> <https://twitter.com/funnelback <https://twitter.com/funnelback>>
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>>>> >>>>>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
>>>>> >>>>>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>>
>>>>> >>>>>>>>>> wrote:
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Hi Paul,
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Repositories should give information to ManifoldCF when they
>>>>> >>>>>>>>>> updated. Current CMIS connector reindex document if the lastest version of
>>>>> >>>>>>>>>> the document has changed, not updated.
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> There is a change token property in CMIS specification and it
>>>>> >>>>>>>>>> should change when document is updated so ManifoldCF can understand that
>>>>> >>>>>>>>>> document is updated but implementing change token property is optional.
>>>>> >>>>>>>>>> I've checked Alfresco's CMIS web site and seen that they didn't set the
>>>>> >>>>>>>>>> change token.
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> I think, there is nothing we can do at this point.
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <daddywri@gmail.com <ma...@gmail.com>>
>>>>> >>>>>>>>>> şunu yazdı:
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>>> Hi Paul,
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>> This looks like a bug in the CMIS connector to me; usually the
>>>>> >>>>>>>>>>> document version string the connector constructs should be adequate to
>>>>> >>>>>>>>>>> detect all changes.  Can you create a ticket?
>>>>> >>>>>>>>>>> https://issues.apache.org/jira <https://issues.apache.org/jira> , project ManifoldCF.  Please
>>>>> >>>>>>>>>>> include what version of MCF you are using here.  FWIW, this may be in fact
>>>>> >>>>>>>>>>> a bug in the Alfresco CMIS implementation, but we'll have to have some back
>>>>> >>>>>>>>>>> and forth before I can determine that for sure.
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>> In the meantime, have you considered using the Alfresco
>>>>> >>>>>>>>>>> Webscript connector?  It's the preferred way to do Alfresco indexing,
>>>>> >>>>>>>>>>> although there have been issues reported having to do with running it on
>>>>> >>>>>>>>>>> some configurations of Alfresco.  I'm not entirely sure what the problem is
>>>>> >>>>>>>>>>> there; maybe a version dependency of some kind.
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>> Karl
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <
>>>>> >>>>>>>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>>> Hi Everyone,
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> Hoping someone may be able to advise.
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> I am currently using Manifold, together with a CMIS connector,
>>>>> >>>>>>>>>>>> to retrieve and index content from an Alfresco repository.
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> All is going well apart from, what I would call, the
>>>>> >>>>>>>>>>>> ‘incremental crawl’.
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> The main issue I am having is that the modification of a
>>>>> >>>>>>>>>>>> document’s security settings, in Alfresco, is not being picked up in next
>>>>> >>>>>>>>>>>> Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A
>>>>> >>>>>>>>>>>> and B as Consumers. I run a crawl in Manifold and it picks up the documents
>>>>> >>>>>>>>>>>> fine.  The security is set as expected. I then remove ‘User A’ from the
>>>>> >>>>>>>>>>>> security of that document and re-run the Manifold crawl. User A can still
>>>>> >>>>>>>>>>>> see the document in the local search engine.
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> It is as if Manifold is not treating the security update as a
>>>>> >>>>>>>>>>>> ‘modification’ and is therefore not refreshing it. Note that if I go into
>>>>> >>>>>>>>>>>> the Output Connections, edit and save the relevant output connection and
>>>>> >>>>>>>>>>>> then click ‘Remove all associated documents’, the next time I crawl, the
>>>>> >>>>>>>>>>>> changes are picked up. It is clear that Manifold is just not updating
>>>>> >>>>>>>>>>>> whatever internal record it has for this item.
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> Any ideas?
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> Many thanks.
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>
>>>>> >>>>>>>>
>>>>> >>>>>>>
>>>>> >>>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>
>>>>> >>>
>>>>> >>>
>>>>> >>
>>>>> >
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 


Re: Manifold/Alfresco seeding and security

Posted by Paul Farrell <pf...@funnelback.com>.
Hi Karl,

Yes, I know what you mean. As well as restarting the server I have attempted numerous manual restarts of the app/web server together with the termination of all java processes and the purging of the server’s ‘work’ directory. Nothing I can do will cause the new code to pick up. 

I have now asked someone else in here to cast their eyes over it as I was getting a little too close to this to see things afresh.

Thanks


> On 21 Oct 2015, at 15:42, Karl Wright <da...@gmail.com> wrote:
> 
> Hi Paul,
> 
> If you are starting and stopping a whole virtual machine, that will NOT cause jars within each process to be reloaded.  You have to start/stop processes.
> 
> Karl
> 
> 
> On Wed, Oct 21, 2015 at 10:19 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> I’m quite fortunate to be running all of this on a personal AWS Virtual Machine so have been able to actually stop and start the server. 
> 
> Having run that command line I sent below, I can confirm that the string “api/node” does not exist in any .jar file or regular file. I am at a loss to explain how the Manifold repository connection test process is still trying to access :
> 
> <tr><td>The Web Script <a href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a> has responded with a status of 404 - Not Found.</td></tr>[\n]”
> 
> It has got to the point now where I may just have to bite the bullet and tell the client that we cannot support nightly Alfresco crawls i.e. crawls that take into account the change log. Tough thing to do but I can’t see I have much choice right now. 
> 
> Really appreciate the help
> 
> 
>> On 21 Oct 2015, at 14:47, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hi Paul,
>> 
>> I can't answer that question until I know how you've deployed things.  I'm presuming that you are using a multiprocess deployment?  If so, for the web applications, recycling the application server should be sufficient, but you really want to check to be sure what properties.xml file the application server is pointing at, so you change the jar in the right place.  In a multiprocess setup, there are also agents processes (at least one), which you would also need to cycle.
>> 
>> Thanks,
>> Karl
>> 
>> 
>> On Wed, Oct 21, 2015 at 9:41 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>> Thanks Karl.
>> 
>> Can you clarify what you mean by ‘recycle Manifold processes’? My fallback position in anything like this is to restart whatever app/web server is hosting Manifold. Is that not sufficient?
>> 
>> As for this path being defined elsewhere, I have just finished constructing a one-liner that lets me search through the classes within jar’s. Quite useful:
>> 
>> find . -iname '*.jar' -printf "unzip -c %p | grep -q 'stringToSearchFor' && echo %p\n" | sh
>> 
>> Going to see if that original ‘api/node’ string exists anywhere else. 
>> 
>> Cheers
>> 
>> 
>> 
>> 
>>> On 21 Oct 2015, at 14:36, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> Hi Paul,
>>> 
>>> The indexer jar should appear in only one place, in the connector-lib directory that is referenced by your properties.xml file.  However, if you replace that, you will need to recycle all ManifoldCF processes or they will not be able to pick it up.
>>> 
>>> I would also check the URL that's being logged to be sure it matches the pattern Maurizio pointed out.  If it doesn't, there's a possibility that some other place in the connector has a similar problem that hasn't been fixed.
>>> 
>>> Thanks,
>>> Karl
>>> 
>>> 
>>> On Wed, Oct 21, 2015 at 8:48 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>> Hi Karl/Maurizio,
>>> 
>>> I have a very very odd circumstance at present. This may or may not be related to the Alfresco WebScript plugin OR the environment in which I am running Manifold but thought I would raise the question. 
>>> 
>>> I have cloned the repo for the Alfresco Webscript connector and can see that there is a ‘alfresco-indexer-client.jar’ file in the ‘target’ directory. 
>>> 
>>> I have taken that jar and have replaced the jar that existed in the Manifold instance. This was at a path called ‘apache-manifoldcf/connector-lib’. This path is referenced in an ‘mcf-properties.xml’ file which may or may not be specific to our environment. 
>>> 
>>> Anyway, as I say I have replaced the existing jar but the strangest thing is that the same path is being used when I ‘Save’ the repository connection. In other words, the path ‘….api/node…’ is still being used despite the jar file saying otherwise. 
>>> 
>>> NOTE: the way I am testing this is to apply the jar, restart Jetty (our app server), open Manifold, navigate to the Alfresco WebScript Repository connection, hit ‘Save’ and then open the ‘manifold.log’ file. It is in this file that I see the HTTP request and the 404 error. It is in this HTTP request that it stipulates the path it is using - the old path. 
>>> 
>>> —
>>> 
>>> I have even gone to the extreme of removing this jar file and restarting the app server to see if this jar is ignored by Manifold. If I do this Manifold does not even start so it is clearly expecting that jar to exist. This is even more strange. It is clearly reliant on the jar but it is not using the content of that jar. 
>>> 
>>> Can I ask if you guys can think of any reason at all that this might be happening. It is starting to drive me mad!
>>> 
>>> Thanks
>>> 
>>> 
>>> 
>>> 
>>>> On 21 Oct 2015, at 02:23, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>>> 
>>>> Hi Paul,
>>>> Looking at Issue 3, I think that Maurizio has indeed pointed you in the right direction.  Can you check your version of the plugin to be sure that /api/node/ is NOT present in the described line of code?
>>>> 
>>>> Karl
>>>> 
>>>> 
>>>> On Tue, Oct 20, 2015 at 5:00 PM, <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>> Hi Maurizio,
>>>> 
>>>> I will be available all day tomorrow (Wednesday) to help out as much as I can. If it's possible for you to look into this I can take whatever steps you need.
>>>> 
>>>> Many thanks,
>>>> 
>>>> Paul
>>>> 
>>>> -----Original Message-----
>>>> From: "Karl Wright" <daddywri@gmail.com <ma...@gmail.com>>
>>>> Sent: Tuesday, October 20, 2015 12:34pm
>>>> To: "user@manifoldcf.apache.org <ma...@manifoldcf.apache.org>" <user@manifoldcf.apache.org <ma...@manifoldcf.apache.org>>
>>>> Subject: Re: Manifold/Alfresco seeding and security
>>>> 
>>>> Hi Maurizio,
>>>> 
>>>> This is the third time we've seen this; can you use Paul's help to chase
>>>> down what the issue is?
>>>> 
>>>> Karl
>>>> 
>>>> 
>>>> On Tue, Oct 20, 2015 at 12:19 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
>>>> wrote:
>>>> 
>>>> > Hi,
>>>> >
>>>> > I am using Alfresco Community 5.0.
>>>> >
>>>> > Having taken that AMP file (version 0.7.1) and then installed it into
>>>> > Alfresco and restarted the services, the issue is still present.
>>>> >
>>>> > I suspect that this is probably more to do with the Manifold end than the
>>>> > Alfresco end. It seems it is Manifold that is automatically appending the
>>>> > “/api/node” string into the path whenever I use “/alfresco/service” as the
>>>> > Context in the repository connection configuration.
>>>> >
>>>> > If it is of interest, this is the output in the manifoldcf.log file when I
>>>> > use the repo connection config I mentioned earlier.
>>>> >
>>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection request:
>>>> > [route: {}->http://54.165.85.140:8080][total <> kept alive: 0; route
>>>> > allocated: 0 of 2; total allocated: 0 of 20]
>>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection leased: [id:
>>>> > 10][route: {}->http://54.165.85.140:8080][total <> kept alive: 0; route
>>>> > allocated: 1 of 2; total allocated: 1 of 20]
>>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Opening connection {}->
>>>> > http://54.165.85.140:8080 <http://54.165.85.140:8080/>
>>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connecting to /
>>>> > 54.165.85.140:8080 <http://54.165.85.140:8080/>
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Connection established
>>>> > 172.31.23.90:58712 <http://172.31.23.90:58712/><->54.165.85.140:8080 <http://54.165.85.140:8080/>
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Executing request GET
>>>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Proxy auth state:
>>>> > UNCHALLENGED
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> GET
>>>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > Accept: application/json
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > Authorization: Basic YWRtaW46RnVubmVsYmFjazE=
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > Host: 54.165.85.140:8080 <http://54.165.85.140:8080/>
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > Connection: Keep-Alive
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > User-Agent: Apache-HttpClient/4.3.5 (java 1.5)
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > Accept-Encoding: gzip,deflate
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> "GET
>>>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > "Accept: application/json[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > "Authorization: Basic YWRtaW46RnVubmVsYmFjazE=[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > "Host: 54.165.85.140:8080[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > "Connection: Keep-Alive[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > "User-Agent: Apache-HttpClient/4.3.5 (java 1.5)[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > "Accept-Encoding: gzip,deflate[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > "[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "HTTP/1.1 404 Not Found[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "Server: Apache-Coyote/1.1[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "Cache-Control: no-cache[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "Expires: Thu, 01 Jan 1970 00:00:00 GMT[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "Pragma: no-cache[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "Content-Type: text/html;charset=UTF-8[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "Transfer-Encoding: chunked[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "Date: Tue, 20 Oct 2015 16:18:47 GMT[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "630[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "
>>>> > http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd <http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd>">[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "<html xmlns="http://www.w3.org/1999/xhtml <http://www.w3.org/1999/xhtml>">[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> > <head>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >    <title>Web Script Status 404 - Not Found</title>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >    <link rel="stylesheet" href="/alfresco/css/webscripts.css"
>>>> > type="text/css" />[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> > </head>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> > <body>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >    <div>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >       <table>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >          <tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >             <td><img src="/alfresco/images/logo/AlfrescoLogo32.png"
>>>> > alt="Alfresco" /></td>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >             <td><span class="title">Web Script Status 404 - Not
>>>> > Found</span></td>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >          </tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >       </table>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >       <br/>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >       <table>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >          <tr><td>The Web Script <a
>>>> > href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
>>>> > has responded with a status of 404 - Not Found.</td></tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >       </table>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >       <br/>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >       <table>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >          <tr><td><b>404 Description:</b></td><td> Requested resource is not
>>>> > available.</td></tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >          <tr><td> </td></tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >          <tr><td><b>Message:</b></td><td>Cannot find object for
>>>> > NodeIdReference[storeRef=auth://resolve,id=admin]</td></tr>[\n] <>"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >          <tr><td><b>Server</b>:</td><td>Community v5.0.0 (r75118-b23)
>>>> > schema 8,001</td></tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >          <tr><td><b>Time</b>:</td><td>Oct 20, 2015 4:18:47 PM</td></tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >          <tr><td></td><td> </td></tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >          <tr><td><b>Diagnostics</b>:</td><td><a
>>>> > href="/alfresco/service/script/org/alfresco/cmis/item.get">Inspect Web
>>>> > Script (org/alfresco/cmis/item.get)</a></td></tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >       </table>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> >    </div>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>>> > </body>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "</html>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > HTTP/1.1 404 Not Found
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > Server: Apache-Coyote/1.1
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > Cache-Control: no-cache
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > Expires: Thu, 01 Jan 1970 00:00:00 GMT
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > Pragma: no-cache
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > Content-Type: text/html;charset=UTF-8
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > Transfer-Encoding: chunked
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > Date: Tue, 20 Oct 2015 16:18:47 GMT
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection can be kept
>>>> > alive indefinitely
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10:
>>>> > Shutdown connection
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection discarded
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10: Close
>>>> > connection
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection released:
>>>> > [id: 10][route: {}->http://54.165.85.140:8080][total <> kept alive: 0; route
>>>> > allocated: 0 of 2; total allocated: 0 of 20]
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager is
>>>> > shutting down
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager shut
>>>> > down
>>>> >
>>>> > *Paul Farrell*
>>>> > Senior Search Consultant
>>>> >
>>>> > 109-123 Clifton Street, London EC2A 4LD
>>>> > *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/> <http://www.funnelback.com/ <http://www.funnelback.com/>>
>>>> >
>>>> > *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>> >
>>>> > Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> -
>>>> > Twitter <https://twitter.com/funnelback <https://twitter.com/funnelback>>
>>>> >
>>>> > Funnelback UK Ltd is a limited liability company registered in England &
>>>> > Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>>>> > EC2A 4LD. Company registration number: 07004264.
>>>> >
>>>> > On 20 Oct 2015, at 16:50, Maurizio Pillitu <maoo@apache.org <ma...@apache.org>> wrote:
>>>> >
>>>> > Hi Paul,
>>>> >
>>>> > it looks like you're hitting
>>>> > https://github.com/maoo/alfresco-indexer/issues/3 <https://github.com/maoo/alfresco-indexer/issues/3> ; which version of
>>>> > alfresco-indexer are you using? Can you try using
>>>> > http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp <http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp> (or
>>>> > the pre-built WAR file -
>>>> > http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar <http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar>
>>>> >  )
>>>> >
>>>> > HTH
>>>> >   mao
>>>> >
>>>> > On Tue, Oct 20, 2015 at 5:36 PM Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
>>>> > wrote:
>>>> >
>>>> >> Hi,
>>>> >>
>>>> >> Having had to go back to basics and re-install my Alfresco instance, I
>>>> >> can confirm that the AMP file for the alfresco indexer web scripts *does*
>>>> >> actually install without error. There must have been an issue with my
>>>> >> previous Alfresco instance.
>>>> >>
>>>> >> Having said that, the Alfresco WebScript connector fails. The failure is
>>>> >> down to the ‘Context’ setting (see below):
>>>> >>
>>>> >> <4a6db6238cff01e7ff77cdaf7e6ea050.png>
>>>> >>
>>>> >> When you attempt to save the configuration of the WebScript connector,
>>>> >> Manifold clearly tries to check the connection. It seems to do this by
>>>> >> making an API call (/auth/resolve/admin). The issue is with what Manifold
>>>> >> prepends to the start of that path.
>>>> >> If I leave the setting as above then Manifold reports   :
>>>> >>
>>>> >> <tr><td>The Web Script <a
>>>> >> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
>>>> >> has responded with a status of 404 - Not Found.</td></tr>[\n]”
>>>> >>
>>>> >> In other words, it builds the full path as
>>>> >> “alfresco/service/api/node/auth/resolve/admin”.
>>>> >>
>>>> >> For my Alfresco Community 5.0 instance, I get to that same web script via
>>>> >> the URL “/alfresco/service/auth/resolve/admin” i.e. without the ‘/api/node’.
>>>> >>
>>>> >> Somewhere, Manifold is assuming that the ‘/api/node’ is a correct path
>>>> >> inclusion. In other words, there is nothing I can put into that box to
>>>> >> prevent it.
>>>> >>
>>>> >> Paul
>>>> >>
>>>> >> On 20 Oct 2015, at 12:56, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>>> >>
>>>> >> Hmm.  What file was missing?  Maurizio maintains the indexer plugin; I
>>>> >> feel certain he'd want to know.
>>>> >>
>>>> >> Karl
>>>> >>
>>>> >>
>>>> >> On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
>>>> >> wrote:
>>>> >>
>>>> >>> Hi guys,
>>>> >>>
>>>> >>> Just to let you know what’s going on - for informational purposes more
>>>> >>> than anything.
>>>> >>>
>>>> >>> I initially tried taking the AMP file provided in the MCF plugins
>>>> >>> directory (0.7.0) and tried to install it into Alfresco but got a message
>>>> >>> saying a file was missing.
>>>> >>>
>>>> >>> Instead, I cloned the repository on GitHub for the alfresco-indexer
>>>> >>> project and then built it on my local machine. This generated the AMP file
>>>> >>> (0.7.2).
>>>> >>>
>>>> >>> I was able to successfully install the AMP file onto my Alfresco
>>>> >>> instance.
>>>> >>>
>>>> >>> As it happens I now cannot log into Alfresco Share ('bad credentials or
>>>> >>> server not available' message) but that is something I can work on.
>>>> >>> Apparently the installation of some AMP files have been known to cause this
>>>> >>> issue.
>>>> >>>
>>>> >>> So, progress to a point!
>>>> >>>
>>>> >>> *Paul Farrell*
>>>> >>> Senior Search Consultant
>>>> >>>
>>>> >>> 109-123 Clifton Street, London EC2A 4LD
>>>> >>> *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/> <http://www.funnelback.com/ <http://www.funnelback.com/>>
>>>> >>>
>>>> >>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>> >>>
>>>> >>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> -
>>>> >>>  Twitter <https://twitter.com/funnelback <https://twitter.com/funnelback>>
>>>> >>>
>>>> >>> Funnelback UK Ltd is a limited liability company registered in England &
>>>> >>> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>>>> >>> EC2A 4LD. Company registration number: 07004264.
>>>> >>>
>>>> >>> On 20 Oct 2015, at 12:36, Rafa Haro <rharoapache@gmail.com <ma...@gmail.com>> wrote:
>>>> >>>
>>>> >>> Hi,
>>>> >>>
>>>> >>> At the Alfresco side, hope this helps:
>>>> >>>
>>>> >>> http://docs.alfresco.com/4.1/tasks/amp-install.html <http://docs.alfresco.com/4.1/tasks/amp-install.html>
>>>> >>>
>>>> >>> Cheers
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>>> >>>
>>>> >>>> The AMP file is actually shipped as part of the binary MCF
>>>> >>>> distribution.  You can find it under "plugins".
>>>> >>>>
>>>> >>>> Karl
>>>> >>>>
>>>> >>>>
>>>> >>>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
>>>> >>>> wrote:
>>>> >>>>
>>>> >>>>> Hi all,
>>>> >>>>>
>>>> >>>>> Hopefully this will be my only request for information today.
>>>> >>>>> I’m afraid this is a bit of a newbie question but I have managed to
>>>> >>>>> get the Manifold UI to now show ‘Alfresco Webscripts’ as a connector. The
>>>> >>>>> only bit I am missing now is to install the AMP file in Afresco.
>>>> >>>>>
>>>> >>>>> I realise that this is slightly outside of the Manifold remit but I
>>>> >>>>> wondered if anyone can advise how I build the AMP file from the URL (
>>>> >>>>> https://github.com/maoo/alfresco-indexer <https://github.com/maoo/alfresco-indexer>)? I have cloned the
>>>> >>>>> repository to my local drive but, having never worked with Maven, am at a
>>>> >>>>> loss at how to generate the AMP file that I then need to install into
>>>> >>>>> Alfresco.
>>>> >>>>>
>>>> >>>>> Many thanks,
>>>> >>>>>
>>>> >>>>> On 19 Oct 2015, at 17:36, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>>> >>>>>
>>>> >>>>> The only way you can have such a reduced list of connectors is if
>>>> >>>>> somebody commented out many connectors in your connectors.xml, or removed
>>>> >>>>> them from the database table where they are registered by hand.
>>>> >>>>>
>>>> >>>>> Karl
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <
>>>> >>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>> >>>>>
>>>> >>>>>> After a good deal of time clicking around I came to the same
>>>> >>>>>> conclusion - that there is no way of telling from the UI!!
>>>> >>>>>>
>>>> >>>>>> Having dug a bit deeper I believe I may actually have the Alfresco
>>>> >>>>>> WebScript connectors installed. At least the 0.7.0 version. I notice in the
>>>> >>>>>> ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
>>>> >>>>>>
>>>> >>>>>> Looking in the ‘connectors.xml’ file I can also see the line :
>>>> >>>>>>
>>>> >>>>>> <repositoryconnector name="Alfresco Webscript"
>>>> >>>>>> class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>>>> >>>>>>
>>>> >>>>>> You can imagine my excitement!
>>>> >>>>>>
>>>> >>>>>> The only thing I am missing is the option in the UI. When I click to
>>>> >>>>>> create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive,
>>>> >>>>>> HDFS, Jira, Meridio, RSS, Sharepoint.
>>>> >>>>>>
>>>> >>>>>> Perhaps I am hoping for too much to hope that I can make a simple
>>>> >>>>>> change to enable this repo connection?
>>>> >>>>>>
>>>> >>>>>> Thanks for all the help everyone
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> On 19 Oct 2015, at 17:26, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>>> >>>>>>
>>>> >>>>>> Hah; there's not a way to inquire in the UI, if that's what you
>>>> >>>>>> mean.  But if you see "Alfresco webscript" in the list of repository
>>>> >>>>>> connection types, you've got a version that supports that connector.
>>>> >>>>>>
>>>> >>>>>> Thanks,
>>>> >>>>>> Karl
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <
>>>> >>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>> >>>>>>
>>>> >>>>>>> Thanks Rafa.
>>>> >>>>>>>
>>>> >>>>>>> As an aside, is there an easy way to identify which version of
>>>> >>>>>>> ManifoldCF you are on?
>>>> >>>>>>>
>>>> >>>>>>> Cheers
>>>> >>>>>>>
>>>> >>>>>>> *Paul Farrell*
>>>> >>>>>>> Senior Search Consultant
>>>> >>>>>>>
>>>> >>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>> >>>>>>> *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/>
>>>> >>>>>>> <http://www.funnelback.com/ <http://www.funnelback.com/>>
>>>> >>>>>>>
>>>> >>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>> >>>>>>>
>>>> >>>>>>> Connect with us: LinkedIn
>>>> >>>>>>> <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> - Twitter
>>>> >>>>>>> <https://twitter.com/funnelback <https://twitter.com/funnelback>>
>>>> >>>>>>>
>>>> >>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>>> >>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
>>>> >>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>>> >>>>>>>
>>>> >>>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rharo@apache.org <ma...@apache.org>> wrote:
>>>> >>>>>>>
>>>> >>>>>>> Hi Paul,
>>>> >>>>>>>
>>>> >>>>>>> All you need to do is to install this webscript
>>>> >>>>>>> <https://github.com/maoo/alfresco-indexer <https://github.com/maoo/alfresco-indexer>> within your Alfresco
>>>> >>>>>>> instance. The connector itself is already part of the most recent versions
>>>> >>>>>>> of ManifoldCF
>>>> >>>>>>>
>>>> >>>>>>> Cheers,
>>>> >>>>>>> Rafa
>>>> >>>>>>>
>>>> >>>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <
>>>> >>>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>> >>>>>>>
>>>> >>>>>>>> Ok, thanks again guys.
>>>> >>>>>>>>
>>>> >>>>>>>> The Webscript connector it is.
>>>> >>>>>>>>
>>>> >>>>>>>> I realise I am asking a lot here but are there any easy-to-follow
>>>> >>>>>>>> guidelines on how to get this Webscript connector installed?  I see there
>>>> >>>>>>>> is a GitHub page here (
>>>> >>>>>>>> https://github.com/maoo/alfresco-webscript-manifold-connector <https://github.com/maoo/alfresco-webscript-manifold-connector>)
>>>> >>>>>>>> which discusses it (although it directs you to a repository of files).
>>>> >>>>>>>>
>>>> >>>>>>>> I am just keen to make sure that any steps I follow to try and get
>>>> >>>>>>>> this Webscript connector installed and working are updated, reliable steps.
>>>> >>>>>>>> I would hate to waste time with out of date information.
>>>> >>>>>>>>
>>>> >>>>>>>> Thanks all
>>>> >>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>>
>>>> >>>>>>>> wrote:
>>>> >>>>>>>>
>>>> >>>>>>>> Hi Paul,
>>>> >>>>>>>>
>>>> >>>>>>>> I suggest that you should use Alfresco Webscript as Karl mentioned.
>>>> >>>>>>>> Web services is so slow compared to other services and I've also checked
>>>> >>>>>>>> that Alfresco CMIS web services does not return change token(may be there
>>>> >>>>>>>> is something that I don't know).
>>>> >>>>>>>>
>>>> >>>>>>>> By the way current version of CMIS connector is not aware of change
>>>> >>>>>>>> token. I would write a patch for you if alfresco supports change token
>>>> >>>>>>>> property.
>>>> >>>>>>>>
>>>> >>>>>>>> Thanks!
>>>> >>>>>>>> Muhammed
>>>> >>>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <
>>>> >>>>>>>> daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>>>> >>>>>>>>
>>>> >>>>>>>>> Hi Paul,
>>>> >>>>>>>>>
>>>> >>>>>>>>> The Alfresco Webscript connector is a wholly different connector
>>>> >>>>>>>>> that has no relation to the CMIS connector.  It requires an Alfresco
>>>> >>>>>>>>> webscript plugin be installed on your Alfresco server to work, though.
>>>> >>>>>>>>>
>>>> >>>>>>>>> Hope that helps.
>>>> >>>>>>>>>
>>>> >>>>>>>>> Karl
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <
>>>> >>>>>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>> >>>>>>>>>
>>>> >>>>>>>>>> Hi Muhammed/Karl,
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Firstly, thank-you so much for taking the time to reply. It is
>>>> >>>>>>>>>> very much appreciated.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Currently I am using the AtomPub for my CMIS repository
>>>> >>>>>>>>>> connection. I have just read something which may shed a little light on
>>>> >>>>>>>>>> this. The post read that change tokens are not passed via AtomPub
>>>> >>>>>>>>>> connections (
>>>> >>>>>>>>>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758 <https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758>).
>>>> >>>>>>>>>> If true, this would explain why ManifoldCF may be unable to determine a
>>>> >>>>>>>>>> change in Alfresco.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> It looks like I have two possible options left open to me
>>>> >>>>>>>>>> (correct me if I’m wrong):
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the
>>>> >>>>>>>>>> connection mechanism
>>>> >>>>>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’
>>>> >>>>>>>>>> connector?  (or is this the same as the ‘Web Services’ connection mentioned
>>>> >>>>>>>>>> above?)
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Thanks again,
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Paul
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> *Paul Farrell*
>>>> >>>>>>>>>> Senior Search Consultant
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>> >>>>>>>>>> *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/>
>>>> >>>>>>>>>> <http://www.funnelback.com/ <http://www.funnelback.com/>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED
>>>> >>>>>>>>>> STATES
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Connect with us: LinkedIn
>>>> >>>>>>>>>> <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> - Twitter
>>>> >>>>>>>>>> <https://twitter.com/funnelback <https://twitter.com/funnelback>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>>> >>>>>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
>>>> >>>>>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>>
>>>> >>>>>>>>>> wrote:
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Hi Paul,
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Repositories should give information to ManifoldCF when they
>>>> >>>>>>>>>> updated. Current CMIS connector reindex document if the lastest version of
>>>> >>>>>>>>>> the document has changed, not updated.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> There is a change token property in CMIS specification and it
>>>> >>>>>>>>>> should change when document is updated so ManifoldCF can understand that
>>>> >>>>>>>>>> document is updated but implementing change token property is optional.
>>>> >>>>>>>>>> I've checked Alfresco's CMIS web site and seen that they didn't set the
>>>> >>>>>>>>>> change token.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> I think, there is nothing we can do at this point.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <daddywri@gmail.com <ma...@gmail.com>>
>>>> >>>>>>>>>> şunu yazdı:
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>> Hi Paul,
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> This looks like a bug in the CMIS connector to me; usually the
>>>> >>>>>>>>>>> document version string the connector constructs should be adequate to
>>>> >>>>>>>>>>> detect all changes.  Can you create a ticket?
>>>> >>>>>>>>>>> https://issues.apache.org/jira <https://issues.apache.org/jira> , project ManifoldCF.  Please
>>>> >>>>>>>>>>> include what version of MCF you are using here.  FWIW, this may be in fact
>>>> >>>>>>>>>>> a bug in the Alfresco CMIS implementation, but we'll have to have some back
>>>> >>>>>>>>>>> and forth before I can determine that for sure.
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> In the meantime, have you considered using the Alfresco
>>>> >>>>>>>>>>> Webscript connector?  It's the preferred way to do Alfresco indexing,
>>>> >>>>>>>>>>> although there have been issues reported having to do with running it on
>>>> >>>>>>>>>>> some configurations of Alfresco.  I'm not entirely sure what the problem is
>>>> >>>>>>>>>>> there; maybe a version dependency of some kind.
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> Karl
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <
>>>> >>>>>>>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>> Hi Everyone,
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Hoping someone may be able to advise.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> I am currently using Manifold, together with a CMIS connector,
>>>> >>>>>>>>>>>> to retrieve and index content from an Alfresco repository.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> All is going well apart from, what I would call, the
>>>> >>>>>>>>>>>> ‘incremental crawl’.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> The main issue I am having is that the modification of a
>>>> >>>>>>>>>>>> document’s security settings, in Alfresco, is not being picked up in next
>>>> >>>>>>>>>>>> Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A
>>>> >>>>>>>>>>>> and B as Consumers. I run a crawl in Manifold and it picks up the documents
>>>> >>>>>>>>>>>> fine.  The security is set as expected. I then remove ‘User A’ from the
>>>> >>>>>>>>>>>> security of that document and re-run the Manifold crawl. User A can still
>>>> >>>>>>>>>>>> see the document in the local search engine.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> It is as if Manifold is not treating the security update as a
>>>> >>>>>>>>>>>> ‘modification’ and is therefore not refreshing it. Note that if I go into
>>>> >>>>>>>>>>>> the Output Connections, edit and save the relevant output connection and
>>>> >>>>>>>>>>>> then click ‘Remove all associated documents’, the next time I crawl, the
>>>> >>>>>>>>>>>> changes are picked up. It is clear that Manifold is just not updating
>>>> >>>>>>>>>>>> whatever internal record it has for this item.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Any ideas?
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Many thanks.
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>
>>>> >>>
>>>> >>>
>>>> >>
>>>> >
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 
> 


Re: Manifold/Alfresco seeding and security

Posted by Karl Wright <da...@gmail.com>.
Hi Paul,

If you are starting and stopping a whole virtual machine, that will NOT
cause jars within each process to be reloaded.  You have to start/stop
processes.

Karl


On Wed, Oct 21, 2015 at 10:19 AM, Paul Farrell <pf...@funnelback.com>
wrote:

> I’m quite fortunate to be running all of this on a personal AWS Virtual
> Machine so have been able to actually stop and start the server.
>
> Having run that command line I sent below, I can confirm that the string
> “api/node” does not exist in any .jar file or regular file. I am at a loss
> to explain how the Manifold repository connection test process is still
> trying to access :
>
> <tr><td>The Web Script <a
> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
> has responded with a status of 404 - Not Found.</td></tr>[\n]”
>
> It has got to the point now where I may just have to bite the bullet and
> tell the client that we cannot support nightly Alfresco crawls i.e. crawls
> that take into account the change log. Tough thing to do but I can’t see I
> have much choice right now.
>
> Really appreciate the help
>
>
> On 21 Oct 2015, at 14:47, Karl Wright <da...@gmail.com> wrote:
>
> Hi Paul,
>
> I can't answer that question until I know how you've deployed things.  I'm
> presuming that you are using a multiprocess deployment?  If so, for the web
> applications, recycling the application server should be sufficient, but
> you really want to check to be sure what properties.xml file the
> application server is pointing at, so you change the jar in the right
> place.  In a multiprocess setup, there are also agents processes (at least
> one), which you would also need to cycle.
>
> Thanks,
> Karl
>
>
> On Wed, Oct 21, 2015 at 9:41 AM, Paul Farrell <pf...@funnelback.com>
> wrote:
>
>> Thanks Karl.
>>
>> Can you clarify what you mean by ‘recycle Manifold processes’? My
>> fallback position in anything like this is to restart whatever app/web
>> server is hosting Manifold. Is that not sufficient?
>>
>> As for this path being defined elsewhere, I have just finished
>> constructing a one-liner that lets me search through the classes within
>> jar’s. Quite useful:
>>
>> find . -iname '*.jar' -printf "unzip -c %p | grep -q 'stringToSearchFor'
>> && echo %p\n" | sh
>>
>> Going to see if that original ‘api/node’ string exists anywhere else.
>>
>> Cheers
>>
>>
>>
>>
>> On 21 Oct 2015, at 14:36, Karl Wright <da...@gmail.com> wrote:
>>
>> Hi Paul,
>>
>> The indexer jar should appear in only one place, in the connector-lib
>> directory that is referenced by your properties.xml file.  However, if you
>> replace that, you will need to recycle all ManifoldCF processes or they
>> will not be able to pick it up.
>>
>> I would also check the URL that's being logged to be sure it matches the
>> pattern Maurizio pointed out.  If it doesn't, there's a possibility that
>> some other place in the connector has a similar problem that hasn't been
>> fixed.
>>
>> Thanks,
>> Karl
>>
>>
>> On Wed, Oct 21, 2015 at 8:48 AM, Paul Farrell <pf...@funnelback.com>
>> wrote:
>>
>>> Hi Karl/Maurizio,
>>>
>>> I have a very very odd circumstance at present. This may or may not be
>>> related to the Alfresco WebScript plugin OR the environment in which I am
>>> running Manifold but thought I would raise the question.
>>>
>>> I have cloned the repo for the Alfresco Webscript connector and can see
>>> that there is a ‘alfresco-indexer-client.jar’ file in the ‘target’
>>> directory.
>>>
>>> I have taken that jar and have replaced the jar that existed in the
>>> Manifold instance. This was at a path called
>>> ‘apache-manifoldcf/connector-lib’. This path is referenced in an
>>> ‘mcf-properties.xml’ file which may or may not be specific to our
>>> environment.
>>>
>>> Anyway, as I say I have replaced the existing jar but the strangest
>>> thing is that the same path is being used when I ‘Save’ the repository
>>> connection. In other words, the path ‘….api/node…’ is still being used
>>> despite the jar file saying otherwise.
>>>
>>> NOTE: the way I am testing this is to apply the jar, restart Jetty (our
>>> app server), open Manifold, navigate to the Alfresco WebScript Repository
>>> connection, hit ‘Save’ and then open the ‘manifold.log’ file. It is in this
>>> file that I see the HTTP request and the 404 error. It is in this HTTP
>>> request that it stipulates the path it is using - the old path.
>>>
>>> —
>>>
>>> I have even gone to the extreme of removing this jar file and restarting
>>> the app server to see if this jar is ignored by Manifold. If I do this
>>> Manifold does not even start so it is clearly expecting that jar to exist.
>>> This is even more strange. It is clearly reliant on the jar but it is not
>>> using the content of that jar.
>>>
>>> Can I ask if you guys can think of any reason at all that this might be
>>> happening. It is starting to drive me mad!
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>> On 21 Oct 2015, at 02:23, Karl Wright <da...@gmail.com> wrote:
>>>
>>> Hi Paul,
>>> Looking at Issue 3, I think that Maurizio has indeed pointed you in the
>>> right direction.  Can you check your version of the plugin to be sure that
>>> /api/node/ is NOT present in the described line of code?
>>>
>>> Karl
>>>
>>>
>>> On Tue, Oct 20, 2015 at 5:00 PM, <pf...@funnelback.com> wrote:
>>>
>>>> Hi Maurizio,
>>>>
>>>> I will be available all day tomorrow (Wednesday) to help out as much as
>>>> I can. If it's possible for you to look into this I can take whatever steps
>>>> you need.
>>>>
>>>> Many thanks,
>>>>
>>>> Paul
>>>>
>>>> -----Original Message-----
>>>> From: "Karl Wright" <da...@gmail.com>
>>>> Sent: Tuesday, October 20, 2015 12:34pm
>>>> To: "user@manifoldcf.apache.org" <us...@manifoldcf.apache.org>
>>>> Subject: Re: Manifold/Alfresco seeding and security
>>>>
>>>> Hi Maurizio,
>>>>
>>>> This is the third time we've seen this; can you use Paul's help to chase
>>>> down what the issue is?
>>>>
>>>> Karl
>>>>
>>>>
>>>> On Tue, Oct 20, 2015 at 12:19 PM, Paul Farrell <pfarrell@funnelback.com
>>>> >
>>>> wrote:
>>>>
>>>> > Hi,
>>>> >
>>>> > I am using Alfresco Community 5.0.
>>>> >
>>>> > Having taken that AMP file (version 0.7.1) and then installed it into
>>>> > Alfresco and restarted the services, the issue is still present.
>>>> >
>>>> > I suspect that this is probably more to do with the Manifold end than
>>>> the
>>>> > Alfresco end. It seems it is Manifold that is automatically appending
>>>> the
>>>> > “/api/node” string into the path whenever I use “/alfresco/service”
>>>> as the
>>>> > Context in the repository connection configuration.
>>>> >
>>>> > If it is of interest, this is the output in the manifoldcf.log file
>>>> when I
>>>> > use the repo connection config I mentioned earlier.
>>>> >
>>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection request:
>>>> > [route: {}->http://54.165.85.140:8080][total kept alive: 0; route
>>>> > allocated: 0 of 2; total allocated: 0 of 20]
>>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection leased:
>>>> [id:
>>>> > 10][route: {}->http://54.165.85.140:8080][total kept alive: 0; route
>>>> > allocated: 1 of 2; total allocated: 1 of 20]
>>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Opening connection
>>>> {}->
>>>> > http://54.165.85.140:8080
>>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connecting to /
>>>> > 54.165.85.140:8080
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Connection
>>>> established
>>>> > 172.31.23.90:58712<->54.165.85.140:8080
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Executing request
>>>> GET
>>>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Proxy auth state:
>>>> > UNCHALLENGED
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> GET
>>>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > Accept: application/json
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > Authorization: Basic YWRtaW46RnVubmVsYmFjazE=
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > Host: 54.165.85.140:8080
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > Connection: Keep-Alive
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > User-Agent: Apache-HttpClient/4.3.5 (java 1.5)
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > Accept-Encoding: gzip,deflate
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> "GET
>>>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > "Accept: application/json[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > "Authorization: Basic YWRtaW46RnVubmVsYmFjazE=[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > "Host: 54.165.85.140:8080[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > "Connection: Keep-Alive[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > "User-Agent: Apache-HttpClient/4.3.5 (java 1.5)[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > "Accept-Encoding: gzip,deflate[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>>> > "[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "HTTP/1.1 404 Not Found[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "Server: Apache-Coyote/1.1[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "Cache-Control: no-cache[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "Expires: Thu, 01 Jan 1970 00:00:00 GMT[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "Pragma: no-cache[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "Content-Type: text/html;charset=UTF-8[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "Transfer-Encoding: chunked[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "Date: Tue, 20 Oct 2015 16:18:47 GMT[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "630[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "
>>>> > http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "<html xmlns="http://www.w3.org/1999/xhtml">[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> > <head>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >    <title>Web Script Status 404 - Not Found</title>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >    <link rel="stylesheet" href="/alfresco/css/webscripts.css"
>>>> > type="text/css" />[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> > </head>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> > <body>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >    <div>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >       <table>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >          <tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >             <td><img src="/alfresco/images/logo/AlfrescoLogo32.png"
>>>> > alt="Alfresco" /></td>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >             <td><span class="title">Web Script Status 404 - Not
>>>> > Found</span></td>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >          </tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >       </table>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >       <br/>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >       <table>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >          <tr><td>The Web Script <a
>>>> >
>>>> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
>>>> > has responded with a status of 404 - Not Found.</td></tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >       </table>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >       <br/>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >       <table>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >          <tr><td><b>404 Description:</b></td><td> Requested resource
>>>> is not
>>>> > available.</td></tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >          <tr><td> </td></tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >          <tr><td><b>Message:</b></td><td>Cannot find object for
>>>> > NodeIdReference[storeRef=auth://resolve,id=admin]</td></tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >          <tr><td><b>Server</b>:</td><td>Community v5.0.0 (r75118-b23)
>>>> > schema 8,001</td></tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >          <tr><td><b>Time</b>:</td><td>Oct 20, 2015 4:18:47
>>>> PM</td></tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >          <tr><td></td><td> </td></tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >          <tr><td><b>Diagnostics</b>:</td><td><a
>>>> > href="/alfresco/service/script/org/alfresco/cmis/item.get">Inspect Web
>>>> > Script (org/alfresco/cmis/item.get)</a></td></tr>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >       </table>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> >    </div>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> "
>>>> > </body>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "</html>[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "[\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > "[\r][\n]"
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > HTTP/1.1 404 Not Found
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > Server: Apache-Coyote/1.1
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > Cache-Control: no-cache
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > Expires: Thu, 01 Jan 1970 00:00:00 GMT
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > Pragma: no-cache
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > Content-Type: text/html;charset=UTF-8
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > Transfer-Encoding: chunked
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>>> > Date: Tue, 20 Oct 2015 16:18:47 GMT
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection can be
>>>> kept
>>>> > alive indefinitely
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10:
>>>> > Shutdown connection
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection discarded
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10:
>>>> Close
>>>> > connection
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection released:
>>>> > [id: 10][route: {}->http://54.165.85.140:8080][total kept alive: 0;
>>>> route
>>>> > allocated: 0 of 2; total allocated: 0 of 20]
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager
>>>> is
>>>> > shutting down
>>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager
>>>> shut
>>>> > down
>>>> >
>>>> > *Paul Farrell*
>>>> > Senior Search Consultant
>>>> >
>>>> > 109-123 Clifton Street, London EC2A 4LD
>>>> > *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/
>>>> >
>>>> >
>>>> > *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>> >
>>>> > Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback>
>>>> -
>>>> > Twitter <https://twitter.com/funnelback>
>>>> >
>>>> > Funnelback UK Ltd is a limited liability company registered in
>>>> England &
>>>> > Wales. Registered address: Zetland House 109-123, Clifton Street,
>>>> London.
>>>> > EC2A 4LD. Company registration number: 07004264.
>>>> >
>>>> > On 20 Oct 2015, at 16:50, Maurizio Pillitu <ma...@apache.org> wrote:
>>>> >
>>>> > Hi Paul,
>>>> >
>>>> > it looks like you're hitting
>>>> > https://github.com/maoo/alfresco-indexer/issues/3 ; which version of
>>>> > alfresco-indexer are you using? Can you try using
>>>> >
>>>> http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp
>>>> (or
>>>> > the pre-built WAR file -
>>>> >
>>>> http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar
>>>> >  )
>>>> >
>>>> > HTH
>>>> >   mao
>>>> >
>>>> > On Tue, Oct 20, 2015 at 5:36 PM Paul Farrell <pfarrell@funnelback.com
>>>> >
>>>> > wrote:
>>>> >
>>>> >> Hi,
>>>> >>
>>>> >> Having had to go back to basics and re-install my Alfresco instance,
>>>> I
>>>> >> can confirm that the AMP file for the alfresco indexer web scripts
>>>> *does*
>>>> >> actually install without error. There must have been an issue with my
>>>> >> previous Alfresco instance.
>>>> >>
>>>> >> Having said that, the Alfresco WebScript connector fails. The
>>>> failure is
>>>> >> down to the ‘Context’ setting (see below):
>>>> >>
>>>> >> <4a6db6238cff01e7ff77cdaf7e6ea050.png>
>>>> >>
>>>> >> When you attempt to save the configuration of the WebScript
>>>> connector,
>>>> >> Manifold clearly tries to check the connection. It seems to do this
>>>> by
>>>> >> making an API call (/auth/resolve/admin). The issue is with what
>>>> Manifold
>>>> >> prepends to the start of that path.
>>>> >> If I leave the setting as above then Manifold reports   :
>>>> >>
>>>> >> <tr><td>The Web Script <a
>>>> >>
>>>> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
>>>> >> has responded with a status of 404 - Not Found.</td></tr>[\n]”
>>>> >>
>>>> >> In other words, it builds the full path as
>>>> >> “alfresco/service/api/node/auth/resolve/admin”.
>>>> >>
>>>> >> For my Alfresco Community 5.0 instance, I get to that same web
>>>> script via
>>>> >> the URL “/alfresco/service/auth/resolve/admin” i.e. without the
>>>> ‘/api/node’.
>>>> >>
>>>> >> Somewhere, Manifold is assuming that the ‘/api/node’ is a correct
>>>> path
>>>> >> inclusion. In other words, there is nothing I can put into that box
>>>> to
>>>> >> prevent it.
>>>> >>
>>>> >> Paul
>>>> >>
>>>> >> On 20 Oct 2015, at 12:56, Karl Wright <da...@gmail.com> wrote:
>>>> >>
>>>> >> Hmm.  What file was missing?  Maurizio maintains the indexer plugin;
>>>> I
>>>> >> feel certain he'd want to know.
>>>> >>
>>>> >> Karl
>>>> >>
>>>> >>
>>>> >> On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <
>>>> pfarrell@funnelback.com>
>>>> >> wrote:
>>>> >>
>>>> >>> Hi guys,
>>>> >>>
>>>> >>> Just to let you know what’s going on - for informational purposes
>>>> more
>>>> >>> than anything.
>>>> >>>
>>>> >>> I initially tried taking the AMP file provided in the MCF plugins
>>>> >>> directory (0.7.0) and tried to install it into Alfresco but got a
>>>> message
>>>> >>> saying a file was missing.
>>>> >>>
>>>> >>> Instead, I cloned the repository on GitHub for the alfresco-indexer
>>>> >>> project and then built it on my local machine. This generated the
>>>> AMP file
>>>> >>> (0.7.2).
>>>> >>>
>>>> >>> I was able to successfully install the AMP file onto my Alfresco
>>>> >>> instance.
>>>> >>>
>>>> >>> As it happens I now cannot log into Alfresco Share ('bad
>>>> credentials or
>>>> >>> server not available' message) but that is something I can work on.
>>>> >>> Apparently the installation of some AMP files have been known to
>>>> cause this
>>>> >>> issue.
>>>> >>>
>>>> >>> So, progress to a point!
>>>> >>>
>>>> >>> *Paul Farrell*
>>>> >>> Senior Search Consultant
>>>> >>>
>>>> >>> 109-123 Clifton Street, London EC2A 4LD
>>>> >>> *T* +44 (0) 207 183 6865 | funnelback.com <
>>>> http://www.funnelback.com/>
>>>> >>>
>>>> >>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>> >>>
>>>> >>> Connect with us: LinkedIn <
>>>> http://www.linkedin.com/company/funnelback> -
>>>> >>>  Twitter <https://twitter.com/funnelback>
>>>> >>>
>>>> >>> Funnelback UK Ltd is a limited liability company registered in
>>>> England &
>>>> >>> Wales. Registered address: Zetland House 109-123, Clifton Street,
>>>> London.
>>>> >>> EC2A 4LD. Company registration number: 07004264.
>>>> >>>
>>>> >>> On 20 Oct 2015, at 12:36, Rafa Haro <rh...@gmail.com> wrote:
>>>> >>>
>>>> >>> Hi,
>>>> >>>
>>>> >>> At the Alfresco side, hope this helps:
>>>> >>>
>>>> >>> http://docs.alfresco.com/4.1/tasks/amp-install.html
>>>> >>>
>>>> >>> Cheers
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <da...@gmail.com>
>>>> wrote:
>>>> >>>
>>>> >>>> The AMP file is actually shipped as part of the binary MCF
>>>> >>>> distribution.  You can find it under "plugins".
>>>> >>>>
>>>> >>>> Karl
>>>> >>>>
>>>> >>>>
>>>> >>>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <
>>>> pfarrell@funnelback.com>
>>>> >>>> wrote:
>>>> >>>>
>>>> >>>>> Hi all,
>>>> >>>>>
>>>> >>>>> Hopefully this will be my only request for information today.
>>>> >>>>> I’m afraid this is a bit of a newbie question but I have managed
>>>> to
>>>> >>>>> get the Manifold UI to now show ‘Alfresco Webscripts’ as a
>>>> connector. The
>>>> >>>>> only bit I am missing now is to install the AMP file in Afresco.
>>>> >>>>>
>>>> >>>>> I realise that this is slightly outside of the Manifold remit but
>>>> I
>>>> >>>>> wondered if anyone can advise how I build the AMP file from the
>>>> URL (
>>>> >>>>> https://github.com/maoo/alfresco-indexer)? I have cloned the
>>>> >>>>> repository to my local drive but, having never worked with Maven,
>>>> am at a
>>>> >>>>> loss at how to generate the AMP file that I then need to install
>>>> into
>>>> >>>>> Alfresco.
>>>> >>>>>
>>>> >>>>> Many thanks,
>>>> >>>>>
>>>> >>>>> On 19 Oct 2015, at 17:36, Karl Wright <da...@gmail.com> wrote:
>>>> >>>>>
>>>> >>>>> The only way you can have such a reduced list of connectors is if
>>>> >>>>> somebody commented out many connectors in your connectors.xml, or
>>>> removed
>>>> >>>>> them from the database table where they are registered by hand.
>>>> >>>>>
>>>> >>>>> Karl
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <
>>>> >>>>> pfarrell@funnelback.com> wrote:
>>>> >>>>>
>>>> >>>>>> After a good deal of time clicking around I came to the same
>>>> >>>>>> conclusion - that there is no way of telling from the UI!!
>>>> >>>>>>
>>>> >>>>>> Having dug a bit deeper I believe I may actually have the
>>>> Alfresco
>>>> >>>>>> WebScript connectors installed. At least the 0.7.0 version. I
>>>> notice in the
>>>> >>>>>> ‘lib’ directory that I have
>>>> ‘alfresco-indexer-webscripts-0.7.0.amp.
>>>> >>>>>>
>>>> >>>>>> Looking in the ‘connectors.xml’ file I can also see the line :
>>>> >>>>>>
>>>> >>>>>> <repositoryconnector name="Alfresco Webscript"
>>>> >>>>>>
>>>> class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>>>> >>>>>>
>>>> >>>>>> You can imagine my excitement!
>>>> >>>>>>
>>>> >>>>>> The only thing I am missing is the option in the UI. When I
>>>> click to
>>>> >>>>>> create a new repo connection I get:  CMIS, Dropbox, Generic,
>>>> GoogleDrive,
>>>> >>>>>> HDFS, Jira, Meridio, RSS, Sharepoint.
>>>> >>>>>>
>>>> >>>>>> Perhaps I am hoping for too much to hope that I can make a simple
>>>> >>>>>> change to enable this repo connection?
>>>> >>>>>>
>>>> >>>>>> Thanks for all the help everyone
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> On 19 Oct 2015, at 17:26, Karl Wright <da...@gmail.com>
>>>> wrote:
>>>> >>>>>>
>>>> >>>>>> Hah; there's not a way to inquire in the UI, if that's what you
>>>> >>>>>> mean.  But if you see "Alfresco webscript" in the list of
>>>> repository
>>>> >>>>>> connection types, you've got a version that supports that
>>>> connector.
>>>> >>>>>>
>>>> >>>>>> Thanks,
>>>> >>>>>> Karl
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <
>>>> >>>>>> pfarrell@funnelback.com> wrote:
>>>> >>>>>>
>>>> >>>>>>> Thanks Rafa.
>>>> >>>>>>>
>>>> >>>>>>> As an aside, is there an easy way to identify which version of
>>>> >>>>>>> ManifoldCF you are on?
>>>> >>>>>>>
>>>> >>>>>>> Cheers
>>>> >>>>>>>
>>>> >>>>>>> *Paul Farrell*
>>>> >>>>>>> Senior Search Consultant
>>>> >>>>>>>
>>>> >>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>> >>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
>>>> >>>>>>> <http://www.funnelback.com/>
>>>> >>>>>>>
>>>> >>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED
>>>> STATES
>>>> >>>>>>>
>>>> >>>>>>> Connect with us: LinkedIn
>>>> >>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
>>>> >>>>>>> <https://twitter.com/funnelback>
>>>> >>>>>>>
>>>> >>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>>> >>>>>>> England & Wales. Registered address: Zetland House 109-123,
>>>> Clifton Street,
>>>> >>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>>> >>>>>>>
>>>> >>>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rh...@apache.org> wrote:
>>>> >>>>>>>
>>>> >>>>>>> Hi Paul,
>>>> >>>>>>>
>>>> >>>>>>> All you need to do is to install this webscript
>>>> >>>>>>> <https://github.com/maoo/alfresco-indexer> within your Alfresco
>>>> >>>>>>> instance. The connector itself is already part of the most
>>>> recent versions
>>>> >>>>>>> of ManifoldCF
>>>> >>>>>>>
>>>> >>>>>>> Cheers,
>>>> >>>>>>> Rafa
>>>> >>>>>>>
>>>> >>>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <
>>>> >>>>>>> pfarrell@funnelback.com> wrote:
>>>> >>>>>>>
>>>> >>>>>>>> Ok, thanks again guys.
>>>> >>>>>>>>
>>>> >>>>>>>> The Webscript connector it is.
>>>> >>>>>>>>
>>>> >>>>>>>> I realise I am asking a lot here but are there any
>>>> easy-to-follow
>>>> >>>>>>>> guidelines on how to get this Webscript connector installed?
>>>> I see there
>>>> >>>>>>>> is a GitHub page here (
>>>> >>>>>>>> https://github.com/maoo/alfresco-webscript-manifold-connector)
>>>> >>>>>>>> which discusses it (although it directs you to a repository of
>>>> files).
>>>> >>>>>>>>
>>>> >>>>>>>> I am just keen to make sure that any steps I follow to try and
>>>> get
>>>> >>>>>>>> this Webscript connector installed and working are updated,
>>>> reliable steps.
>>>> >>>>>>>> I would hate to waste time with out of date information.
>>>> >>>>>>>>
>>>> >>>>>>>> Thanks all
>>>> >>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh...@gmail.com>
>>>> >>>>>>>> wrote:
>>>> >>>>>>>>
>>>> >>>>>>>> Hi Paul,
>>>> >>>>>>>>
>>>> >>>>>>>> I suggest that you should use Alfresco Webscript as Karl
>>>> mentioned.
>>>> >>>>>>>> Web services is so slow compared to other services and I've
>>>> also checked
>>>> >>>>>>>> that Alfresco CMIS web services does not return change
>>>> token(may be there
>>>> >>>>>>>> is something that I don't know).
>>>> >>>>>>>>
>>>> >>>>>>>> By the way current version of CMIS connector is not aware of
>>>> change
>>>> >>>>>>>> token. I would write a patch for you if alfresco supports
>>>> change token
>>>> >>>>>>>> property.
>>>> >>>>>>>>
>>>> >>>>>>>> Thanks!
>>>> >>>>>>>> Muhammed
>>>> >>>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <
>>>> >>>>>>>> daddywri@gmail.com> şunu yazdı:
>>>> >>>>>>>>
>>>> >>>>>>>>> Hi Paul,
>>>> >>>>>>>>>
>>>> >>>>>>>>> The Alfresco Webscript connector is a wholly different
>>>> connector
>>>> >>>>>>>>> that has no relation to the CMIS connector.  It requires an
>>>> Alfresco
>>>> >>>>>>>>> webscript plugin be installed on your Alfresco server to
>>>> work, though.
>>>> >>>>>>>>>
>>>> >>>>>>>>> Hope that helps.
>>>> >>>>>>>>>
>>>> >>>>>>>>> Karl
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <
>>>> >>>>>>>>> pfarrell@funnelback.com> wrote:
>>>> >>>>>>>>>
>>>> >>>>>>>>>> Hi Muhammed/Karl,
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Firstly, thank-you so much for taking the time to reply. It
>>>> is
>>>> >>>>>>>>>> very much appreciated.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Currently I am using the AtomPub for my CMIS repository
>>>> >>>>>>>>>> connection. I have just read something which may shed a
>>>> little light on
>>>> >>>>>>>>>> this. The post read that change tokens are not passed via
>>>> AtomPub
>>>> >>>>>>>>>> connections (
>>>> >>>>>>>>>>
>>>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758
>>>> ).
>>>> >>>>>>>>>> If true, this would explain why ManifoldCF may be unable to
>>>> determine a
>>>> >>>>>>>>>> change in Alfresco.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> It looks like I have two possible options left open to me
>>>> >>>>>>>>>> (correct me if I’m wrong):
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the
>>>> >>>>>>>>>> connection mechanism
>>>> >>>>>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’
>>>> >>>>>>>>>> connector?  (or is this the same as the ‘Web Services’
>>>> connection mentioned
>>>> >>>>>>>>>> above?)
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Thanks again,
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Paul
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> *Paul Farrell*
>>>> >>>>>>>>>> Senior Search Consultant
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>> >>>>>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
>>>> >>>>>>>>>> <http://www.funnelback.com/>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED
>>>> >>>>>>>>>> STATES
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Connect with us: LinkedIn
>>>> >>>>>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
>>>> >>>>>>>>>> <https://twitter.com/funnelback>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Funnelback UK Ltd is a limited liability company registered
>>>> in
>>>> >>>>>>>>>> England & Wales. Registered address: Zetland House 109-123,
>>>> Clifton Street,
>>>> >>>>>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh.olgun@gmail.com
>>>> >
>>>> >>>>>>>>>> wrote:
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Hi Paul,
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Repositories should give information to ManifoldCF when they
>>>> >>>>>>>>>> updated. Current CMIS connector reindex document if the
>>>> lastest version of
>>>> >>>>>>>>>> the document has changed, not updated.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> There is a change token property in CMIS specification and it
>>>> >>>>>>>>>> should change when document is updated so ManifoldCF can
>>>> understand that
>>>> >>>>>>>>>> document is updated but implementing change token property
>>>> is optional.
>>>> >>>>>>>>>> I've checked Alfresco's CMIS web site and seen that they
>>>> didn't set the
>>>> >>>>>>>>>> change token.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> I think, there is nothing we can do at this point.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <
>>>> daddywri@gmail.com>
>>>> >>>>>>>>>> şunu yazdı:
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>> Hi Paul,
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> This looks like a bug in the CMIS connector to me; usually
>>>> the
>>>> >>>>>>>>>>> document version string the connector constructs should be
>>>> adequate to
>>>> >>>>>>>>>>> detect all changes.  Can you create a ticket?
>>>> >>>>>>>>>>> https://issues.apache.org/jira , project ManifoldCF.
>>>> Please
>>>> >>>>>>>>>>> include what version of MCF you are using here.  FWIW, this
>>>> may be in fact
>>>> >>>>>>>>>>> a bug in the Alfresco CMIS implementation, but we'll have
>>>> to have some back
>>>> >>>>>>>>>>> and forth before I can determine that for sure.
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> In the meantime, have you considered using the Alfresco
>>>> >>>>>>>>>>> Webscript connector?  It's the preferred way to do Alfresco
>>>> indexing,
>>>> >>>>>>>>>>> although there have been issues reported having to do with
>>>> running it on
>>>> >>>>>>>>>>> some configurations of Alfresco.  I'm not entirely sure
>>>> what the problem is
>>>> >>>>>>>>>>> there; maybe a version dependency of some kind.
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> Karl
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <
>>>> >>>>>>>>>>> pfarrell@funnelback.com> wrote:
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>> Hi Everyone,
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Hoping someone may be able to advise.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> I am currently using Manifold, together with a CMIS
>>>> connector,
>>>> >>>>>>>>>>>> to retrieve and index content from an Alfresco repository.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> All is going well apart from, what I would call, the
>>>> >>>>>>>>>>>> ‘incremental crawl’.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> The main issue I am having is that the modification of a
>>>> >>>>>>>>>>>> document’s security settings, in Alfresco, is not being
>>>> picked up in next
>>>> >>>>>>>>>>>> Manifold crawl. As an example I have a document ‘TestDoc1’
>>>> which has user A
>>>> >>>>>>>>>>>> and B as Consumers. I run a crawl in Manifold and it picks
>>>> up the documents
>>>> >>>>>>>>>>>> fine.  The security is set as expected. I then remove
>>>> ‘User A’ from the
>>>> >>>>>>>>>>>> security of that document and re-run the Manifold crawl.
>>>> User A can still
>>>> >>>>>>>>>>>> see the document in the local search engine.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> It is as if Manifold is not treating the security update
>>>> as a
>>>> >>>>>>>>>>>> ‘modification’ and is therefore not refreshing it. Note
>>>> that if I go into
>>>> >>>>>>>>>>>> the Output Connections, edit and save the relevant output
>>>> connection and
>>>> >>>>>>>>>>>> then click ‘Remove all associated documents’, the next
>>>> time I crawl, the
>>>> >>>>>>>>>>>> changes are picked up. It is clear that Manifold is just
>>>> not updating
>>>> >>>>>>>>>>>> whatever internal record it has for this item.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Any ideas?
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Many thanks.
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>
>>>> >>>
>>>> >>>
>>>> >>
>>>> >
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

Re: Manifold/Alfresco seeding and security

Posted by Paul Farrell <pf...@funnelback.com>.
I’m quite fortunate to be running all of this on a personal AWS Virtual Machine so have been able to actually stop and start the server. 

Having run that command line I sent below, I can confirm that the string “api/node” does not exist in any .jar file or regular file. I am at a loss to explain how the Manifold repository connection test process is still trying to access :

<tr><td>The Web Script <a href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a> has responded with a status of 404 - Not Found.</td></tr>[\n]”

It has got to the point now where I may just have to bite the bullet and tell the client that we cannot support nightly Alfresco crawls i.e. crawls that take into account the change log. Tough thing to do but I can’t see I have much choice right now. 

Really appreciate the help

> On 21 Oct 2015, at 14:47, Karl Wright <da...@gmail.com> wrote:
> 
> Hi Paul,
> 
> I can't answer that question until I know how you've deployed things.  I'm presuming that you are using a multiprocess deployment?  If so, for the web applications, recycling the application server should be sufficient, but you really want to check to be sure what properties.xml file the application server is pointing at, so you change the jar in the right place.  In a multiprocess setup, there are also agents processes (at least one), which you would also need to cycle.
> 
> Thanks,
> Karl
> 
> 
> On Wed, Oct 21, 2015 at 9:41 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> Thanks Karl.
> 
> Can you clarify what you mean by ‘recycle Manifold processes’? My fallback position in anything like this is to restart whatever app/web server is hosting Manifold. Is that not sufficient?
> 
> As for this path being defined elsewhere, I have just finished constructing a one-liner that lets me search through the classes within jar’s. Quite useful:
> 
> find . -iname '*.jar' -printf "unzip -c %p | grep -q 'stringToSearchFor' && echo %p\n" | sh
> 
> Going to see if that original ‘api/node’ string exists anywhere else. 
> 
> Cheers
> 
> 
> 
> 
>> On 21 Oct 2015, at 14:36, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hi Paul,
>> 
>> The indexer jar should appear in only one place, in the connector-lib directory that is referenced by your properties.xml file.  However, if you replace that, you will need to recycle all ManifoldCF processes or they will not be able to pick it up.
>> 
>> I would also check the URL that's being logged to be sure it matches the pattern Maurizio pointed out.  If it doesn't, there's a possibility that some other place in the connector has a similar problem that hasn't been fixed.
>> 
>> Thanks,
>> Karl
>> 
>> 
>> On Wed, Oct 21, 2015 at 8:48 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>> Hi Karl/Maurizio,
>> 
>> I have a very very odd circumstance at present. This may or may not be related to the Alfresco WebScript plugin OR the environment in which I am running Manifold but thought I would raise the question. 
>> 
>> I have cloned the repo for the Alfresco Webscript connector and can see that there is a ‘alfresco-indexer-client.jar’ file in the ‘target’ directory. 
>> 
>> I have taken that jar and have replaced the jar that existed in the Manifold instance. This was at a path called ‘apache-manifoldcf/connector-lib’. This path is referenced in an ‘mcf-properties.xml’ file which may or may not be specific to our environment. 
>> 
>> Anyway, as I say I have replaced the existing jar but the strangest thing is that the same path is being used when I ‘Save’ the repository connection. In other words, the path ‘….api/node…’ is still being used despite the jar file saying otherwise. 
>> 
>> NOTE: the way I am testing this is to apply the jar, restart Jetty (our app server), open Manifold, navigate to the Alfresco WebScript Repository connection, hit ‘Save’ and then open the ‘manifold.log’ file. It is in this file that I see the HTTP request and the 404 error. It is in this HTTP request that it stipulates the path it is using - the old path. 
>> 
>> —
>> 
>> I have even gone to the extreme of removing this jar file and restarting the app server to see if this jar is ignored by Manifold. If I do this Manifold does not even start so it is clearly expecting that jar to exist. This is even more strange. It is clearly reliant on the jar but it is not using the content of that jar. 
>> 
>> Can I ask if you guys can think of any reason at all that this might be happening. It is starting to drive me mad!
>> 
>> Thanks
>> 
>> 
>> 
>> 
>>> On 21 Oct 2015, at 02:23, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> Hi Paul,
>>> Looking at Issue 3, I think that Maurizio has indeed pointed you in the right direction.  Can you check your version of the plugin to be sure that /api/node/ is NOT present in the described line of code?
>>> 
>>> Karl
>>> 
>>> 
>>> On Tue, Oct 20, 2015 at 5:00 PM, <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>> Hi Maurizio,
>>> 
>>> I will be available all day tomorrow (Wednesday) to help out as much as I can. If it's possible for you to look into this I can take whatever steps you need.
>>> 
>>> Many thanks,
>>> 
>>> Paul
>>> 
>>> -----Original Message-----
>>> From: "Karl Wright" <daddywri@gmail.com <ma...@gmail.com>>
>>> Sent: Tuesday, October 20, 2015 12:34pm
>>> To: "user@manifoldcf.apache.org <ma...@manifoldcf.apache.org>" <user@manifoldcf.apache.org <ma...@manifoldcf.apache.org>>
>>> Subject: Re: Manifold/Alfresco seeding and security
>>> 
>>> Hi Maurizio,
>>> 
>>> This is the third time we've seen this; can you use Paul's help to chase
>>> down what the issue is?
>>> 
>>> Karl
>>> 
>>> 
>>> On Tue, Oct 20, 2015 at 12:19 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
>>> wrote:
>>> 
>>> > Hi,
>>> >
>>> > I am using Alfresco Community 5.0.
>>> >
>>> > Having taken that AMP file (version 0.7.1) and then installed it into
>>> > Alfresco and restarted the services, the issue is still present.
>>> >
>>> > I suspect that this is probably more to do with the Manifold end than the
>>> > Alfresco end. It seems it is Manifold that is automatically appending the
>>> > “/api/node” string into the path whenever I use “/alfresco/service” as the
>>> > Context in the repository connection configuration.
>>> >
>>> > If it is of interest, this is the output in the manifoldcf.log file when I
>>> > use the repo connection config I mentioned earlier.
>>> >
>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection request:
>>> > [route: {}->http://54.165.85.140:8080][total <> kept alive: 0; route
>>> > allocated: 0 of 2; total allocated: 0 of 20]
>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection leased: [id:
>>> > 10][route: {}->http://54.165.85.140:8080][total <> kept alive: 0; route
>>> > allocated: 1 of 2; total allocated: 1 of 20]
>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Opening connection {}->
>>> > http://54.165.85.140:8080 <http://54.165.85.140:8080/>
>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connecting to /
>>> > 54.165.85.140:8080 <http://54.165.85.140:8080/>
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Connection established
>>> > 172.31.23.90:58712 <http://172.31.23.90:58712/><->54.165.85.140:8080 <http://54.165.85.140:8080/>
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Executing request GET
>>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Proxy auth state:
>>> > UNCHALLENGED
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> GET
>>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > Accept: application/json
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > Authorization: Basic YWRtaW46RnVubmVsYmFjazE=
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > Host: 54.165.85.140:8080 <http://54.165.85.140:8080/>
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > Connection: Keep-Alive
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > User-Agent: Apache-HttpClient/4.3.5 (java 1.5)
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > Accept-Encoding: gzip,deflate
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> "GET
>>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > "Accept: application/json[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > "Authorization: Basic YWRtaW46RnVubmVsYmFjazE=[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > "Host: 54.165.85.140:8080[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > "Connection: Keep-Alive[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > "User-Agent: Apache-HttpClient/4.3.5 (java 1.5)[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > "Accept-Encoding: gzip,deflate[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > "[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "HTTP/1.1 404 Not Found[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "Server: Apache-Coyote/1.1[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "Cache-Control: no-cache[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "Expires: Thu, 01 Jan 1970 00:00:00 GMT[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "Pragma: no-cache[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "Content-Type: text/html;charset=UTF-8[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "Transfer-Encoding: chunked[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "Date: Tue, 20 Oct 2015 16:18:47 GMT[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "630[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "
>>> > http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd <http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd>">[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "<html xmlns="http://www.w3.org/1999/xhtml <http://www.w3.org/1999/xhtml>">[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> > <head>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >    <title>Web Script Status 404 - Not Found</title>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >    <link rel="stylesheet" href="/alfresco/css/webscripts.css"
>>> > type="text/css" />[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> > </head>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> > <body>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >    <div>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >       <table>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >             <td><img src="/alfresco/images/logo/AlfrescoLogo32.png"
>>> > alt="Alfresco" /></td>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >             <td><span class="title">Web Script Status 404 - Not
>>> > Found</span></td>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          </tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >       </table>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >       <br/>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >       <table>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr><td>The Web Script <a
>>> > href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
>>> > has responded with a status of 404 - Not Found.</td></tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >       </table>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >       <br/>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >       <table>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr><td><b>404 Description:</b></td><td> Requested resource is not
>>> > available.</td></tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr><td> </td></tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr><td><b>Message:</b></td><td>Cannot find object for
>>> > NodeIdReference[storeRef=auth://resolve,id=admin]</td></tr>[\n] <>"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr><td><b>Server</b>:</td><td>Community v5.0.0 (r75118-b23)
>>> > schema 8,001</td></tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr><td><b>Time</b>:</td><td>Oct 20, 2015 4:18:47 PM</td></tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr><td></td><td> </td></tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr><td><b>Diagnostics</b>:</td><td><a
>>> > href="/alfresco/service/script/org/alfresco/cmis/item.get">Inspect Web
>>> > Script (org/alfresco/cmis/item.get)</a></td></tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >       </table>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >    </div>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> > </body>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "</html>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > HTTP/1.1 404 Not Found
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > Server: Apache-Coyote/1.1
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > Cache-Control: no-cache
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > Expires: Thu, 01 Jan 1970 00:00:00 GMT
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > Pragma: no-cache
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > Content-Type: text/html;charset=UTF-8
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > Transfer-Encoding: chunked
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > Date: Tue, 20 Oct 2015 16:18:47 GMT
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection can be kept
>>> > alive indefinitely
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10:
>>> > Shutdown connection
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection discarded
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10: Close
>>> > connection
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection released:
>>> > [id: 10][route: {}->http://54.165.85.140:8080][total <> kept alive: 0; route
>>> > allocated: 0 of 2; total allocated: 0 of 20]
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager is
>>> > shutting down
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager shut
>>> > down
>>> >
>>> > *Paul Farrell*
>>> > Senior Search Consultant
>>> >
>>> > 109-123 Clifton Street, London EC2A 4LD
>>> > *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/> <http://www.funnelback.com/ <http://www.funnelback.com/>>
>>> >
>>> > *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>> >
>>> > Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> -
>>> > Twitter <https://twitter.com/funnelback <https://twitter.com/funnelback>>
>>> >
>>> > Funnelback UK Ltd is a limited liability company registered in England &
>>> > Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>>> > EC2A 4LD. Company registration number: 07004264.
>>> >
>>> > On 20 Oct 2015, at 16:50, Maurizio Pillitu <maoo@apache.org <ma...@apache.org>> wrote:
>>> >
>>> > Hi Paul,
>>> >
>>> > it looks like you're hitting
>>> > https://github.com/maoo/alfresco-indexer/issues/3 <https://github.com/maoo/alfresco-indexer/issues/3> ; which version of
>>> > alfresco-indexer are you using? Can you try using
>>> > http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp <http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp> (or
>>> > the pre-built WAR file -
>>> > http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar <http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar>
>>> >  )
>>> >
>>> > HTH
>>> >   mao
>>> >
>>> > On Tue, Oct 20, 2015 at 5:36 PM Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
>>> > wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> Having had to go back to basics and re-install my Alfresco instance, I
>>> >> can confirm that the AMP file for the alfresco indexer web scripts *does*
>>> >> actually install without error. There must have been an issue with my
>>> >> previous Alfresco instance.
>>> >>
>>> >> Having said that, the Alfresco WebScript connector fails. The failure is
>>> >> down to the ‘Context’ setting (see below):
>>> >>
>>> >> <4a6db6238cff01e7ff77cdaf7e6ea050.png>
>>> >>
>>> >> When you attempt to save the configuration of the WebScript connector,
>>> >> Manifold clearly tries to check the connection. It seems to do this by
>>> >> making an API call (/auth/resolve/admin). The issue is with what Manifold
>>> >> prepends to the start of that path.
>>> >> If I leave the setting as above then Manifold reports   :
>>> >>
>>> >> <tr><td>The Web Script <a
>>> >> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
>>> >> has responded with a status of 404 - Not Found.</td></tr>[\n]”
>>> >>
>>> >> In other words, it builds the full path as
>>> >> “alfresco/service/api/node/auth/resolve/admin”.
>>> >>
>>> >> For my Alfresco Community 5.0 instance, I get to that same web script via
>>> >> the URL “/alfresco/service/auth/resolve/admin” i.e. without the ‘/api/node’.
>>> >>
>>> >> Somewhere, Manifold is assuming that the ‘/api/node’ is a correct path
>>> >> inclusion. In other words, there is nothing I can put into that box to
>>> >> prevent it.
>>> >>
>>> >> Paul
>>> >>
>>> >> On 20 Oct 2015, at 12:56, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>> >>
>>> >> Hmm.  What file was missing?  Maurizio maintains the indexer plugin; I
>>> >> feel certain he'd want to know.
>>> >>
>>> >> Karl
>>> >>
>>> >>
>>> >> On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
>>> >> wrote:
>>> >>
>>> >>> Hi guys,
>>> >>>
>>> >>> Just to let you know what’s going on - for informational purposes more
>>> >>> than anything.
>>> >>>
>>> >>> I initially tried taking the AMP file provided in the MCF plugins
>>> >>> directory (0.7.0) and tried to install it into Alfresco but got a message
>>> >>> saying a file was missing.
>>> >>>
>>> >>> Instead, I cloned the repository on GitHub for the alfresco-indexer
>>> >>> project and then built it on my local machine. This generated the AMP file
>>> >>> (0.7.2).
>>> >>>
>>> >>> I was able to successfully install the AMP file onto my Alfresco
>>> >>> instance.
>>> >>>
>>> >>> As it happens I now cannot log into Alfresco Share ('bad credentials or
>>> >>> server not available' message) but that is something I can work on.
>>> >>> Apparently the installation of some AMP files have been known to cause this
>>> >>> issue.
>>> >>>
>>> >>> So, progress to a point!
>>> >>>
>>> >>> *Paul Farrell*
>>> >>> Senior Search Consultant
>>> >>>
>>> >>> 109-123 Clifton Street, London EC2A 4LD
>>> >>> *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/> <http://www.funnelback.com/ <http://www.funnelback.com/>>
>>> >>>
>>> >>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>> >>>
>>> >>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> -
>>> >>>  Twitter <https://twitter.com/funnelback <https://twitter.com/funnelback>>
>>> >>>
>>> >>> Funnelback UK Ltd is a limited liability company registered in England &
>>> >>> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>>> >>> EC2A 4LD. Company registration number: 07004264.
>>> >>>
>>> >>> On 20 Oct 2015, at 12:36, Rafa Haro <rharoapache@gmail.com <ma...@gmail.com>> wrote:
>>> >>>
>>> >>> Hi,
>>> >>>
>>> >>> At the Alfresco side, hope this helps:
>>> >>>
>>> >>> http://docs.alfresco.com/4.1/tasks/amp-install.html <http://docs.alfresco.com/4.1/tasks/amp-install.html>
>>> >>>
>>> >>> Cheers
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>> >>>
>>> >>>> The AMP file is actually shipped as part of the binary MCF
>>> >>>> distribution.  You can find it under "plugins".
>>> >>>>
>>> >>>> Karl
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
>>> >>>> wrote:
>>> >>>>
>>> >>>>> Hi all,
>>> >>>>>
>>> >>>>> Hopefully this will be my only request for information today.
>>> >>>>> I’m afraid this is a bit of a newbie question but I have managed to
>>> >>>>> get the Manifold UI to now show ‘Alfresco Webscripts’ as a connector. The
>>> >>>>> only bit I am missing now is to install the AMP file in Afresco.
>>> >>>>>
>>> >>>>> I realise that this is slightly outside of the Manifold remit but I
>>> >>>>> wondered if anyone can advise how I build the AMP file from the URL (
>>> >>>>> https://github.com/maoo/alfresco-indexer <https://github.com/maoo/alfresco-indexer>)? I have cloned the
>>> >>>>> repository to my local drive but, having never worked with Maven, am at a
>>> >>>>> loss at how to generate the AMP file that I then need to install into
>>> >>>>> Alfresco.
>>> >>>>>
>>> >>>>> Many thanks,
>>> >>>>>
>>> >>>>> On 19 Oct 2015, at 17:36, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>> >>>>>
>>> >>>>> The only way you can have such a reduced list of connectors is if
>>> >>>>> somebody commented out many connectors in your connectors.xml, or removed
>>> >>>>> them from the database table where they are registered by hand.
>>> >>>>>
>>> >>>>> Karl
>>> >>>>>
>>> >>>>>
>>> >>>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <
>>> >>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>> >>>>>
>>> >>>>>> After a good deal of time clicking around I came to the same
>>> >>>>>> conclusion - that there is no way of telling from the UI!!
>>> >>>>>>
>>> >>>>>> Having dug a bit deeper I believe I may actually have the Alfresco
>>> >>>>>> WebScript connectors installed. At least the 0.7.0 version. I notice in the
>>> >>>>>> ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
>>> >>>>>>
>>> >>>>>> Looking in the ‘connectors.xml’ file I can also see the line :
>>> >>>>>>
>>> >>>>>> <repositoryconnector name="Alfresco Webscript"
>>> >>>>>> class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>>> >>>>>>
>>> >>>>>> You can imagine my excitement!
>>> >>>>>>
>>> >>>>>> The only thing I am missing is the option in the UI. When I click to
>>> >>>>>> create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive,
>>> >>>>>> HDFS, Jira, Meridio, RSS, Sharepoint.
>>> >>>>>>
>>> >>>>>> Perhaps I am hoping for too much to hope that I can make a simple
>>> >>>>>> change to enable this repo connection?
>>> >>>>>>
>>> >>>>>> Thanks for all the help everyone
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> On 19 Oct 2015, at 17:26, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>> >>>>>>
>>> >>>>>> Hah; there's not a way to inquire in the UI, if that's what you
>>> >>>>>> mean.  But if you see "Alfresco webscript" in the list of repository
>>> >>>>>> connection types, you've got a version that supports that connector.
>>> >>>>>>
>>> >>>>>> Thanks,
>>> >>>>>> Karl
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <
>>> >>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>> >>>>>>
>>> >>>>>>> Thanks Rafa.
>>> >>>>>>>
>>> >>>>>>> As an aside, is there an easy way to identify which version of
>>> >>>>>>> ManifoldCF you are on?
>>> >>>>>>>
>>> >>>>>>> Cheers
>>> >>>>>>>
>>> >>>>>>> *Paul Farrell*
>>> >>>>>>> Senior Search Consultant
>>> >>>>>>>
>>> >>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>> >>>>>>> *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/>
>>> >>>>>>> <http://www.funnelback.com/ <http://www.funnelback.com/>>
>>> >>>>>>>
>>> >>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>> >>>>>>>
>>> >>>>>>> Connect with us: LinkedIn
>>> >>>>>>> <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> - Twitter
>>> >>>>>>> <https://twitter.com/funnelback <https://twitter.com/funnelback>>
>>> >>>>>>>
>>> >>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>> >>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
>>> >>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>> >>>>>>>
>>> >>>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rharo@apache.org <ma...@apache.org>> wrote:
>>> >>>>>>>
>>> >>>>>>> Hi Paul,
>>> >>>>>>>
>>> >>>>>>> All you need to do is to install this webscript
>>> >>>>>>> <https://github.com/maoo/alfresco-indexer <https://github.com/maoo/alfresco-indexer>> within your Alfresco
>>> >>>>>>> instance. The connector itself is already part of the most recent versions
>>> >>>>>>> of ManifoldCF
>>> >>>>>>>
>>> >>>>>>> Cheers,
>>> >>>>>>> Rafa
>>> >>>>>>>
>>> >>>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <
>>> >>>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>> >>>>>>>
>>> >>>>>>>> Ok, thanks again guys.
>>> >>>>>>>>
>>> >>>>>>>> The Webscript connector it is.
>>> >>>>>>>>
>>> >>>>>>>> I realise I am asking a lot here but are there any easy-to-follow
>>> >>>>>>>> guidelines on how to get this Webscript connector installed?  I see there
>>> >>>>>>>> is a GitHub page here (
>>> >>>>>>>> https://github.com/maoo/alfresco-webscript-manifold-connector <https://github.com/maoo/alfresco-webscript-manifold-connector>)
>>> >>>>>>>> which discusses it (although it directs you to a repository of files).
>>> >>>>>>>>
>>> >>>>>>>> I am just keen to make sure that any steps I follow to try and get
>>> >>>>>>>> this Webscript connector installed and working are updated, reliable steps.
>>> >>>>>>>> I would hate to waste time with out of date information.
>>> >>>>>>>>
>>> >>>>>>>> Thanks all
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>>
>>> >>>>>>>> wrote:
>>> >>>>>>>>
>>> >>>>>>>> Hi Paul,
>>> >>>>>>>>
>>> >>>>>>>> I suggest that you should use Alfresco Webscript as Karl mentioned.
>>> >>>>>>>> Web services is so slow compared to other services and I've also checked
>>> >>>>>>>> that Alfresco CMIS web services does not return change token(may be there
>>> >>>>>>>> is something that I don't know).
>>> >>>>>>>>
>>> >>>>>>>> By the way current version of CMIS connector is not aware of change
>>> >>>>>>>> token. I would write a patch for you if alfresco supports change token
>>> >>>>>>>> property.
>>> >>>>>>>>
>>> >>>>>>>> Thanks!
>>> >>>>>>>> Muhammed
>>> >>>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <
>>> >>>>>>>> daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>>> >>>>>>>>
>>> >>>>>>>>> Hi Paul,
>>> >>>>>>>>>
>>> >>>>>>>>> The Alfresco Webscript connector is a wholly different connector
>>> >>>>>>>>> that has no relation to the CMIS connector.  It requires an Alfresco
>>> >>>>>>>>> webscript plugin be installed on your Alfresco server to work, though.
>>> >>>>>>>>>
>>> >>>>>>>>> Hope that helps.
>>> >>>>>>>>>
>>> >>>>>>>>> Karl
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <
>>> >>>>>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>> >>>>>>>>>
>>> >>>>>>>>>> Hi Muhammed/Karl,
>>> >>>>>>>>>>
>>> >>>>>>>>>> Firstly, thank-you so much for taking the time to reply. It is
>>> >>>>>>>>>> very much appreciated.
>>> >>>>>>>>>>
>>> >>>>>>>>>> Currently I am using the AtomPub for my CMIS repository
>>> >>>>>>>>>> connection. I have just read something which may shed a little light on
>>> >>>>>>>>>> this. The post read that change tokens are not passed via AtomPub
>>> >>>>>>>>>> connections (
>>> >>>>>>>>>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758 <https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758>).
>>> >>>>>>>>>> If true, this would explain why ManifoldCF may be unable to determine a
>>> >>>>>>>>>> change in Alfresco.
>>> >>>>>>>>>>
>>> >>>>>>>>>> It looks like I have two possible options left open to me
>>> >>>>>>>>>> (correct me if I’m wrong):
>>> >>>>>>>>>>
>>> >>>>>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the
>>> >>>>>>>>>> connection mechanism
>>> >>>>>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’
>>> >>>>>>>>>> connector?  (or is this the same as the ‘Web Services’ connection mentioned
>>> >>>>>>>>>> above?)
>>> >>>>>>>>>>
>>> >>>>>>>>>> Thanks again,
>>> >>>>>>>>>>
>>> >>>>>>>>>> Paul
>>> >>>>>>>>>>
>>> >>>>>>>>>> *Paul Farrell*
>>> >>>>>>>>>> Senior Search Consultant
>>> >>>>>>>>>>
>>> >>>>>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>> >>>>>>>>>> *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/>
>>> >>>>>>>>>> <http://www.funnelback.com/ <http://www.funnelback.com/>>
>>> >>>>>>>>>>
>>> >>>>>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED
>>> >>>>>>>>>> STATES
>>> >>>>>>>>>>
>>> >>>>>>>>>> Connect with us: LinkedIn
>>> >>>>>>>>>> <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> - Twitter
>>> >>>>>>>>>> <https://twitter.com/funnelback <https://twitter.com/funnelback>>
>>> >>>>>>>>>>
>>> >>>>>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>> >>>>>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
>>> >>>>>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>> >>>>>>>>>>
>>> >>>>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>>
>>> >>>>>>>>>> wrote:
>>> >>>>>>>>>>
>>> >>>>>>>>>> Hi Paul,
>>> >>>>>>>>>>
>>> >>>>>>>>>> Repositories should give information to ManifoldCF when they
>>> >>>>>>>>>> updated. Current CMIS connector reindex document if the lastest version of
>>> >>>>>>>>>> the document has changed, not updated.
>>> >>>>>>>>>>
>>> >>>>>>>>>> There is a change token property in CMIS specification and it
>>> >>>>>>>>>> should change when document is updated so ManifoldCF can understand that
>>> >>>>>>>>>> document is updated but implementing change token property is optional.
>>> >>>>>>>>>> I've checked Alfresco's CMIS web site and seen that they didn't set the
>>> >>>>>>>>>> change token.
>>> >>>>>>>>>>
>>> >>>>>>>>>> I think, there is nothing we can do at this point.
>>> >>>>>>>>>>
>>> >>>>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <daddywri@gmail.com <ma...@gmail.com>>
>>> >>>>>>>>>> şunu yazdı:
>>> >>>>>>>>>>
>>> >>>>>>>>>>> Hi Paul,
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> This looks like a bug in the CMIS connector to me; usually the
>>> >>>>>>>>>>> document version string the connector constructs should be adequate to
>>> >>>>>>>>>>> detect all changes.  Can you create a ticket?
>>> >>>>>>>>>>> https://issues.apache.org/jira <https://issues.apache.org/jira> , project ManifoldCF.  Please
>>> >>>>>>>>>>> include what version of MCF you are using here.  FWIW, this may be in fact
>>> >>>>>>>>>>> a bug in the Alfresco CMIS implementation, but we'll have to have some back
>>> >>>>>>>>>>> and forth before I can determine that for sure.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> In the meantime, have you considered using the Alfresco
>>> >>>>>>>>>>> Webscript connector?  It's the preferred way to do Alfresco indexing,
>>> >>>>>>>>>>> although there have been issues reported having to do with running it on
>>> >>>>>>>>>>> some configurations of Alfresco.  I'm not entirely sure what the problem is
>>> >>>>>>>>>>> there; maybe a version dependency of some kind.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> Karl
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <
>>> >>>>>>>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>> Hi Everyone,
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Hoping someone may be able to advise.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> I am currently using Manifold, together with a CMIS connector,
>>> >>>>>>>>>>>> to retrieve and index content from an Alfresco repository.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> All is going well apart from, what I would call, the
>>> >>>>>>>>>>>> ‘incremental crawl’.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> The main issue I am having is that the modification of a
>>> >>>>>>>>>>>> document’s security settings, in Alfresco, is not being picked up in next
>>> >>>>>>>>>>>> Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A
>>> >>>>>>>>>>>> and B as Consumers. I run a crawl in Manifold and it picks up the documents
>>> >>>>>>>>>>>> fine.  The security is set as expected. I then remove ‘User A’ from the
>>> >>>>>>>>>>>> security of that document and re-run the Manifold crawl. User A can still
>>> >>>>>>>>>>>> see the document in the local search engine.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> It is as if Manifold is not treating the security update as a
>>> >>>>>>>>>>>> ‘modification’ and is therefore not refreshing it. Note that if I go into
>>> >>>>>>>>>>>> the Output Connections, edit and save the relevant output connection and
>>> >>>>>>>>>>>> then click ‘Remove all associated documents’, the next time I crawl, the
>>> >>>>>>>>>>>> changes are picked up. It is clear that Manifold is just not updating
>>> >>>>>>>>>>>> whatever internal record it has for this item.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Any ideas?
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Many thanks.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>>
>>> >>
>>> >
>>> 
>>> 
>>> 
>> 
>> 
> 
> 


Re: Manifold/Alfresco seeding and security

Posted by Karl Wright <da...@gmail.com>.
Hi Paul,

I can't answer that question until I know how you've deployed things.  I'm
presuming that you are using a multiprocess deployment?  If so, for the web
applications, recycling the application server should be sufficient, but
you really want to check to be sure what properties.xml file the
application server is pointing at, so you change the jar in the right
place.  In a multiprocess setup, there are also agents processes (at least
one), which you would also need to cycle.

Thanks,
Karl


On Wed, Oct 21, 2015 at 9:41 AM, Paul Farrell <pf...@funnelback.com>
wrote:

> Thanks Karl.
>
> Can you clarify what you mean by ‘recycle Manifold processes’? My fallback
> position in anything like this is to restart whatever app/web server is
> hosting Manifold. Is that not sufficient?
>
> As for this path being defined elsewhere, I have just finished
> constructing a one-liner that lets me search through the classes within
> jar’s. Quite useful:
>
> find . -iname '*.jar' -printf "unzip -c %p | grep -q 'stringToSearchFor'
> && echo %p\n" | sh
>
> Going to see if that original ‘api/node’ string exists anywhere else.
>
> Cheers
>
>
>
>
> On 21 Oct 2015, at 14:36, Karl Wright <da...@gmail.com> wrote:
>
> Hi Paul,
>
> The indexer jar should appear in only one place, in the connector-lib
> directory that is referenced by your properties.xml file.  However, if you
> replace that, you will need to recycle all ManifoldCF processes or they
> will not be able to pick it up.
>
> I would also check the URL that's being logged to be sure it matches the
> pattern Maurizio pointed out.  If it doesn't, there's a possibility that
> some other place in the connector has a similar problem that hasn't been
> fixed.
>
> Thanks,
> Karl
>
>
> On Wed, Oct 21, 2015 at 8:48 AM, Paul Farrell <pf...@funnelback.com>
> wrote:
>
>> Hi Karl/Maurizio,
>>
>> I have a very very odd circumstance at present. This may or may not be
>> related to the Alfresco WebScript plugin OR the environment in which I am
>> running Manifold but thought I would raise the question.
>>
>> I have cloned the repo for the Alfresco Webscript connector and can see
>> that there is a ‘alfresco-indexer-client.jar’ file in the ‘target’
>> directory.
>>
>> I have taken that jar and have replaced the jar that existed in the
>> Manifold instance. This was at a path called
>> ‘apache-manifoldcf/connector-lib’. This path is referenced in an
>> ‘mcf-properties.xml’ file which may or may not be specific to our
>> environment.
>>
>> Anyway, as I say I have replaced the existing jar but the strangest thing
>> is that the same path is being used when I ‘Save’ the repository
>> connection. In other words, the path ‘….api/node…’ is still being used
>> despite the jar file saying otherwise.
>>
>> NOTE: the way I am testing this is to apply the jar, restart Jetty (our
>> app server), open Manifold, navigate to the Alfresco WebScript Repository
>> connection, hit ‘Save’ and then open the ‘manifold.log’ file. It is in this
>> file that I see the HTTP request and the 404 error. It is in this HTTP
>> request that it stipulates the path it is using - the old path.
>>
>> —
>>
>> I have even gone to the extreme of removing this jar file and restarting
>> the app server to see if this jar is ignored by Manifold. If I do this
>> Manifold does not even start so it is clearly expecting that jar to exist.
>> This is even more strange. It is clearly reliant on the jar but it is not
>> using the content of that jar.
>>
>> Can I ask if you guys can think of any reason at all that this might be
>> happening. It is starting to drive me mad!
>>
>> Thanks
>>
>>
>>
>>
>> On 21 Oct 2015, at 02:23, Karl Wright <da...@gmail.com> wrote:
>>
>> Hi Paul,
>> Looking at Issue 3, I think that Maurizio has indeed pointed you in the
>> right direction.  Can you check your version of the plugin to be sure that
>> /api/node/ is NOT present in the described line of code?
>>
>> Karl
>>
>>
>> On Tue, Oct 20, 2015 at 5:00 PM, <pf...@funnelback.com> wrote:
>>
>>> Hi Maurizio,
>>>
>>> I will be available all day tomorrow (Wednesday) to help out as much as
>>> I can. If it's possible for you to look into this I can take whatever steps
>>> you need.
>>>
>>> Many thanks,
>>>
>>> Paul
>>>
>>> -----Original Message-----
>>> From: "Karl Wright" <da...@gmail.com>
>>> Sent: Tuesday, October 20, 2015 12:34pm
>>> To: "user@manifoldcf.apache.org" <us...@manifoldcf.apache.org>
>>> Subject: Re: Manifold/Alfresco seeding and security
>>>
>>> Hi Maurizio,
>>>
>>> This is the third time we've seen this; can you use Paul's help to chase
>>> down what the issue is?
>>>
>>> Karl
>>>
>>>
>>> On Tue, Oct 20, 2015 at 12:19 PM, Paul Farrell <pf...@funnelback.com>
>>> wrote:
>>>
>>> > Hi,
>>> >
>>> > I am using Alfresco Community 5.0.
>>> >
>>> > Having taken that AMP file (version 0.7.1) and then installed it into
>>> > Alfresco and restarted the services, the issue is still present.
>>> >
>>> > I suspect that this is probably more to do with the Manifold end than
>>> the
>>> > Alfresco end. It seems it is Manifold that is automatically appending
>>> the
>>> > “/api/node” string into the path whenever I use “/alfresco/service” as
>>> the
>>> > Context in the repository connection configuration.
>>> >
>>> > If it is of interest, this is the output in the manifoldcf.log file
>>> when I
>>> > use the repo connection config I mentioned earlier.
>>> >
>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection request:
>>> > [route: {}->http://54.165.85.140:8080][total kept alive: 0; route
>>> > allocated: 0 of 2; total allocated: 0 of 20]
>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection leased:
>>> [id:
>>> > 10][route: {}->http://54.165.85.140:8080][total kept alive: 0; route
>>> > allocated: 1 of 2; total allocated: 1 of 20]
>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Opening connection
>>> {}->
>>> > http://54.165.85.140:8080
>>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connecting to /
>>> > 54.165.85.140:8080
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Connection
>>> established
>>> > 172.31.23.90:58712<->54.165.85.140:8080
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Executing request GET
>>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Proxy auth state:
>>> > UNCHALLENGED
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> GET
>>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > Accept: application/json
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > Authorization: Basic YWRtaW46RnVubmVsYmFjazE=
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > Host: 54.165.85.140:8080
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > Connection: Keep-Alive
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > User-Agent: Apache-HttpClient/4.3.5 (java 1.5)
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > Accept-Encoding: gzip,deflate
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> "GET
>>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > "Accept: application/json[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > "Authorization: Basic YWRtaW46RnVubmVsYmFjazE=[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > "Host: 54.165.85.140:8080[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > "Connection: Keep-Alive[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > "User-Agent: Apache-HttpClient/4.3.5 (java 1.5)[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > "Accept-Encoding: gzip,deflate[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>>> > "[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "HTTP/1.1 404 Not Found[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "Server: Apache-Coyote/1.1[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "Cache-Control: no-cache[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "Expires: Thu, 01 Jan 1970 00:00:00 GMT[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "Pragma: no-cache[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "Content-Type: text/html;charset=UTF-8[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "Transfer-Encoding: chunked[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "Date: Tue, 20 Oct 2015 16:18:47 GMT[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "630[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "
>>> > http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "<html xmlns="http://www.w3.org/1999/xhtml">[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> > <head>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >    <title>Web Script Status 404 - Not Found</title>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >    <link rel="stylesheet" href="/alfresco/css/webscripts.css"
>>> > type="text/css" />[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> > </head>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> > <body>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >    <div>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >       <table>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >             <td><img src="/alfresco/images/logo/AlfrescoLogo32.png"
>>> > alt="Alfresco" /></td>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >             <td><span class="title">Web Script Status 404 - Not
>>> > Found</span></td>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          </tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >       </table>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >       <br/>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >       <table>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr><td>The Web Script <a
>>> >
>>> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
>>> > has responded with a status of 404 - Not Found.</td></tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >       </table>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >       <br/>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >       <table>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr><td><b>404 Description:</b></td><td> Requested resource
>>> is not
>>> > available.</td></tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr><td> </td></tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr><td><b>Message:</b></td><td>Cannot find object for
>>> > NodeIdReference[storeRef=auth://resolve,id=admin]</td></tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr><td><b>Server</b>:</td><td>Community v5.0.0 (r75118-b23)
>>> > schema 8,001</td></tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr><td><b>Time</b>:</td><td>Oct 20, 2015 4:18:47
>>> PM</td></tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr><td></td><td> </td></tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >          <tr><td><b>Diagnostics</b>:</td><td><a
>>> > href="/alfresco/service/script/org/alfresco/cmis/item.get">Inspect Web
>>> > Script (org/alfresco/cmis/item.get)</a></td></tr>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >       </table>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> >    </div>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>>> > </body>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "</html>[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "[\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > "[\r][\n]"
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > HTTP/1.1 404 Not Found
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > Server: Apache-Coyote/1.1
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > Cache-Control: no-cache
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > Expires: Thu, 01 Jan 1970 00:00:00 GMT
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > Pragma: no-cache
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > Content-Type: text/html;charset=UTF-8
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > Transfer-Encoding: chunked
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>>> > Date: Tue, 20 Oct 2015 16:18:47 GMT
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection can be
>>> kept
>>> > alive indefinitely
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10:
>>> > Shutdown connection
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection discarded
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10:
>>> Close
>>> > connection
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection released:
>>> > [id: 10][route: {}->http://54.165.85.140:8080][total kept alive: 0;
>>> route
>>> > allocated: 0 of 2; total allocated: 0 of 20]
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager is
>>> > shutting down
>>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager
>>> shut
>>> > down
>>> >
>>> > *Paul Farrell*
>>> > Senior Search Consultant
>>> >
>>> > 109-123 Clifton Street, London EC2A 4LD
>>> > *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>>> >
>>> > *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>> >
>>> > Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback>
>>> -
>>> > Twitter <https://twitter.com/funnelback>
>>> >
>>> > Funnelback UK Ltd is a limited liability company registered in England
>>> &
>>> > Wales. Registered address: Zetland House 109-123, Clifton Street,
>>> London.
>>> > EC2A 4LD. Company registration number: 07004264.
>>> >
>>> > On 20 Oct 2015, at 16:50, Maurizio Pillitu <ma...@apache.org> wrote:
>>> >
>>> > Hi Paul,
>>> >
>>> > it looks like you're hitting
>>> > https://github.com/maoo/alfresco-indexer/issues/3 ; which version of
>>> > alfresco-indexer are you using? Can you try using
>>> >
>>> http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp
>>> (or
>>> > the pre-built WAR file -
>>> >
>>> http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar
>>> >  )
>>> >
>>> > HTH
>>> >   mao
>>> >
>>> > On Tue, Oct 20, 2015 at 5:36 PM Paul Farrell <pf...@funnelback.com>
>>> > wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> Having had to go back to basics and re-install my Alfresco instance, I
>>> >> can confirm that the AMP file for the alfresco indexer web scripts
>>> *does*
>>> >> actually install without error. There must have been an issue with my
>>> >> previous Alfresco instance.
>>> >>
>>> >> Having said that, the Alfresco WebScript connector fails. The failure
>>> is
>>> >> down to the ‘Context’ setting (see below):
>>> >>
>>> >> <4a6db6238cff01e7ff77cdaf7e6ea050.png>
>>> >>
>>> >> When you attempt to save the configuration of the WebScript connector,
>>> >> Manifold clearly tries to check the connection. It seems to do this by
>>> >> making an API call (/auth/resolve/admin). The issue is with what
>>> Manifold
>>> >> prepends to the start of that path.
>>> >> If I leave the setting as above then Manifold reports   :
>>> >>
>>> >> <tr><td>The Web Script <a
>>> >>
>>> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
>>> >> has responded with a status of 404 - Not Found.</td></tr>[\n]”
>>> >>
>>> >> In other words, it builds the full path as
>>> >> “alfresco/service/api/node/auth/resolve/admin”.
>>> >>
>>> >> For my Alfresco Community 5.0 instance, I get to that same web script
>>> via
>>> >> the URL “/alfresco/service/auth/resolve/admin” i.e. without the
>>> ‘/api/node’.
>>> >>
>>> >> Somewhere, Manifold is assuming that the ‘/api/node’ is a correct path
>>> >> inclusion. In other words, there is nothing I can put into that box to
>>> >> prevent it.
>>> >>
>>> >> Paul
>>> >>
>>> >> On 20 Oct 2015, at 12:56, Karl Wright <da...@gmail.com> wrote:
>>> >>
>>> >> Hmm.  What file was missing?  Maurizio maintains the indexer plugin; I
>>> >> feel certain he'd want to know.
>>> >>
>>> >> Karl
>>> >>
>>> >>
>>> >> On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <
>>> pfarrell@funnelback.com>
>>> >> wrote:
>>> >>
>>> >>> Hi guys,
>>> >>>
>>> >>> Just to let you know what’s going on - for informational purposes
>>> more
>>> >>> than anything.
>>> >>>
>>> >>> I initially tried taking the AMP file provided in the MCF plugins
>>> >>> directory (0.7.0) and tried to install it into Alfresco but got a
>>> message
>>> >>> saying a file was missing.
>>> >>>
>>> >>> Instead, I cloned the repository on GitHub for the alfresco-indexer
>>> >>> project and then built it on my local machine. This generated the
>>> AMP file
>>> >>> (0.7.2).
>>> >>>
>>> >>> I was able to successfully install the AMP file onto my Alfresco
>>> >>> instance.
>>> >>>
>>> >>> As it happens I now cannot log into Alfresco Share ('bad credentials
>>> or
>>> >>> server not available' message) but that is something I can work on.
>>> >>> Apparently the installation of some AMP files have been known to
>>> cause this
>>> >>> issue.
>>> >>>
>>> >>> So, progress to a point!
>>> >>>
>>> >>> *Paul Farrell*
>>> >>> Senior Search Consultant
>>> >>>
>>> >>> 109-123 Clifton Street, London EC2A 4LD
>>> >>> *T* +44 (0) 207 183 6865 | funnelback.com <
>>> http://www.funnelback.com/>
>>> >>>
>>> >>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>> >>>
>>> >>> Connect with us: LinkedIn <
>>> http://www.linkedin.com/company/funnelback> -
>>> >>>  Twitter <https://twitter.com/funnelback>
>>> >>>
>>> >>> Funnelback UK Ltd is a limited liability company registered in
>>> England &
>>> >>> Wales. Registered address: Zetland House 109-123, Clifton Street,
>>> London.
>>> >>> EC2A 4LD. Company registration number: 07004264.
>>> >>>
>>> >>> On 20 Oct 2015, at 12:36, Rafa Haro <rh...@gmail.com> wrote:
>>> >>>
>>> >>> Hi,
>>> >>>
>>> >>> At the Alfresco side, hope this helps:
>>> >>>
>>> >>> http://docs.alfresco.com/4.1/tasks/amp-install.html
>>> >>>
>>> >>> Cheers
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <da...@gmail.com>
>>> wrote:
>>> >>>
>>> >>>> The AMP file is actually shipped as part of the binary MCF
>>> >>>> distribution.  You can find it under "plugins".
>>> >>>>
>>> >>>> Karl
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <
>>> pfarrell@funnelback.com>
>>> >>>> wrote:
>>> >>>>
>>> >>>>> Hi all,
>>> >>>>>
>>> >>>>> Hopefully this will be my only request for information today.
>>> >>>>> I’m afraid this is a bit of a newbie question but I have managed to
>>> >>>>> get the Manifold UI to now show ‘Alfresco Webscripts’ as a
>>> connector. The
>>> >>>>> only bit I am missing now is to install the AMP file in Afresco.
>>> >>>>>
>>> >>>>> I realise that this is slightly outside of the Manifold remit but I
>>> >>>>> wondered if anyone can advise how I build the AMP file from the
>>> URL (
>>> >>>>> https://github.com/maoo/alfresco-indexer)? I have cloned the
>>> >>>>> repository to my local drive but, having never worked with Maven,
>>> am at a
>>> >>>>> loss at how to generate the AMP file that I then need to install
>>> into
>>> >>>>> Alfresco.
>>> >>>>>
>>> >>>>> Many thanks,
>>> >>>>>
>>> >>>>> On 19 Oct 2015, at 17:36, Karl Wright <da...@gmail.com> wrote:
>>> >>>>>
>>> >>>>> The only way you can have such a reduced list of connectors is if
>>> >>>>> somebody commented out many connectors in your connectors.xml, or
>>> removed
>>> >>>>> them from the database table where they are registered by hand.
>>> >>>>>
>>> >>>>> Karl
>>> >>>>>
>>> >>>>>
>>> >>>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <
>>> >>>>> pfarrell@funnelback.com> wrote:
>>> >>>>>
>>> >>>>>> After a good deal of time clicking around I came to the same
>>> >>>>>> conclusion - that there is no way of telling from the UI!!
>>> >>>>>>
>>> >>>>>> Having dug a bit deeper I believe I may actually have the Alfresco
>>> >>>>>> WebScript connectors installed. At least the 0.7.0 version. I
>>> notice in the
>>> >>>>>> ‘lib’ directory that I have
>>> ‘alfresco-indexer-webscripts-0.7.0.amp.
>>> >>>>>>
>>> >>>>>> Looking in the ‘connectors.xml’ file I can also see the line :
>>> >>>>>>
>>> >>>>>> <repositoryconnector name="Alfresco Webscript"
>>> >>>>>>
>>> class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>>> >>>>>>
>>> >>>>>> You can imagine my excitement!
>>> >>>>>>
>>> >>>>>> The only thing I am missing is the option in the UI. When I click
>>> to
>>> >>>>>> create a new repo connection I get:  CMIS, Dropbox, Generic,
>>> GoogleDrive,
>>> >>>>>> HDFS, Jira, Meridio, RSS, Sharepoint.
>>> >>>>>>
>>> >>>>>> Perhaps I am hoping for too much to hope that I can make a simple
>>> >>>>>> change to enable this repo connection?
>>> >>>>>>
>>> >>>>>> Thanks for all the help everyone
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> On 19 Oct 2015, at 17:26, Karl Wright <da...@gmail.com> wrote:
>>> >>>>>>
>>> >>>>>> Hah; there's not a way to inquire in the UI, if that's what you
>>> >>>>>> mean.  But if you see "Alfresco webscript" in the list of
>>> repository
>>> >>>>>> connection types, you've got a version that supports that
>>> connector.
>>> >>>>>>
>>> >>>>>> Thanks,
>>> >>>>>> Karl
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <
>>> >>>>>> pfarrell@funnelback.com> wrote:
>>> >>>>>>
>>> >>>>>>> Thanks Rafa.
>>> >>>>>>>
>>> >>>>>>> As an aside, is there an easy way to identify which version of
>>> >>>>>>> ManifoldCF you are on?
>>> >>>>>>>
>>> >>>>>>> Cheers
>>> >>>>>>>
>>> >>>>>>> *Paul Farrell*
>>> >>>>>>> Senior Search Consultant
>>> >>>>>>>
>>> >>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>> >>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
>>> >>>>>>> <http://www.funnelback.com/>
>>> >>>>>>>
>>> >>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED
>>> STATES
>>> >>>>>>>
>>> >>>>>>> Connect with us: LinkedIn
>>> >>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
>>> >>>>>>> <https://twitter.com/funnelback>
>>> >>>>>>>
>>> >>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>> >>>>>>> England & Wales. Registered address: Zetland House 109-123,
>>> Clifton Street,
>>> >>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>> >>>>>>>
>>> >>>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rh...@apache.org> wrote:
>>> >>>>>>>
>>> >>>>>>> Hi Paul,
>>> >>>>>>>
>>> >>>>>>> All you need to do is to install this webscript
>>> >>>>>>> <https://github.com/maoo/alfresco-indexer> within your Alfresco
>>> >>>>>>> instance. The connector itself is already part of the most
>>> recent versions
>>> >>>>>>> of ManifoldCF
>>> >>>>>>>
>>> >>>>>>> Cheers,
>>> >>>>>>> Rafa
>>> >>>>>>>
>>> >>>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <
>>> >>>>>>> pfarrell@funnelback.com> wrote:
>>> >>>>>>>
>>> >>>>>>>> Ok, thanks again guys.
>>> >>>>>>>>
>>> >>>>>>>> The Webscript connector it is.
>>> >>>>>>>>
>>> >>>>>>>> I realise I am asking a lot here but are there any
>>> easy-to-follow
>>> >>>>>>>> guidelines on how to get this Webscript connector installed?  I
>>> see there
>>> >>>>>>>> is a GitHub page here (
>>> >>>>>>>> https://github.com/maoo/alfresco-webscript-manifold-connector)
>>> >>>>>>>> which discusses it (although it directs you to a repository of
>>> files).
>>> >>>>>>>>
>>> >>>>>>>> I am just keen to make sure that any steps I follow to try and
>>> get
>>> >>>>>>>> this Webscript connector installed and working are updated,
>>> reliable steps.
>>> >>>>>>>> I would hate to waste time with out of date information.
>>> >>>>>>>>
>>> >>>>>>>> Thanks all
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh...@gmail.com>
>>> >>>>>>>> wrote:
>>> >>>>>>>>
>>> >>>>>>>> Hi Paul,
>>> >>>>>>>>
>>> >>>>>>>> I suggest that you should use Alfresco Webscript as Karl
>>> mentioned.
>>> >>>>>>>> Web services is so slow compared to other services and I've
>>> also checked
>>> >>>>>>>> that Alfresco CMIS web services does not return change
>>> token(may be there
>>> >>>>>>>> is something that I don't know).
>>> >>>>>>>>
>>> >>>>>>>> By the way current version of CMIS connector is not aware of
>>> change
>>> >>>>>>>> token. I would write a patch for you if alfresco supports
>>> change token
>>> >>>>>>>> property.
>>> >>>>>>>>
>>> >>>>>>>> Thanks!
>>> >>>>>>>> Muhammed
>>> >>>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <
>>> >>>>>>>> daddywri@gmail.com> şunu yazdı:
>>> >>>>>>>>
>>> >>>>>>>>> Hi Paul,
>>> >>>>>>>>>
>>> >>>>>>>>> The Alfresco Webscript connector is a wholly different
>>> connector
>>> >>>>>>>>> that has no relation to the CMIS connector.  It requires an
>>> Alfresco
>>> >>>>>>>>> webscript plugin be installed on your Alfresco server to work,
>>> though.
>>> >>>>>>>>>
>>> >>>>>>>>> Hope that helps.
>>> >>>>>>>>>
>>> >>>>>>>>> Karl
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <
>>> >>>>>>>>> pfarrell@funnelback.com> wrote:
>>> >>>>>>>>>
>>> >>>>>>>>>> Hi Muhammed/Karl,
>>> >>>>>>>>>>
>>> >>>>>>>>>> Firstly, thank-you so much for taking the time to reply. It is
>>> >>>>>>>>>> very much appreciated.
>>> >>>>>>>>>>
>>> >>>>>>>>>> Currently I am using the AtomPub for my CMIS repository
>>> >>>>>>>>>> connection. I have just read something which may shed a
>>> little light on
>>> >>>>>>>>>> this. The post read that change tokens are not passed via
>>> AtomPub
>>> >>>>>>>>>> connections (
>>> >>>>>>>>>>
>>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758
>>> ).
>>> >>>>>>>>>> If true, this would explain why ManifoldCF may be unable to
>>> determine a
>>> >>>>>>>>>> change in Alfresco.
>>> >>>>>>>>>>
>>> >>>>>>>>>> It looks like I have two possible options left open to me
>>> >>>>>>>>>> (correct me if I’m wrong):
>>> >>>>>>>>>>
>>> >>>>>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the
>>> >>>>>>>>>> connection mechanism
>>> >>>>>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’
>>> >>>>>>>>>> connector?  (or is this the same as the ‘Web Services’
>>> connection mentioned
>>> >>>>>>>>>> above?)
>>> >>>>>>>>>>
>>> >>>>>>>>>> Thanks again,
>>> >>>>>>>>>>
>>> >>>>>>>>>> Paul
>>> >>>>>>>>>>
>>> >>>>>>>>>> *Paul Farrell*
>>> >>>>>>>>>> Senior Search Consultant
>>> >>>>>>>>>>
>>> >>>>>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>> >>>>>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
>>> >>>>>>>>>> <http://www.funnelback.com/>
>>> >>>>>>>>>>
>>> >>>>>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED
>>> >>>>>>>>>> STATES
>>> >>>>>>>>>>
>>> >>>>>>>>>> Connect with us: LinkedIn
>>> >>>>>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
>>> >>>>>>>>>> <https://twitter.com/funnelback>
>>> >>>>>>>>>>
>>> >>>>>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>> >>>>>>>>>> England & Wales. Registered address: Zetland House 109-123,
>>> Clifton Street,
>>> >>>>>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>> >>>>>>>>>>
>>> >>>>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh...@gmail.com>
>>> >>>>>>>>>> wrote:
>>> >>>>>>>>>>
>>> >>>>>>>>>> Hi Paul,
>>> >>>>>>>>>>
>>> >>>>>>>>>> Repositories should give information to ManifoldCF when they
>>> >>>>>>>>>> updated. Current CMIS connector reindex document if the
>>> lastest version of
>>> >>>>>>>>>> the document has changed, not updated.
>>> >>>>>>>>>>
>>> >>>>>>>>>> There is a change token property in CMIS specification and it
>>> >>>>>>>>>> should change when document is updated so ManifoldCF can
>>> understand that
>>> >>>>>>>>>> document is updated but implementing change token property is
>>> optional.
>>> >>>>>>>>>> I've checked Alfresco's CMIS web site and seen that they
>>> didn't set the
>>> >>>>>>>>>> change token.
>>> >>>>>>>>>>
>>> >>>>>>>>>> I think, there is nothing we can do at this point.
>>> >>>>>>>>>>
>>> >>>>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <
>>> daddywri@gmail.com>
>>> >>>>>>>>>> şunu yazdı:
>>> >>>>>>>>>>
>>> >>>>>>>>>>> Hi Paul,
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> This looks like a bug in the CMIS connector to me; usually
>>> the
>>> >>>>>>>>>>> document version string the connector constructs should be
>>> adequate to
>>> >>>>>>>>>>> detect all changes.  Can you create a ticket?
>>> >>>>>>>>>>> https://issues.apache.org/jira , project ManifoldCF.  Please
>>> >>>>>>>>>>> include what version of MCF you are using here.  FWIW, this
>>> may be in fact
>>> >>>>>>>>>>> a bug in the Alfresco CMIS implementation, but we'll have to
>>> have some back
>>> >>>>>>>>>>> and forth before I can determine that for sure.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> In the meantime, have you considered using the Alfresco
>>> >>>>>>>>>>> Webscript connector?  It's the preferred way to do Alfresco
>>> indexing,
>>> >>>>>>>>>>> although there have been issues reported having to do with
>>> running it on
>>> >>>>>>>>>>> some configurations of Alfresco.  I'm not entirely sure what
>>> the problem is
>>> >>>>>>>>>>> there; maybe a version dependency of some kind.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> Karl
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <
>>> >>>>>>>>>>> pfarrell@funnelback.com> wrote:
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>> Hi Everyone,
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Hoping someone may be able to advise.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> I am currently using Manifold, together with a CMIS
>>> connector,
>>> >>>>>>>>>>>> to retrieve and index content from an Alfresco repository.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> All is going well apart from, what I would call, the
>>> >>>>>>>>>>>> ‘incremental crawl’.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> The main issue I am having is that the modification of a
>>> >>>>>>>>>>>> document’s security settings, in Alfresco, is not being
>>> picked up in next
>>> >>>>>>>>>>>> Manifold crawl. As an example I have a document ‘TestDoc1’
>>> which has user A
>>> >>>>>>>>>>>> and B as Consumers. I run a crawl in Manifold and it picks
>>> up the documents
>>> >>>>>>>>>>>> fine.  The security is set as expected. I then remove ‘User
>>> A’ from the
>>> >>>>>>>>>>>> security of that document and re-run the Manifold crawl.
>>> User A can still
>>> >>>>>>>>>>>> see the document in the local search engine.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> It is as if Manifold is not treating the security update as
>>> a
>>> >>>>>>>>>>>> ‘modification’ and is therefore not refreshing it. Note
>>> that if I go into
>>> >>>>>>>>>>>> the Output Connections, edit and save the relevant output
>>> connection and
>>> >>>>>>>>>>>> then click ‘Remove all associated documents’, the next time
>>> I crawl, the
>>> >>>>>>>>>>>> changes are picked up. It is clear that Manifold is just
>>> not updating
>>> >>>>>>>>>>>> whatever internal record it has for this item.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Any ideas?
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Many thanks.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>>
>>> >>
>>> >
>>>
>>>
>>>
>>
>>
>
>

Re: Manifold/Alfresco seeding and security

Posted by Paul Farrell <pf...@funnelback.com>.
Thanks Karl.

Can you clarify what you mean by ‘recycle Manifold processes’? My fallback position in anything like this is to restart whatever app/web server is hosting Manifold. Is that not sufficient?

As for this path being defined elsewhere, I have just finished constructing a one-liner that lets me search through the classes within jar’s. Quite useful:

find . -iname '*.jar' -printf "unzip -c %p | grep -q 'stringToSearchFor' && echo %p\n" | sh

Going to see if that original ‘api/node’ string exists anywhere else. 

Cheers



> On 21 Oct 2015, at 14:36, Karl Wright <da...@gmail.com> wrote:
> 
> Hi Paul,
> 
> The indexer jar should appear in only one place, in the connector-lib directory that is referenced by your properties.xml file.  However, if you replace that, you will need to recycle all ManifoldCF processes or they will not be able to pick it up.
> 
> I would also check the URL that's being logged to be sure it matches the pattern Maurizio pointed out.  If it doesn't, there's a possibility that some other place in the connector has a similar problem that hasn't been fixed.
> 
> Thanks,
> Karl
> 
> 
> On Wed, Oct 21, 2015 at 8:48 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> Hi Karl/Maurizio,
> 
> I have a very very odd circumstance at present. This may or may not be related to the Alfresco WebScript plugin OR the environment in which I am running Manifold but thought I would raise the question. 
> 
> I have cloned the repo for the Alfresco Webscript connector and can see that there is a ‘alfresco-indexer-client.jar’ file in the ‘target’ directory. 
> 
> I have taken that jar and have replaced the jar that existed in the Manifold instance. This was at a path called ‘apache-manifoldcf/connector-lib’. This path is referenced in an ‘mcf-properties.xml’ file which may or may not be specific to our environment. 
> 
> Anyway, as I say I have replaced the existing jar but the strangest thing is that the same path is being used when I ‘Save’ the repository connection. In other words, the path ‘….api/node…’ is still being used despite the jar file saying otherwise. 
> 
> NOTE: the way I am testing this is to apply the jar, restart Jetty (our app server), open Manifold, navigate to the Alfresco WebScript Repository connection, hit ‘Save’ and then open the ‘manifold.log’ file. It is in this file that I see the HTTP request and the 404 error. It is in this HTTP request that it stipulates the path it is using - the old path. 
> 
> —
> 
> I have even gone to the extreme of removing this jar file and restarting the app server to see if this jar is ignored by Manifold. If I do this Manifold does not even start so it is clearly expecting that jar to exist. This is even more strange. It is clearly reliant on the jar but it is not using the content of that jar. 
> 
> Can I ask if you guys can think of any reason at all that this might be happening. It is starting to drive me mad!
> 
> Thanks
> 
> 
> 
> 
>> On 21 Oct 2015, at 02:23, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hi Paul,
>> Looking at Issue 3, I think that Maurizio has indeed pointed you in the right direction.  Can you check your version of the plugin to be sure that /api/node/ is NOT present in the described line of code?
>> 
>> Karl
>> 
>> 
>> On Tue, Oct 20, 2015 at 5:00 PM, <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>> Hi Maurizio,
>> 
>> I will be available all day tomorrow (Wednesday) to help out as much as I can. If it's possible for you to look into this I can take whatever steps you need.
>> 
>> Many thanks,
>> 
>> Paul
>> 
>> -----Original Message-----
>> From: "Karl Wright" <daddywri@gmail.com <ma...@gmail.com>>
>> Sent: Tuesday, October 20, 2015 12:34pm
>> To: "user@manifoldcf.apache.org <ma...@manifoldcf.apache.org>" <user@manifoldcf.apache.org <ma...@manifoldcf.apache.org>>
>> Subject: Re: Manifold/Alfresco seeding and security
>> 
>> Hi Maurizio,
>> 
>> This is the third time we've seen this; can you use Paul's help to chase
>> down what the issue is?
>> 
>> Karl
>> 
>> 
>> On Tue, Oct 20, 2015 at 12:19 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
>> wrote:
>> 
>> > Hi,
>> >
>> > I am using Alfresco Community 5.0.
>> >
>> > Having taken that AMP file (version 0.7.1) and then installed it into
>> > Alfresco and restarted the services, the issue is still present.
>> >
>> > I suspect that this is probably more to do with the Manifold end than the
>> > Alfresco end. It seems it is Manifold that is automatically appending the
>> > “/api/node” string into the path whenever I use “/alfresco/service” as the
>> > Context in the repository connection configuration.
>> >
>> > If it is of interest, this is the output in the manifoldcf.log file when I
>> > use the repo connection config I mentioned earlier.
>> >
>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection request:
>> > [route: {}->http://54.165.85.140:8080][total <> kept alive: 0; route
>> > allocated: 0 of 2; total allocated: 0 of 20]
>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection leased: [id:
>> > 10][route: {}->http://54.165.85.140:8080][total <> kept alive: 0; route
>> > allocated: 1 of 2; total allocated: 1 of 20]
>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Opening connection {}->
>> > http://54.165.85.140:8080 <http://54.165.85.140:8080/>
>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connecting to /
>> > 54.165.85.140:8080 <http://54.165.85.140:8080/>
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Connection established
>> > 172.31.23.90:58712 <http://172.31.23.90:58712/><->54.165.85.140:8080 <http://54.165.85.140:8080/>
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Executing request GET
>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Proxy auth state:
>> > UNCHALLENGED
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> GET
>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > Accept: application/json
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > Authorization: Basic YWRtaW46RnVubmVsYmFjazE=
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > Host: 54.165.85.140:8080 <http://54.165.85.140:8080/>
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > Connection: Keep-Alive
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > User-Agent: Apache-HttpClient/4.3.5 (java 1.5)
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > Accept-Encoding: gzip,deflate
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> "GET
>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > "Accept: application/json[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > "Authorization: Basic YWRtaW46RnVubmVsYmFjazE=[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > "Host: 54.165.85.140:8080[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > "Connection: Keep-Alive[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > "User-Agent: Apache-HttpClient/4.3.5 (java 1.5)[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > "Accept-Encoding: gzip,deflate[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > "[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "HTTP/1.1 404 Not Found[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "Server: Apache-Coyote/1.1[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "Cache-Control: no-cache[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "Expires: Thu, 01 Jan 1970 00:00:00 GMT[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "Pragma: no-cache[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "Content-Type: text/html;charset=UTF-8[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "Transfer-Encoding: chunked[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "Date: Tue, 20 Oct 2015 16:18:47 GMT[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "630[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "
>> > http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd <http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd>">[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "<html xmlns="http://www.w3.org/1999/xhtml <http://www.w3.org/1999/xhtml>">[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> > <head>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >    <title>Web Script Status 404 - Not Found</title>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >    <link rel="stylesheet" href="/alfresco/css/webscripts.css"
>> > type="text/css" />[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> > </head>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> > <body>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >    <div>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >       <table>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >             <td><img src="/alfresco/images/logo/AlfrescoLogo32.png"
>> > alt="Alfresco" /></td>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >             <td><span class="title">Web Script Status 404 - Not
>> > Found</span></td>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          </tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >       </table>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >       <br/>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >       <table>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr><td>The Web Script <a
>> > href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
>> > has responded with a status of 404 - Not Found.</td></tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >       </table>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >       <br/>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >       <table>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr><td><b>404 Description:</b></td><td> Requested resource is not
>> > available.</td></tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr><td> </td></tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr><td><b>Message:</b></td><td>Cannot find object for
>> > NodeIdReference[storeRef=auth://resolve,id=admin]</td></tr>[\n] <>"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr><td><b>Server</b>:</td><td>Community v5.0.0 (r75118-b23)
>> > schema 8,001</td></tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr><td><b>Time</b>:</td><td>Oct 20, 2015 4:18:47 PM</td></tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr><td></td><td> </td></tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr><td><b>Diagnostics</b>:</td><td><a
>> > href="/alfresco/service/script/org/alfresco/cmis/item.get">Inspect Web
>> > Script (org/alfresco/cmis/item.get)</a></td></tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >       </table>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >    </div>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> > </body>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "</html>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > HTTP/1.1 404 Not Found
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > Server: Apache-Coyote/1.1
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > Cache-Control: no-cache
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > Expires: Thu, 01 Jan 1970 00:00:00 GMT
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > Pragma: no-cache
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > Content-Type: text/html;charset=UTF-8
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > Transfer-Encoding: chunked
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > Date: Tue, 20 Oct 2015 16:18:47 GMT
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection can be kept
>> > alive indefinitely
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10:
>> > Shutdown connection
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection discarded
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10: Close
>> > connection
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection released:
>> > [id: 10][route: {}->http://54.165.85.140:8080][total <> kept alive: 0; route
>> > allocated: 0 of 2; total allocated: 0 of 20]
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager is
>> > shutting down
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager shut
>> > down
>> >
>> > *Paul Farrell*
>> > Senior Search Consultant
>> >
>> > 109-123 Clifton Street, London EC2A 4LD
>> > *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/> <http://www.funnelback.com/ <http://www.funnelback.com/>>
>> >
>> > *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>> >
>> > Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> -
>> > Twitter <https://twitter.com/funnelback <https://twitter.com/funnelback>>
>> >
>> > Funnelback UK Ltd is a limited liability company registered in England &
>> > Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>> > EC2A 4LD. Company registration number: 07004264.
>> >
>> > On 20 Oct 2015, at 16:50, Maurizio Pillitu <maoo@apache.org <ma...@apache.org>> wrote:
>> >
>> > Hi Paul,
>> >
>> > it looks like you're hitting
>> > https://github.com/maoo/alfresco-indexer/issues/3 <https://github.com/maoo/alfresco-indexer/issues/3> ; which version of
>> > alfresco-indexer are you using? Can you try using
>> > http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp <http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp> (or
>> > the pre-built WAR file -
>> > http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar <http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar>
>> >  )
>> >
>> > HTH
>> >   mao
>> >
>> > On Tue, Oct 20, 2015 at 5:36 PM Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
>> > wrote:
>> >
>> >> Hi,
>> >>
>> >> Having had to go back to basics and re-install my Alfresco instance, I
>> >> can confirm that the AMP file for the alfresco indexer web scripts *does*
>> >> actually install without error. There must have been an issue with my
>> >> previous Alfresco instance.
>> >>
>> >> Having said that, the Alfresco WebScript connector fails. The failure is
>> >> down to the ‘Context’ setting (see below):
>> >>
>> >> <4a6db6238cff01e7ff77cdaf7e6ea050.png>
>> >>
>> >> When you attempt to save the configuration of the WebScript connector,
>> >> Manifold clearly tries to check the connection. It seems to do this by
>> >> making an API call (/auth/resolve/admin). The issue is with what Manifold
>> >> prepends to the start of that path.
>> >> If I leave the setting as above then Manifold reports   :
>> >>
>> >> <tr><td>The Web Script <a
>> >> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
>> >> has responded with a status of 404 - Not Found.</td></tr>[\n]”
>> >>
>> >> In other words, it builds the full path as
>> >> “alfresco/service/api/node/auth/resolve/admin”.
>> >>
>> >> For my Alfresco Community 5.0 instance, I get to that same web script via
>> >> the URL “/alfresco/service/auth/resolve/admin” i.e. without the ‘/api/node’.
>> >>
>> >> Somewhere, Manifold is assuming that the ‘/api/node’ is a correct path
>> >> inclusion. In other words, there is nothing I can put into that box to
>> >> prevent it.
>> >>
>> >> Paul
>> >>
>> >> On 20 Oct 2015, at 12:56, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>> >>
>> >> Hmm.  What file was missing?  Maurizio maintains the indexer plugin; I
>> >> feel certain he'd want to know.
>> >>
>> >> Karl
>> >>
>> >>
>> >> On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
>> >> wrote:
>> >>
>> >>> Hi guys,
>> >>>
>> >>> Just to let you know what’s going on - for informational purposes more
>> >>> than anything.
>> >>>
>> >>> I initially tried taking the AMP file provided in the MCF plugins
>> >>> directory (0.7.0) and tried to install it into Alfresco but got a message
>> >>> saying a file was missing.
>> >>>
>> >>> Instead, I cloned the repository on GitHub for the alfresco-indexer
>> >>> project and then built it on my local machine. This generated the AMP file
>> >>> (0.7.2).
>> >>>
>> >>> I was able to successfully install the AMP file onto my Alfresco
>> >>> instance.
>> >>>
>> >>> As it happens I now cannot log into Alfresco Share ('bad credentials or
>> >>> server not available' message) but that is something I can work on.
>> >>> Apparently the installation of some AMP files have been known to cause this
>> >>> issue.
>> >>>
>> >>> So, progress to a point!
>> >>>
>> >>> *Paul Farrell*
>> >>> Senior Search Consultant
>> >>>
>> >>> 109-123 Clifton Street, London EC2A 4LD
>> >>> *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/> <http://www.funnelback.com/ <http://www.funnelback.com/>>
>> >>>
>> >>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>> >>>
>> >>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> -
>> >>>  Twitter <https://twitter.com/funnelback <https://twitter.com/funnelback>>
>> >>>
>> >>> Funnelback UK Ltd is a limited liability company registered in England &
>> >>> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>> >>> EC2A 4LD. Company registration number: 07004264.
>> >>>
>> >>> On 20 Oct 2015, at 12:36, Rafa Haro <rharoapache@gmail.com <ma...@gmail.com>> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> At the Alfresco side, hope this helps:
>> >>>
>> >>> http://docs.alfresco.com/4.1/tasks/amp-install.html <http://docs.alfresco.com/4.1/tasks/amp-install.html>
>> >>>
>> >>> Cheers
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>> >>>
>> >>>> The AMP file is actually shipped as part of the binary MCF
>> >>>> distribution.  You can find it under "plugins".
>> >>>>
>> >>>> Karl
>> >>>>
>> >>>>
>> >>>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
>> >>>> wrote:
>> >>>>
>> >>>>> Hi all,
>> >>>>>
>> >>>>> Hopefully this will be my only request for information today.
>> >>>>> I’m afraid this is a bit of a newbie question but I have managed to
>> >>>>> get the Manifold UI to now show ‘Alfresco Webscripts’ as a connector. The
>> >>>>> only bit I am missing now is to install the AMP file in Afresco.
>> >>>>>
>> >>>>> I realise that this is slightly outside of the Manifold remit but I
>> >>>>> wondered if anyone can advise how I build the AMP file from the URL (
>> >>>>> https://github.com/maoo/alfresco-indexer <https://github.com/maoo/alfresco-indexer>)? I have cloned the
>> >>>>> repository to my local drive but, having never worked with Maven, am at a
>> >>>>> loss at how to generate the AMP file that I then need to install into
>> >>>>> Alfresco.
>> >>>>>
>> >>>>> Many thanks,
>> >>>>>
>> >>>>> On 19 Oct 2015, at 17:36, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>> >>>>>
>> >>>>> The only way you can have such a reduced list of connectors is if
>> >>>>> somebody commented out many connectors in your connectors.xml, or removed
>> >>>>> them from the database table where they are registered by hand.
>> >>>>>
>> >>>>> Karl
>> >>>>>
>> >>>>>
>> >>>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <
>> >>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>> >>>>>
>> >>>>>> After a good deal of time clicking around I came to the same
>> >>>>>> conclusion - that there is no way of telling from the UI!!
>> >>>>>>
>> >>>>>> Having dug a bit deeper I believe I may actually have the Alfresco
>> >>>>>> WebScript connectors installed. At least the 0.7.0 version. I notice in the
>> >>>>>> ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
>> >>>>>>
>> >>>>>> Looking in the ‘connectors.xml’ file I can also see the line :
>> >>>>>>
>> >>>>>> <repositoryconnector name="Alfresco Webscript"
>> >>>>>> class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>> >>>>>>
>> >>>>>> You can imagine my excitement!
>> >>>>>>
>> >>>>>> The only thing I am missing is the option in the UI. When I click to
>> >>>>>> create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive,
>> >>>>>> HDFS, Jira, Meridio, RSS, Sharepoint.
>> >>>>>>
>> >>>>>> Perhaps I am hoping for too much to hope that I can make a simple
>> >>>>>> change to enable this repo connection?
>> >>>>>>
>> >>>>>> Thanks for all the help everyone
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> On 19 Oct 2015, at 17:26, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>> >>>>>>
>> >>>>>> Hah; there's not a way to inquire in the UI, if that's what you
>> >>>>>> mean.  But if you see "Alfresco webscript" in the list of repository
>> >>>>>> connection types, you've got a version that supports that connector.
>> >>>>>>
>> >>>>>> Thanks,
>> >>>>>> Karl
>> >>>>>>
>> >>>>>>
>> >>>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <
>> >>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>> >>>>>>
>> >>>>>>> Thanks Rafa.
>> >>>>>>>
>> >>>>>>> As an aside, is there an easy way to identify which version of
>> >>>>>>> ManifoldCF you are on?
>> >>>>>>>
>> >>>>>>> Cheers
>> >>>>>>>
>> >>>>>>> *Paul Farrell*
>> >>>>>>> Senior Search Consultant
>> >>>>>>>
>> >>>>>>> 109-123 Clifton Street, London EC2A 4LD
>> >>>>>>> *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/>
>> >>>>>>> <http://www.funnelback.com/ <http://www.funnelback.com/>>
>> >>>>>>>
>> >>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>> >>>>>>>
>> >>>>>>> Connect with us: LinkedIn
>> >>>>>>> <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> - Twitter
>> >>>>>>> <https://twitter.com/funnelback <https://twitter.com/funnelback>>
>> >>>>>>>
>> >>>>>>> Funnelback UK Ltd is a limited liability company registered in
>> >>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
>> >>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>> >>>>>>>
>> >>>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rharo@apache.org <ma...@apache.org>> wrote:
>> >>>>>>>
>> >>>>>>> Hi Paul,
>> >>>>>>>
>> >>>>>>> All you need to do is to install this webscript
>> >>>>>>> <https://github.com/maoo/alfresco-indexer <https://github.com/maoo/alfresco-indexer>> within your Alfresco
>> >>>>>>> instance. The connector itself is already part of the most recent versions
>> >>>>>>> of ManifoldCF
>> >>>>>>>
>> >>>>>>> Cheers,
>> >>>>>>> Rafa
>> >>>>>>>
>> >>>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <
>> >>>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>> >>>>>>>
>> >>>>>>>> Ok, thanks again guys.
>> >>>>>>>>
>> >>>>>>>> The Webscript connector it is.
>> >>>>>>>>
>> >>>>>>>> I realise I am asking a lot here but are there any easy-to-follow
>> >>>>>>>> guidelines on how to get this Webscript connector installed?  I see there
>> >>>>>>>> is a GitHub page here (
>> >>>>>>>> https://github.com/maoo/alfresco-webscript-manifold-connector <https://github.com/maoo/alfresco-webscript-manifold-connector>)
>> >>>>>>>> which discusses it (although it directs you to a repository of files).
>> >>>>>>>>
>> >>>>>>>> I am just keen to make sure that any steps I follow to try and get
>> >>>>>>>> this Webscript connector installed and working are updated, reliable steps.
>> >>>>>>>> I would hate to waste time with out of date information.
>> >>>>>>>>
>> >>>>>>>> Thanks all
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>>
>> >>>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>> Hi Paul,
>> >>>>>>>>
>> >>>>>>>> I suggest that you should use Alfresco Webscript as Karl mentioned.
>> >>>>>>>> Web services is so slow compared to other services and I've also checked
>> >>>>>>>> that Alfresco CMIS web services does not return change token(may be there
>> >>>>>>>> is something that I don't know).
>> >>>>>>>>
>> >>>>>>>> By the way current version of CMIS connector is not aware of change
>> >>>>>>>> token. I would write a patch for you if alfresco supports change token
>> >>>>>>>> property.
>> >>>>>>>>
>> >>>>>>>> Thanks!
>> >>>>>>>> Muhammed
>> >>>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <
>> >>>>>>>> daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>> >>>>>>>>
>> >>>>>>>>> Hi Paul,
>> >>>>>>>>>
>> >>>>>>>>> The Alfresco Webscript connector is a wholly different connector
>> >>>>>>>>> that has no relation to the CMIS connector.  It requires an Alfresco
>> >>>>>>>>> webscript plugin be installed on your Alfresco server to work, though.
>> >>>>>>>>>
>> >>>>>>>>> Hope that helps.
>> >>>>>>>>>
>> >>>>>>>>> Karl
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <
>> >>>>>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> Hi Muhammed/Karl,
>> >>>>>>>>>>
>> >>>>>>>>>> Firstly, thank-you so much for taking the time to reply. It is
>> >>>>>>>>>> very much appreciated.
>> >>>>>>>>>>
>> >>>>>>>>>> Currently I am using the AtomPub for my CMIS repository
>> >>>>>>>>>> connection. I have just read something which may shed a little light on
>> >>>>>>>>>> this. The post read that change tokens are not passed via AtomPub
>> >>>>>>>>>> connections (
>> >>>>>>>>>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758 <https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758>).
>> >>>>>>>>>> If true, this would explain why ManifoldCF may be unable to determine a
>> >>>>>>>>>> change in Alfresco.
>> >>>>>>>>>>
>> >>>>>>>>>> It looks like I have two possible options left open to me
>> >>>>>>>>>> (correct me if I’m wrong):
>> >>>>>>>>>>
>> >>>>>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the
>> >>>>>>>>>> connection mechanism
>> >>>>>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’
>> >>>>>>>>>> connector?  (or is this the same as the ‘Web Services’ connection mentioned
>> >>>>>>>>>> above?)
>> >>>>>>>>>>
>> >>>>>>>>>> Thanks again,
>> >>>>>>>>>>
>> >>>>>>>>>> Paul
>> >>>>>>>>>>
>> >>>>>>>>>> *Paul Farrell*
>> >>>>>>>>>> Senior Search Consultant
>> >>>>>>>>>>
>> >>>>>>>>>> 109-123 Clifton Street, London EC2A 4LD
>> >>>>>>>>>> *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/>
>> >>>>>>>>>> <http://www.funnelback.com/ <http://www.funnelback.com/>>
>> >>>>>>>>>>
>> >>>>>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED
>> >>>>>>>>>> STATES
>> >>>>>>>>>>
>> >>>>>>>>>> Connect with us: LinkedIn
>> >>>>>>>>>> <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> - Twitter
>> >>>>>>>>>> <https://twitter.com/funnelback <https://twitter.com/funnelback>>
>> >>>>>>>>>>
>> >>>>>>>>>> Funnelback UK Ltd is a limited liability company registered in
>> >>>>>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
>> >>>>>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>> >>>>>>>>>>
>> >>>>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>>
>> >>>>>>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>> Hi Paul,
>> >>>>>>>>>>
>> >>>>>>>>>> Repositories should give information to ManifoldCF when they
>> >>>>>>>>>> updated. Current CMIS connector reindex document if the lastest version of
>> >>>>>>>>>> the document has changed, not updated.
>> >>>>>>>>>>
>> >>>>>>>>>> There is a change token property in CMIS specification and it
>> >>>>>>>>>> should change when document is updated so ManifoldCF can understand that
>> >>>>>>>>>> document is updated but implementing change token property is optional.
>> >>>>>>>>>> I've checked Alfresco's CMIS web site and seen that they didn't set the
>> >>>>>>>>>> change token.
>> >>>>>>>>>>
>> >>>>>>>>>> I think, there is nothing we can do at this point.
>> >>>>>>>>>>
>> >>>>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <daddywri@gmail.com <ma...@gmail.com>>
>> >>>>>>>>>> şunu yazdı:
>> >>>>>>>>>>
>> >>>>>>>>>>> Hi Paul,
>> >>>>>>>>>>>
>> >>>>>>>>>>> This looks like a bug in the CMIS connector to me; usually the
>> >>>>>>>>>>> document version string the connector constructs should be adequate to
>> >>>>>>>>>>> detect all changes.  Can you create a ticket?
>> >>>>>>>>>>> https://issues.apache.org/jira <https://issues.apache.org/jira> , project ManifoldCF.  Please
>> >>>>>>>>>>> include what version of MCF you are using here.  FWIW, this may be in fact
>> >>>>>>>>>>> a bug in the Alfresco CMIS implementation, but we'll have to have some back
>> >>>>>>>>>>> and forth before I can determine that for sure.
>> >>>>>>>>>>>
>> >>>>>>>>>>> In the meantime, have you considered using the Alfresco
>> >>>>>>>>>>> Webscript connector?  It's the preferred way to do Alfresco indexing,
>> >>>>>>>>>>> although there have been issues reported having to do with running it on
>> >>>>>>>>>>> some configurations of Alfresco.  I'm not entirely sure what the problem is
>> >>>>>>>>>>> there; maybe a version dependency of some kind.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Karl
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <
>> >>>>>>>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>> Hi Everyone,
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Hoping someone may be able to advise.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> I am currently using Manifold, together with a CMIS connector,
>> >>>>>>>>>>>> to retrieve and index content from an Alfresco repository.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> All is going well apart from, what I would call, the
>> >>>>>>>>>>>> ‘incremental crawl’.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> The main issue I am having is that the modification of a
>> >>>>>>>>>>>> document’s security settings, in Alfresco, is not being picked up in next
>> >>>>>>>>>>>> Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A
>> >>>>>>>>>>>> and B as Consumers. I run a crawl in Manifold and it picks up the documents
>> >>>>>>>>>>>> fine.  The security is set as expected. I then remove ‘User A’ from the
>> >>>>>>>>>>>> security of that document and re-run the Manifold crawl. User A can still
>> >>>>>>>>>>>> see the document in the local search engine.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> It is as if Manifold is not treating the security update as a
>> >>>>>>>>>>>> ‘modification’ and is therefore not refreshing it. Note that if I go into
>> >>>>>>>>>>>> the Output Connections, edit and save the relevant output connection and
>> >>>>>>>>>>>> then click ‘Remove all associated documents’, the next time I crawl, the
>> >>>>>>>>>>>> changes are picked up. It is clear that Manifold is just not updating
>> >>>>>>>>>>>> whatever internal record it has for this item.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Any ideas?
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Many thanks.
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>>
>> >>
>> >
>> 
>> 
>> 
> 
> 


Re: Manifold/Alfresco seeding and security

Posted by Karl Wright <da...@gmail.com>.
Hi Paul,

The indexer jar should appear in only one place, in the connector-lib
directory that is referenced by your properties.xml file.  However, if you
replace that, you will need to recycle all ManifoldCF processes or they
will not be able to pick it up.

I would also check the URL that's being logged to be sure it matches the
pattern Maurizio pointed out.  If it doesn't, there's a possibility that
some other place in the connector has a similar problem that hasn't been
fixed.

Thanks,
Karl


On Wed, Oct 21, 2015 at 8:48 AM, Paul Farrell <pf...@funnelback.com>
wrote:

> Hi Karl/Maurizio,
>
> I have a very very odd circumstance at present. This may or may not be
> related to the Alfresco WebScript plugin OR the environment in which I am
> running Manifold but thought I would raise the question.
>
> I have cloned the repo for the Alfresco Webscript connector and can see
> that there is a ‘alfresco-indexer-client.jar’ file in the ‘target’
> directory.
>
> I have taken that jar and have replaced the jar that existed in the
> Manifold instance. This was at a path called
> ‘apache-manifoldcf/connector-lib’. This path is referenced in an
> ‘mcf-properties.xml’ file which may or may not be specific to our
> environment.
>
> Anyway, as I say I have replaced the existing jar but the strangest thing
> is that the same path is being used when I ‘Save’ the repository
> connection. In other words, the path ‘….api/node…’ is still being used
> despite the jar file saying otherwise.
>
> NOTE: the way I am testing this is to apply the jar, restart Jetty (our
> app server), open Manifold, navigate to the Alfresco WebScript Repository
> connection, hit ‘Save’ and then open the ‘manifold.log’ file. It is in this
> file that I see the HTTP request and the 404 error. It is in this HTTP
> request that it stipulates the path it is using - the old path.
>
> —
>
> I have even gone to the extreme of removing this jar file and restarting
> the app server to see if this jar is ignored by Manifold. If I do this
> Manifold does not even start so it is clearly expecting that jar to exist.
> This is even more strange. It is clearly reliant on the jar but it is not
> using the content of that jar.
>
> Can I ask if you guys can think of any reason at all that this might be
> happening. It is starting to drive me mad!
>
> Thanks
>
>
>
>
> On 21 Oct 2015, at 02:23, Karl Wright <da...@gmail.com> wrote:
>
> Hi Paul,
> Looking at Issue 3, I think that Maurizio has indeed pointed you in the
> right direction.  Can you check your version of the plugin to be sure that
> /api/node/ is NOT present in the described line of code?
>
> Karl
>
>
> On Tue, Oct 20, 2015 at 5:00 PM, <pf...@funnelback.com> wrote:
>
>> Hi Maurizio,
>>
>> I will be available all day tomorrow (Wednesday) to help out as much as I
>> can. If it's possible for you to look into this I can take whatever steps
>> you need.
>>
>> Many thanks,
>>
>> Paul
>>
>> -----Original Message-----
>> From: "Karl Wright" <da...@gmail.com>
>> Sent: Tuesday, October 20, 2015 12:34pm
>> To: "user@manifoldcf.apache.org" <us...@manifoldcf.apache.org>
>> Subject: Re: Manifold/Alfresco seeding and security
>>
>> Hi Maurizio,
>>
>> This is the third time we've seen this; can you use Paul's help to chase
>> down what the issue is?
>>
>> Karl
>>
>>
>> On Tue, Oct 20, 2015 at 12:19 PM, Paul Farrell <pf...@funnelback.com>
>> wrote:
>>
>> > Hi,
>> >
>> > I am using Alfresco Community 5.0.
>> >
>> > Having taken that AMP file (version 0.7.1) and then installed it into
>> > Alfresco and restarted the services, the issue is still present.
>> >
>> > I suspect that this is probably more to do with the Manifold end than
>> the
>> > Alfresco end. It seems it is Manifold that is automatically appending
>> the
>> > “/api/node” string into the path whenever I use “/alfresco/service” as
>> the
>> > Context in the repository connection configuration.
>> >
>> > If it is of interest, this is the output in the manifoldcf.log file
>> when I
>> > use the repo connection config I mentioned earlier.
>> >
>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection request:
>> > [route: {}->http://54.165.85.140:8080][total kept alive: 0; route
>> > allocated: 0 of 2; total allocated: 0 of 20]
>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection leased:
>> [id:
>> > 10][route: {}->http://54.165.85.140:8080][total kept alive: 0; route
>> > allocated: 1 of 2; total allocated: 1 of 20]
>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Opening connection
>> {}->
>> > http://54.165.85.140:8080
>> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connecting to /
>> > 54.165.85.140:8080
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Connection established
>> > 172.31.23.90:58712<->54.165.85.140:8080
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Executing request GET
>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Proxy auth state:
>> > UNCHALLENGED
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> GET
>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > Accept: application/json
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > Authorization: Basic YWRtaW46RnVubmVsYmFjazE=
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > Host: 54.165.85.140:8080
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > Connection: Keep-Alive
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > User-Agent: Apache-HttpClient/4.3.5 (java 1.5)
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > Accept-Encoding: gzip,deflate
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> "GET
>> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > "Accept: application/json[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > "Authorization: Basic YWRtaW46RnVubmVsYmFjazE=[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > "Host: 54.165.85.140:8080[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > "Connection: Keep-Alive[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > "User-Agent: Apache-HttpClient/4.3.5 (java 1.5)[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > "Accept-Encoding: gzip,deflate[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
>> > "[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "HTTP/1.1 404 Not Found[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "Server: Apache-Coyote/1.1[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "Cache-Control: no-cache[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "Expires: Thu, 01 Jan 1970 00:00:00 GMT[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "Pragma: no-cache[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "Content-Type: text/html;charset=UTF-8[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "Transfer-Encoding: chunked[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "Date: Tue, 20 Oct 2015 16:18:47 GMT[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "630[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "
>> > http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "<html xmlns="http://www.w3.org/1999/xhtml">[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> > <head>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >    <title>Web Script Status 404 - Not Found</title>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >    <link rel="stylesheet" href="/alfresco/css/webscripts.css"
>> > type="text/css" />[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> > </head>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> > <body>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >    <div>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >       <table>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >             <td><img src="/alfresco/images/logo/AlfrescoLogo32.png"
>> > alt="Alfresco" /></td>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >             <td><span class="title">Web Script Status 404 - Not
>> > Found</span></td>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          </tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >       </table>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >       <br/>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >       <table>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr><td>The Web Script <a
>> >
>> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
>> > has responded with a status of 404 - Not Found.</td></tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >       </table>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >       <br/>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >       <table>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr><td><b>404 Description:</b></td><td> Requested resource is
>> not
>> > available.</td></tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr><td> </td></tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr><td><b>Message:</b></td><td>Cannot find object for
>> > NodeIdReference[storeRef=auth://resolve,id=admin]</td></tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr><td><b>Server</b>:</td><td>Community v5.0.0 (r75118-b23)
>> > schema 8,001</td></tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr><td><b>Time</b>:</td><td>Oct 20, 2015 4:18:47
>> PM</td></tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr><td></td><td> </td></tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >          <tr><td><b>Diagnostics</b>:</td><td><a
>> > href="/alfresco/service/script/org/alfresco/cmis/item.get">Inspect Web
>> > Script (org/alfresco/cmis/item.get)</a></td></tr>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >       </table>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> >    </div>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>> > </body>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "</html>[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "[\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > "[\r][\n]"
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > HTTP/1.1 404 Not Found
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > Server: Apache-Coyote/1.1
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > Cache-Control: no-cache
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > Expires: Thu, 01 Jan 1970 00:00:00 GMT
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > Pragma: no-cache
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > Content-Type: text/html;charset=UTF-8
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > Transfer-Encoding: chunked
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
>> > Date: Tue, 20 Oct 2015 16:18:47 GMT
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection can be kept
>> > alive indefinitely
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10:
>> > Shutdown connection
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection discarded
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10:
>> Close
>> > connection
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection released:
>> > [id: 10][route: {}->http://54.165.85.140:8080][total kept alive: 0;
>> route
>> > allocated: 0 of 2; total allocated: 0 of 20]
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager is
>> > shutting down
>> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager
>> shut
>> > down
>> >
>> > *Paul Farrell*
>> > Senior Search Consultant
>> >
>> > 109-123 Clifton Street, London EC2A 4LD
>> > *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>> >
>> > *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>> >
>> > Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback>
>> -
>> > Twitter <https://twitter.com/funnelback>
>> >
>> > Funnelback UK Ltd is a limited liability company registered in England &
>> > Wales. Registered address: Zetland House 109-123, Clifton Street,
>> London.
>> > EC2A 4LD. Company registration number: 07004264.
>> >
>> > On 20 Oct 2015, at 16:50, Maurizio Pillitu <ma...@apache.org> wrote:
>> >
>> > Hi Paul,
>> >
>> > it looks like you're hitting
>> > https://github.com/maoo/alfresco-indexer/issues/3 ; which version of
>> > alfresco-indexer are you using? Can you try using
>> >
>> http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp
>> (or
>> > the pre-built WAR file -
>> >
>> http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar
>> >  )
>> >
>> > HTH
>> >   mao
>> >
>> > On Tue, Oct 20, 2015 at 5:36 PM Paul Farrell <pf...@funnelback.com>
>> > wrote:
>> >
>> >> Hi,
>> >>
>> >> Having had to go back to basics and re-install my Alfresco instance, I
>> >> can confirm that the AMP file for the alfresco indexer web scripts
>> *does*
>> >> actually install without error. There must have been an issue with my
>> >> previous Alfresco instance.
>> >>
>> >> Having said that, the Alfresco WebScript connector fails. The failure
>> is
>> >> down to the ‘Context’ setting (see below):
>> >>
>> >> <4a6db6238cff01e7ff77cdaf7e6ea050.png>
>> >>
>> >> When you attempt to save the configuration of the WebScript connector,
>> >> Manifold clearly tries to check the connection. It seems to do this by
>> >> making an API call (/auth/resolve/admin). The issue is with what
>> Manifold
>> >> prepends to the start of that path.
>> >> If I leave the setting as above then Manifold reports   :
>> >>
>> >> <tr><td>The Web Script <a
>> >>
>> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
>> >> has responded with a status of 404 - Not Found.</td></tr>[\n]”
>> >>
>> >> In other words, it builds the full path as
>> >> “alfresco/service/api/node/auth/resolve/admin”.
>> >>
>> >> For my Alfresco Community 5.0 instance, I get to that same web script
>> via
>> >> the URL “/alfresco/service/auth/resolve/admin” i.e. without the
>> ‘/api/node’.
>> >>
>> >> Somewhere, Manifold is assuming that the ‘/api/node’ is a correct path
>> >> inclusion. In other words, there is nothing I can put into that box to
>> >> prevent it.
>> >>
>> >> Paul
>> >>
>> >> On 20 Oct 2015, at 12:56, Karl Wright <da...@gmail.com> wrote:
>> >>
>> >> Hmm.  What file was missing?  Maurizio maintains the indexer plugin; I
>> >> feel certain he'd want to know.
>> >>
>> >> Karl
>> >>
>> >>
>> >> On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <pfarrell@funnelback.com
>> >
>> >> wrote:
>> >>
>> >>> Hi guys,
>> >>>
>> >>> Just to let you know what’s going on - for informational purposes more
>> >>> than anything.
>> >>>
>> >>> I initially tried taking the AMP file provided in the MCF plugins
>> >>> directory (0.7.0) and tried to install it into Alfresco but got a
>> message
>> >>> saying a file was missing.
>> >>>
>> >>> Instead, I cloned the repository on GitHub for the alfresco-indexer
>> >>> project and then built it on my local machine. This generated the AMP
>> file
>> >>> (0.7.2).
>> >>>
>> >>> I was able to successfully install the AMP file onto my Alfresco
>> >>> instance.
>> >>>
>> >>> As it happens I now cannot log into Alfresco Share ('bad credentials
>> or
>> >>> server not available' message) but that is something I can work on.
>> >>> Apparently the installation of some AMP files have been known to
>> cause this
>> >>> issue.
>> >>>
>> >>> So, progress to a point!
>> >>>
>> >>> *Paul Farrell*
>> >>> Senior Search Consultant
>> >>>
>> >>> 109-123 Clifton Street, London EC2A 4LD
>> >>> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/
>> >
>> >>>
>> >>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>> >>>
>> >>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback>
>> -
>> >>>  Twitter <https://twitter.com/funnelback>
>> >>>
>> >>> Funnelback UK Ltd is a limited liability company registered in
>> England &
>> >>> Wales. Registered address: Zetland House 109-123, Clifton Street,
>> London.
>> >>> EC2A 4LD. Company registration number: 07004264.
>> >>>
>> >>> On 20 Oct 2015, at 12:36, Rafa Haro <rh...@gmail.com> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> At the Alfresco side, hope this helps:
>> >>>
>> >>> http://docs.alfresco.com/4.1/tasks/amp-install.html
>> >>>
>> >>> Cheers
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <da...@gmail.com>
>> wrote:
>> >>>
>> >>>> The AMP file is actually shipped as part of the binary MCF
>> >>>> distribution.  You can find it under "plugins".
>> >>>>
>> >>>> Karl
>> >>>>
>> >>>>
>> >>>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <
>> pfarrell@funnelback.com>
>> >>>> wrote:
>> >>>>
>> >>>>> Hi all,
>> >>>>>
>> >>>>> Hopefully this will be my only request for information today.
>> >>>>> I’m afraid this is a bit of a newbie question but I have managed to
>> >>>>> get the Manifold UI to now show ‘Alfresco Webscripts’ as a
>> connector. The
>> >>>>> only bit I am missing now is to install the AMP file in Afresco.
>> >>>>>
>> >>>>> I realise that this is slightly outside of the Manifold remit but I
>> >>>>> wondered if anyone can advise how I build the AMP file from the URL
>> (
>> >>>>> https://github.com/maoo/alfresco-indexer)? I have cloned the
>> >>>>> repository to my local drive but, having never worked with Maven,
>> am at a
>> >>>>> loss at how to generate the AMP file that I then need to install
>> into
>> >>>>> Alfresco.
>> >>>>>
>> >>>>> Many thanks,
>> >>>>>
>> >>>>> On 19 Oct 2015, at 17:36, Karl Wright <da...@gmail.com> wrote:
>> >>>>>
>> >>>>> The only way you can have such a reduced list of connectors is if
>> >>>>> somebody commented out many connectors in your connectors.xml, or
>> removed
>> >>>>> them from the database table where they are registered by hand.
>> >>>>>
>> >>>>> Karl
>> >>>>>
>> >>>>>
>> >>>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <
>> >>>>> pfarrell@funnelback.com> wrote:
>> >>>>>
>> >>>>>> After a good deal of time clicking around I came to the same
>> >>>>>> conclusion - that there is no way of telling from the UI!!
>> >>>>>>
>> >>>>>> Having dug a bit deeper I believe I may actually have the Alfresco
>> >>>>>> WebScript connectors installed. At least the 0.7.0 version. I
>> notice in the
>> >>>>>> ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
>> >>>>>>
>> >>>>>> Looking in the ‘connectors.xml’ file I can also see the line :
>> >>>>>>
>> >>>>>> <repositoryconnector name="Alfresco Webscript"
>> >>>>>>
>> class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>> >>>>>>
>> >>>>>> You can imagine my excitement!
>> >>>>>>
>> >>>>>> The only thing I am missing is the option in the UI. When I click
>> to
>> >>>>>> create a new repo connection I get:  CMIS, Dropbox, Generic,
>> GoogleDrive,
>> >>>>>> HDFS, Jira, Meridio, RSS, Sharepoint.
>> >>>>>>
>> >>>>>> Perhaps I am hoping for too much to hope that I can make a simple
>> >>>>>> change to enable this repo connection?
>> >>>>>>
>> >>>>>> Thanks for all the help everyone
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> On 19 Oct 2015, at 17:26, Karl Wright <da...@gmail.com> wrote:
>> >>>>>>
>> >>>>>> Hah; there's not a way to inquire in the UI, if that's what you
>> >>>>>> mean.  But if you see "Alfresco webscript" in the list of
>> repository
>> >>>>>> connection types, you've got a version that supports that
>> connector.
>> >>>>>>
>> >>>>>> Thanks,
>> >>>>>> Karl
>> >>>>>>
>> >>>>>>
>> >>>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <
>> >>>>>> pfarrell@funnelback.com> wrote:
>> >>>>>>
>> >>>>>>> Thanks Rafa.
>> >>>>>>>
>> >>>>>>> As an aside, is there an easy way to identify which version of
>> >>>>>>> ManifoldCF you are on?
>> >>>>>>>
>> >>>>>>> Cheers
>> >>>>>>>
>> >>>>>>> *Paul Farrell*
>> >>>>>>> Senior Search Consultant
>> >>>>>>>
>> >>>>>>> 109-123 Clifton Street, London EC2A 4LD
>> >>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
>> >>>>>>> <http://www.funnelback.com/>
>> >>>>>>>
>> >>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED
>> STATES
>> >>>>>>>
>> >>>>>>> Connect with us: LinkedIn
>> >>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
>> >>>>>>> <https://twitter.com/funnelback>
>> >>>>>>>
>> >>>>>>> Funnelback UK Ltd is a limited liability company registered in
>> >>>>>>> England & Wales. Registered address: Zetland House 109-123,
>> Clifton Street,
>> >>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>> >>>>>>>
>> >>>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rh...@apache.org> wrote:
>> >>>>>>>
>> >>>>>>> Hi Paul,
>> >>>>>>>
>> >>>>>>> All you need to do is to install this webscript
>> >>>>>>> <https://github.com/maoo/alfresco-indexer> within your Alfresco
>> >>>>>>> instance. The connector itself is already part of the most recent
>> versions
>> >>>>>>> of ManifoldCF
>> >>>>>>>
>> >>>>>>> Cheers,
>> >>>>>>> Rafa
>> >>>>>>>
>> >>>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <
>> >>>>>>> pfarrell@funnelback.com> wrote:
>> >>>>>>>
>> >>>>>>>> Ok, thanks again guys.
>> >>>>>>>>
>> >>>>>>>> The Webscript connector it is.
>> >>>>>>>>
>> >>>>>>>> I realise I am asking a lot here but are there any easy-to-follow
>> >>>>>>>> guidelines on how to get this Webscript connector installed?  I
>> see there
>> >>>>>>>> is a GitHub page here (
>> >>>>>>>> https://github.com/maoo/alfresco-webscript-manifold-connector)
>> >>>>>>>> which discusses it (although it directs you to a repository of
>> files).
>> >>>>>>>>
>> >>>>>>>> I am just keen to make sure that any steps I follow to try and
>> get
>> >>>>>>>> this Webscript connector installed and working are updated,
>> reliable steps.
>> >>>>>>>> I would hate to waste time with out of date information.
>> >>>>>>>>
>> >>>>>>>> Thanks all
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh...@gmail.com>
>> >>>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>> Hi Paul,
>> >>>>>>>>
>> >>>>>>>> I suggest that you should use Alfresco Webscript as Karl
>> mentioned.
>> >>>>>>>> Web services is so slow compared to other services and I've also
>> checked
>> >>>>>>>> that Alfresco CMIS web services does not return change token(may
>> be there
>> >>>>>>>> is something that I don't know).
>> >>>>>>>>
>> >>>>>>>> By the way current version of CMIS connector is not aware of
>> change
>> >>>>>>>> token. I would write a patch for you if alfresco supports change
>> token
>> >>>>>>>> property.
>> >>>>>>>>
>> >>>>>>>> Thanks!
>> >>>>>>>> Muhammed
>> >>>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <
>> >>>>>>>> daddywri@gmail.com> şunu yazdı:
>> >>>>>>>>
>> >>>>>>>>> Hi Paul,
>> >>>>>>>>>
>> >>>>>>>>> The Alfresco Webscript connector is a wholly different connector
>> >>>>>>>>> that has no relation to the CMIS connector.  It requires an
>> Alfresco
>> >>>>>>>>> webscript plugin be installed on your Alfresco server to work,
>> though.
>> >>>>>>>>>
>> >>>>>>>>> Hope that helps.
>> >>>>>>>>>
>> >>>>>>>>> Karl
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <
>> >>>>>>>>> pfarrell@funnelback.com> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> Hi Muhammed/Karl,
>> >>>>>>>>>>
>> >>>>>>>>>> Firstly, thank-you so much for taking the time to reply. It is
>> >>>>>>>>>> very much appreciated.
>> >>>>>>>>>>
>> >>>>>>>>>> Currently I am using the AtomPub for my CMIS repository
>> >>>>>>>>>> connection. I have just read something which may shed a little
>> light on
>> >>>>>>>>>> this. The post read that change tokens are not passed via
>> AtomPub
>> >>>>>>>>>> connections (
>> >>>>>>>>>>
>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758
>> ).
>> >>>>>>>>>> If true, this would explain why ManifoldCF may be unable to
>> determine a
>> >>>>>>>>>> change in Alfresco.
>> >>>>>>>>>>
>> >>>>>>>>>> It looks like I have two possible options left open to me
>> >>>>>>>>>> (correct me if I’m wrong):
>> >>>>>>>>>>
>> >>>>>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the
>> >>>>>>>>>> connection mechanism
>> >>>>>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’
>> >>>>>>>>>> connector?  (or is this the same as the ‘Web Services’
>> connection mentioned
>> >>>>>>>>>> above?)
>> >>>>>>>>>>
>> >>>>>>>>>> Thanks again,
>> >>>>>>>>>>
>> >>>>>>>>>> Paul
>> >>>>>>>>>>
>> >>>>>>>>>> *Paul Farrell*
>> >>>>>>>>>> Senior Search Consultant
>> >>>>>>>>>>
>> >>>>>>>>>> 109-123 Clifton Street, London EC2A 4LD
>> >>>>>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
>> >>>>>>>>>> <http://www.funnelback.com/>
>> >>>>>>>>>>
>> >>>>>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED
>> >>>>>>>>>> STATES
>> >>>>>>>>>>
>> >>>>>>>>>> Connect with us: LinkedIn
>> >>>>>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
>> >>>>>>>>>> <https://twitter.com/funnelback>
>> >>>>>>>>>>
>> >>>>>>>>>> Funnelback UK Ltd is a limited liability company registered in
>> >>>>>>>>>> England & Wales. Registered address: Zetland House 109-123,
>> Clifton Street,
>> >>>>>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>> >>>>>>>>>>
>> >>>>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh...@gmail.com>
>> >>>>>>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>> Hi Paul,
>> >>>>>>>>>>
>> >>>>>>>>>> Repositories should give information to ManifoldCF when they
>> >>>>>>>>>> updated. Current CMIS connector reindex document if the
>> lastest version of
>> >>>>>>>>>> the document has changed, not updated.
>> >>>>>>>>>>
>> >>>>>>>>>> There is a change token property in CMIS specification and it
>> >>>>>>>>>> should change when document is updated so ManifoldCF can
>> understand that
>> >>>>>>>>>> document is updated but implementing change token property is
>> optional.
>> >>>>>>>>>> I've checked Alfresco's CMIS web site and seen that they
>> didn't set the
>> >>>>>>>>>> change token.
>> >>>>>>>>>>
>> >>>>>>>>>> I think, there is nothing we can do at this point.
>> >>>>>>>>>>
>> >>>>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <
>> daddywri@gmail.com>
>> >>>>>>>>>> şunu yazdı:
>> >>>>>>>>>>
>> >>>>>>>>>>> Hi Paul,
>> >>>>>>>>>>>
>> >>>>>>>>>>> This looks like a bug in the CMIS connector to me; usually the
>> >>>>>>>>>>> document version string the connector constructs should be
>> adequate to
>> >>>>>>>>>>> detect all changes.  Can you create a ticket?
>> >>>>>>>>>>> https://issues.apache.org/jira , project ManifoldCF.  Please
>> >>>>>>>>>>> include what version of MCF you are using here.  FWIW, this
>> may be in fact
>> >>>>>>>>>>> a bug in the Alfresco CMIS implementation, but we'll have to
>> have some back
>> >>>>>>>>>>> and forth before I can determine that for sure.
>> >>>>>>>>>>>
>> >>>>>>>>>>> In the meantime, have you considered using the Alfresco
>> >>>>>>>>>>> Webscript connector?  It's the preferred way to do Alfresco
>> indexing,
>> >>>>>>>>>>> although there have been issues reported having to do with
>> running it on
>> >>>>>>>>>>> some configurations of Alfresco.  I'm not entirely sure what
>> the problem is
>> >>>>>>>>>>> there; maybe a version dependency of some kind.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Karl
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <
>> >>>>>>>>>>> pfarrell@funnelback.com> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>> Hi Everyone,
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Hoping someone may be able to advise.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> I am currently using Manifold, together with a CMIS
>> connector,
>> >>>>>>>>>>>> to retrieve and index content from an Alfresco repository.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> All is going well apart from, what I would call, the
>> >>>>>>>>>>>> ‘incremental crawl’.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> The main issue I am having is that the modification of a
>> >>>>>>>>>>>> document’s security settings, in Alfresco, is not being
>> picked up in next
>> >>>>>>>>>>>> Manifold crawl. As an example I have a document ‘TestDoc1’
>> which has user A
>> >>>>>>>>>>>> and B as Consumers. I run a crawl in Manifold and it picks
>> up the documents
>> >>>>>>>>>>>> fine.  The security is set as expected. I then remove ‘User
>> A’ from the
>> >>>>>>>>>>>> security of that document and re-run the Manifold crawl.
>> User A can still
>> >>>>>>>>>>>> see the document in the local search engine.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> It is as if Manifold is not treating the security update as a
>> >>>>>>>>>>>> ‘modification’ and is therefore not refreshing it. Note that
>> if I go into
>> >>>>>>>>>>>> the Output Connections, edit and save the relevant output
>> connection and
>> >>>>>>>>>>>> then click ‘Remove all associated documents’, the next time
>> I crawl, the
>> >>>>>>>>>>>> changes are picked up. It is clear that Manifold is just not
>> updating
>> >>>>>>>>>>>> whatever internal record it has for this item.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Any ideas?
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Many thanks.
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>>
>> >>
>> >
>>
>>
>>
>
>

Re: Manifold/Alfresco seeding and security

Posted by Paul Farrell <pf...@funnelback.com>.
Hi Karl/Maurizio,

I have a very very odd circumstance at present. This may or may not be related to the Alfresco WebScript plugin OR the environment in which I am running Manifold but thought I would raise the question. 

I have cloned the repo for the Alfresco Webscript connector and can see that there is a ‘alfresco-indexer-client.jar’ file in the ‘target’ directory. 

I have taken that jar and have replaced the jar that existed in the Manifold instance. This was at a path called ‘apache-manifoldcf/connector-lib’. This path is referenced in an ‘mcf-properties.xml’ file which may or may not be specific to our environment. 

Anyway, as I say I have replaced the existing jar but the strangest thing is that the same path is being used when I ‘Save’ the repository connection. In other words, the path ‘….api/node…’ is still being used despite the jar file saying otherwise. 

NOTE: the way I am testing this is to apply the jar, restart Jetty (our app server), open Manifold, navigate to the Alfresco WebScript Repository connection, hit ‘Save’ and then open the ‘manifold.log’ file. It is in this file that I see the HTTP request and the 404 error. It is in this HTTP request that it stipulates the path it is using - the old path. 

—

I have even gone to the extreme of removing this jar file and restarting the app server to see if this jar is ignored by Manifold. If I do this Manifold does not even start so it is clearly expecting that jar to exist. This is even more strange. It is clearly reliant on the jar but it is not using the content of that jar. 

Can I ask if you guys can think of any reason at all that this might be happening. It is starting to drive me mad!

Thanks



> On 21 Oct 2015, at 02:23, Karl Wright <da...@gmail.com> wrote:
> 
> Hi Paul,
> Looking at Issue 3, I think that Maurizio has indeed pointed you in the right direction.  Can you check your version of the plugin to be sure that /api/node/ is NOT present in the described line of code?
> 
> Karl
> 
> 
> On Tue, Oct 20, 2015 at 5:00 PM, <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> Hi Maurizio,
> 
> I will be available all day tomorrow (Wednesday) to help out as much as I can. If it's possible for you to look into this I can take whatever steps you need.
> 
> Many thanks,
> 
> Paul
> 
> -----Original Message-----
> From: "Karl Wright" <daddywri@gmail.com <ma...@gmail.com>>
> Sent: Tuesday, October 20, 2015 12:34pm
> To: "user@manifoldcf.apache.org <ma...@manifoldcf.apache.org>" <user@manifoldcf.apache.org <ma...@manifoldcf.apache.org>>
> Subject: Re: Manifold/Alfresco seeding and security
> 
> Hi Maurizio,
> 
> This is the third time we've seen this; can you use Paul's help to chase
> down what the issue is?
> 
> Karl
> 
> 
> On Tue, Oct 20, 2015 at 12:19 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
> wrote:
> 
> > Hi,
> >
> > I am using Alfresco Community 5.0.
> >
> > Having taken that AMP file (version 0.7.1) and then installed it into
> > Alfresco and restarted the services, the issue is still present.
> >
> > I suspect that this is probably more to do with the Manifold end than the
> > Alfresco end. It seems it is Manifold that is automatically appending the
> > “/api/node” string into the path whenever I use “/alfresco/service” as the
> > Context in the repository connection configuration.
> >
> > If it is of interest, this is the output in the manifoldcf.log file when I
> > use the repo connection config I mentioned earlier.
> >
> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection request:
> > [route: {}->http://54.165.85.140:8080][total kept alive: 0; route
> > allocated: 0 of 2; total allocated: 0 of 20]
> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection leased: [id:
> > 10][route: {}->http://54.165.85.140:8080][total kept alive: 0; route
> > allocated: 1 of 2; total allocated: 1 of 20]
> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Opening connection {}->
> > http://54.165.85.140:8080 <http://54.165.85.140:8080/>
> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connecting to /
> > 54.165.85.140:8080 <http://54.165.85.140:8080/>
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Connection established
> > 172.31.23.90:58712 <http://172.31.23.90:58712/><->54.165.85.140:8080 <http://54.165.85.140:8080/>
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Executing request GET
> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Proxy auth state:
> > UNCHALLENGED
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> GET
> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > Accept: application/json
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > Authorization: Basic YWRtaW46RnVubmVsYmFjazE=
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > Host: 54.165.85.140:8080 <http://54.165.85.140:8080/>
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > Connection: Keep-Alive
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > User-Agent: Apache-HttpClient/4.3.5 (java 1.5)
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > Accept-Encoding: gzip,deflate
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> "GET
> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "Accept: application/json[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "Authorization: Basic YWRtaW46RnVubmVsYmFjazE=[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "Host: 54.165.85.140:8080[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "Connection: Keep-Alive[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "User-Agent: Apache-HttpClient/4.3.5 (java 1.5)[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "Accept-Encoding: gzip,deflate[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "HTTP/1.1 404 Not Found[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Server: Apache-Coyote/1.1[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Cache-Control: no-cache[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Expires: Thu, 01 Jan 1970 00:00:00 GMT[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Pragma: no-cache[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Content-Type: text/html;charset=UTF-8[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Transfer-Encoding: chunked[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Date: Tue, 20 Oct 2015 16:18:47 GMT[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "630[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "
> > http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd <http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd>">[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "<html xmlns="http://www.w3.org/1999/xhtml <http://www.w3.org/1999/xhtml>">[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> > <head>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >    <title>Web Script Status 404 - Not Found</title>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >    <link rel="stylesheet" href="/alfresco/css/webscripts.css"
> > type="text/css" />[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> > </head>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> > <body>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >    <div>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       <table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >             <td><img src="/alfresco/images/logo/AlfrescoLogo32.png"
> > alt="Alfresco" /></td>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >             <td><span class="title">Web Script Status 404 - Not
> > Found</span></td>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          </tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       </table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       <br/>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       <table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td>The Web Script <a
> > href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
> > has responded with a status of 404 - Not Found.</td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       </table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       <br/>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       <table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td><b>404 Description:</b></td><td> Requested resource is not
> > available.</td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td> </td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td><b>Message:</b></td><td>Cannot find object for
> > NodeIdReference[storeRef=auth://resolve,id=admin]</td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td><b>Server</b>:</td><td>Community v5.0.0 (r75118-b23)
> > schema 8,001</td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td><b>Time</b>:</td><td>Oct 20, 2015 4:18:47 PM</td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td></td><td> </td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td><b>Diagnostics</b>:</td><td><a
> > href="/alfresco/service/script/org/alfresco/cmis/item.get">Inspect Web
> > Script (org/alfresco/cmis/item.get)</a></td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       </table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >    </div>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> > </body>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "</html>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > HTTP/1.1 404 Not Found
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Server: Apache-Coyote/1.1
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Cache-Control: no-cache
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Expires: Thu, 01 Jan 1970 00:00:00 GMT
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Pragma: no-cache
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Content-Type: text/html;charset=UTF-8
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Transfer-Encoding: chunked
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Date: Tue, 20 Oct 2015 16:18:47 GMT
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection can be kept
> > alive indefinitely
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10:
> > Shutdown connection
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection discarded
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10: Close
> > connection
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection released:
> > [id: 10][route: {}->http://54.165.85.140:8080][total kept alive: 0; route
> > allocated: 0 of 2; total allocated: 0 of 20]
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager is
> > shutting down
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager shut
> > down
> >
> > *Paul Farrell*
> > Senior Search Consultant
> >
> > 109-123 Clifton Street, London EC2A 4LD
> > *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/> <http://www.funnelback.com/ <http://www.funnelback.com/>>
> >
> > *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
> >
> > Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> -
> > Twitter <https://twitter.com/funnelback <https://twitter.com/funnelback>>
> >
> > Funnelback UK Ltd is a limited liability company registered in England &
> > Wales. Registered address: Zetland House 109-123, Clifton Street, London.
> > EC2A 4LD. Company registration number: 07004264.
> >
> > On 20 Oct 2015, at 16:50, Maurizio Pillitu <maoo@apache.org <ma...@apache.org>> wrote:
> >
> > Hi Paul,
> >
> > it looks like you're hitting
> > https://github.com/maoo/alfresco-indexer/issues/3 <https://github.com/maoo/alfresco-indexer/issues/3> ; which version of
> > alfresco-indexer are you using? Can you try using
> > http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp <http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp> (or
> > the pre-built WAR file -
> > http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar <http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar>
> >  )
> >
> > HTH
> >   mao
> >
> > On Tue, Oct 20, 2015 at 5:36 PM Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
> > wrote:
> >
> >> Hi,
> >>
> >> Having had to go back to basics and re-install my Alfresco instance, I
> >> can confirm that the AMP file for the alfresco indexer web scripts *does*
> >> actually install without error. There must have been an issue with my
> >> previous Alfresco instance.
> >>
> >> Having said that, the Alfresco WebScript connector fails. The failure is
> >> down to the ‘Context’ setting (see below):
> >>
> >> <4a6db6238cff01e7ff77cdaf7e6ea050.png>
> >>
> >> When you attempt to save the configuration of the WebScript connector,
> >> Manifold clearly tries to check the connection. It seems to do this by
> >> making an API call (/auth/resolve/admin). The issue is with what Manifold
> >> prepends to the start of that path.
> >> If I leave the setting as above then Manifold reports   :
> >>
> >> <tr><td>The Web Script <a
> >> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
> >> has responded with a status of 404 - Not Found.</td></tr>[\n]”
> >>
> >> In other words, it builds the full path as
> >> “alfresco/service/api/node/auth/resolve/admin”.
> >>
> >> For my Alfresco Community 5.0 instance, I get to that same web script via
> >> the URL “/alfresco/service/auth/resolve/admin” i.e. without the ‘/api/node’.
> >>
> >> Somewhere, Manifold is assuming that the ‘/api/node’ is a correct path
> >> inclusion. In other words, there is nothing I can put into that box to
> >> prevent it.
> >>
> >> Paul
> >>
> >> On 20 Oct 2015, at 12:56, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
> >>
> >> Hmm.  What file was missing?  Maurizio maintains the indexer plugin; I
> >> feel certain he'd want to know.
> >>
> >> Karl
> >>
> >>
> >> On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
> >> wrote:
> >>
> >>> Hi guys,
> >>>
> >>> Just to let you know what’s going on - for informational purposes more
> >>> than anything.
> >>>
> >>> I initially tried taking the AMP file provided in the MCF plugins
> >>> directory (0.7.0) and tried to install it into Alfresco but got a message
> >>> saying a file was missing.
> >>>
> >>> Instead, I cloned the repository on GitHub for the alfresco-indexer
> >>> project and then built it on my local machine. This generated the AMP file
> >>> (0.7.2).
> >>>
> >>> I was able to successfully install the AMP file onto my Alfresco
> >>> instance.
> >>>
> >>> As it happens I now cannot log into Alfresco Share ('bad credentials or
> >>> server not available' message) but that is something I can work on.
> >>> Apparently the installation of some AMP files have been known to cause this
> >>> issue.
> >>>
> >>> So, progress to a point!
> >>>
> >>> *Paul Farrell*
> >>> Senior Search Consultant
> >>>
> >>> 109-123 Clifton Street, London EC2A 4LD
> >>> *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/> <http://www.funnelback.com/ <http://www.funnelback.com/>>
> >>>
> >>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
> >>>
> >>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> -
> >>>  Twitter <https://twitter.com/funnelback <https://twitter.com/funnelback>>
> >>>
> >>> Funnelback UK Ltd is a limited liability company registered in England &
> >>> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
> >>> EC2A 4LD. Company registration number: 07004264.
> >>>
> >>> On 20 Oct 2015, at 12:36, Rafa Haro <rharoapache@gmail.com <ma...@gmail.com>> wrote:
> >>>
> >>> Hi,
> >>>
> >>> At the Alfresco side, hope this helps:
> >>>
> >>> http://docs.alfresco.com/4.1/tasks/amp-install.html <http://docs.alfresco.com/4.1/tasks/amp-install.html>
> >>>
> >>> Cheers
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
> >>>
> >>>> The AMP file is actually shipped as part of the binary MCF
> >>>> distribution.  You can find it under "plugins".
> >>>>
> >>>> Karl
> >>>>
> >>>>
> >>>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>>
> >>>> wrote:
> >>>>
> >>>>> Hi all,
> >>>>>
> >>>>> Hopefully this will be my only request for information today.
> >>>>> I’m afraid this is a bit of a newbie question but I have managed to
> >>>>> get the Manifold UI to now show ‘Alfresco Webscripts’ as a connector. The
> >>>>> only bit I am missing now is to install the AMP file in Afresco.
> >>>>>
> >>>>> I realise that this is slightly outside of the Manifold remit but I
> >>>>> wondered if anyone can advise how I build the AMP file from the URL (
> >>>>> https://github.com/maoo/alfresco-indexer <https://github.com/maoo/alfresco-indexer>)? I have cloned the
> >>>>> repository to my local drive but, having never worked with Maven, am at a
> >>>>> loss at how to generate the AMP file that I then need to install into
> >>>>> Alfresco.
> >>>>>
> >>>>> Many thanks,
> >>>>>
> >>>>> On 19 Oct 2015, at 17:36, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
> >>>>>
> >>>>> The only way you can have such a reduced list of connectors is if
> >>>>> somebody commented out many connectors in your connectors.xml, or removed
> >>>>> them from the database table where they are registered by hand.
> >>>>>
> >>>>> Karl
> >>>>>
> >>>>>
> >>>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <
> >>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> >>>>>
> >>>>>> After a good deal of time clicking around I came to the same
> >>>>>> conclusion - that there is no way of telling from the UI!!
> >>>>>>
> >>>>>> Having dug a bit deeper I believe I may actually have the Alfresco
> >>>>>> WebScript connectors installed. At least the 0.7.0 version. I notice in the
> >>>>>> ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
> >>>>>>
> >>>>>> Looking in the ‘connectors.xml’ file I can also see the line :
> >>>>>>
> >>>>>> <repositoryconnector name="Alfresco Webscript"
> >>>>>> class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
> >>>>>>
> >>>>>> You can imagine my excitement!
> >>>>>>
> >>>>>> The only thing I am missing is the option in the UI. When I click to
> >>>>>> create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive,
> >>>>>> HDFS, Jira, Meridio, RSS, Sharepoint.
> >>>>>>
> >>>>>> Perhaps I am hoping for too much to hope that I can make a simple
> >>>>>> change to enable this repo connection?
> >>>>>>
> >>>>>> Thanks for all the help everyone
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 19 Oct 2015, at 17:26, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
> >>>>>>
> >>>>>> Hah; there's not a way to inquire in the UI, if that's what you
> >>>>>> mean.  But if you see "Alfresco webscript" in the list of repository
> >>>>>> connection types, you've got a version that supports that connector.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Karl
> >>>>>>
> >>>>>>
> >>>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <
> >>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> >>>>>>
> >>>>>>> Thanks Rafa.
> >>>>>>>
> >>>>>>> As an aside, is there an easy way to identify which version of
> >>>>>>> ManifoldCF you are on?
> >>>>>>>
> >>>>>>> Cheers
> >>>>>>>
> >>>>>>> *Paul Farrell*
> >>>>>>> Senior Search Consultant
> >>>>>>>
> >>>>>>> 109-123 Clifton Street, London EC2A 4LD
> >>>>>>> *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/>
> >>>>>>> <http://www.funnelback.com/ <http://www.funnelback.com/>>
> >>>>>>>
> >>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
> >>>>>>>
> >>>>>>> Connect with us: LinkedIn
> >>>>>>> <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> - Twitter
> >>>>>>> <https://twitter.com/funnelback <https://twitter.com/funnelback>>
> >>>>>>>
> >>>>>>> Funnelback UK Ltd is a limited liability company registered in
> >>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
> >>>>>>> London. EC2A 4LD. Company registration number: 07004264.
> >>>>>>>
> >>>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rharo@apache.org <ma...@apache.org>> wrote:
> >>>>>>>
> >>>>>>> Hi Paul,
> >>>>>>>
> >>>>>>> All you need to do is to install this webscript
> >>>>>>> <https://github.com/maoo/alfresco-indexer <https://github.com/maoo/alfresco-indexer>> within your Alfresco
> >>>>>>> instance. The connector itself is already part of the most recent versions
> >>>>>>> of ManifoldCF
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Rafa
> >>>>>>>
> >>>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <
> >>>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> >>>>>>>
> >>>>>>>> Ok, thanks again guys.
> >>>>>>>>
> >>>>>>>> The Webscript connector it is.
> >>>>>>>>
> >>>>>>>> I realise I am asking a lot here but are there any easy-to-follow
> >>>>>>>> guidelines on how to get this Webscript connector installed?  I see there
> >>>>>>>> is a GitHub page here (
> >>>>>>>> https://github.com/maoo/alfresco-webscript-manifold-connector <https://github.com/maoo/alfresco-webscript-manifold-connector>)
> >>>>>>>> which discusses it (although it directs you to a repository of files).
> >>>>>>>>
> >>>>>>>> I am just keen to make sure that any steps I follow to try and get
> >>>>>>>> this Webscript connector installed and working are updated, reliable steps.
> >>>>>>>> I would hate to waste time with out of date information.
> >>>>>>>>
> >>>>>>>> Thanks all
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> Hi Paul,
> >>>>>>>>
> >>>>>>>> I suggest that you should use Alfresco Webscript as Karl mentioned.
> >>>>>>>> Web services is so slow compared to other services and I've also checked
> >>>>>>>> that Alfresco CMIS web services does not return change token(may be there
> >>>>>>>> is something that I don't know).
> >>>>>>>>
> >>>>>>>> By the way current version of CMIS connector is not aware of change
> >>>>>>>> token. I would write a patch for you if alfresco supports change token
> >>>>>>>> property.
> >>>>>>>>
> >>>>>>>> Thanks!
> >>>>>>>> Muhammed
> >>>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <
> >>>>>>>> daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
> >>>>>>>>
> >>>>>>>>> Hi Paul,
> >>>>>>>>>
> >>>>>>>>> The Alfresco Webscript connector is a wholly different connector
> >>>>>>>>> that has no relation to the CMIS connector.  It requires an Alfresco
> >>>>>>>>> webscript plugin be installed on your Alfresco server to work, though.
> >>>>>>>>>
> >>>>>>>>> Hope that helps.
> >>>>>>>>>
> >>>>>>>>> Karl
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <
> >>>>>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi Muhammed/Karl,
> >>>>>>>>>>
> >>>>>>>>>> Firstly, thank-you so much for taking the time to reply. It is
> >>>>>>>>>> very much appreciated.
> >>>>>>>>>>
> >>>>>>>>>> Currently I am using the AtomPub for my CMIS repository
> >>>>>>>>>> connection. I have just read something which may shed a little light on
> >>>>>>>>>> this. The post read that change tokens are not passed via AtomPub
> >>>>>>>>>> connections (
> >>>>>>>>>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758 <https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758>).
> >>>>>>>>>> If true, this would explain why ManifoldCF may be unable to determine a
> >>>>>>>>>> change in Alfresco.
> >>>>>>>>>>
> >>>>>>>>>> It looks like I have two possible options left open to me
> >>>>>>>>>> (correct me if I’m wrong):
> >>>>>>>>>>
> >>>>>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the
> >>>>>>>>>> connection mechanism
> >>>>>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’
> >>>>>>>>>> connector?  (or is this the same as the ‘Web Services’ connection mentioned
> >>>>>>>>>> above?)
> >>>>>>>>>>
> >>>>>>>>>> Thanks again,
> >>>>>>>>>>
> >>>>>>>>>> Paul
> >>>>>>>>>>
> >>>>>>>>>> *Paul Farrell*
> >>>>>>>>>> Senior Search Consultant
> >>>>>>>>>>
> >>>>>>>>>> 109-123 Clifton Street, London EC2A 4LD
> >>>>>>>>>> *T* +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://funnelback.com/>
> >>>>>>>>>> <http://www.funnelback.com/ <http://www.funnelback.com/>>
> >>>>>>>>>>
> >>>>>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED
> >>>>>>>>>> STATES
> >>>>>>>>>>
> >>>>>>>>>> Connect with us: LinkedIn
> >>>>>>>>>> <http://www.linkedin.com/company/funnelback <http://www.linkedin.com/company/funnelback>> - Twitter
> >>>>>>>>>> <https://twitter.com/funnelback <https://twitter.com/funnelback>>
> >>>>>>>>>>
> >>>>>>>>>> Funnelback UK Ltd is a limited liability company registered in
> >>>>>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
> >>>>>>>>>> London. EC2A 4LD. Company registration number: 07004264.
> >>>>>>>>>>
> >>>>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hi Paul,
> >>>>>>>>>>
> >>>>>>>>>> Repositories should give information to ManifoldCF when they
> >>>>>>>>>> updated. Current CMIS connector reindex document if the lastest version of
> >>>>>>>>>> the document has changed, not updated.
> >>>>>>>>>>
> >>>>>>>>>> There is a change token property in CMIS specification and it
> >>>>>>>>>> should change when document is updated so ManifoldCF can understand that
> >>>>>>>>>> document is updated but implementing change token property is optional.
> >>>>>>>>>> I've checked Alfresco's CMIS web site and seen that they didn't set the
> >>>>>>>>>> change token.
> >>>>>>>>>>
> >>>>>>>>>> I think, there is nothing we can do at this point.
> >>>>>>>>>>
> >>>>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <daddywri@gmail.com <ma...@gmail.com>>
> >>>>>>>>>> şunu yazdı:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Paul,
> >>>>>>>>>>>
> >>>>>>>>>>> This looks like a bug in the CMIS connector to me; usually the
> >>>>>>>>>>> document version string the connector constructs should be adequate to
> >>>>>>>>>>> detect all changes.  Can you create a ticket?
> >>>>>>>>>>> https://issues.apache.org/jira <https://issues.apache.org/jira> , project ManifoldCF.  Please
> >>>>>>>>>>> include what version of MCF you are using here.  FWIW, this may be in fact
> >>>>>>>>>>> a bug in the Alfresco CMIS implementation, but we'll have to have some back
> >>>>>>>>>>> and forth before I can determine that for sure.
> >>>>>>>>>>>
> >>>>>>>>>>> In the meantime, have you considered using the Alfresco
> >>>>>>>>>>> Webscript connector?  It's the preferred way to do Alfresco indexing,
> >>>>>>>>>>> although there have been issues reported having to do with running it on
> >>>>>>>>>>> some configurations of Alfresco.  I'm not entirely sure what the problem is
> >>>>>>>>>>> there; maybe a version dependency of some kind.
> >>>>>>>>>>>
> >>>>>>>>>>> Karl
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <
> >>>>>>>>>>> pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Everyone,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hoping someone may be able to advise.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I am currently using Manifold, together with a CMIS connector,
> >>>>>>>>>>>> to retrieve and index content from an Alfresco repository.
> >>>>>>>>>>>>
> >>>>>>>>>>>> All is going well apart from, what I would call, the
> >>>>>>>>>>>> ‘incremental crawl’.
> >>>>>>>>>>>>
> >>>>>>>>>>>> The main issue I am having is that the modification of a
> >>>>>>>>>>>> document’s security settings, in Alfresco, is not being picked up in next
> >>>>>>>>>>>> Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A
> >>>>>>>>>>>> and B as Consumers. I run a crawl in Manifold and it picks up the documents
> >>>>>>>>>>>> fine.  The security is set as expected. I then remove ‘User A’ from the
> >>>>>>>>>>>> security of that document and re-run the Manifold crawl. User A can still
> >>>>>>>>>>>> see the document in the local search engine.
> >>>>>>>>>>>>
> >>>>>>>>>>>> It is as if Manifold is not treating the security update as a
> >>>>>>>>>>>> ‘modification’ and is therefore not refreshing it. Note that if I go into
> >>>>>>>>>>>> the Output Connections, edit and save the relevant output connection and
> >>>>>>>>>>>> then click ‘Remove all associated documents’, the next time I crawl, the
> >>>>>>>>>>>> changes are picked up. It is clear that Manifold is just not updating
> >>>>>>>>>>>> whatever internal record it has for this item.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Any ideas?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Many thanks.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >
> 
> 
> 


Re: Manifold/Alfresco seeding and security

Posted by Karl Wright <da...@gmail.com>.
Hi Paul,
Looking at Issue 3, I think that Maurizio has indeed pointed you in the
right direction.  Can you check your version of the plugin to be sure that
/api/node/ is NOT present in the described line of code?

Karl


On Tue, Oct 20, 2015 at 5:00 PM, <pf...@funnelback.com> wrote:

> Hi Maurizio,
>
> I will be available all day tomorrow (Wednesday) to help out as much as I
> can. If it's possible for you to look into this I can take whatever steps
> you need.
>
> Many thanks,
>
> Paul
>
> -----Original Message-----
> From: "Karl Wright" <da...@gmail.com>
> Sent: Tuesday, October 20, 2015 12:34pm
> To: "user@manifoldcf.apache.org" <us...@manifoldcf.apache.org>
> Subject: Re: Manifold/Alfresco seeding and security
>
> Hi Maurizio,
>
> This is the third time we've seen this; can you use Paul's help to chase
> down what the issue is?
>
> Karl
>
>
> On Tue, Oct 20, 2015 at 12:19 PM, Paul Farrell <pf...@funnelback.com>
> wrote:
>
> > Hi,
> >
> > I am using Alfresco Community 5.0.
> >
> > Having taken that AMP file (version 0.7.1) and then installed it into
> > Alfresco and restarted the services, the issue is still present.
> >
> > I suspect that this is probably more to do with the Manifold end than the
> > Alfresco end. It seems it is Manifold that is automatically appending the
> > “/api/node” string into the path whenever I use “/alfresco/service” as
> the
> > Context in the repository connection configuration.
> >
> > If it is of interest, this is the output in the manifoldcf.log file when
> I
> > use the repo connection config I mentioned earlier.
> >
> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection request:
> > [route: {}->http://54.165.85.140:8080][total kept alive: 0; route
> > allocated: 0 of 2; total allocated: 0 of 20]
> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection leased: [id:
> > 10][route: {}->http://54.165.85.140:8080][total kept alive: 0; route
> > allocated: 1 of 2; total allocated: 1 of 20]
> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Opening connection {}->
> > http://54.165.85.140:8080
> > DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connecting to /
> > 54.165.85.140:8080
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Connection established
> > 172.31.23.90:58712<->54.165.85.140:8080
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Executing request GET
> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Proxy auth state:
> > UNCHALLENGED
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> GET
> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > Accept: application/json
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > Authorization: Basic YWRtaW46RnVubmVsYmFjazE=
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > Host: 54.165.85.140:8080
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > Connection: Keep-Alive
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > User-Agent: Apache-HttpClient/4.3.5 (java 1.5)
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > Accept-Encoding: gzip,deflate
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> "GET
> > /alfresco/service/api/node/auth/resolve/admin HTTP/1.1[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "Accept: application/json[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "Authorization: Basic YWRtaW46RnVubmVsYmFjazE=[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "Host: 54.165.85.140:8080[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "Connection: Keep-Alive[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "User-Agent: Apache-HttpClient/4.3.5 (java 1.5)[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "Accept-Encoding: gzip,deflate[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> > "[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "HTTP/1.1 404 Not Found[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Server: Apache-Coyote/1.1[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Cache-Control: no-cache[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Expires: Thu, 01 Jan 1970 00:00:00 GMT[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Pragma: no-cache[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Content-Type: text/html;charset=UTF-8[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Transfer-Encoding: chunked[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "Date: Tue, 20 Oct 2015 16:18:47 GMT[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "630[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "
> > http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "<html xmlns="http://www.w3.org/1999/xhtml">[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> > <head>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >    <title>Web Script Status 404 - Not Found</title>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >    <link rel="stylesheet" href="/alfresco/css/webscripts.css"
> > type="text/css" />[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> > </head>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> > <body>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >    <div>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       <table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >             <td><img src="/alfresco/images/logo/AlfrescoLogo32.png"
> > alt="Alfresco" /></td>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >             <td><span class="title">Web Script Status 404 - Not
> > Found</span></td>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          </tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       </table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       <br/>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       <table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td>The Web Script <a
> >
> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
> > has responded with a status of 404 - Not Found.</td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       </table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       <br/>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       <table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td><b>404 Description:</b></td><td> Requested resource is
> not
> > available.</td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td> </td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td><b>Message:</b></td><td>Cannot find object for
> > NodeIdReference[storeRef=auth://resolve,id=admin]</td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td><b>Server</b>:</td><td>Community v5.0.0 (r75118-b23)
> > schema 8,001</td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td><b>Time</b>:</td><td>Oct 20, 2015 4:18:47
> PM</td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td></td><td> </td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >          <tr><td><b>Diagnostics</b>:</td><td><a
> > href="/alfresco/service/script/org/alfresco/cmis/item.get">Inspect Web
> > Script (org/alfresco/cmis/item.get)</a></td></tr>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >       </table>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> >    </div>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> > </body>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "</html>[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "[\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > "[\r][\n]"
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > HTTP/1.1 404 Not Found
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Server: Apache-Coyote/1.1
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Cache-Control: no-cache
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Expires: Thu, 01 Jan 1970 00:00:00 GMT
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Pragma: no-cache
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Content-Type: text/html;charset=UTF-8
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Transfer-Encoding: chunked
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> > Date: Tue, 20 Oct 2015 16:18:47 GMT
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection can be kept
> > alive indefinitely
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10:
> > Shutdown connection
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection discarded
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10: Close
> > connection
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection released:
> > [id: 10][route: {}->http://54.165.85.140:8080][total kept alive: 0;
> route
> > allocated: 0 of 2; total allocated: 0 of 20]
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager is
> > shutting down
> > DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager shut
> > down
> >
> > *Paul Farrell*
> > Senior Search Consultant
> >
> > 109-123 Clifton Street, London EC2A 4LD
> > *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
> >
> > *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
> >
> > Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> -
> > Twitter <https://twitter.com/funnelback>
> >
> > Funnelback UK Ltd is a limited liability company registered in England &
> > Wales. Registered address: Zetland House 109-123, Clifton Street, London.
> > EC2A 4LD. Company registration number: 07004264.
> >
> > On 20 Oct 2015, at 16:50, Maurizio Pillitu <ma...@apache.org> wrote:
> >
> > Hi Paul,
> >
> > it looks like you're hitting
> > https://github.com/maoo/alfresco-indexer/issues/3 ; which version of
> > alfresco-indexer are you using? Can you try using
> >
> http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp
> (or
> > the pre-built WAR file -
> >
> http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar
> >  )
> >
> > HTH
> >   mao
> >
> > On Tue, Oct 20, 2015 at 5:36 PM Paul Farrell <pf...@funnelback.com>
> > wrote:
> >
> >> Hi,
> >>
> >> Having had to go back to basics and re-install my Alfresco instance, I
> >> can confirm that the AMP file for the alfresco indexer web scripts
> *does*
> >> actually install without error. There must have been an issue with my
> >> previous Alfresco instance.
> >>
> >> Having said that, the Alfresco WebScript connector fails. The failure is
> >> down to the ‘Context’ setting (see below):
> >>
> >> <4a6db6238cff01e7ff77cdaf7e6ea050.png>
> >>
> >> When you attempt to save the configuration of the WebScript connector,
> >> Manifold clearly tries to check the connection. It seems to do this by
> >> making an API call (/auth/resolve/admin). The issue is with what
> Manifold
> >> prepends to the start of that path.
> >> If I leave the setting as above then Manifold reports   :
> >>
> >> <tr><td>The Web Script <a
> >>
> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
> >> has responded with a status of 404 - Not Found.</td></tr>[\n]”
> >>
> >> In other words, it builds the full path as
> >> “alfresco/service/api/node/auth/resolve/admin”.
> >>
> >> For my Alfresco Community 5.0 instance, I get to that same web script
> via
> >> the URL “/alfresco/service/auth/resolve/admin” i.e. without the
> ‘/api/node’.
> >>
> >> Somewhere, Manifold is assuming that the ‘/api/node’ is a correct path
> >> inclusion. In other words, there is nothing I can put into that box to
> >> prevent it.
> >>
> >> Paul
> >>
> >> On 20 Oct 2015, at 12:56, Karl Wright <da...@gmail.com> wrote:
> >>
> >> Hmm.  What file was missing?  Maurizio maintains the indexer plugin; I
> >> feel certain he'd want to know.
> >>
> >> Karl
> >>
> >>
> >> On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <pf...@funnelback.com>
> >> wrote:
> >>
> >>> Hi guys,
> >>>
> >>> Just to let you know what’s going on - for informational purposes more
> >>> than anything.
> >>>
> >>> I initially tried taking the AMP file provided in the MCF plugins
> >>> directory (0.7.0) and tried to install it into Alfresco but got a
> message
> >>> saying a file was missing.
> >>>
> >>> Instead, I cloned the repository on GitHub for the alfresco-indexer
> >>> project and then built it on my local machine. This generated the AMP
> file
> >>> (0.7.2).
> >>>
> >>> I was able to successfully install the AMP file onto my Alfresco
> >>> instance.
> >>>
> >>> As it happens I now cannot log into Alfresco Share ('bad credentials or
> >>> server not available' message) but that is something I can work on.
> >>> Apparently the installation of some AMP files have been known to cause
> this
> >>> issue.
> >>>
> >>> So, progress to a point!
> >>>
> >>> *Paul Farrell*
> >>> Senior Search Consultant
> >>>
> >>> 109-123 Clifton Street, London EC2A 4LD
> >>> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
> >>>
> >>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
> >>>
> >>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback>
> -
> >>>  Twitter <https://twitter.com/funnelback>
> >>>
> >>> Funnelback UK Ltd is a limited liability company registered in England
> &
> >>> Wales. Registered address: Zetland House 109-123, Clifton Street,
> London.
> >>> EC2A 4LD. Company registration number: 07004264.
> >>>
> >>> On 20 Oct 2015, at 12:36, Rafa Haro <rh...@gmail.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> At the Alfresco side, hope this helps:
> >>>
> >>> http://docs.alfresco.com/4.1/tasks/amp-install.html
> >>>
> >>> Cheers
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <da...@gmail.com>
> wrote:
> >>>
> >>>> The AMP file is actually shipped as part of the binary MCF
> >>>> distribution.  You can find it under "plugins".
> >>>>
> >>>> Karl
> >>>>
> >>>>
> >>>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <
> pfarrell@funnelback.com>
> >>>> wrote:
> >>>>
> >>>>> Hi all,
> >>>>>
> >>>>> Hopefully this will be my only request for information today.
> >>>>> I’m afraid this is a bit of a newbie question but I have managed to
> >>>>> get the Manifold UI to now show ‘Alfresco Webscripts’ as a
> connector. The
> >>>>> only bit I am missing now is to install the AMP file in Afresco.
> >>>>>
> >>>>> I realise that this is slightly outside of the Manifold remit but I
> >>>>> wondered if anyone can advise how I build the AMP file from the URL (
> >>>>> https://github.com/maoo/alfresco-indexer)? I have cloned the
> >>>>> repository to my local drive but, having never worked with Maven, am
> at a
> >>>>> loss at how to generate the AMP file that I then need to install into
> >>>>> Alfresco.
> >>>>>
> >>>>> Many thanks,
> >>>>>
> >>>>> On 19 Oct 2015, at 17:36, Karl Wright <da...@gmail.com> wrote:
> >>>>>
> >>>>> The only way you can have such a reduced list of connectors is if
> >>>>> somebody commented out many connectors in your connectors.xml, or
> removed
> >>>>> them from the database table where they are registered by hand.
> >>>>>
> >>>>> Karl
> >>>>>
> >>>>>
> >>>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <
> >>>>> pfarrell@funnelback.com> wrote:
> >>>>>
> >>>>>> After a good deal of time clicking around I came to the same
> >>>>>> conclusion - that there is no way of telling from the UI!!
> >>>>>>
> >>>>>> Having dug a bit deeper I believe I may actually have the Alfresco
> >>>>>> WebScript connectors installed. At least the 0.7.0 version. I
> notice in the
> >>>>>> ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
> >>>>>>
> >>>>>> Looking in the ‘connectors.xml’ file I can also see the line :
> >>>>>>
> >>>>>> <repositoryconnector name="Alfresco Webscript"
> >>>>>>
> class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
> >>>>>>
> >>>>>> You can imagine my excitement!
> >>>>>>
> >>>>>> The only thing I am missing is the option in the UI. When I click to
> >>>>>> create a new repo connection I get:  CMIS, Dropbox, Generic,
> GoogleDrive,
> >>>>>> HDFS, Jira, Meridio, RSS, Sharepoint.
> >>>>>>
> >>>>>> Perhaps I am hoping for too much to hope that I can make a simple
> >>>>>> change to enable this repo connection?
> >>>>>>
> >>>>>> Thanks for all the help everyone
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 19 Oct 2015, at 17:26, Karl Wright <da...@gmail.com> wrote:
> >>>>>>
> >>>>>> Hah; there's not a way to inquire in the UI, if that's what you
> >>>>>> mean.  But if you see "Alfresco webscript" in the list of repository
> >>>>>> connection types, you've got a version that supports that connector.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Karl
> >>>>>>
> >>>>>>
> >>>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <
> >>>>>> pfarrell@funnelback.com> wrote:
> >>>>>>
> >>>>>>> Thanks Rafa.
> >>>>>>>
> >>>>>>> As an aside, is there an easy way to identify which version of
> >>>>>>> ManifoldCF you are on?
> >>>>>>>
> >>>>>>> Cheers
> >>>>>>>
> >>>>>>> *Paul Farrell*
> >>>>>>> Senior Search Consultant
> >>>>>>>
> >>>>>>> 109-123 Clifton Street, London EC2A 4LD
> >>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
> >>>>>>> <http://www.funnelback.com/>
> >>>>>>>
> >>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
> >>>>>>>
> >>>>>>> Connect with us: LinkedIn
> >>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
> >>>>>>> <https://twitter.com/funnelback>
> >>>>>>>
> >>>>>>> Funnelback UK Ltd is a limited liability company registered in
> >>>>>>> England & Wales. Registered address: Zetland House 109-123,
> Clifton Street,
> >>>>>>> London. EC2A 4LD. Company registration number: 07004264.
> >>>>>>>
> >>>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rh...@apache.org> wrote:
> >>>>>>>
> >>>>>>> Hi Paul,
> >>>>>>>
> >>>>>>> All you need to do is to install this webscript
> >>>>>>> <https://github.com/maoo/alfresco-indexer> within your Alfresco
> >>>>>>> instance. The connector itself is already part of the most recent
> versions
> >>>>>>> of ManifoldCF
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Rafa
> >>>>>>>
> >>>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <
> >>>>>>> pfarrell@funnelback.com> wrote:
> >>>>>>>
> >>>>>>>> Ok, thanks again guys.
> >>>>>>>>
> >>>>>>>> The Webscript connector it is.
> >>>>>>>>
> >>>>>>>> I realise I am asking a lot here but are there any easy-to-follow
> >>>>>>>> guidelines on how to get this Webscript connector installed?  I
> see there
> >>>>>>>> is a GitHub page here (
> >>>>>>>> https://github.com/maoo/alfresco-webscript-manifold-connector)
> >>>>>>>> which discusses it (although it directs you to a repository of
> files).
> >>>>>>>>
> >>>>>>>> I am just keen to make sure that any steps I follow to try and get
> >>>>>>>> this Webscript connector installed and working are updated,
> reliable steps.
> >>>>>>>> I would hate to waste time with out of date information.
> >>>>>>>>
> >>>>>>>> Thanks all
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh...@gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> Hi Paul,
> >>>>>>>>
> >>>>>>>> I suggest that you should use Alfresco Webscript as Karl
> mentioned.
> >>>>>>>> Web services is so slow compared to other services and I've also
> checked
> >>>>>>>> that Alfresco CMIS web services does not return change token(may
> be there
> >>>>>>>> is something that I don't know).
> >>>>>>>>
> >>>>>>>> By the way current version of CMIS connector is not aware of
> change
> >>>>>>>> token. I would write a patch for you if alfresco supports change
> token
> >>>>>>>> property.
> >>>>>>>>
> >>>>>>>> Thanks!
> >>>>>>>> Muhammed
> >>>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <
> >>>>>>>> daddywri@gmail.com> şunu yazdı:
> >>>>>>>>
> >>>>>>>>> Hi Paul,
> >>>>>>>>>
> >>>>>>>>> The Alfresco Webscript connector is a wholly different connector
> >>>>>>>>> that has no relation to the CMIS connector.  It requires an
> Alfresco
> >>>>>>>>> webscript plugin be installed on your Alfresco server to work,
> though.
> >>>>>>>>>
> >>>>>>>>> Hope that helps.
> >>>>>>>>>
> >>>>>>>>> Karl
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <
> >>>>>>>>> pfarrell@funnelback.com> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi Muhammed/Karl,
> >>>>>>>>>>
> >>>>>>>>>> Firstly, thank-you so much for taking the time to reply. It is
> >>>>>>>>>> very much appreciated.
> >>>>>>>>>>
> >>>>>>>>>> Currently I am using the AtomPub for my CMIS repository
> >>>>>>>>>> connection. I have just read something which may shed a little
> light on
> >>>>>>>>>> this. The post read that change tokens are not passed via
> AtomPub
> >>>>>>>>>> connections (
> >>>>>>>>>>
> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758
> ).
> >>>>>>>>>> If true, this would explain why ManifoldCF may be unable to
> determine a
> >>>>>>>>>> change in Alfresco.
> >>>>>>>>>>
> >>>>>>>>>> It looks like I have two possible options left open to me
> >>>>>>>>>> (correct me if I’m wrong):
> >>>>>>>>>>
> >>>>>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the
> >>>>>>>>>> connection mechanism
> >>>>>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’
> >>>>>>>>>> connector?  (or is this the same as the ‘Web Services’
> connection mentioned
> >>>>>>>>>> above?)
> >>>>>>>>>>
> >>>>>>>>>> Thanks again,
> >>>>>>>>>>
> >>>>>>>>>> Paul
> >>>>>>>>>>
> >>>>>>>>>> *Paul Farrell*
> >>>>>>>>>> Senior Search Consultant
> >>>>>>>>>>
> >>>>>>>>>> 109-123 Clifton Street, London EC2A 4LD
> >>>>>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
> >>>>>>>>>> <http://www.funnelback.com/>
> >>>>>>>>>>
> >>>>>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED
> >>>>>>>>>> STATES
> >>>>>>>>>>
> >>>>>>>>>> Connect with us: LinkedIn
> >>>>>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
> >>>>>>>>>> <https://twitter.com/funnelback>
> >>>>>>>>>>
> >>>>>>>>>> Funnelback UK Ltd is a limited liability company registered in
> >>>>>>>>>> England & Wales. Registered address: Zetland House 109-123,
> Clifton Street,
> >>>>>>>>>> London. EC2A 4LD. Company registration number: 07004264.
> >>>>>>>>>>
> >>>>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh...@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hi Paul,
> >>>>>>>>>>
> >>>>>>>>>> Repositories should give information to ManifoldCF when they
> >>>>>>>>>> updated. Current CMIS connector reindex document if the lastest
> version of
> >>>>>>>>>> the document has changed, not updated.
> >>>>>>>>>>
> >>>>>>>>>> There is a change token property in CMIS specification and it
> >>>>>>>>>> should change when document is updated so ManifoldCF can
> understand that
> >>>>>>>>>> document is updated but implementing change token property is
> optional.
> >>>>>>>>>> I've checked Alfresco's CMIS web site and seen that they didn't
> set the
> >>>>>>>>>> change token.
> >>>>>>>>>>
> >>>>>>>>>> I think, there is nothing we can do at this point.
> >>>>>>>>>>
> >>>>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <
> daddywri@gmail.com>
> >>>>>>>>>> şunu yazdı:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Paul,
> >>>>>>>>>>>
> >>>>>>>>>>> This looks like a bug in the CMIS connector to me; usually the
> >>>>>>>>>>> document version string the connector constructs should be
> adequate to
> >>>>>>>>>>> detect all changes.  Can you create a ticket?
> >>>>>>>>>>> https://issues.apache.org/jira , project ManifoldCF.  Please
> >>>>>>>>>>> include what version of MCF you are using here.  FWIW, this
> may be in fact
> >>>>>>>>>>> a bug in the Alfresco CMIS implementation, but we'll have to
> have some back
> >>>>>>>>>>> and forth before I can determine that for sure.
> >>>>>>>>>>>
> >>>>>>>>>>> In the meantime, have you considered using the Alfresco
> >>>>>>>>>>> Webscript connector?  It's the preferred way to do Alfresco
> indexing,
> >>>>>>>>>>> although there have been issues reported having to do with
> running it on
> >>>>>>>>>>> some configurations of Alfresco.  I'm not entirely sure what
> the problem is
> >>>>>>>>>>> there; maybe a version dependency of some kind.
> >>>>>>>>>>>
> >>>>>>>>>>> Karl
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <
> >>>>>>>>>>> pfarrell@funnelback.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Everyone,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hoping someone may be able to advise.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I am currently using Manifold, together with a CMIS connector,
> >>>>>>>>>>>> to retrieve and index content from an Alfresco repository.
> >>>>>>>>>>>>
> >>>>>>>>>>>> All is going well apart from, what I would call, the
> >>>>>>>>>>>> ‘incremental crawl’.
> >>>>>>>>>>>>
> >>>>>>>>>>>> The main issue I am having is that the modification of a
> >>>>>>>>>>>> document’s security settings, in Alfresco, is not being
> picked up in next
> >>>>>>>>>>>> Manifold crawl. As an example I have a document ‘TestDoc1’
> which has user A
> >>>>>>>>>>>> and B as Consumers. I run a crawl in Manifold and it picks up
> the documents
> >>>>>>>>>>>> fine.  The security is set as expected. I then remove ‘User
> A’ from the
> >>>>>>>>>>>> security of that document and re-run the Manifold crawl. User
> A can still
> >>>>>>>>>>>> see the document in the local search engine.
> >>>>>>>>>>>>
> >>>>>>>>>>>> It is as if Manifold is not treating the security update as a
> >>>>>>>>>>>> ‘modification’ and is therefore not refreshing it. Note that
> if I go into
> >>>>>>>>>>>> the Output Connections, edit and save the relevant output
> connection and
> >>>>>>>>>>>> then click ‘Remove all associated documents’, the next time I
> crawl, the
> >>>>>>>>>>>> changes are picked up. It is clear that Manifold is just not
> updating
> >>>>>>>>>>>> whatever internal record it has for this item.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Any ideas?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Many thanks.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >
>
>
>

Re: Manifold/Alfresco seeding and security

Posted by pf...@funnelback.com.
Hi Maurizio,

I will be available all day tomorrow (Wednesday) to help out as much as I can. If it's possible for you to look into this I can take whatever steps you need.

Many thanks,

Paul

-----Original Message-----
From: "Karl Wright" <da...@gmail.com>
Sent: Tuesday, October 20, 2015 12:34pm
To: "user@manifoldcf.apache.org" <us...@manifoldcf.apache.org>
Subject: Re: Manifold/Alfresco seeding and security

Hi Maurizio,

This is the third time we've seen this; can you use Paul's help to chase
down what the issue is?

Karl


On Tue, Oct 20, 2015 at 12:19 PM, Paul Farrell <pf...@funnelback.com>
wrote:

> Hi,
>
> I am using Alfresco Community 5.0.
>
> Having taken that AMP file (version 0.7.1) and then installed it into
> Alfresco and restarted the services, the issue is still present.
>
> I suspect that this is probably more to do with the Manifold end than the
> Alfresco end. It seems it is Manifold that is automatically appending the
> “/api/node” string into the path whenever I use “/alfresco/service” as the
> Context in the repository connection configuration.
>
> If it is of interest, this is the output in the manifoldcf.log file when I
> use the repo connection config I mentioned earlier.
>
> DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection request:
> [route: {}->http://54.165.85.140:8080][total kept alive: 0; route
> allocated: 0 of 2; total allocated: 0 of 20]
> DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection leased: [id:
> 10][route: {}->http://54.165.85.140:8080][total kept alive: 0; route
> allocated: 1 of 2; total allocated: 1 of 20]
> DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Opening connection {}->
> http://54.165.85.140:8080
> DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connecting to /
> 54.165.85.140:8080
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Connection established
> 172.31.23.90:58712<->54.165.85.140:8080
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Executing request GET
> /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Proxy auth state:
> UNCHALLENGED
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> GET
> /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> Accept: application/json
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> Authorization: Basic YWRtaW46RnVubmVsYmFjazE=
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> Host: 54.165.85.140:8080
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> Connection: Keep-Alive
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> User-Agent: Apache-HttpClient/4.3.5 (java 1.5)
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> Accept-Encoding: gzip,deflate
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> "GET
> /alfresco/service/api/node/auth/resolve/admin HTTP/1.1[\r][\n]"
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> "Accept: application/json[\r][\n]"
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> "Authorization: Basic YWRtaW46RnVubmVsYmFjazE=[\r][\n]"
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> "Host: 54.165.85.140:8080[\r][\n]"
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> "Connection: Keep-Alive[\r][\n]"
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> "User-Agent: Apache-HttpClient/4.3.5 (java 1.5)[\r][\n]"
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> "Accept-Encoding: gzip,deflate[\r][\n]"
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> "[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "HTTP/1.1 404 Not Found[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "Server: Apache-Coyote/1.1[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "Cache-Control: no-cache[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "Expires: Thu, 01 Jan 1970 00:00:00 GMT[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "Pragma: no-cache[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "Content-Type: text/html;charset=UTF-8[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "Transfer-Encoding: chunked[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "Date: Tue, 20 Oct 2015 16:18:47 GMT[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "630[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "
> http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "<html xmlns="http://www.w3.org/1999/xhtml">[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> <head>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>    <title>Web Script Status 404 - Not Found</title>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>    <link rel="stylesheet" href="/alfresco/css/webscripts.css"
> type="text/css" />[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> </head>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> <body>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>    <div>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>       <table>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>             <td><img src="/alfresco/images/logo/AlfrescoLogo32.png"
> alt="Alfresco" /></td>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>             <td><span class="title">Web Script Status 404 - Not
> Found</span></td>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          </tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>       </table>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>       <br/>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>       <table>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr><td>The Web Script <a
> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
> has responded with a status of 404 - Not Found.</td></tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>       </table>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>       <br/>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>       <table>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr><td><b>404 Description:</b></td><td> Requested resource is not
> available.</td></tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr><td> </td></tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr><td><b>Message:</b></td><td>Cannot find object for
> NodeIdReference[storeRef=auth://resolve,id=admin]</td></tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr><td><b>Server</b>:</td><td>Community v5.0.0 (r75118-b23)
> schema 8,001</td></tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr><td><b>Time</b>:</td><td>Oct 20, 2015 4:18:47 PM</td></tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr><td></td><td> </td></tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr><td><b>Diagnostics</b>:</td><td><a
> href="/alfresco/service/script/org/alfresco/cmis/item.get">Inspect Web
> Script (org/alfresco/cmis/item.get)</a></td></tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>       </table>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>    </div>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> </body>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "</html>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> HTTP/1.1 404 Not Found
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> Server: Apache-Coyote/1.1
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> Cache-Control: no-cache
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> Expires: Thu, 01 Jan 1970 00:00:00 GMT
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> Pragma: no-cache
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> Content-Type: text/html;charset=UTF-8
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> Transfer-Encoding: chunked
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> Date: Tue, 20 Oct 2015 16:18:47 GMT
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection can be kept
> alive indefinitely
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10:
> Shutdown connection
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection discarded
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10: Close
> connection
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection released:
> [id: 10][route: {}->http://54.165.85.140:8080][total kept alive: 0; route
> allocated: 0 of 2; total allocated: 0 of 20]
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager is
> shutting down
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager shut
> down
>
> *Paul Farrell*
> Senior Search Consultant
>
> 109-123 Clifton Street, London EC2A 4LD
> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>
> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>
> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> -
> Twitter <https://twitter.com/funnelback>
>
> Funnelback UK Ltd is a limited liability company registered in England &
> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
> EC2A 4LD. Company registration number: 07004264.
>
> On 20 Oct 2015, at 16:50, Maurizio Pillitu <ma...@apache.org> wrote:
>
> Hi Paul,
>
> it looks like you're hitting
> https://github.com/maoo/alfresco-indexer/issues/3 ; which version of
> alfresco-indexer are you using? Can you try using
> http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp (or
> the pre-built WAR file -
> http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar
>  )
>
> HTH
>   mao
>
> On Tue, Oct 20, 2015 at 5:36 PM Paul Farrell <pf...@funnelback.com>
> wrote:
>
>> Hi,
>>
>> Having had to go back to basics and re-install my Alfresco instance, I
>> can confirm that the AMP file for the alfresco indexer web scripts *does*
>> actually install without error. There must have been an issue with my
>> previous Alfresco instance.
>>
>> Having said that, the Alfresco WebScript connector fails. The failure is
>> down to the ‘Context’ setting (see below):
>>
>> <4a6db6238cff01e7ff77cdaf7e6ea050.png>
>>
>> When you attempt to save the configuration of the WebScript connector,
>> Manifold clearly tries to check the connection. It seems to do this by
>> making an API call (/auth/resolve/admin). The issue is with what Manifold
>> prepends to the start of that path.
>> If I leave the setting as above then Manifold reports   :
>>
>> <tr><td>The Web Script <a
>> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
>> has responded with a status of 404 - Not Found.</td></tr>[\n]”
>>
>> In other words, it builds the full path as
>> “alfresco/service/api/node/auth/resolve/admin”.
>>
>> For my Alfresco Community 5.0 instance, I get to that same web script via
>> the URL “/alfresco/service/auth/resolve/admin” i.e. without the ‘/api/node’.
>>
>> Somewhere, Manifold is assuming that the ‘/api/node’ is a correct path
>> inclusion. In other words, there is nothing I can put into that box to
>> prevent it.
>>
>> Paul
>>
>> On 20 Oct 2015, at 12:56, Karl Wright <da...@gmail.com> wrote:
>>
>> Hmm.  What file was missing?  Maurizio maintains the indexer plugin; I
>> feel certain he'd want to know.
>>
>> Karl
>>
>>
>> On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <pf...@funnelback.com>
>> wrote:
>>
>>> Hi guys,
>>>
>>> Just to let you know what’s going on - for informational purposes more
>>> than anything.
>>>
>>> I initially tried taking the AMP file provided in the MCF plugins
>>> directory (0.7.0) and tried to install it into Alfresco but got a message
>>> saying a file was missing.
>>>
>>> Instead, I cloned the repository on GitHub for the alfresco-indexer
>>> project and then built it on my local machine. This generated the AMP file
>>> (0.7.2).
>>>
>>> I was able to successfully install the AMP file onto my Alfresco
>>> instance.
>>>
>>> As it happens I now cannot log into Alfresco Share ('bad credentials or
>>> server not available' message) but that is something I can work on.
>>> Apparently the installation of some AMP files have been known to cause this
>>> issue.
>>>
>>> So, progress to a point!
>>>
>>> *Paul Farrell*
>>> Senior Search Consultant
>>>
>>> 109-123 Clifton Street, London EC2A 4LD
>>> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>>>
>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>
>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> -
>>>  Twitter <https://twitter.com/funnelback>
>>>
>>> Funnelback UK Ltd is a limited liability company registered in England &
>>> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>>> EC2A 4LD. Company registration number: 07004264.
>>>
>>> On 20 Oct 2015, at 12:36, Rafa Haro <rh...@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> At the Alfresco side, hope this helps:
>>>
>>> http://docs.alfresco.com/4.1/tasks/amp-install.html
>>>
>>> Cheers
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <da...@gmail.com> wrote:
>>>
>>>> The AMP file is actually shipped as part of the binary MCF
>>>> distribution.  You can find it under "plugins".
>>>>
>>>> Karl
>>>>
>>>>
>>>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <pf...@funnelback.com>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Hopefully this will be my only request for information today.
>>>>> I’m afraid this is a bit of a newbie question but I have managed to
>>>>> get the Manifold UI to now show ‘Alfresco Webscripts’ as a connector. The
>>>>> only bit I am missing now is to install the AMP file in Afresco.
>>>>>
>>>>> I realise that this is slightly outside of the Manifold remit but I
>>>>> wondered if anyone can advise how I build the AMP file from the URL (
>>>>> https://github.com/maoo/alfresco-indexer)? I have cloned the
>>>>> repository to my local drive but, having never worked with Maven, am at a
>>>>> loss at how to generate the AMP file that I then need to install into
>>>>> Alfresco.
>>>>>
>>>>> Many thanks,
>>>>>
>>>>> On 19 Oct 2015, at 17:36, Karl Wright <da...@gmail.com> wrote:
>>>>>
>>>>> The only way you can have such a reduced list of connectors is if
>>>>> somebody commented out many connectors in your connectors.xml, or removed
>>>>> them from the database table where they are registered by hand.
>>>>>
>>>>> Karl
>>>>>
>>>>>
>>>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <
>>>>> pfarrell@funnelback.com> wrote:
>>>>>
>>>>>> After a good deal of time clicking around I came to the same
>>>>>> conclusion - that there is no way of telling from the UI!!
>>>>>>
>>>>>> Having dug a bit deeper I believe I may actually have the Alfresco
>>>>>> WebScript connectors installed. At least the 0.7.0 version. I notice in the
>>>>>> ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
>>>>>>
>>>>>> Looking in the ‘connectors.xml’ file I can also see the line :
>>>>>>
>>>>>> <repositoryconnector name="Alfresco Webscript"
>>>>>> class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>>>>>>
>>>>>> You can imagine my excitement!
>>>>>>
>>>>>> The only thing I am missing is the option in the UI. When I click to
>>>>>> create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive,
>>>>>> HDFS, Jira, Meridio, RSS, Sharepoint.
>>>>>>
>>>>>> Perhaps I am hoping for too much to hope that I can make a simple
>>>>>> change to enable this repo connection?
>>>>>>
>>>>>> Thanks for all the help everyone
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 19 Oct 2015, at 17:26, Karl Wright <da...@gmail.com> wrote:
>>>>>>
>>>>>> Hah; there's not a way to inquire in the UI, if that's what you
>>>>>> mean.  But if you see "Alfresco webscript" in the list of repository
>>>>>> connection types, you've got a version that supports that connector.
>>>>>>
>>>>>> Thanks,
>>>>>> Karl
>>>>>>
>>>>>>
>>>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <
>>>>>> pfarrell@funnelback.com> wrote:
>>>>>>
>>>>>>> Thanks Rafa.
>>>>>>>
>>>>>>> As an aside, is there an easy way to identify which version of
>>>>>>> ManifoldCF you are on?
>>>>>>>
>>>>>>> Cheers
>>>>>>>
>>>>>>> *Paul Farrell*
>>>>>>> Senior Search Consultant
>>>>>>>
>>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
>>>>>>> <http://www.funnelback.com/>
>>>>>>>
>>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>>>>
>>>>>>> Connect with us: LinkedIn
>>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
>>>>>>> <https://twitter.com/funnelback>
>>>>>>>
>>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
>>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>>>>>>
>>>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rh...@apache.org> wrote:
>>>>>>>
>>>>>>> Hi Paul,
>>>>>>>
>>>>>>> All you need to do is to install this webscript
>>>>>>> <https://github.com/maoo/alfresco-indexer> within your Alfresco
>>>>>>> instance. The connector itself is already part of the most recent versions
>>>>>>> of ManifoldCF
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Rafa
>>>>>>>
>>>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <
>>>>>>> pfarrell@funnelback.com> wrote:
>>>>>>>
>>>>>>>> Ok, thanks again guys.
>>>>>>>>
>>>>>>>> The Webscript connector it is.
>>>>>>>>
>>>>>>>> I realise I am asking a lot here but are there any easy-to-follow
>>>>>>>> guidelines on how to get this Webscript connector installed?  I see there
>>>>>>>> is a GitHub page here (
>>>>>>>> https://github.com/maoo/alfresco-webscript-manifold-connector)
>>>>>>>> which discusses it (although it directs you to a repository of files).
>>>>>>>>
>>>>>>>> I am just keen to make sure that any steps I follow to try and get
>>>>>>>> this Webscript connector installed and working are updated, reliable steps.
>>>>>>>> I would hate to waste time with out of date information.
>>>>>>>>
>>>>>>>> Thanks all
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Paul,
>>>>>>>>
>>>>>>>> I suggest that you should use Alfresco Webscript as Karl mentioned.
>>>>>>>> Web services is so slow compared to other services and I've also checked
>>>>>>>> that Alfresco CMIS web services does not return change token(may be there
>>>>>>>> is something that I don't know).
>>>>>>>>
>>>>>>>> By the way current version of CMIS connector is not aware of change
>>>>>>>> token. I would write a patch for you if alfresco supports change token
>>>>>>>> property.
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>> Muhammed
>>>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <
>>>>>>>> daddywri@gmail.com> şunu yazdı:
>>>>>>>>
>>>>>>>>> Hi Paul,
>>>>>>>>>
>>>>>>>>> The Alfresco Webscript connector is a wholly different connector
>>>>>>>>> that has no relation to the CMIS connector.  It requires an Alfresco
>>>>>>>>> webscript plugin be installed on your Alfresco server to work, though.
>>>>>>>>>
>>>>>>>>> Hope that helps.
>>>>>>>>>
>>>>>>>>> Karl
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <
>>>>>>>>> pfarrell@funnelback.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Muhammed/Karl,
>>>>>>>>>>
>>>>>>>>>> Firstly, thank-you so much for taking the time to reply. It is
>>>>>>>>>> very much appreciated.
>>>>>>>>>>
>>>>>>>>>> Currently I am using the AtomPub for my CMIS repository
>>>>>>>>>> connection. I have just read something which may shed a little light on
>>>>>>>>>> this. The post read that change tokens are not passed via AtomPub
>>>>>>>>>> connections (
>>>>>>>>>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758).
>>>>>>>>>> If true, this would explain why ManifoldCF may be unable to determine a
>>>>>>>>>> change in Alfresco.
>>>>>>>>>>
>>>>>>>>>> It looks like I have two possible options left open to me
>>>>>>>>>> (correct me if I’m wrong):
>>>>>>>>>>
>>>>>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the
>>>>>>>>>> connection mechanism
>>>>>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’
>>>>>>>>>> connector?  (or is this the same as the ‘Web Services’ connection mentioned
>>>>>>>>>> above?)
>>>>>>>>>>
>>>>>>>>>> Thanks again,
>>>>>>>>>>
>>>>>>>>>> Paul
>>>>>>>>>>
>>>>>>>>>> *Paul Farrell*
>>>>>>>>>> Senior Search Consultant
>>>>>>>>>>
>>>>>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
>>>>>>>>>> <http://www.funnelback.com/>
>>>>>>>>>>
>>>>>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED
>>>>>>>>>> STATES
>>>>>>>>>>
>>>>>>>>>> Connect with us: LinkedIn
>>>>>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
>>>>>>>>>> <https://twitter.com/funnelback>
>>>>>>>>>>
>>>>>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>>>>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
>>>>>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>>>>>>>>>
>>>>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Paul,
>>>>>>>>>>
>>>>>>>>>> Repositories should give information to ManifoldCF when they
>>>>>>>>>> updated. Current CMIS connector reindex document if the lastest version of
>>>>>>>>>> the document has changed, not updated.
>>>>>>>>>>
>>>>>>>>>> There is a change token property in CMIS specification and it
>>>>>>>>>> should change when document is updated so ManifoldCF can understand that
>>>>>>>>>> document is updated but implementing change token property is optional.
>>>>>>>>>> I've checked Alfresco's CMIS web site and seen that they didn't set the
>>>>>>>>>> change token.
>>>>>>>>>>
>>>>>>>>>> I think, there is nothing we can do at this point.
>>>>>>>>>>
>>>>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <da...@gmail.com>
>>>>>>>>>> şunu yazdı:
>>>>>>>>>>
>>>>>>>>>>> Hi Paul,
>>>>>>>>>>>
>>>>>>>>>>> This looks like a bug in the CMIS connector to me; usually the
>>>>>>>>>>> document version string the connector constructs should be adequate to
>>>>>>>>>>> detect all changes.  Can you create a ticket?
>>>>>>>>>>> https://issues.apache.org/jira , project ManifoldCF.  Please
>>>>>>>>>>> include what version of MCF you are using here.  FWIW, this may be in fact
>>>>>>>>>>> a bug in the Alfresco CMIS implementation, but we'll have to have some back
>>>>>>>>>>> and forth before I can determine that for sure.
>>>>>>>>>>>
>>>>>>>>>>> In the meantime, have you considered using the Alfresco
>>>>>>>>>>> Webscript connector?  It's the preferred way to do Alfresco indexing,
>>>>>>>>>>> although there have been issues reported having to do with running it on
>>>>>>>>>>> some configurations of Alfresco.  I'm not entirely sure what the problem is
>>>>>>>>>>> there; maybe a version dependency of some kind.
>>>>>>>>>>>
>>>>>>>>>>> Karl
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <
>>>>>>>>>>> pfarrell@funnelback.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>
>>>>>>>>>>>> Hoping someone may be able to advise.
>>>>>>>>>>>>
>>>>>>>>>>>> I am currently using Manifold, together with a CMIS connector,
>>>>>>>>>>>> to retrieve and index content from an Alfresco repository.
>>>>>>>>>>>>
>>>>>>>>>>>> All is going well apart from, what I would call, the
>>>>>>>>>>>> ‘incremental crawl’.
>>>>>>>>>>>>
>>>>>>>>>>>> The main issue I am having is that the modification of a
>>>>>>>>>>>> document’s security settings, in Alfresco, is not being picked up in next
>>>>>>>>>>>> Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A
>>>>>>>>>>>> and B as Consumers. I run a crawl in Manifold and it picks up the documents
>>>>>>>>>>>> fine.  The security is set as expected. I then remove ‘User A’ from the
>>>>>>>>>>>> security of that document and re-run the Manifold crawl. User A can still
>>>>>>>>>>>> see the document in the local search engine.
>>>>>>>>>>>>
>>>>>>>>>>>> It is as if Manifold is not treating the security update as a
>>>>>>>>>>>> ‘modification’ and is therefore not refreshing it. Note that if I go into
>>>>>>>>>>>> the Output Connections, edit and save the relevant output connection and
>>>>>>>>>>>> then click ‘Remove all associated documents’, the next time I crawl, the
>>>>>>>>>>>> changes are picked up. It is clear that Manifold is just not updating
>>>>>>>>>>>> whatever internal record it has for this item.
>>>>>>>>>>>>
>>>>>>>>>>>> Any ideas?
>>>>>>>>>>>>
>>>>>>>>>>>> Many thanks.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>



Re: Manifold/Alfresco seeding and security

Posted by Karl Wright <da...@gmail.com>.
Hi Maurizio,

This is the third time we've seen this; can you use Paul's help to chase
down what the issue is?

Karl


On Tue, Oct 20, 2015 at 12:19 PM, Paul Farrell <pf...@funnelback.com>
wrote:

> Hi,
>
> I am using Alfresco Community 5.0.
>
> Having taken that AMP file (version 0.7.1) and then installed it into
> Alfresco and restarted the services, the issue is still present.
>
> I suspect that this is probably more to do with the Manifold end than the
> Alfresco end. It seems it is Manifold that is automatically appending the
> “/api/node” string into the path whenever I use “/alfresco/service” as the
> Context in the repository connection configuration.
>
> If it is of interest, this is the output in the manifoldcf.log file when I
> use the repo connection config I mentioned earlier.
>
> DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection request:
> [route: {}->http://54.165.85.140:8080][total kept alive: 0; route
> allocated: 0 of 2; total allocated: 0 of 20]
> DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection leased: [id:
> 10][route: {}->http://54.165.85.140:8080][total kept alive: 0; route
> allocated: 1 of 2; total allocated: 1 of 20]
> DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Opening connection {}->
> http://54.165.85.140:8080
> DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connecting to /
> 54.165.85.140:8080
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Connection established
> 172.31.23.90:58712<->54.165.85.140:8080
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Executing request GET
> /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Proxy auth state:
> UNCHALLENGED
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> GET
> /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> Accept: application/json
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> Authorization: Basic YWRtaW46RnVubmVsYmFjazE=
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> Host: 54.165.85.140:8080
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> Connection: Keep-Alive
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> User-Agent: Apache-HttpClient/4.3.5 (java 1.5)
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> Accept-Encoding: gzip,deflate
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> "GET
> /alfresco/service/api/node/auth/resolve/admin HTTP/1.1[\r][\n]"
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> "Accept: application/json[\r][\n]"
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> "Authorization: Basic YWRtaW46RnVubmVsYmFjazE=[\r][\n]"
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> "Host: 54.165.85.140:8080[\r][\n]"
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> "Connection: Keep-Alive[\r][\n]"
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> "User-Agent: Apache-HttpClient/4.3.5 (java 1.5)[\r][\n]"
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> "Accept-Encoding: gzip,deflate[\r][\n]"
> DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >>
> "[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "HTTP/1.1 404 Not Found[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "Server: Apache-Coyote/1.1[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "Cache-Control: no-cache[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "Expires: Thu, 01 Jan 1970 00:00:00 GMT[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "Pragma: no-cache[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "Content-Type: text/html;charset=UTF-8[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "Transfer-Encoding: chunked[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "Date: Tue, 20 Oct 2015 16:18:47 GMT[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "630[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "
> http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "<html xmlns="http://www.w3.org/1999/xhtml">[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> <head>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>    <title>Web Script Status 404 - Not Found</title>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>    <link rel="stylesheet" href="/alfresco/css/webscripts.css"
> type="text/css" />[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> </head>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> <body>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>    <div>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>       <table>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>             <td><img src="/alfresco/images/logo/AlfrescoLogo32.png"
> alt="Alfresco" /></td>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>             <td><span class="title">Web Script Status 404 - Not
> Found</span></td>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          </tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>       </table>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>       <br/>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>       <table>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr><td>The Web Script <a
> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
> has responded with a status of 404 - Not Found.</td></tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>       </table>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>       <br/>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>       <table>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr><td><b>404 Description:</b></td><td> Requested resource is not
> available.</td></tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr><td>&nbsp;</td></tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr><td><b>Message:</b></td><td>Cannot find object for
> NodeIdReference[storeRef=auth://resolve,id=admin]</td></tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr><td><b>Server</b>:</td><td>Community v5.0.0 (r75118-b23)
> schema 8,001</td></tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr><td><b>Time</b>:</td><td>Oct 20, 2015 4:18:47 PM</td></tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr><td></td><td>&nbsp;</td></tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>          <tr><td><b>Diagnostics</b>:</td><td><a
> href="/alfresco/service/script/org/alfresco/cmis/item.get">Inspect Web
> Script (org/alfresco/cmis/item.get)</a></td></tr>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>       </table>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
>    </div>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "
> </body>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "</html>[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "[\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> "[\r][\n]"
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> HTTP/1.1 404 Not Found
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> Server: Apache-Coyote/1.1
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> Cache-Control: no-cache
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> Expires: Thu, 01 Jan 1970 00:00:00 GMT
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> Pragma: no-cache
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> Content-Type: text/html;charset=UTF-8
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> Transfer-Encoding: chunked
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 <<
> Date: Tue, 20 Oct 2015 16:18:47 GMT
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection can be kept
> alive indefinitely
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10:
> Shutdown connection
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection discarded
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10: Close
> connection
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection released:
> [id: 10][route: {}->http://54.165.85.140:8080][total kept alive: 0; route
> allocated: 0 of 2; total allocated: 0 of 20]
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager is
> shutting down
> DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager shut
> down
>
> *Paul Farrell*
> Senior Search Consultant
>
> 109-123 Clifton Street, London EC2A 4LD
> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>
> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>
> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> -
> Twitter <https://twitter.com/funnelback>
>
> Funnelback UK Ltd is a limited liability company registered in England &
> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
> EC2A 4LD. Company registration number: 07004264.
>
> On 20 Oct 2015, at 16:50, Maurizio Pillitu <ma...@apache.org> wrote:
>
> Hi Paul,
>
> it looks like you're hitting
> https://github.com/maoo/alfresco-indexer/issues/3 ; which version of
> alfresco-indexer are you using? Can you try using
> http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp (or
> the pre-built WAR file -
> http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar
>  )
>
> HTH
>   mao
>
> On Tue, Oct 20, 2015 at 5:36 PM Paul Farrell <pf...@funnelback.com>
> wrote:
>
>> Hi,
>>
>> Having had to go back to basics and re-install my Alfresco instance, I
>> can confirm that the AMP file for the alfresco indexer web scripts *does*
>> actually install without error. There must have been an issue with my
>> previous Alfresco instance.
>>
>> Having said that, the Alfresco WebScript connector fails. The failure is
>> down to the ‘Context’ setting (see below):
>>
>> <4a6db6238cff01e7ff77cdaf7e6ea050.png>
>>
>> When you attempt to save the configuration of the WebScript connector,
>> Manifold clearly tries to check the connection. It seems to do this by
>> making an API call (/auth/resolve/admin). The issue is with what Manifold
>> prepends to the start of that path.
>> If I leave the setting as above then Manifold reports   :
>>
>> <tr><td>The Web Script <a
>> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
>> has responded with a status of 404 - Not Found.</td></tr>[\n]”
>>
>> In other words, it builds the full path as
>> “alfresco/service/api/node/auth/resolve/admin”.
>>
>> For my Alfresco Community 5.0 instance, I get to that same web script via
>> the URL “/alfresco/service/auth/resolve/admin” i.e. without the ‘/api/node’.
>>
>> Somewhere, Manifold is assuming that the ‘/api/node’ is a correct path
>> inclusion. In other words, there is nothing I can put into that box to
>> prevent it.
>>
>> Paul
>>
>> On 20 Oct 2015, at 12:56, Karl Wright <da...@gmail.com> wrote:
>>
>> Hmm.  What file was missing?  Maurizio maintains the indexer plugin; I
>> feel certain he'd want to know.
>>
>> Karl
>>
>>
>> On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <pf...@funnelback.com>
>> wrote:
>>
>>> Hi guys,
>>>
>>> Just to let you know what’s going on - for informational purposes more
>>> than anything.
>>>
>>> I initially tried taking the AMP file provided in the MCF plugins
>>> directory (0.7.0) and tried to install it into Alfresco but got a message
>>> saying a file was missing.
>>>
>>> Instead, I cloned the repository on GitHub for the alfresco-indexer
>>> project and then built it on my local machine. This generated the AMP file
>>> (0.7.2).
>>>
>>> I was able to successfully install the AMP file onto my Alfresco
>>> instance.
>>>
>>> As it happens I now cannot log into Alfresco Share ('bad credentials or
>>> server not available' message) but that is something I can work on.
>>> Apparently the installation of some AMP files have been known to cause this
>>> issue.
>>>
>>> So, progress to a point!
>>>
>>> *Paul Farrell*
>>> Senior Search Consultant
>>>
>>> 109-123 Clifton Street, London EC2A 4LD
>>> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>>>
>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>
>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> -
>>>  Twitter <https://twitter.com/funnelback>
>>>
>>> Funnelback UK Ltd is a limited liability company registered in England &
>>> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>>> EC2A 4LD. Company registration number: 07004264.
>>>
>>> On 20 Oct 2015, at 12:36, Rafa Haro <rh...@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> At the Alfresco side, hope this helps:
>>>
>>> http://docs.alfresco.com/4.1/tasks/amp-install.html
>>>
>>> Cheers
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <da...@gmail.com> wrote:
>>>
>>>> The AMP file is actually shipped as part of the binary MCF
>>>> distribution.  You can find it under "plugins".
>>>>
>>>> Karl
>>>>
>>>>
>>>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <pf...@funnelback.com>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Hopefully this will be my only request for information today.
>>>>> I’m afraid this is a bit of a newbie question but I have managed to
>>>>> get the Manifold UI to now show ‘Alfresco Webscripts’ as a connector. The
>>>>> only bit I am missing now is to install the AMP file in Afresco.
>>>>>
>>>>> I realise that this is slightly outside of the Manifold remit but I
>>>>> wondered if anyone can advise how I build the AMP file from the URL (
>>>>> https://github.com/maoo/alfresco-indexer)? I have cloned the
>>>>> repository to my local drive but, having never worked with Maven, am at a
>>>>> loss at how to generate the AMP file that I then need to install into
>>>>> Alfresco.
>>>>>
>>>>> Many thanks,
>>>>>
>>>>> On 19 Oct 2015, at 17:36, Karl Wright <da...@gmail.com> wrote:
>>>>>
>>>>> The only way you can have such a reduced list of connectors is if
>>>>> somebody commented out many connectors in your connectors.xml, or removed
>>>>> them from the database table where they are registered by hand.
>>>>>
>>>>> Karl
>>>>>
>>>>>
>>>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <
>>>>> pfarrell@funnelback.com> wrote:
>>>>>
>>>>>> After a good deal of time clicking around I came to the same
>>>>>> conclusion - that there is no way of telling from the UI!!
>>>>>>
>>>>>> Having dug a bit deeper I believe I may actually have the Alfresco
>>>>>> WebScript connectors installed. At least the 0.7.0 version. I notice in the
>>>>>> ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
>>>>>>
>>>>>> Looking in the ‘connectors.xml’ file I can also see the line :
>>>>>>
>>>>>> <repositoryconnector name="Alfresco Webscript"
>>>>>> class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>>>>>>
>>>>>> You can imagine my excitement!
>>>>>>
>>>>>> The only thing I am missing is the option in the UI. When I click to
>>>>>> create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive,
>>>>>> HDFS, Jira, Meridio, RSS, Sharepoint.
>>>>>>
>>>>>> Perhaps I am hoping for too much to hope that I can make a simple
>>>>>> change to enable this repo connection?
>>>>>>
>>>>>> Thanks for all the help everyone
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 19 Oct 2015, at 17:26, Karl Wright <da...@gmail.com> wrote:
>>>>>>
>>>>>> Hah; there's not a way to inquire in the UI, if that's what you
>>>>>> mean.  But if you see "Alfresco webscript" in the list of repository
>>>>>> connection types, you've got a version that supports that connector.
>>>>>>
>>>>>> Thanks,
>>>>>> Karl
>>>>>>
>>>>>>
>>>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <
>>>>>> pfarrell@funnelback.com> wrote:
>>>>>>
>>>>>>> Thanks Rafa.
>>>>>>>
>>>>>>> As an aside, is there an easy way to identify which version of
>>>>>>> ManifoldCF you are on?
>>>>>>>
>>>>>>> Cheers
>>>>>>>
>>>>>>> *Paul Farrell*
>>>>>>> Senior Search Consultant
>>>>>>>
>>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
>>>>>>> <http://www.funnelback.com/>
>>>>>>>
>>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>>>>
>>>>>>> Connect with us: LinkedIn
>>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
>>>>>>> <https://twitter.com/funnelback>
>>>>>>>
>>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
>>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>>>>>>
>>>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rh...@apache.org> wrote:
>>>>>>>
>>>>>>> Hi Paul,
>>>>>>>
>>>>>>> All you need to do is to install this webscript
>>>>>>> <https://github.com/maoo/alfresco-indexer> within your Alfresco
>>>>>>> instance. The connector itself is already part of the most recent versions
>>>>>>> of ManifoldCF
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Rafa
>>>>>>>
>>>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <
>>>>>>> pfarrell@funnelback.com> wrote:
>>>>>>>
>>>>>>>> Ok, thanks again guys.
>>>>>>>>
>>>>>>>> The Webscript connector it is.
>>>>>>>>
>>>>>>>> I realise I am asking a lot here but are there any easy-to-follow
>>>>>>>> guidelines on how to get this Webscript connector installed?  I see there
>>>>>>>> is a GitHub page here (
>>>>>>>> https://github.com/maoo/alfresco-webscript-manifold-connector)
>>>>>>>> which discusses it (although it directs you to a repository of files).
>>>>>>>>
>>>>>>>> I am just keen to make sure that any steps I follow to try and get
>>>>>>>> this Webscript connector installed and working are updated, reliable steps.
>>>>>>>> I would hate to waste time with out of date information.
>>>>>>>>
>>>>>>>> Thanks all
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Paul,
>>>>>>>>
>>>>>>>> I suggest that you should use Alfresco Webscript as Karl mentioned.
>>>>>>>> Web services is so slow compared to other services and I've also checked
>>>>>>>> that Alfresco CMIS web services does not return change token(may be there
>>>>>>>> is something that I don't know).
>>>>>>>>
>>>>>>>> By the way current version of CMIS connector is not aware of change
>>>>>>>> token. I would write a patch for you if alfresco supports change token
>>>>>>>> property.
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>> Muhammed
>>>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <
>>>>>>>> daddywri@gmail.com> şunu yazdı:
>>>>>>>>
>>>>>>>>> Hi Paul,
>>>>>>>>>
>>>>>>>>> The Alfresco Webscript connector is a wholly different connector
>>>>>>>>> that has no relation to the CMIS connector.  It requires an Alfresco
>>>>>>>>> webscript plugin be installed on your Alfresco server to work, though.
>>>>>>>>>
>>>>>>>>> Hope that helps.
>>>>>>>>>
>>>>>>>>> Karl
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <
>>>>>>>>> pfarrell@funnelback.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Muhammed/Karl,
>>>>>>>>>>
>>>>>>>>>> Firstly, thank-you so much for taking the time to reply. It is
>>>>>>>>>> very much appreciated.
>>>>>>>>>>
>>>>>>>>>> Currently I am using the AtomPub for my CMIS repository
>>>>>>>>>> connection. I have just read something which may shed a little light on
>>>>>>>>>> this. The post read that change tokens are not passed via AtomPub
>>>>>>>>>> connections (
>>>>>>>>>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758).
>>>>>>>>>> If true, this would explain why ManifoldCF may be unable to determine a
>>>>>>>>>> change in Alfresco.
>>>>>>>>>>
>>>>>>>>>> It looks like I have two possible options left open to me
>>>>>>>>>> (correct me if I’m wrong):
>>>>>>>>>>
>>>>>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the
>>>>>>>>>> connection mechanism
>>>>>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’
>>>>>>>>>> connector?  (or is this the same as the ‘Web Services’ connection mentioned
>>>>>>>>>> above?)
>>>>>>>>>>
>>>>>>>>>> Thanks again,
>>>>>>>>>>
>>>>>>>>>> Paul
>>>>>>>>>>
>>>>>>>>>> *Paul Farrell*
>>>>>>>>>> Senior Search Consultant
>>>>>>>>>>
>>>>>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
>>>>>>>>>> <http://www.funnelback.com/>
>>>>>>>>>>
>>>>>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED
>>>>>>>>>> STATES
>>>>>>>>>>
>>>>>>>>>> Connect with us: LinkedIn
>>>>>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
>>>>>>>>>> <https://twitter.com/funnelback>
>>>>>>>>>>
>>>>>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>>>>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
>>>>>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>>>>>>>>>
>>>>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Paul,
>>>>>>>>>>
>>>>>>>>>> Repositories should give information to ManifoldCF when they
>>>>>>>>>> updated. Current CMIS connector reindex document if the lastest version of
>>>>>>>>>> the document has changed, not updated.
>>>>>>>>>>
>>>>>>>>>> There is a change token property in CMIS specification and it
>>>>>>>>>> should change when document is updated so ManifoldCF can understand that
>>>>>>>>>> document is updated but implementing change token property is optional.
>>>>>>>>>> I've checked Alfresco's CMIS web site and seen that they didn't set the
>>>>>>>>>> change token.
>>>>>>>>>>
>>>>>>>>>> I think, there is nothing we can do at this point.
>>>>>>>>>>
>>>>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <da...@gmail.com>
>>>>>>>>>> şunu yazdı:
>>>>>>>>>>
>>>>>>>>>>> Hi Paul,
>>>>>>>>>>>
>>>>>>>>>>> This looks like a bug in the CMIS connector to me; usually the
>>>>>>>>>>> document version string the connector constructs should be adequate to
>>>>>>>>>>> detect all changes.  Can you create a ticket?
>>>>>>>>>>> https://issues.apache.org/jira , project ManifoldCF.  Please
>>>>>>>>>>> include what version of MCF you are using here.  FWIW, this may be in fact
>>>>>>>>>>> a bug in the Alfresco CMIS implementation, but we'll have to have some back
>>>>>>>>>>> and forth before I can determine that for sure.
>>>>>>>>>>>
>>>>>>>>>>> In the meantime, have you considered using the Alfresco
>>>>>>>>>>> Webscript connector?  It's the preferred way to do Alfresco indexing,
>>>>>>>>>>> although there have been issues reported having to do with running it on
>>>>>>>>>>> some configurations of Alfresco.  I'm not entirely sure what the problem is
>>>>>>>>>>> there; maybe a version dependency of some kind.
>>>>>>>>>>>
>>>>>>>>>>> Karl
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <
>>>>>>>>>>> pfarrell@funnelback.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>
>>>>>>>>>>>> Hoping someone may be able to advise.
>>>>>>>>>>>>
>>>>>>>>>>>> I am currently using Manifold, together with a CMIS connector,
>>>>>>>>>>>> to retrieve and index content from an Alfresco repository.
>>>>>>>>>>>>
>>>>>>>>>>>> All is going well apart from, what I would call, the
>>>>>>>>>>>> ‘incremental crawl’.
>>>>>>>>>>>>
>>>>>>>>>>>> The main issue I am having is that the modification of a
>>>>>>>>>>>> document’s security settings, in Alfresco, is not being picked up in next
>>>>>>>>>>>> Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A
>>>>>>>>>>>> and B as Consumers. I run a crawl in Manifold and it picks up the documents
>>>>>>>>>>>> fine.  The security is set as expected. I then remove ‘User A’ from the
>>>>>>>>>>>> security of that document and re-run the Manifold crawl. User A can still
>>>>>>>>>>>> see the document in the local search engine.
>>>>>>>>>>>>
>>>>>>>>>>>> It is as if Manifold is not treating the security update as a
>>>>>>>>>>>> ‘modification’ and is therefore not refreshing it. Note that if I go into
>>>>>>>>>>>> the Output Connections, edit and save the relevant output connection and
>>>>>>>>>>>> then click ‘Remove all associated documents’, the next time I crawl, the
>>>>>>>>>>>> changes are picked up. It is clear that Manifold is just not updating
>>>>>>>>>>>> whatever internal record it has for this item.
>>>>>>>>>>>>
>>>>>>>>>>>> Any ideas?
>>>>>>>>>>>>
>>>>>>>>>>>> Many thanks.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>

Re: Manifold/Alfresco seeding and security

Posted by Paul Farrell <pf...@funnelback.com>.
Hi,

I am using Alfresco Community 5.0. 

Having taken that AMP file (version 0.7.1) and then installed it into Alfresco and restarted the services, the issue is still present. 

I suspect that this is probably more to do with the Manifold end than the Alfresco end. It seems it is Manifold that is automatically appending the “/api/node” string into the path whenever I use “/alfresco/service” as the Context in the repository connection configuration. 

If it is of interest, this is the output in the manifoldcf.log file when I use the repo connection config I mentioned earlier. 

DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection request: [route: {}->http://54.165.85.140:8080][total kept alive: 0; route allocated: 0 of 2; total allocated: 0 of 20]
DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connection leased: [id: 10][route: {}->http://54.165.85.140:8080][total kept alive: 0; route allocated: 1 of 2; total allocated: 1 of 20]
DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Opening connection {}->http://54.165.85.140:8080
DEBUG 2015-10-20 12:18:46,869 (qtp182259421-40) - Connecting to /54.165.85.140:8080
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Connection established 172.31.23.90:58712<->54.165.85.140:8080
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Executing request GET /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - Proxy auth state: UNCHALLENGED
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> GET /alfresco/service/api/node/auth/resolve/admin HTTP/1.1
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> Accept: application/json
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> Authorization: Basic YWRtaW46RnVubmVsYmFjazE=
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> Host: 54.165.85.140:8080
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> Connection: Keep-Alive
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> User-Agent: Apache-HttpClient/4.3.5 (java 1.5)
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> Accept-Encoding: gzip,deflate
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> "GET /alfresco/service/api/node/auth/resolve/admin HTTP/1.1[\r][\n]"
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> "Accept: application/json[\r][\n]"
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> "Authorization: Basic YWRtaW46RnVubmVsYmFjazE=[\r][\n]"
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> "Host: 54.165.85.140:8080[\r][\n]"
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> "Connection: Keep-Alive[\r][\n]"
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> "User-Agent: Apache-HttpClient/4.3.5 (java 1.5)[\r][\n]"
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> "Accept-Encoding: gzip,deflate[\r][\n]"
DEBUG 2015-10-20 12:18:46,870 (qtp182259421-40) - http-outgoing-10 >> "[\r][\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "HTTP/1.1 404 Not Found[\r][\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "Server: Apache-Coyote/1.1[\r][\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "Cache-Control: no-cache[\r][\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "Expires: Thu, 01 Jan 1970 00:00:00 GMT[\r][\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "Pragma: no-cache[\r][\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "Content-Type: text/html;charset=UTF-8[\r][\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "Transfer-Encoding: chunked[\r][\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "Date: Tue, 20 Oct 2015 16:18:47 GMT[\r][\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "[\r][\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "630[\r][\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "<html xmlns="http://www.w3.org/1999/xhtml">[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "   <head>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "      <title>Web Script Status 404 - Not Found</title>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "      <link rel="stylesheet" href="/alfresco/css/webscripts.css" type="text/css" />[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "   </head>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "   <body>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "      <div>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "         <table>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "            <tr>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "               <td><img src="/alfresco/images/logo/AlfrescoLogo32.png" alt="Alfresco" /></td>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "               <td><span class="title">Web Script Status 404 - Not Found</span></td>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "            </tr>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "         </table>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "         <br/>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "         <table>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "            <tr><td>The Web Script <a href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a> has responded with a status of 404 - Not Found.</td></tr>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "         </table>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "         <br/>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "         <table>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "            <tr><td><b>404 Description:</b></td><td> Requested resource is not available.</td></tr>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "            <tr><td>&nbsp;</td></tr>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "            <tr><td><b>Message:</b></td><td>Cannot find object for NodeIdReference[storeRef=auth://resolve,id=admin]</td></tr>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "            <tr><td><b>Server</b>:</td><td>Community v5.0.0 (r75118-b23) schema 8,001</td></tr>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "            <tr><td><b>Time</b>:</td><td>Oct 20, 2015 4:18:47 PM</td></tr>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "            <tr><td></td><td>&nbsp;</td></tr>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "            <tr><td><b>Diagnostics</b>:</td><td><a href="/alfresco/service/script/org/alfresco/cmis/item.get">Inspect Web Script (org/alfresco/cmis/item.get)</a></td></tr>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "         </table>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "      </div>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "   </body>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "</html>[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "[\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << "[\r][\n]"
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << HTTP/1.1 404 Not Found
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << Server: Apache-Coyote/1.1
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << Cache-Control: no-cache
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << Expires: Thu, 01 Jan 1970 00:00:00 GMT
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << Pragma: no-cache
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << Content-Type: text/html;charset=UTF-8
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << Transfer-Encoding: chunked
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10 << Date: Tue, 20 Oct 2015 16:18:47 GMT
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection can be kept alive indefinitely
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10: Shutdown connection
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection discarded
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - http-outgoing-10: Close connection
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection released: [id: 10][route: {}->http://54.165.85.140:8080][total kept alive: 0; route allocated: 0 of 2; total allocated: 0 of 20]
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager is shutting down
DEBUG 2015-10-20 12:18:46,883 (qtp182259421-40) - Connection manager shut down

Paul Farrell
Senior Search Consultant
 
109-123 Clifton Street, London EC2A 4LD
T +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>

UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES

Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>

Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.

> On 20 Oct 2015, at 16:50, Maurizio Pillitu <ma...@apache.org> wrote:
> 
> Hi Paul,
> 
> it looks like you're hitting https://github.com/maoo/alfresco-indexer/issues/3 <https://github.com/maoo/alfresco-indexer/issues/3> ; which version of alfresco-indexer are you using? Can you try using http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp <http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp> (or the pre-built WAR file - http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar <http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar> )
> 
> HTH
>   mao
> 
> On Tue, Oct 20, 2015 at 5:36 PM Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> Hi,
> 
> Having had to go back to basics and re-install my Alfresco instance, I can confirm that the AMP file for the alfresco indexer web scripts does actually install without error. There must have been an issue with my previous Alfresco instance. 
> 
> Having said that, the Alfresco WebScript connector fails. The failure is down to the ‘Context’ setting (see below):
> 
> <4a6db6238cff01e7ff77cdaf7e6ea050.png>
> 
> When you attempt to save the configuration of the WebScript connector, Manifold clearly tries to check the connection. It seems to do this by making an API call (/auth/resolve/admin). The issue is with what Manifold prepends to the start of that path. 
> If I leave the setting as above then Manifold reports   :   
> 
> <tr><td>The Web Script <a href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a> has responded with a status of 404 - Not Found.</td></tr>[\n]”
> 
> In other words, it builds the full path as “alfresco/service/api/node/auth/resolve/admin”.
> 
> For my Alfresco Community 5.0 instance, I get to that same web script via the URL “/alfresco/service/auth/resolve/admin” i.e. without the ‘/api/node’.
> 
> Somewhere, Manifold is assuming that the ‘/api/node’ is a correct path inclusion. In other words, there is nothing I can put into that box to prevent it. 
> 
> Paul
>> On 20 Oct 2015, at 12:56, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>> 
> 
>> Hmm.  What file was missing?  Maurizio maintains the indexer plugin; I feel certain he'd want to know.
>> 
>> Karl
>> 
>> 
>> On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>> Hi guys,
>> 
>> Just to let you know what’s going on - for informational purposes more than anything.
>> 
>> I initially tried taking the AMP file provided in the MCF plugins directory (0.7.0) and tried to install it into Alfresco but got a message saying a file was missing.
>> 
>> Instead, I cloned the repository on GitHub for the alfresco-indexer project and then built it on my local machine. This generated the AMP file (0.7.2). 
>> 
>> I was able to successfully install the AMP file onto my Alfresco instance. 
>> 
>> As it happens I now cannot log into Alfresco Share ('bad credentials or server not available' message) but that is something I can work on. Apparently the installation of some AMP files have been known to cause this issue. 
>> 
>> So, progress to a point!
>> 
>> Paul Farrell
>> Senior Search Consultant
>>  
>> 109-123 Clifton Street, London EC2A 4LD
>> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://www.funnelback.com/>
>> 
>> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>> 
>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>
>> 
>> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.
>> 
>>> On 20 Oct 2015, at 12:36, Rafa Haro <rharoapache@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> Hi, 
>>> 
>>> At the Alfresco side, hope this helps:
>>> 
>>> http://docs.alfresco.com/4.1/tasks/amp-install.html <http://docs.alfresco.com/4.1/tasks/amp-install.html>
>>> 
>>> Cheers
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> The AMP file is actually shipped as part of the binary MCF distribution.  You can find it under "plugins".
>>> 
>>> Karl
>>> 
>>> 
>>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>> Hi all,
>>> 
>>> Hopefully this will be my only request for information today. 
>>> I’m afraid this is a bit of a newbie question but I have managed to get the Manifold UI to now show ‘Alfresco Webscripts’ as a connector. The only bit I am missing now is to install the AMP file in Afresco. 
>>> 
>>> I realise that this is slightly outside of the Manifold remit but I wondered if anyone can advise how I build the AMP file from the URL (https://github.com/maoo/alfresco-indexer <https://github.com/maoo/alfresco-indexer>)? I have cloned the repository to my local drive but, having never worked with Maven, am at a loss at how to generate the AMP file that I then need to install into Alfresco. 
>>> 
>>> Many thanks,
>>> 
>>>> On 19 Oct 2015, at 17:36, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>>> 
>>>> The only way you can have such a reduced list of connectors is if somebody commented out many connectors in your connectors.xml, or removed them from the database table where they are registered by hand.
>>>> 
>>>> Karl
>>>> 
>>>> 
>>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>> After a good deal of time clicking around I came to the same conclusion - that there is no way of telling from the UI!!
>>>> 
>>>> Having dug a bit deeper I believe I may actually have the Alfresco WebScript connectors installed. At least the 0.7.0 version. I notice in the ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
>>>> 
>>>> Looking in the ‘connectors.xml’ file I can also see the line :
>>>> 
>>>> <repositoryconnector name="Alfresco Webscript" class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>>>> 
>>>> You can imagine my excitement!
>>>> 
>>>> The only thing I am missing is the option in the UI. When I click to create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive, HDFS, Jira, Meridio, RSS, Sharepoint. 
>>>> 
>>>> Perhaps I am hoping for too much to hope that I can make a simple change to enable this repo connection?
>>>> 
>>>> Thanks for all the help everyone 
>>>> 
>>>> 
>>>> 
>>>>> On 19 Oct 2015, at 17:26, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>>>> 
>>>>> Hah; there's not a way to inquire in the UI, if that's what you mean.  But if you see "Alfresco webscript" in the list of repository connection types, you've got a version that supports that connector.
>>>>> 
>>>>> Thanks,
>>>>> Karl
>>>>> 
>>>>> 
>>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>> Thanks Rafa.
>>>>> 
>>>>> As an aside, is there an easy way to identify which version of ManifoldCF you are on?
>>>>> 
>>>>> Cheers
>>>>> 
>>>>> Paul Farrell
>>>>> Senior Search Consultant
>>>>>  
>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://www.funnelback.com/>
>>>>> 
>>>>> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>> 
>>>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>
>>>>> 
>>>>> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.
>>>>> 
>>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rharo@apache.org <ma...@apache.org>> wrote:
>>>>>> 
>>>>>> Hi Paul, 
>>>>>> 
>>>>>> All you need to do is to install this webscript <https://github.com/maoo/alfresco-indexer> within your Alfresco instance. The connector itself is already part of the most recent versions of ManifoldCF
>>>>>> 
>>>>>> Cheers,
>>>>>> Rafa
>>>>>> 
>>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>>> Ok, thanks again guys. 
>>>>>> 
>>>>>> The Webscript connector it is. 
>>>>>> 
>>>>>> I realise I am asking a lot here but are there any easy-to-follow guidelines on how to get this Webscript connector installed?  I see there is a GitHub page here (https://github.com/maoo/alfresco-webscript-manifold-connector <https://github.com/maoo/alfresco-webscript-manifold-connector>) which discusses it (although it directs you to a repository of files). 
>>>>>> 
>>>>>> I am just keen to make sure that any steps I follow to try and get this Webscript connector installed and working are updated, reliable steps. I would hate to waste time with out of date information. 
>>>>>> 
>>>>>> Thanks all
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>> wrote:
>>>>>>> 
>>>>>>> Hi Paul,
>>>>>>> 
>>>>>>> I suggest that you should use Alfresco Webscript as Karl mentioned. Web services is so slow compared to other services and I've also checked that Alfresco CMIS web services does not return change token(may be there is something that I don't know). 
>>>>>>> 
>>>>>>> By the way current version of CMIS connector is not aware of change token. I would write a patch for you if alfresco supports change token property.
>>>>>>> 
>>>>>>> Thanks!
>>>>>>> Muhammed 
>>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>>>>>>> Hi Paul,
>>>>>>> 
>>>>>>> The Alfresco Webscript connector is a wholly different connector that has no relation to the CMIS connector.  It requires an Alfresco webscript plugin be installed on your Alfresco server to work, though.
>>>>>>> 
>>>>>>> Hope that helps.
>>>>>>> 
>>>>>>> Karl
>>>>>>> 
>>>>>>> 
>>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>>>> Hi Muhammed/Karl,
>>>>>>> 
>>>>>>> Firstly, thank-you so much for taking the time to reply. It is very much appreciated. 
>>>>>>> 
>>>>>>> Currently I am using the AtomPub for my CMIS repository connection. I have just read something which may shed a little light on this. The post read that change tokens are not passed via AtomPub connections (https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758 <https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758>). If true, this would explain why ManifoldCF may be unable to determine a change in Alfresco.
>>>>>>> 
>>>>>>> It looks like I have two possible options left open to me (correct me if I’m wrong):
>>>>>>> 
>>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the connection mechanism
>>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’ connector?  (or is this the same as the ‘Web Services’ connection mentioned above?)
>>>>>>> 
>>>>>>> Thanks again,
>>>>>>> 
>>>>>>> Paul
>>>>>>> 
>>>>>>> Paul Farrell
>>>>>>> Senior Search Consultant
>>>>>>>  
>>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>>>> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://www.funnelback.com/>
>>>>>>> 
>>>>>>> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>>>> 
>>>>>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>
>>>>>>> 
>>>>>>> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.
>>>>>>> 
>>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>> wrote:
>>>>>>>> 
>>>>>>>> Hi Paul,
>>>>>>>> 
>>>>>>>> Repositories should give information to ManifoldCF when they updated. Current CMIS connector reindex document if the lastest version of the document has changed, not updated. 
>>>>>>>> 
>>>>>>>> There is a change token property in CMIS specification and it should change when document is updated so ManifoldCF can understand that document is updated but implementing change token property is optional.  I've checked Alfresco's CMIS web site and seen that they didn't set the change token.
>>>>>>>> 
>>>>>>>> I think, there is nothing we can do at this point.
>>>>>>>> 
>>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>>>>>>>> Hi Paul,
>>>>>>>> 
>>>>>>>> This looks like a bug in the CMIS connector to me; usually the document version string the connector constructs should be adequate to detect all changes.  Can you create a ticket?  https://issues.apache.org/jira <https://issues.apache.org/jira> , project ManifoldCF.  Please include what version of MCF you are using here.  FWIW, this may be in fact a bug in the Alfresco CMIS implementation, but we'll have to have some back and forth before I can determine that for sure.
>>>>>>>> 
>>>>>>>> In the meantime, have you considered using the Alfresco Webscript connector?  It's the preferred way to do Alfresco indexing, although there have been issues reported having to do with running it on some configurations of Alfresco.  I'm not entirely sure what the problem is there; maybe a version dependency of some kind.
>>>>>>>> 
>>>>>>>> Karl
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>>>>> Hi Everyone,
>>>>>>>> 
>>>>>>>> Hoping someone may be able to advise.
>>>>>>>> 
>>>>>>>> I am currently using Manifold, together with a CMIS connector, to retrieve and index content from an Alfresco repository.
>>>>>>>> 
>>>>>>>> All is going well apart from, what I would call, the ‘incremental crawl’.
>>>>>>>> 
>>>>>>>> The main issue I am having is that the modification of a document’s security settings, in Alfresco, is not being picked up in next Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A and B as Consumers. I run a crawl in Manifold and it picks up the documents fine.  The security is set as expected. I then remove ‘User A’ from the security of that document and re-run the Manifold crawl. User A can still see the document in the local search engine.
>>>>>>>> 
>>>>>>>> It is as if Manifold is not treating the security update as a ‘modification’ and is therefore not refreshing it. Note that if I go into the Output Connections, edit and save the relevant output connection and then click ‘Remove all associated documents’, the next time I crawl, the changes are picked up. It is clear that Manifold is just not updating whatever internal record it has for this item.
>>>>>>>> 
>>>>>>>> Any ideas?
>>>>>>>> 
>>>>>>>> Many thanks.
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>> 
>> 


Re: Manifold/Alfresco seeding and security

Posted by Maurizio Pillitu <ma...@apache.org>.
Hi Paul,

it looks like you're hitting
https://github.com/maoo/alfresco-indexer/issues/3 ; which version of
alfresco-indexer are you using? Can you try using
http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts%7C0.7.1%7Camp
(or
the pre-built WAR file -
http://search.maven.org/#artifactdetails%7Ccom.github.maoo.indexer%7Calfresco-indexer-webscripts-war%7C0.7.1%7Cwar
 )

HTH
  mao

On Tue, Oct 20, 2015 at 5:36 PM Paul Farrell <pf...@funnelback.com>
wrote:

> Hi,
>
> Having had to go back to basics and re-install my Alfresco instance, I can
> confirm that the AMP file for the alfresco indexer web scripts *does*
> actually install without error. There must have been an issue with my
> previous Alfresco instance.
>
> Having said that, the Alfresco WebScript connector fails. The failure is
> down to the ‘Context’ setting (see below):
>
>
> When you attempt to save the configuration of the WebScript connector,
> Manifold clearly tries to check the connection. It seems to do this by
> making an API call (/auth/resolve/admin). The issue is with what Manifold
> prepends to the start of that path.
> If I leave the setting as above then Manifold reports   :
>
> <tr><td>The Web Script <a
> href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
> has responded with a status of 404 - Not Found.</td></tr>[\n]”
>
> In other words, it builds the full path as
> “alfresco/service/api/node/auth/resolve/admin”.
>
> For my Alfresco Community 5.0 instance, I get to that same web script via
> the URL “/alfresco/service/auth/resolve/admin” i.e. without the ‘/api/node’.
>
> Somewhere, Manifold is assuming that the ‘/api/node’ is a correct path
> inclusion. In other words, there is nothing I can put into that box to
> prevent it.
>
> Paul
>
> On 20 Oct 2015, at 12:56, Karl Wright <da...@gmail.com> wrote:
>
> Hmm.  What file was missing?  Maurizio maintains the indexer plugin; I
> feel certain he'd want to know.
>
> Karl
>
>
> On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <pf...@funnelback.com>
> wrote:
>
>> Hi guys,
>>
>> Just to let you know what’s going on - for informational purposes more
>> than anything.
>>
>> I initially tried taking the AMP file provided in the MCF plugins
>> directory (0.7.0) and tried to install it into Alfresco but got a message
>> saying a file was missing.
>>
>> Instead, I cloned the repository on GitHub for the alfresco-indexer
>> project and then built it on my local machine. This generated the AMP file
>> (0.7.2).
>>
>> I was able to successfully install the AMP file onto my Alfresco
>> instance.
>>
>> As it happens I now cannot log into Alfresco Share ('bad credentials or
>> server not available' message) but that is something I can work on.
>> Apparently the installation of some AMP files have been known to cause this
>> issue.
>>
>> So, progress to a point!
>>
>> *Paul Farrell*
>> Senior Search Consultant
>>
>> 109-123 Clifton Street, London EC2A 4LD
>> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>>
>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>
>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> -
>> Twitter <https://twitter.com/funnelback>
>>
>> Funnelback UK Ltd is a limited liability company registered in England &
>> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>> EC2A 4LD. Company registration number: 07004264.
>>
>> On 20 Oct 2015, at 12:36, Rafa Haro <rh...@gmail.com> wrote:
>>
>> Hi,
>>
>> At the Alfresco side, hope this helps:
>>
>> http://docs.alfresco.com/4.1/tasks/amp-install.html
>>
>> Cheers
>>
>>
>>
>>
>>
>> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <da...@gmail.com> wrote:
>>
>>> The AMP file is actually shipped as part of the binary MCF
>>> distribution.  You can find it under "plugins".
>>>
>>> Karl
>>>
>>>
>>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <pf...@funnelback.com>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> Hopefully this will be my only request for information today.
>>>> I’m afraid this is a bit of a newbie question but I have managed to get
>>>> the Manifold UI to now show ‘Alfresco Webscripts’ as a connector. The only
>>>> bit I am missing now is to install the AMP file in Afresco.
>>>>
>>>> I realise that this is slightly outside of the Manifold remit but I
>>>> wondered if anyone can advise how I build the AMP file from the URL (
>>>> https://github.com/maoo/alfresco-indexer)? I have cloned the
>>>> repository to my local drive but, having never worked with Maven, am at a
>>>> loss at how to generate the AMP file that I then need to install into
>>>> Alfresco.
>>>>
>>>> Many thanks,
>>>>
>>>> On 19 Oct 2015, at 17:36, Karl Wright <da...@gmail.com> wrote:
>>>>
>>>> The only way you can have such a reduced list of connectors is if
>>>> somebody commented out many connectors in your connectors.xml, or removed
>>>> them from the database table where they are registered by hand.
>>>>
>>>> Karl
>>>>
>>>>
>>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <pfarrell@funnelback.com
>>>> > wrote:
>>>>
>>>>> After a good deal of time clicking around I came to the same
>>>>> conclusion - that there is no way of telling from the UI!!
>>>>>
>>>>> Having dug a bit deeper I believe I may actually have the Alfresco
>>>>> WebScript connectors installed. At least the 0.7.0 version. I notice in the
>>>>> ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
>>>>>
>>>>> Looking in the ‘connectors.xml’ file I can also see the line :
>>>>>
>>>>> <repositoryconnector name="Alfresco Webscript"
>>>>> class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>>>>>
>>>>> You can imagine my excitement!
>>>>>
>>>>> The only thing I am missing is the option in the UI. When I click to
>>>>> create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive,
>>>>> HDFS, Jira, Meridio, RSS, Sharepoint.
>>>>>
>>>>> Perhaps I am hoping for too much to hope that I can make a simple
>>>>> change to enable this repo connection?
>>>>>
>>>>> Thanks for all the help everyone
>>>>>
>>>>>
>>>>>
>>>>> On 19 Oct 2015, at 17:26, Karl Wright <da...@gmail.com> wrote:
>>>>>
>>>>> Hah; there's not a way to inquire in the UI, if that's what you mean.
>>>>> But if you see "Alfresco webscript" in the list of repository connection
>>>>> types, you've got a version that supports that connector.
>>>>>
>>>>> Thanks,
>>>>> Karl
>>>>>
>>>>>
>>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <
>>>>> pfarrell@funnelback.com> wrote:
>>>>>
>>>>>> Thanks Rafa.
>>>>>>
>>>>>> As an aside, is there an easy way to identify which version of
>>>>>> ManifoldCF you are on?
>>>>>>
>>>>>> Cheers
>>>>>>
>>>>>> *Paul Farrell*
>>>>>> Senior Search Consultant
>>>>>>
>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
>>>>>> <http://www.funnelback.com/>
>>>>>>
>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>>>
>>>>>> Connect with us: LinkedIn
>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
>>>>>> <https://twitter.com/funnelback>
>>>>>>
>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>>>>>
>>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rh...@apache.org> wrote:
>>>>>>
>>>>>> Hi Paul,
>>>>>>
>>>>>> All you need to do is to install this webscript
>>>>>> <https://github.com/maoo/alfresco-indexer> within your Alfresco
>>>>>> instance. The connector itself is already part of the most recent versions
>>>>>> of ManifoldCF
>>>>>>
>>>>>> Cheers,
>>>>>> Rafa
>>>>>>
>>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <
>>>>>> pfarrell@funnelback.com> wrote:
>>>>>>
>>>>>>> Ok, thanks again guys.
>>>>>>>
>>>>>>> The Webscript connector it is.
>>>>>>>
>>>>>>> I realise I am asking a lot here but are there any easy-to-follow
>>>>>>> guidelines on how to get this Webscript connector installed?  I see there
>>>>>>> is a GitHub page here (
>>>>>>> https://github.com/maoo/alfresco-webscript-manifold-connector)
>>>>>>> which discusses it (although it directs you to a repository of files).
>>>>>>>
>>>>>>> I am just keen to make sure that any steps I follow to try and get
>>>>>>> this Webscript connector installed and working are updated, reliable steps.
>>>>>>> I would hate to waste time with out of date information.
>>>>>>>
>>>>>>> Thanks all
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh...@gmail.com> wrote:
>>>>>>>
>>>>>>> Hi Paul,
>>>>>>>
>>>>>>> I suggest that you should use Alfresco Webscript as Karl mentioned.
>>>>>>> Web services is so slow compared to other services and I've also checked
>>>>>>> that Alfresco CMIS web services does not return change token(may be there
>>>>>>> is something that I don't know).
>>>>>>>
>>>>>>> By the way current version of CMIS connector is not aware of change
>>>>>>> token. I would write a patch for you if alfresco supports change token
>>>>>>> property.
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Muhammed
>>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <
>>>>>>> daddywri@gmail.com> şunu yazdı:
>>>>>>>
>>>>>>>> Hi Paul,
>>>>>>>>
>>>>>>>> The Alfresco Webscript connector is a wholly different connector
>>>>>>>> that has no relation to the CMIS connector.  It requires an Alfresco
>>>>>>>> webscript plugin be installed on your Alfresco server to work, though.
>>>>>>>>
>>>>>>>> Hope that helps.
>>>>>>>>
>>>>>>>> Karl
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <
>>>>>>>> pfarrell@funnelback.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Muhammed/Karl,
>>>>>>>>>
>>>>>>>>> Firstly, thank-you so much for taking the time to reply. It is
>>>>>>>>> very much appreciated.
>>>>>>>>>
>>>>>>>>> Currently I am using the AtomPub for my CMIS repository
>>>>>>>>> connection. I have just read something which may shed a little light on
>>>>>>>>> this. The post read that change tokens are not passed via AtomPub
>>>>>>>>> connections (
>>>>>>>>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758).
>>>>>>>>> If true, this would explain why ManifoldCF may be unable to determine a
>>>>>>>>> change in Alfresco.
>>>>>>>>>
>>>>>>>>> It looks like I have two possible options left open to me (correct
>>>>>>>>> me if I’m wrong):
>>>>>>>>>
>>>>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the
>>>>>>>>> connection mechanism
>>>>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’
>>>>>>>>> connector?  (or is this the same as the ‘Web Services’ connection mentioned
>>>>>>>>> above?)
>>>>>>>>>
>>>>>>>>> Thanks again,
>>>>>>>>>
>>>>>>>>> Paul
>>>>>>>>>
>>>>>>>>> *Paul Farrell*
>>>>>>>>> Senior Search Consultant
>>>>>>>>>
>>>>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
>>>>>>>>> <http://www.funnelback.com/>
>>>>>>>>>
>>>>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED
>>>>>>>>> STATES
>>>>>>>>>
>>>>>>>>> Connect with us: LinkedIn
>>>>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
>>>>>>>>> <https://twitter.com/funnelback>
>>>>>>>>>
>>>>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>>>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
>>>>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>>>>>>>>
>>>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi Paul,
>>>>>>>>>
>>>>>>>>> Repositories should give information to ManifoldCF when they
>>>>>>>>> updated. Current CMIS connector reindex document if the lastest version of
>>>>>>>>> the document has changed, not updated.
>>>>>>>>>
>>>>>>>>> There is a change token property in CMIS specification and it
>>>>>>>>> should change when document is updated so ManifoldCF can understand that
>>>>>>>>> document is updated but implementing change token property is optional.
>>>>>>>>> I've checked Alfresco's CMIS web site and seen that they didn't set the
>>>>>>>>> change token.
>>>>>>>>>
>>>>>>>>> I think, there is nothing we can do at this point.
>>>>>>>>>
>>>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <da...@gmail.com>
>>>>>>>>> şunu yazdı:
>>>>>>>>>
>>>>>>>>>> Hi Paul,
>>>>>>>>>>
>>>>>>>>>> This looks like a bug in the CMIS connector to me; usually the
>>>>>>>>>> document version string the connector constructs should be adequate to
>>>>>>>>>> detect all changes.  Can you create a ticket?
>>>>>>>>>> https://issues.apache.org/jira , project ManifoldCF.  Please
>>>>>>>>>> include what version of MCF you are using here.  FWIW, this may be in fact
>>>>>>>>>> a bug in the Alfresco CMIS implementation, but we'll have to have some back
>>>>>>>>>> and forth before I can determine that for sure.
>>>>>>>>>>
>>>>>>>>>> In the meantime, have you considered using the Alfresco Webscript
>>>>>>>>>> connector?  It's the preferred way to do Alfresco indexing, although there
>>>>>>>>>> have been issues reported having to do with running it on some
>>>>>>>>>> configurations of Alfresco.  I'm not entirely sure what the problem is
>>>>>>>>>> there; maybe a version dependency of some kind.
>>>>>>>>>>
>>>>>>>>>> Karl
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <
>>>>>>>>>> pfarrell@funnelback.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>
>>>>>>>>>>> Hoping someone may be able to advise.
>>>>>>>>>>>
>>>>>>>>>>> I am currently using Manifold, together with a CMIS connector,
>>>>>>>>>>> to retrieve and index content from an Alfresco repository.
>>>>>>>>>>>
>>>>>>>>>>> All is going well apart from, what I would call, the
>>>>>>>>>>> ‘incremental crawl’.
>>>>>>>>>>>
>>>>>>>>>>> The main issue I am having is that the modification of a
>>>>>>>>>>> document’s security settings, in Alfresco, is not being picked up in next
>>>>>>>>>>> Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A
>>>>>>>>>>> and B as Consumers. I run a crawl in Manifold and it picks up the documents
>>>>>>>>>>> fine.  The security is set as expected. I then remove ‘User A’ from the
>>>>>>>>>>> security of that document and re-run the Manifold crawl. User A can still
>>>>>>>>>>> see the document in the local search engine.
>>>>>>>>>>>
>>>>>>>>>>> It is as if Manifold is not treating the security update as a
>>>>>>>>>>> ‘modification’ and is therefore not refreshing it. Note that if I go into
>>>>>>>>>>> the Output Connections, edit and save the relevant output connection and
>>>>>>>>>>> then click ‘Remove all associated documents’, the next time I crawl, the
>>>>>>>>>>> changes are picked up. It is clear that Manifold is just not updating
>>>>>>>>>>> whatever internal record it has for this item.
>>>>>>>>>>>
>>>>>>>>>>> Any ideas?
>>>>>>>>>>>
>>>>>>>>>>> Many thanks.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>

Re: Manifold/Alfresco seeding and security

Posted by Paul Farrell <pf...@funnelback.com>.
Hi,

Having had to go back to basics and re-install my Alfresco instance, I can confirm that the AMP file for the alfresco indexer web scripts does actually install without error. There must have been an issue with my previous Alfresco instance. 

Having said that, the Alfresco WebScript connector fails. The failure is down to the ‘Context’ setting (see below):



When you attempt to save the configuration of the WebScript connector, Manifold clearly tries to check the connection. It seems to do this by making an API call (/auth/resolve/admin). The issue is with what Manifold prepends to the start of that path. 
If I leave the setting as above then Manifold reports   :   

<tr><td>The Web Script <a href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a> has responded with a status of 404 - Not Found.</td></tr>[\n]”

In other words, it builds the full path as “alfresco/service/api/node/auth/resolve/admin”.

For my Alfresco Community 5.0 instance, I get to that same web script via the URL “/alfresco/service/auth/resolve/admin” i.e. without the ‘/api/node’.

Somewhere, Manifold is assuming that the ‘/api/node’ is a correct path inclusion. In other words, there is nothing I can put into that box to prevent it. 

Paul

> On 20 Oct 2015, at 12:56, Karl Wright <da...@gmail.com> wrote:
> 
> Hmm.  What file was missing?  Maurizio maintains the indexer plugin; I feel certain he'd want to know.
> 
> Karl
> 
> 
> On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> Hi guys,
> 
> Just to let you know what’s going on - for informational purposes more than anything.
> 
> I initially tried taking the AMP file provided in the MCF plugins directory (0.7.0) and tried to install it into Alfresco but got a message saying a file was missing.
> 
> Instead, I cloned the repository on GitHub for the alfresco-indexer project and then built it on my local machine. This generated the AMP file (0.7.2). 
> 
> I was able to successfully install the AMP file onto my Alfresco instance. 
> 
> As it happens I now cannot log into Alfresco Share ('bad credentials or server not available' message) but that is something I can work on. Apparently the installation of some AMP files have been known to cause this issue. 
> 
> So, progress to a point!
> 
> Paul Farrell
> Senior Search Consultant
>  
> 109-123 Clifton Street, London EC2A 4LD
> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://www.funnelback.com/>
> 
> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
> 
> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>
> 
> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.
> 
>> On 20 Oct 2015, at 12:36, Rafa Haro <rharoapache@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hi, 
>> 
>> At the Alfresco side, hope this helps:
>> 
>> http://docs.alfresco.com/4.1/tasks/amp-install.html <http://docs.alfresco.com/4.1/tasks/amp-install.html>
>> 
>> Cheers
>> 
>> 
>> 
>> 
>> 
>> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>> 
>> The AMP file is actually shipped as part of the binary MCF distribution.  You can find it under "plugins".
>> 
>> Karl
>> 
>> 
>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>> Hi all,
>> 
>> Hopefully this will be my only request for information today. 
>> I’m afraid this is a bit of a newbie question but I have managed to get the Manifold UI to now show ‘Alfresco Webscripts’ as a connector. The only bit I am missing now is to install the AMP file in Afresco. 
>> 
>> I realise that this is slightly outside of the Manifold remit but I wondered if anyone can advise how I build the AMP file from the URL (https://github.com/maoo/alfresco-indexer <https://github.com/maoo/alfresco-indexer>)? I have cloned the repository to my local drive but, having never worked with Maven, am at a loss at how to generate the AMP file that I then need to install into Alfresco. 
>> 
>> Many thanks,
>> 
>>> On 19 Oct 2015, at 17:36, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> The only way you can have such a reduced list of connectors is if somebody commented out many connectors in your connectors.xml, or removed them from the database table where they are registered by hand.
>>> 
>>> Karl
>>> 
>>> 
>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>> After a good deal of time clicking around I came to the same conclusion - that there is no way of telling from the UI!!
>>> 
>>> Having dug a bit deeper I believe I may actually have the Alfresco WebScript connectors installed. At least the 0.7.0 version. I notice in the ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
>>> 
>>> Looking in the ‘connectors.xml’ file I can also see the line :
>>> 
>>> <repositoryconnector name="Alfresco Webscript" class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>>> 
>>> You can imagine my excitement!
>>> 
>>> The only thing I am missing is the option in the UI. When I click to create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive, HDFS, Jira, Meridio, RSS, Sharepoint. 
>>> 
>>> Perhaps I am hoping for too much to hope that I can make a simple change to enable this repo connection?
>>> 
>>> Thanks for all the help everyone 
>>> 
>>> 
>>> 
>>>> On 19 Oct 2015, at 17:26, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>>> 
>>>> Hah; there's not a way to inquire in the UI, if that's what you mean.  But if you see "Alfresco webscript" in the list of repository connection types, you've got a version that supports that connector.
>>>> 
>>>> Thanks,
>>>> Karl
>>>> 
>>>> 
>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>> Thanks Rafa.
>>>> 
>>>> As an aside, is there an easy way to identify which version of ManifoldCF you are on?
>>>> 
>>>> Cheers
>>>> 
>>>> Paul Farrell
>>>> Senior Search Consultant
>>>>  
>>>> 109-123 Clifton Street, London EC2A 4LD
>>>> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://www.funnelback.com/>
>>>> 
>>>> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>> 
>>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>
>>>> 
>>>> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.
>>>> 
>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rharo@apache.org <ma...@apache.org>> wrote:
>>>>> 
>>>>> Hi Paul, 
>>>>> 
>>>>> All you need to do is to install this webscript <https://github.com/maoo/alfresco-indexer> within your Alfresco instance. The connector itself is already part of the most recent versions of ManifoldCF
>>>>> 
>>>>> Cheers,
>>>>> Rafa
>>>>> 
>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>> Ok, thanks again guys. 
>>>>> 
>>>>> The Webscript connector it is. 
>>>>> 
>>>>> I realise I am asking a lot here but are there any easy-to-follow guidelines on how to get this Webscript connector installed?  I see there is a GitHub page here (https://github.com/maoo/alfresco-webscript-manifold-connector <https://github.com/maoo/alfresco-webscript-manifold-connector>) which discusses it (although it directs you to a repository of files). 
>>>>> 
>>>>> I am just keen to make sure that any steps I follow to try and get this Webscript connector installed and working are updated, reliable steps. I would hate to waste time with out of date information. 
>>>>> 
>>>>> Thanks all
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>> wrote:
>>>>>> 
>>>>>> Hi Paul,
>>>>>> 
>>>>>> I suggest that you should use Alfresco Webscript as Karl mentioned. Web services is so slow compared to other services and I've also checked that Alfresco CMIS web services does not return change token(may be there is something that I don't know). 
>>>>>> 
>>>>>> By the way current version of CMIS connector is not aware of change token. I would write a patch for you if alfresco supports change token property.
>>>>>> 
>>>>>> Thanks!
>>>>>> Muhammed 
>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>>>>>> Hi Paul,
>>>>>> 
>>>>>> The Alfresco Webscript connector is a wholly different connector that has no relation to the CMIS connector.  It requires an Alfresco webscript plugin be installed on your Alfresco server to work, though.
>>>>>> 
>>>>>> Hope that helps.
>>>>>> 
>>>>>> Karl
>>>>>> 
>>>>>> 
>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>>> Hi Muhammed/Karl,
>>>>>> 
>>>>>> Firstly, thank-you so much for taking the time to reply. It is very much appreciated. 
>>>>>> 
>>>>>> Currently I am using the AtomPub for my CMIS repository connection. I have just read something which may shed a little light on this. The post read that change tokens are not passed via AtomPub connections (https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758 <https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758>). If true, this would explain why ManifoldCF may be unable to determine a change in Alfresco.
>>>>>> 
>>>>>> It looks like I have two possible options left open to me (correct me if I’m wrong):
>>>>>> 
>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the connection mechanism
>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’ connector?  (or is this the same as the ‘Web Services’ connection mentioned above?)
>>>>>> 
>>>>>> Thanks again,
>>>>>> 
>>>>>> Paul
>>>>>> 
>>>>>> Paul Farrell
>>>>>> Senior Search Consultant
>>>>>>  
>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>>> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://www.funnelback.com/>
>>>>>> 
>>>>>> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>>> 
>>>>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>
>>>>>> 
>>>>>> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.
>>>>>> 
>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>> wrote:
>>>>>>> 
>>>>>>> Hi Paul,
>>>>>>> 
>>>>>>> Repositories should give information to ManifoldCF when they updated. Current CMIS connector reindex document if the lastest version of the document has changed, not updated. 
>>>>>>> 
>>>>>>> There is a change token property in CMIS specification and it should change when document is updated so ManifoldCF can understand that document is updated but implementing change token property is optional.  I've checked Alfresco's CMIS web site and seen that they didn't set the change token.
>>>>>>> 
>>>>>>> I think, there is nothing we can do at this point.
>>>>>>> 
>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>>>>>>> Hi Paul,
>>>>>>> 
>>>>>>> This looks like a bug in the CMIS connector to me; usually the document version string the connector constructs should be adequate to detect all changes.  Can you create a ticket?  https://issues.apache.org/jira <https://issues.apache.org/jira> , project ManifoldCF.  Please include what version of MCF you are using here.  FWIW, this may be in fact a bug in the Alfresco CMIS implementation, but we'll have to have some back and forth before I can determine that for sure.
>>>>>>> 
>>>>>>> In the meantime, have you considered using the Alfresco Webscript connector?  It's the preferred way to do Alfresco indexing, although there have been issues reported having to do with running it on some configurations of Alfresco.  I'm not entirely sure what the problem is there; maybe a version dependency of some kind.
>>>>>>> 
>>>>>>> Karl
>>>>>>> 
>>>>>>> 
>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>>>> Hi Everyone,
>>>>>>> 
>>>>>>> Hoping someone may be able to advise.
>>>>>>> 
>>>>>>> I am currently using Manifold, together with a CMIS connector, to retrieve and index content from an Alfresco repository.
>>>>>>> 
>>>>>>> All is going well apart from, what I would call, the ‘incremental crawl’.
>>>>>>> 
>>>>>>> The main issue I am having is that the modification of a document’s security settings, in Alfresco, is not being picked up in next Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A and B as Consumers. I run a crawl in Manifold and it picks up the documents fine.  The security is set as expected. I then remove ‘User A’ from the security of that document and re-run the Manifold crawl. User A can still see the document in the local search engine.
>>>>>>> 
>>>>>>> It is as if Manifold is not treating the security update as a ‘modification’ and is therefore not refreshing it. Note that if I go into the Output Connections, edit and save the relevant output connection and then click ‘Remove all associated documents’, the next time I crawl, the changes are picked up. It is clear that Manifold is just not updating whatever internal record it has for this item.
>>>>>>> 
>>>>>>> Any ideas?
>>>>>>> 
>>>>>>> Many thanks.
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
>> 
> 
> 


Re: Manifold/Alfresco seeding and security

Posted by Paul Farrell <pf...@funnelback.com>.
Hi Karl,

I will try and verify which file it was later today. I know is was a .properties file it reported. 

Right now, the application of the built AMP file has rendered Alfresco inaccessible so I am having to try and debug and diagnose that. Kind of wish I hadn’t started on this now! :)

Paul Farrell
Senior Search Consultant
 
109-123 Clifton Street, London EC2A 4LD
T +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>

UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES

Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>

Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.

> On 20 Oct 2015, at 12:56, Karl Wright <da...@gmail.com> wrote:
> 
> Hmm.  What file was missing?  Maurizio maintains the indexer plugin; I feel certain he'd want to know.
> 
> Karl
> 
> 
> On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> Hi guys,
> 
> Just to let you know what’s going on - for informational purposes more than anything.
> 
> I initially tried taking the AMP file provided in the MCF plugins directory (0.7.0) and tried to install it into Alfresco but got a message saying a file was missing.
> 
> Instead, I cloned the repository on GitHub for the alfresco-indexer project and then built it on my local machine. This generated the AMP file (0.7.2). 
> 
> I was able to successfully install the AMP file onto my Alfresco instance. 
> 
> As it happens I now cannot log into Alfresco Share ('bad credentials or server not available' message) but that is something I can work on. Apparently the installation of some AMP files have been known to cause this issue. 
> 
> So, progress to a point!
> 
> Paul Farrell
> Senior Search Consultant
>  
> 109-123 Clifton Street, London EC2A 4LD
> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://www.funnelback.com/>
> 
> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
> 
> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>
> 
> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.
> 
>> On 20 Oct 2015, at 12:36, Rafa Haro <rharoapache@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hi, 
>> 
>> At the Alfresco side, hope this helps:
>> 
>> http://docs.alfresco.com/4.1/tasks/amp-install.html <http://docs.alfresco.com/4.1/tasks/amp-install.html>
>> 
>> Cheers
>> 
>> 
>> 
>> 
>> 
>> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>> 
>> The AMP file is actually shipped as part of the binary MCF distribution.  You can find it under "plugins".
>> 
>> Karl
>> 
>> 
>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>> Hi all,
>> 
>> Hopefully this will be my only request for information today. 
>> I’m afraid this is a bit of a newbie question but I have managed to get the Manifold UI to now show ‘Alfresco Webscripts’ as a connector. The only bit I am missing now is to install the AMP file in Afresco. 
>> 
>> I realise that this is slightly outside of the Manifold remit but I wondered if anyone can advise how I build the AMP file from the URL (https://github.com/maoo/alfresco-indexer <https://github.com/maoo/alfresco-indexer>)? I have cloned the repository to my local drive but, having never worked with Maven, am at a loss at how to generate the AMP file that I then need to install into Alfresco. 
>> 
>> Many thanks,
>> 
>>> On 19 Oct 2015, at 17:36, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> The only way you can have such a reduced list of connectors is if somebody commented out many connectors in your connectors.xml, or removed them from the database table where they are registered by hand.
>>> 
>>> Karl
>>> 
>>> 
>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>> After a good deal of time clicking around I came to the same conclusion - that there is no way of telling from the UI!!
>>> 
>>> Having dug a bit deeper I believe I may actually have the Alfresco WebScript connectors installed. At least the 0.7.0 version. I notice in the ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
>>> 
>>> Looking in the ‘connectors.xml’ file I can also see the line :
>>> 
>>> <repositoryconnector name="Alfresco Webscript" class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>>> 
>>> You can imagine my excitement!
>>> 
>>> The only thing I am missing is the option in the UI. When I click to create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive, HDFS, Jira, Meridio, RSS, Sharepoint. 
>>> 
>>> Perhaps I am hoping for too much to hope that I can make a simple change to enable this repo connection?
>>> 
>>> Thanks for all the help everyone 
>>> 
>>> 
>>> 
>>>> On 19 Oct 2015, at 17:26, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>>> 
>>>> Hah; there's not a way to inquire in the UI, if that's what you mean.  But if you see "Alfresco webscript" in the list of repository connection types, you've got a version that supports that connector.
>>>> 
>>>> Thanks,
>>>> Karl
>>>> 
>>>> 
>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>> Thanks Rafa.
>>>> 
>>>> As an aside, is there an easy way to identify which version of ManifoldCF you are on?
>>>> 
>>>> Cheers
>>>> 
>>>> Paul Farrell
>>>> Senior Search Consultant
>>>>  
>>>> 109-123 Clifton Street, London EC2A 4LD
>>>> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://www.funnelback.com/>
>>>> 
>>>> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>> 
>>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>
>>>> 
>>>> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.
>>>> 
>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rharo@apache.org <ma...@apache.org>> wrote:
>>>>> 
>>>>> Hi Paul, 
>>>>> 
>>>>> All you need to do is to install this webscript <https://github.com/maoo/alfresco-indexer> within your Alfresco instance. The connector itself is already part of the most recent versions of ManifoldCF
>>>>> 
>>>>> Cheers,
>>>>> Rafa
>>>>> 
>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>> Ok, thanks again guys. 
>>>>> 
>>>>> The Webscript connector it is. 
>>>>> 
>>>>> I realise I am asking a lot here but are there any easy-to-follow guidelines on how to get this Webscript connector installed?  I see there is a GitHub page here (https://github.com/maoo/alfresco-webscript-manifold-connector <https://github.com/maoo/alfresco-webscript-manifold-connector>) which discusses it (although it directs you to a repository of files). 
>>>>> 
>>>>> I am just keen to make sure that any steps I follow to try and get this Webscript connector installed and working are updated, reliable steps. I would hate to waste time with out of date information. 
>>>>> 
>>>>> Thanks all
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>> wrote:
>>>>>> 
>>>>>> Hi Paul,
>>>>>> 
>>>>>> I suggest that you should use Alfresco Webscript as Karl mentioned. Web services is so slow compared to other services and I've also checked that Alfresco CMIS web services does not return change token(may be there is something that I don't know). 
>>>>>> 
>>>>>> By the way current version of CMIS connector is not aware of change token. I would write a patch for you if alfresco supports change token property.
>>>>>> 
>>>>>> Thanks!
>>>>>> Muhammed 
>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>>>>>> Hi Paul,
>>>>>> 
>>>>>> The Alfresco Webscript connector is a wholly different connector that has no relation to the CMIS connector.  It requires an Alfresco webscript plugin be installed on your Alfresco server to work, though.
>>>>>> 
>>>>>> Hope that helps.
>>>>>> 
>>>>>> Karl
>>>>>> 
>>>>>> 
>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>>> Hi Muhammed/Karl,
>>>>>> 
>>>>>> Firstly, thank-you so much for taking the time to reply. It is very much appreciated. 
>>>>>> 
>>>>>> Currently I am using the AtomPub for my CMIS repository connection. I have just read something which may shed a little light on this. The post read that change tokens are not passed via AtomPub connections (https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758 <https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758>). If true, this would explain why ManifoldCF may be unable to determine a change in Alfresco.
>>>>>> 
>>>>>> It looks like I have two possible options left open to me (correct me if I’m wrong):
>>>>>> 
>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the connection mechanism
>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’ connector?  (or is this the same as the ‘Web Services’ connection mentioned above?)
>>>>>> 
>>>>>> Thanks again,
>>>>>> 
>>>>>> Paul
>>>>>> 
>>>>>> Paul Farrell
>>>>>> Senior Search Consultant
>>>>>>  
>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>>> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://www.funnelback.com/>
>>>>>> 
>>>>>> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>>> 
>>>>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>
>>>>>> 
>>>>>> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.
>>>>>> 
>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>> wrote:
>>>>>>> 
>>>>>>> Hi Paul,
>>>>>>> 
>>>>>>> Repositories should give information to ManifoldCF when they updated. Current CMIS connector reindex document if the lastest version of the document has changed, not updated. 
>>>>>>> 
>>>>>>> There is a change token property in CMIS specification and it should change when document is updated so ManifoldCF can understand that document is updated but implementing change token property is optional.  I've checked Alfresco's CMIS web site and seen that they didn't set the change token.
>>>>>>> 
>>>>>>> I think, there is nothing we can do at this point.
>>>>>>> 
>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>>>>>>> Hi Paul,
>>>>>>> 
>>>>>>> This looks like a bug in the CMIS connector to me; usually the document version string the connector constructs should be adequate to detect all changes.  Can you create a ticket?  https://issues.apache.org/jira <https://issues.apache.org/jira> , project ManifoldCF.  Please include what version of MCF you are using here.  FWIW, this may be in fact a bug in the Alfresco CMIS implementation, but we'll have to have some back and forth before I can determine that for sure.
>>>>>>> 
>>>>>>> In the meantime, have you considered using the Alfresco Webscript connector?  It's the preferred way to do Alfresco indexing, although there have been issues reported having to do with running it on some configurations of Alfresco.  I'm not entirely sure what the problem is there; maybe a version dependency of some kind.
>>>>>>> 
>>>>>>> Karl
>>>>>>> 
>>>>>>> 
>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>>>> Hi Everyone,
>>>>>>> 
>>>>>>> Hoping someone may be able to advise.
>>>>>>> 
>>>>>>> I am currently using Manifold, together with a CMIS connector, to retrieve and index content from an Alfresco repository.
>>>>>>> 
>>>>>>> All is going well apart from, what I would call, the ‘incremental crawl’.
>>>>>>> 
>>>>>>> The main issue I am having is that the modification of a document’s security settings, in Alfresco, is not being picked up in next Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A and B as Consumers. I run a crawl in Manifold and it picks up the documents fine.  The security is set as expected. I then remove ‘User A’ from the security of that document and re-run the Manifold crawl. User A can still see the document in the local search engine.
>>>>>>> 
>>>>>>> It is as if Manifold is not treating the security update as a ‘modification’ and is therefore not refreshing it. Note that if I go into the Output Connections, edit and save the relevant output connection and then click ‘Remove all associated documents’, the next time I crawl, the changes are picked up. It is clear that Manifold is just not updating whatever internal record it has for this item.
>>>>>>> 
>>>>>>> Any ideas?
>>>>>>> 
>>>>>>> Many thanks.
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
>> 
> 
> 


Re: Manifold/Alfresco seeding and security

Posted by Karl Wright <da...@gmail.com>.
Hmm.  What file was missing?  Maurizio maintains the indexer plugin; I feel
certain he'd want to know.

Karl


On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <pf...@funnelback.com>
wrote:

> Hi guys,
>
> Just to let you know what’s going on - for informational purposes more
> than anything.
>
> I initially tried taking the AMP file provided in the MCF plugins
> directory (0.7.0) and tried to install it into Alfresco but got a message
> saying a file was missing.
>
> Instead, I cloned the repository on GitHub for the alfresco-indexer
> project and then built it on my local machine. This generated the AMP file
> (0.7.2).
>
> I was able to successfully install the AMP file onto my Alfresco instance.
>
> As it happens I now cannot log into Alfresco Share ('bad credentials or
> server not available' message) but that is something I can work on.
> Apparently the installation of some AMP files have been known to cause this
> issue.
>
> So, progress to a point!
>
> *Paul Farrell*
> Senior Search Consultant
>
> 109-123 Clifton Street, London EC2A 4LD
> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>
> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>
> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> -
> Twitter <https://twitter.com/funnelback>
>
> Funnelback UK Ltd is a limited liability company registered in England &
> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
> EC2A 4LD. Company registration number: 07004264.
>
> On 20 Oct 2015, at 12:36, Rafa Haro <rh...@gmail.com> wrote:
>
> Hi,
>
> At the Alfresco side, hope this helps:
>
> http://docs.alfresco.com/4.1/tasks/amp-install.html
>
> Cheers
>
>
>
>
>
> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <da...@gmail.com> wrote:
>
>> The AMP file is actually shipped as part of the binary MCF distribution.
>> You can find it under "plugins".
>>
>> Karl
>>
>>
>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <pf...@funnelback.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> Hopefully this will be my only request for information today.
>>> I’m afraid this is a bit of a newbie question but I have managed to get
>>> the Manifold UI to now show ‘Alfresco Webscripts’ as a connector. The only
>>> bit I am missing now is to install the AMP file in Afresco.
>>>
>>> I realise that this is slightly outside of the Manifold remit but I
>>> wondered if anyone can advise how I build the AMP file from the URL (
>>> https://github.com/maoo/alfresco-indexer)? I have cloned the repository
>>> to my local drive but, having never worked with Maven, am at a loss at how
>>> to generate the AMP file that I then need to install into Alfresco.
>>>
>>> Many thanks,
>>>
>>> On 19 Oct 2015, at 17:36, Karl Wright <da...@gmail.com> wrote:
>>>
>>> The only way you can have such a reduced list of connectors is if
>>> somebody commented out many connectors in your connectors.xml, or removed
>>> them from the database table where they are registered by hand.
>>>
>>> Karl
>>>
>>>
>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <pf...@funnelback.com>
>>> wrote:
>>>
>>>> After a good deal of time clicking around I came to the same conclusion
>>>> - that there is no way of telling from the UI!!
>>>>
>>>> Having dug a bit deeper I believe I may actually have the Alfresco
>>>> WebScript connectors installed. At least the 0.7.0 version. I notice in the
>>>> ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
>>>>
>>>> Looking in the ‘connectors.xml’ file I can also see the line :
>>>>
>>>> <repositoryconnector name="Alfresco Webscript"
>>>> class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>>>>
>>>> You can imagine my excitement!
>>>>
>>>> The only thing I am missing is the option in the UI. When I click to
>>>> create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive,
>>>> HDFS, Jira, Meridio, RSS, Sharepoint.
>>>>
>>>> Perhaps I am hoping for too much to hope that I can make a simple
>>>> change to enable this repo connection?
>>>>
>>>> Thanks for all the help everyone
>>>>
>>>>
>>>>
>>>> On 19 Oct 2015, at 17:26, Karl Wright <da...@gmail.com> wrote:
>>>>
>>>> Hah; there's not a way to inquire in the UI, if that's what you mean.
>>>> But if you see "Alfresco webscript" in the list of repository connection
>>>> types, you've got a version that supports that connector.
>>>>
>>>> Thanks,
>>>> Karl
>>>>
>>>>
>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <pfarrell@funnelback.com
>>>> > wrote:
>>>>
>>>>> Thanks Rafa.
>>>>>
>>>>> As an aside, is there an easy way to identify which version of
>>>>> ManifoldCF you are on?
>>>>>
>>>>> Cheers
>>>>>
>>>>> *Paul Farrell*
>>>>> Senior Search Consultant
>>>>>
>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>>>>>
>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>>
>>>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback>
>>>>>  - Twitter <https://twitter.com/funnelback>
>>>>>
>>>>> Funnelback UK Ltd is a limited liability company registered in England
>>>>> & Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>>>>> EC2A 4LD. Company registration number: 07004264.
>>>>>
>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rh...@apache.org> wrote:
>>>>>
>>>>> Hi Paul,
>>>>>
>>>>> All you need to do is to install this webscript
>>>>> <https://github.com/maoo/alfresco-indexer> within your Alfresco
>>>>> instance. The connector itself is already part of the most recent versions
>>>>> of ManifoldCF
>>>>>
>>>>> Cheers,
>>>>> Rafa
>>>>>
>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <pfarrell@funnelback.com
>>>>> > wrote:
>>>>>
>>>>>> Ok, thanks again guys.
>>>>>>
>>>>>> The Webscript connector it is.
>>>>>>
>>>>>> I realise I am asking a lot here but are there any easy-to-follow
>>>>>> guidelines on how to get this Webscript connector installed?  I see there
>>>>>> is a GitHub page here (
>>>>>> https://github.com/maoo/alfresco-webscript-manifold-connector) which
>>>>>> discusses it (although it directs you to a repository of files).
>>>>>>
>>>>>> I am just keen to make sure that any steps I follow to try and get
>>>>>> this Webscript connector installed and working are updated, reliable steps.
>>>>>> I would hate to waste time with out of date information.
>>>>>>
>>>>>> Thanks all
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh...@gmail.com> wrote:
>>>>>>
>>>>>> Hi Paul,
>>>>>>
>>>>>> I suggest that you should use Alfresco Webscript as Karl mentioned.
>>>>>> Web services is so slow compared to other services and I've also checked
>>>>>> that Alfresco CMIS web services does not return change token(may be there
>>>>>> is something that I don't know).
>>>>>>
>>>>>> By the way current version of CMIS connector is not aware of change
>>>>>> token. I would write a patch for you if alfresco supports change token
>>>>>> property.
>>>>>>
>>>>>> Thanks!
>>>>>> Muhammed
>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <da...@gmail.com>
>>>>>> şunu yazdı:
>>>>>>
>>>>>>> Hi Paul,
>>>>>>>
>>>>>>> The Alfresco Webscript connector is a wholly different connector
>>>>>>> that has no relation to the CMIS connector.  It requires an Alfresco
>>>>>>> webscript plugin be installed on your Alfresco server to work, though.
>>>>>>>
>>>>>>> Hope that helps.
>>>>>>>
>>>>>>> Karl
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <
>>>>>>> pfarrell@funnelback.com> wrote:
>>>>>>>
>>>>>>>> Hi Muhammed/Karl,
>>>>>>>>
>>>>>>>> Firstly, thank-you so much for taking the time to reply. It is very
>>>>>>>> much appreciated.
>>>>>>>>
>>>>>>>> Currently I am using the AtomPub for my CMIS repository connection.
>>>>>>>> I have just read something which may shed a little light on this. The post
>>>>>>>> read that change tokens are not passed via AtomPub connections (
>>>>>>>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758).
>>>>>>>> If true, this would explain why ManifoldCF may be unable to determine a
>>>>>>>> change in Alfresco.
>>>>>>>>
>>>>>>>> It looks like I have two possible options left open to me (correct
>>>>>>>> me if I’m wrong):
>>>>>>>>
>>>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the
>>>>>>>> connection mechanism
>>>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’
>>>>>>>> connector?  (or is this the same as the ‘Web Services’ connection mentioned
>>>>>>>> above?)
>>>>>>>>
>>>>>>>> Thanks again,
>>>>>>>>
>>>>>>>> Paul
>>>>>>>>
>>>>>>>> *Paul Farrell*
>>>>>>>> Senior Search Consultant
>>>>>>>>
>>>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
>>>>>>>> <http://www.funnelback.com/>
>>>>>>>>
>>>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>>>>>
>>>>>>>> Connect with us: LinkedIn
>>>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
>>>>>>>> <https://twitter.com/funnelback>
>>>>>>>>
>>>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
>>>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>>>>>>>
>>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Paul,
>>>>>>>>
>>>>>>>> Repositories should give information to ManifoldCF when they
>>>>>>>> updated. Current CMIS connector reindex document if the lastest version of
>>>>>>>> the document has changed, not updated.
>>>>>>>>
>>>>>>>> There is a change token property in CMIS specification and it
>>>>>>>> should change when document is updated so ManifoldCF can understand that
>>>>>>>> document is updated but implementing change token property is optional.
>>>>>>>> I've checked Alfresco's CMIS web site and seen that they didn't set the
>>>>>>>> change token.
>>>>>>>>
>>>>>>>> I think, there is nothing we can do at this point.
>>>>>>>>
>>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <da...@gmail.com>
>>>>>>>> şunu yazdı:
>>>>>>>>
>>>>>>>>> Hi Paul,
>>>>>>>>>
>>>>>>>>> This looks like a bug in the CMIS connector to me; usually the
>>>>>>>>> document version string the connector constructs should be adequate to
>>>>>>>>> detect all changes.  Can you create a ticket?
>>>>>>>>> https://issues.apache.org/jira , project ManifoldCF.  Please
>>>>>>>>> include what version of MCF you are using here.  FWIW, this may be in fact
>>>>>>>>> a bug in the Alfresco CMIS implementation, but we'll have to have some back
>>>>>>>>> and forth before I can determine that for sure.
>>>>>>>>>
>>>>>>>>> In the meantime, have you considered using the Alfresco Webscript
>>>>>>>>> connector?  It's the preferred way to do Alfresco indexing, although there
>>>>>>>>> have been issues reported having to do with running it on some
>>>>>>>>> configurations of Alfresco.  I'm not entirely sure what the problem is
>>>>>>>>> there; maybe a version dependency of some kind.
>>>>>>>>>
>>>>>>>>> Karl
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <
>>>>>>>>> pfarrell@funnelback.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Everyone,
>>>>>>>>>>
>>>>>>>>>> Hoping someone may be able to advise.
>>>>>>>>>>
>>>>>>>>>> I am currently using Manifold, together with a CMIS connector, to
>>>>>>>>>> retrieve and index content from an Alfresco repository.
>>>>>>>>>>
>>>>>>>>>> All is going well apart from, what I would call, the ‘incremental
>>>>>>>>>> crawl’.
>>>>>>>>>>
>>>>>>>>>> The main issue I am having is that the modification of a
>>>>>>>>>> document’s security settings, in Alfresco, is not being picked up in next
>>>>>>>>>> Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A
>>>>>>>>>> and B as Consumers. I run a crawl in Manifold and it picks up the documents
>>>>>>>>>> fine.  The security is set as expected. I then remove ‘User A’ from the
>>>>>>>>>> security of that document and re-run the Manifold crawl. User A can still
>>>>>>>>>> see the document in the local search engine.
>>>>>>>>>>
>>>>>>>>>> It is as if Manifold is not treating the security update as a
>>>>>>>>>> ‘modification’ and is therefore not refreshing it. Note that if I go into
>>>>>>>>>> the Output Connections, edit and save the relevant output connection and
>>>>>>>>>> then click ‘Remove all associated documents’, the next time I crawl, the
>>>>>>>>>> changes are picked up. It is clear that Manifold is just not updating
>>>>>>>>>> whatever internal record it has for this item.
>>>>>>>>>>
>>>>>>>>>> Any ideas?
>>>>>>>>>>
>>>>>>>>>> Many thanks.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>
>

Re: Manifold/Alfresco seeding and security

Posted by Paul Farrell <pf...@funnelback.com>.
Hi guys,

Just to let you know what’s going on - for informational purposes more than anything.

I initially tried taking the AMP file provided in the MCF plugins directory (0.7.0) and tried to install it into Alfresco but got a message saying a file was missing.

Instead, I cloned the repository on GitHub for the alfresco-indexer project and then built it on my local machine. This generated the AMP file (0.7.2). 

I was able to successfully install the AMP file onto my Alfresco instance. 

As it happens I now cannot log into Alfresco Share ('bad credentials or server not available' message) but that is something I can work on. Apparently the installation of some AMP files have been known to cause this issue. 

So, progress to a point!

Paul Farrell
Senior Search Consultant
 
109-123 Clifton Street, London EC2A 4LD
T +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>

UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES

Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>

Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.

> On 20 Oct 2015, at 12:36, Rafa Haro <rh...@gmail.com> wrote:
> 
> Hi, 
> 
> At the Alfresco side, hope this helps:
> 
> http://docs.alfresco.com/4.1/tasks/amp-install.html <http://docs.alfresco.com/4.1/tasks/amp-install.html>
> 
> Cheers
> 
> 
> 
> 
> 
> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
> 
> The AMP file is actually shipped as part of the binary MCF distribution.  You can find it under "plugins".
> 
> Karl
> 
> 
> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> Hi all,
> 
> Hopefully this will be my only request for information today. 
> I’m afraid this is a bit of a newbie question but I have managed to get the Manifold UI to now show ‘Alfresco Webscripts’ as a connector. The only bit I am missing now is to install the AMP file in Afresco. 
> 
> I realise that this is slightly outside of the Manifold remit but I wondered if anyone can advise how I build the AMP file from the URL (https://github.com/maoo/alfresco-indexer <https://github.com/maoo/alfresco-indexer>)? I have cloned the repository to my local drive but, having never worked with Maven, am at a loss at how to generate the AMP file that I then need to install into Alfresco. 
> 
> Many thanks,
> 
>> On 19 Oct 2015, at 17:36, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>> 
>> The only way you can have such a reduced list of connectors is if somebody commented out many connectors in your connectors.xml, or removed them from the database table where they are registered by hand.
>> 
>> Karl
>> 
>> 
>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>> After a good deal of time clicking around I came to the same conclusion - that there is no way of telling from the UI!!
>> 
>> Having dug a bit deeper I believe I may actually have the Alfresco WebScript connectors installed. At least the 0.7.0 version. I notice in the ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
>> 
>> Looking in the ‘connectors.xml’ file I can also see the line :
>> 
>> <repositoryconnector name="Alfresco Webscript" class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>> 
>> You can imagine my excitement!
>> 
>> The only thing I am missing is the option in the UI. When I click to create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive, HDFS, Jira, Meridio, RSS, Sharepoint. 
>> 
>> Perhaps I am hoping for too much to hope that I can make a simple change to enable this repo connection?
>> 
>> Thanks for all the help everyone 
>> 
>> 
>> 
>>> On 19 Oct 2015, at 17:26, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> Hah; there's not a way to inquire in the UI, if that's what you mean.  But if you see "Alfresco webscript" in the list of repository connection types, you've got a version that supports that connector.
>>> 
>>> Thanks,
>>> Karl
>>> 
>>> 
>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>> Thanks Rafa.
>>> 
>>> As an aside, is there an easy way to identify which version of ManifoldCF you are on?
>>> 
>>> Cheers
>>> 
>>> Paul Farrell
>>> Senior Search Consultant
>>>  
>>> 109-123 Clifton Street, London EC2A 4LD
>>> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://www.funnelback.com/>
>>> 
>>> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>> 
>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>
>>> 
>>> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.
>>> 
>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rharo@apache.org <ma...@apache.org>> wrote:
>>>> 
>>>> Hi Paul, 
>>>> 
>>>> All you need to do is to install this webscript <https://github.com/maoo/alfresco-indexer> within your Alfresco instance. The connector itself is already part of the most recent versions of ManifoldCF
>>>> 
>>>> Cheers,
>>>> Rafa
>>>> 
>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>> Ok, thanks again guys. 
>>>> 
>>>> The Webscript connector it is. 
>>>> 
>>>> I realise I am asking a lot here but are there any easy-to-follow guidelines on how to get this Webscript connector installed?  I see there is a GitHub page here (https://github.com/maoo/alfresco-webscript-manifold-connector <https://github.com/maoo/alfresco-webscript-manifold-connector>) which discusses it (although it directs you to a repository of files). 
>>>> 
>>>> I am just keen to make sure that any steps I follow to try and get this Webscript connector installed and working are updated, reliable steps. I would hate to waste time with out of date information. 
>>>> 
>>>> Thanks all
>>>> 
>>>> 
>>>> 
>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>> wrote:
>>>>> 
>>>>> Hi Paul,
>>>>> 
>>>>> I suggest that you should use Alfresco Webscript as Karl mentioned. Web services is so slow compared to other services and I've also checked that Alfresco CMIS web services does not return change token(may be there is something that I don't know). 
>>>>> 
>>>>> By the way current version of CMIS connector is not aware of change token. I would write a patch for you if alfresco supports change token property.
>>>>> 
>>>>> Thanks!
>>>>> Muhammed 
>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>>>>> Hi Paul,
>>>>> 
>>>>> The Alfresco Webscript connector is a wholly different connector that has no relation to the CMIS connector.  It requires an Alfresco webscript plugin be installed on your Alfresco server to work, though.
>>>>> 
>>>>> Hope that helps.
>>>>> 
>>>>> Karl
>>>>> 
>>>>> 
>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>> Hi Muhammed/Karl,
>>>>> 
>>>>> Firstly, thank-you so much for taking the time to reply. It is very much appreciated. 
>>>>> 
>>>>> Currently I am using the AtomPub for my CMIS repository connection. I have just read something which may shed a little light on this. The post read that change tokens are not passed via AtomPub connections (https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758 <https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758>). If true, this would explain why ManifoldCF may be unable to determine a change in Alfresco.
>>>>> 
>>>>> It looks like I have two possible options left open to me (correct me if I’m wrong):
>>>>> 
>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the connection mechanism
>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’ connector?  (or is this the same as the ‘Web Services’ connection mentioned above?)
>>>>> 
>>>>> Thanks again,
>>>>> 
>>>>> Paul
>>>>> 
>>>>> Paul Farrell
>>>>> Senior Search Consultant
>>>>>  
>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://www.funnelback.com/>
>>>>> 
>>>>> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>> 
>>>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>
>>>>> 
>>>>> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.
>>>>> 
>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>> wrote:
>>>>>> 
>>>>>> Hi Paul,
>>>>>> 
>>>>>> Repositories should give information to ManifoldCF when they updated. Current CMIS connector reindex document if the lastest version of the document has changed, not updated. 
>>>>>> 
>>>>>> There is a change token property in CMIS specification and it should change when document is updated so ManifoldCF can understand that document is updated but implementing change token property is optional.  I've checked Alfresco's CMIS web site and seen that they didn't set the change token.
>>>>>> 
>>>>>> I think, there is nothing we can do at this point.
>>>>>> 
>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>>>>>> Hi Paul,
>>>>>> 
>>>>>> This looks like a bug in the CMIS connector to me; usually the document version string the connector constructs should be adequate to detect all changes.  Can you create a ticket?  https://issues.apache.org/jira <https://issues.apache.org/jira> , project ManifoldCF.  Please include what version of MCF you are using here.  FWIW, this may be in fact a bug in the Alfresco CMIS implementation, but we'll have to have some back and forth before I can determine that for sure.
>>>>>> 
>>>>>> In the meantime, have you considered using the Alfresco Webscript connector?  It's the preferred way to do Alfresco indexing, although there have been issues reported having to do with running it on some configurations of Alfresco.  I'm not entirely sure what the problem is there; maybe a version dependency of some kind.
>>>>>> 
>>>>>> Karl
>>>>>> 
>>>>>> 
>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>>> Hi Everyone,
>>>>>> 
>>>>>> Hoping someone may be able to advise.
>>>>>> 
>>>>>> I am currently using Manifold, together with a CMIS connector, to retrieve and index content from an Alfresco repository.
>>>>>> 
>>>>>> All is going well apart from, what I would call, the ‘incremental crawl’.
>>>>>> 
>>>>>> The main issue I am having is that the modification of a document’s security settings, in Alfresco, is not being picked up in next Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A and B as Consumers. I run a crawl in Manifold and it picks up the documents fine.  The security is set as expected. I then remove ‘User A’ from the security of that document and re-run the Manifold crawl. User A can still see the document in the local search engine.
>>>>>> 
>>>>>> It is as if Manifold is not treating the security update as a ‘modification’ and is therefore not refreshing it. Note that if I go into the Output Connections, edit and save the relevant output connection and then click ‘Remove all associated documents’, the next time I crawl, the changes are picked up. It is clear that Manifold is just not updating whatever internal record it has for this item.
>>>>>> 
>>>>>> Any ideas?
>>>>>> 
>>>>>> Many thanks.
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 
> 
> 


Re: Manifold/Alfresco seeding and security

Posted by Rafa Haro <rh...@gmail.com>.
Hi, 




At the Alfresco side, hope this helps:




http://docs.alfresco.com/4.1/tasks/amp-install.html




Cheers

On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <da...@gmail.com> wrote:

> The AMP file is actually shipped as part of the binary MCF distribution.
> You can find it under "plugins".
> Karl
> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <pf...@funnelback.com>
> wrote:
>> Hi all,
>>
>> Hopefully this will be my only request for information today.
>> I’m afraid this is a bit of a newbie question but I have managed to get
>> the Manifold UI to now show ‘Alfresco Webscripts’ as a connector. The only
>> bit I am missing now is to install the AMP file in Afresco.
>>
>> I realise that this is slightly outside of the Manifold remit but I
>> wondered if anyone can advise how I build the AMP file from the URL (
>> https://github.com/maoo/alfresco-indexer)? I have cloned the repository
>> to my local drive but, having never worked with Maven, am at a loss at how
>> to generate the AMP file that I then need to install into Alfresco.
>>
>> Many thanks,
>>
>> On 19 Oct 2015, at 17:36, Karl Wright <da...@gmail.com> wrote:
>>
>> The only way you can have such a reduced list of connectors is if somebody
>> commented out many connectors in your connectors.xml, or removed them from
>> the database table where they are registered by hand.
>>
>> Karl
>>
>>
>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <pf...@funnelback.com>
>> wrote:
>>
>>> After a good deal of time clicking around I came to the same conclusion -
>>> that there is no way of telling from the UI!!
>>>
>>> Having dug a bit deeper I believe I may actually have the Alfresco
>>> WebScript connectors installed. At least the 0.7.0 version. I notice in the
>>> ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
>>>
>>> Looking in the ‘connectors.xml’ file I can also see the line :
>>>
>>> <repositoryconnector name="Alfresco Webscript"
>>> class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>>>
>>> You can imagine my excitement!
>>>
>>> The only thing I am missing is the option in the UI. When I click to
>>> create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive,
>>> HDFS, Jira, Meridio, RSS, Sharepoint.
>>>
>>> Perhaps I am hoping for too much to hope that I can make a simple change
>>> to enable this repo connection?
>>>
>>> Thanks for all the help everyone
>>>
>>>
>>>
>>> On 19 Oct 2015, at 17:26, Karl Wright <da...@gmail.com> wrote:
>>>
>>> Hah; there's not a way to inquire in the UI, if that's what you mean.
>>> But if you see "Alfresco webscript" in the list of repository connection
>>> types, you've got a version that supports that connector.
>>>
>>> Thanks,
>>> Karl
>>>
>>>
>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <pf...@funnelback.com>
>>> wrote:
>>>
>>>> Thanks Rafa.
>>>>
>>>> As an aside, is there an easy way to identify which version of
>>>> ManifoldCF you are on?
>>>>
>>>> Cheers
>>>>
>>>> *Paul Farrell*
>>>> Senior Search Consultant
>>>>
>>>> 109-123 Clifton Street, London EC2A 4LD
>>>> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>>>>
>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>
>>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> -
>>>>  Twitter <https://twitter.com/funnelback>
>>>>
>>>> Funnelback UK Ltd is a limited liability company registered in England &
>>>> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>>>> EC2A 4LD. Company registration number: 07004264.
>>>>
>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rh...@apache.org> wrote:
>>>>
>>>> Hi Paul,
>>>>
>>>> All you need to do is to install this webscript
>>>> <https://github.com/maoo/alfresco-indexer> within your Alfresco
>>>> instance. The connector itself is already part of the most recent versions
>>>> of ManifoldCF
>>>>
>>>> Cheers,
>>>> Rafa
>>>>
>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <pf...@funnelback.com>
>>>> wrote:
>>>>
>>>>> Ok, thanks again guys.
>>>>>
>>>>> The Webscript connector it is.
>>>>>
>>>>> I realise I am asking a lot here but are there any easy-to-follow
>>>>> guidelines on how to get this Webscript connector installed?  I see there
>>>>> is a GitHub page here (
>>>>> https://github.com/maoo/alfresco-webscript-manifold-connector) which
>>>>> discusses it (although it directs you to a repository of files).
>>>>>
>>>>> I am just keen to make sure that any steps I follow to try and get this
>>>>> Webscript connector installed and working are updated, reliable steps. I
>>>>> would hate to waste time with out of date information.
>>>>>
>>>>> Thanks all
>>>>>
>>>>>
>>>>>
>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh...@gmail.com> wrote:
>>>>>
>>>>> Hi Paul,
>>>>>
>>>>> I suggest that you should use Alfresco Webscript as Karl mentioned. Web
>>>>> services is so slow compared to other services and I've also checked that
>>>>> Alfresco CMIS web services does not return change token(may be there is
>>>>> something that I don't know).
>>>>>
>>>>> By the way current version of CMIS connector is not aware of change
>>>>> token. I would write a patch for you if alfresco supports change token
>>>>> property.
>>>>>
>>>>> Thanks!
>>>>> Muhammed
>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <da...@gmail.com>
>>>>> şunu yazdı:
>>>>>
>>>>>> Hi Paul,
>>>>>>
>>>>>> The Alfresco Webscript connector is a wholly different connector that
>>>>>> has no relation to the CMIS connector.  It requires an Alfresco webscript
>>>>>> plugin be installed on your Alfresco server to work, though.
>>>>>>
>>>>>> Hope that helps.
>>>>>>
>>>>>> Karl
>>>>>>
>>>>>>
>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <
>>>>>> pfarrell@funnelback.com> wrote:
>>>>>>
>>>>>>> Hi Muhammed/Karl,
>>>>>>>
>>>>>>> Firstly, thank-you so much for taking the time to reply. It is very
>>>>>>> much appreciated.
>>>>>>>
>>>>>>> Currently I am using the AtomPub for my CMIS repository connection. I
>>>>>>> have just read something which may shed a little light on this. The post
>>>>>>> read that change tokens are not passed via AtomPub connections (
>>>>>>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758).
>>>>>>> If true, this would explain why ManifoldCF may be unable to determine a
>>>>>>> change in Alfresco.
>>>>>>>
>>>>>>> It looks like I have two possible options left open to me (correct me
>>>>>>> if I’m wrong):
>>>>>>>
>>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the
>>>>>>> connection mechanism
>>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’
>>>>>>> connector?  (or is this the same as the ‘Web Services’ connection mentioned
>>>>>>> above?)
>>>>>>>
>>>>>>> Thanks again,
>>>>>>>
>>>>>>> Paul
>>>>>>>
>>>>>>> *Paul Farrell*
>>>>>>> Senior Search Consultant
>>>>>>>
>>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
>>>>>>> <http://www.funnelback.com/>
>>>>>>>
>>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>>>>
>>>>>>> Connect with us: LinkedIn
>>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
>>>>>>> <https://twitter.com/funnelback>
>>>>>>>
>>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
>>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>>>>>>
>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh...@gmail.com> wrote:
>>>>>>>
>>>>>>> Hi Paul,
>>>>>>>
>>>>>>> Repositories should give information to ManifoldCF when they updated.
>>>>>>> Current CMIS connector reindex document if the lastest version of the
>>>>>>> document has changed, not updated.
>>>>>>>
>>>>>>> There is a change token property in CMIS specification and it should
>>>>>>> change when document is updated so ManifoldCF can understand that document
>>>>>>> is updated but implementing change token property is optional.  I've
>>>>>>> checked Alfresco's CMIS web site and seen that they didn't set the change
>>>>>>> token.
>>>>>>>
>>>>>>> I think, there is nothing we can do at this point.
>>>>>>>
>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <da...@gmail.com>
>>>>>>> şunu yazdı:
>>>>>>>
>>>>>>>> Hi Paul,
>>>>>>>>
>>>>>>>> This looks like a bug in the CMIS connector to me; usually the
>>>>>>>> document version string the connector constructs should be adequate to
>>>>>>>> detect all changes.  Can you create a ticket?
>>>>>>>> https://issues.apache.org/jira , project ManifoldCF.  Please
>>>>>>>> include what version of MCF you are using here.  FWIW, this may be in fact
>>>>>>>> a bug in the Alfresco CMIS implementation, but we'll have to have some back
>>>>>>>> and forth before I can determine that for sure.
>>>>>>>>
>>>>>>>> In the meantime, have you considered using the Alfresco Webscript
>>>>>>>> connector?  It's the preferred way to do Alfresco indexing, although there
>>>>>>>> have been issues reported having to do with running it on some
>>>>>>>> configurations of Alfresco.  I'm not entirely sure what the problem is
>>>>>>>> there; maybe a version dependency of some kind.
>>>>>>>>
>>>>>>>> Karl
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <
>>>>>>>> pfarrell@funnelback.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Everyone,
>>>>>>>>>
>>>>>>>>> Hoping someone may be able to advise.
>>>>>>>>>
>>>>>>>>> I am currently using Manifold, together with a CMIS connector, to
>>>>>>>>> retrieve and index content from an Alfresco repository.
>>>>>>>>>
>>>>>>>>> All is going well apart from, what I would call, the ‘incremental
>>>>>>>>> crawl’.
>>>>>>>>>
>>>>>>>>> The main issue I am having is that the modification of a document’s
>>>>>>>>> security settings, in Alfresco, is not being picked up in next Manifold
>>>>>>>>> crawl. As an example I have a document ‘TestDoc1’ which has user A and B as
>>>>>>>>> Consumers. I run a crawl in Manifold and it picks up the documents fine.
>>>>>>>>> The security is set as expected. I then remove ‘User A’ from the security
>>>>>>>>> of that document and re-run the Manifold crawl. User A can still see the
>>>>>>>>> document in the local search engine.
>>>>>>>>>
>>>>>>>>> It is as if Manifold is not treating the security update as a
>>>>>>>>> ‘modification’ and is therefore not refreshing it. Note that if I go into
>>>>>>>>> the Output Connections, edit and save the relevant output connection and
>>>>>>>>> then click ‘Remove all associated documents’, the next time I crawl, the
>>>>>>>>> changes are picked up. It is clear that Manifold is just not updating
>>>>>>>>> whatever internal record it has for this item.
>>>>>>>>>
>>>>>>>>> Any ideas?
>>>>>>>>>
>>>>>>>>> Many thanks.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>

Re: Manifold/Alfresco seeding and security

Posted by Karl Wright <da...@gmail.com>.
The AMP file is actually shipped as part of the binary MCF distribution.
You can find it under "plugins".

Karl


On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <pf...@funnelback.com>
wrote:

> Hi all,
>
> Hopefully this will be my only request for information today.
> I’m afraid this is a bit of a newbie question but I have managed to get
> the Manifold UI to now show ‘Alfresco Webscripts’ as a connector. The only
> bit I am missing now is to install the AMP file in Afresco.
>
> I realise that this is slightly outside of the Manifold remit but I
> wondered if anyone can advise how I build the AMP file from the URL (
> https://github.com/maoo/alfresco-indexer)? I have cloned the repository
> to my local drive but, having never worked with Maven, am at a loss at how
> to generate the AMP file that I then need to install into Alfresco.
>
> Many thanks,
>
> On 19 Oct 2015, at 17:36, Karl Wright <da...@gmail.com> wrote:
>
> The only way you can have such a reduced list of connectors is if somebody
> commented out many connectors in your connectors.xml, or removed them from
> the database table where they are registered by hand.
>
> Karl
>
>
> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <pf...@funnelback.com>
> wrote:
>
>> After a good deal of time clicking around I came to the same conclusion -
>> that there is no way of telling from the UI!!
>>
>> Having dug a bit deeper I believe I may actually have the Alfresco
>> WebScript connectors installed. At least the 0.7.0 version. I notice in the
>> ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
>>
>> Looking in the ‘connectors.xml’ file I can also see the line :
>>
>> <repositoryconnector name="Alfresco Webscript"
>> class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>>
>> You can imagine my excitement!
>>
>> The only thing I am missing is the option in the UI. When I click to
>> create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive,
>> HDFS, Jira, Meridio, RSS, Sharepoint.
>>
>> Perhaps I am hoping for too much to hope that I can make a simple change
>> to enable this repo connection?
>>
>> Thanks for all the help everyone
>>
>>
>>
>> On 19 Oct 2015, at 17:26, Karl Wright <da...@gmail.com> wrote:
>>
>> Hah; there's not a way to inquire in the UI, if that's what you mean.
>> But if you see "Alfresco webscript" in the list of repository connection
>> types, you've got a version that supports that connector.
>>
>> Thanks,
>> Karl
>>
>>
>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <pf...@funnelback.com>
>> wrote:
>>
>>> Thanks Rafa.
>>>
>>> As an aside, is there an easy way to identify which version of
>>> ManifoldCF you are on?
>>>
>>> Cheers
>>>
>>> *Paul Farrell*
>>> Senior Search Consultant
>>>
>>> 109-123 Clifton Street, London EC2A 4LD
>>> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>>>
>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>
>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> -
>>>  Twitter <https://twitter.com/funnelback>
>>>
>>> Funnelback UK Ltd is a limited liability company registered in England &
>>> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>>> EC2A 4LD. Company registration number: 07004264.
>>>
>>> On 19 Oct 2015, at 16:54, Rafa Haro <rh...@apache.org> wrote:
>>>
>>> Hi Paul,
>>>
>>> All you need to do is to install this webscript
>>> <https://github.com/maoo/alfresco-indexer> within your Alfresco
>>> instance. The connector itself is already part of the most recent versions
>>> of ManifoldCF
>>>
>>> Cheers,
>>> Rafa
>>>
>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <pf...@funnelback.com>
>>> wrote:
>>>
>>>> Ok, thanks again guys.
>>>>
>>>> The Webscript connector it is.
>>>>
>>>> I realise I am asking a lot here but are there any easy-to-follow
>>>> guidelines on how to get this Webscript connector installed?  I see there
>>>> is a GitHub page here (
>>>> https://github.com/maoo/alfresco-webscript-manifold-connector) which
>>>> discusses it (although it directs you to a repository of files).
>>>>
>>>> I am just keen to make sure that any steps I follow to try and get this
>>>> Webscript connector installed and working are updated, reliable steps. I
>>>> would hate to waste time with out of date information.
>>>>
>>>> Thanks all
>>>>
>>>>
>>>>
>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh...@gmail.com> wrote:
>>>>
>>>> Hi Paul,
>>>>
>>>> I suggest that you should use Alfresco Webscript as Karl mentioned. Web
>>>> services is so slow compared to other services and I've also checked that
>>>> Alfresco CMIS web services does not return change token(may be there is
>>>> something that I don't know).
>>>>
>>>> By the way current version of CMIS connector is not aware of change
>>>> token. I would write a patch for you if alfresco supports change token
>>>> property.
>>>>
>>>> Thanks!
>>>> Muhammed
>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <da...@gmail.com>
>>>> şunu yazdı:
>>>>
>>>>> Hi Paul,
>>>>>
>>>>> The Alfresco Webscript connector is a wholly different connector that
>>>>> has no relation to the CMIS connector.  It requires an Alfresco webscript
>>>>> plugin be installed on your Alfresco server to work, though.
>>>>>
>>>>> Hope that helps.
>>>>>
>>>>> Karl
>>>>>
>>>>>
>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <
>>>>> pfarrell@funnelback.com> wrote:
>>>>>
>>>>>> Hi Muhammed/Karl,
>>>>>>
>>>>>> Firstly, thank-you so much for taking the time to reply. It is very
>>>>>> much appreciated.
>>>>>>
>>>>>> Currently I am using the AtomPub for my CMIS repository connection. I
>>>>>> have just read something which may shed a little light on this. The post
>>>>>> read that change tokens are not passed via AtomPub connections (
>>>>>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758).
>>>>>> If true, this would explain why ManifoldCF may be unable to determine a
>>>>>> change in Alfresco.
>>>>>>
>>>>>> It looks like I have two possible options left open to me (correct me
>>>>>> if I’m wrong):
>>>>>>
>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the
>>>>>> connection mechanism
>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’
>>>>>> connector?  (or is this the same as the ‘Web Services’ connection mentioned
>>>>>> above?)
>>>>>>
>>>>>> Thanks again,
>>>>>>
>>>>>> Paul
>>>>>>
>>>>>> *Paul Farrell*
>>>>>> Senior Search Consultant
>>>>>>
>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>>> *T* +44 (0) 207 183 6865 | funnelback.com
>>>>>> <http://www.funnelback.com/>
>>>>>>
>>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>>>
>>>>>> Connect with us: LinkedIn
>>>>>> <http://www.linkedin.com/company/funnelback> - Twitter
>>>>>> <https://twitter.com/funnelback>
>>>>>>
>>>>>> Funnelback UK Ltd is a limited liability company registered in
>>>>>> England & Wales. Registered address: Zetland House 109-123, Clifton Street,
>>>>>> London. EC2A 4LD. Company registration number: 07004264.
>>>>>>
>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh...@gmail.com> wrote:
>>>>>>
>>>>>> Hi Paul,
>>>>>>
>>>>>> Repositories should give information to ManifoldCF when they updated.
>>>>>> Current CMIS connector reindex document if the lastest version of the
>>>>>> document has changed, not updated.
>>>>>>
>>>>>> There is a change token property in CMIS specification and it should
>>>>>> change when document is updated so ManifoldCF can understand that document
>>>>>> is updated but implementing change token property is optional.  I've
>>>>>> checked Alfresco's CMIS web site and seen that they didn't set the change
>>>>>> token.
>>>>>>
>>>>>> I think, there is nothing we can do at this point.
>>>>>>
>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <da...@gmail.com>
>>>>>> şunu yazdı:
>>>>>>
>>>>>>> Hi Paul,
>>>>>>>
>>>>>>> This looks like a bug in the CMIS connector to me; usually the
>>>>>>> document version string the connector constructs should be adequate to
>>>>>>> detect all changes.  Can you create a ticket?
>>>>>>> https://issues.apache.org/jira , project ManifoldCF.  Please
>>>>>>> include what version of MCF you are using here.  FWIW, this may be in fact
>>>>>>> a bug in the Alfresco CMIS implementation, but we'll have to have some back
>>>>>>> and forth before I can determine that for sure.
>>>>>>>
>>>>>>> In the meantime, have you considered using the Alfresco Webscript
>>>>>>> connector?  It's the preferred way to do Alfresco indexing, although there
>>>>>>> have been issues reported having to do with running it on some
>>>>>>> configurations of Alfresco.  I'm not entirely sure what the problem is
>>>>>>> there; maybe a version dependency of some kind.
>>>>>>>
>>>>>>> Karl
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <
>>>>>>> pfarrell@funnelback.com> wrote:
>>>>>>>
>>>>>>>> Hi Everyone,
>>>>>>>>
>>>>>>>> Hoping someone may be able to advise.
>>>>>>>>
>>>>>>>> I am currently using Manifold, together with a CMIS connector, to
>>>>>>>> retrieve and index content from an Alfresco repository.
>>>>>>>>
>>>>>>>> All is going well apart from, what I would call, the ‘incremental
>>>>>>>> crawl’.
>>>>>>>>
>>>>>>>> The main issue I am having is that the modification of a document’s
>>>>>>>> security settings, in Alfresco, is not being picked up in next Manifold
>>>>>>>> crawl. As an example I have a document ‘TestDoc1’ which has user A and B as
>>>>>>>> Consumers. I run a crawl in Manifold and it picks up the documents fine.
>>>>>>>> The security is set as expected. I then remove ‘User A’ from the security
>>>>>>>> of that document and re-run the Manifold crawl. User A can still see the
>>>>>>>> document in the local search engine.
>>>>>>>>
>>>>>>>> It is as if Manifold is not treating the security update as a
>>>>>>>> ‘modification’ and is therefore not refreshing it. Note that if I go into
>>>>>>>> the Output Connections, edit and save the relevant output connection and
>>>>>>>> then click ‘Remove all associated documents’, the next time I crawl, the
>>>>>>>> changes are picked up. It is clear that Manifold is just not updating
>>>>>>>> whatever internal record it has for this item.
>>>>>>>>
>>>>>>>> Any ideas?
>>>>>>>>
>>>>>>>> Many thanks.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>
>

Re: Manifold/Alfresco seeding and security

Posted by Paul Farrell <pf...@funnelback.com>.
Hi all,

Hopefully this will be my only request for information today. 
I’m afraid this is a bit of a newbie question but I have managed to get the Manifold UI to now show ‘Alfresco Webscripts’ as a connector. The only bit I am missing now is to install the AMP file in Afresco. 

I realise that this is slightly outside of the Manifold remit but I wondered if anyone can advise how I build the AMP file from the URL (https://github.com/maoo/alfresco-indexer <https://github.com/maoo/alfresco-indexer>)? I have cloned the repository to my local drive but, having never worked with Maven, am at a loss at how to generate the AMP file that I then need to install into Alfresco. 

Many thanks,

> On 19 Oct 2015, at 17:36, Karl Wright <da...@gmail.com> wrote:
> 
> The only way you can have such a reduced list of connectors is if somebody commented out many connectors in your connectors.xml, or removed them from the database table where they are registered by hand.
> 
> Karl
> 
> 
> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> After a good deal of time clicking around I came to the same conclusion - that there is no way of telling from the UI!!
> 
> Having dug a bit deeper I believe I may actually have the Alfresco WebScript connectors installed. At least the 0.7.0 version. I notice in the ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
> 
> Looking in the ‘connectors.xml’ file I can also see the line :
> 
> <repositoryconnector name="Alfresco Webscript" class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
> 
> You can imagine my excitement!
> 
> The only thing I am missing is the option in the UI. When I click to create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive, HDFS, Jira, Meridio, RSS, Sharepoint. 
> 
> Perhaps I am hoping for too much to hope that I can make a simple change to enable this repo connection?
> 
> Thanks for all the help everyone 
> 
> 
> 
>> On 19 Oct 2015, at 17:26, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hah; there's not a way to inquire in the UI, if that's what you mean.  But if you see "Alfresco webscript" in the list of repository connection types, you've got a version that supports that connector.
>> 
>> Thanks,
>> Karl
>> 
>> 
>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>> Thanks Rafa.
>> 
>> As an aside, is there an easy way to identify which version of ManifoldCF you are on?
>> 
>> Cheers
>> 
>> Paul Farrell
>> Senior Search Consultant
>>  
>> 109-123 Clifton Street, London EC2A 4LD
>> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://www.funnelback.com/>
>> 
>> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>> 
>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>
>> 
>> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.
>> 
>>> On 19 Oct 2015, at 16:54, Rafa Haro <rharo@apache.org <ma...@apache.org>> wrote:
>>> 
>>> Hi Paul, 
>>> 
>>> All you need to do is to install this webscript <https://github.com/maoo/alfresco-indexer> within your Alfresco instance. The connector itself is already part of the most recent versions of ManifoldCF
>>> 
>>> Cheers,
>>> Rafa
>>> 
>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>> Ok, thanks again guys. 
>>> 
>>> The Webscript connector it is. 
>>> 
>>> I realise I am asking a lot here but are there any easy-to-follow guidelines on how to get this Webscript connector installed?  I see there is a GitHub page here (https://github.com/maoo/alfresco-webscript-manifold-connector <https://github.com/maoo/alfresco-webscript-manifold-connector>) which discusses it (although it directs you to a repository of files). 
>>> 
>>> I am just keen to make sure that any steps I follow to try and get this Webscript connector installed and working are updated, reliable steps. I would hate to waste time with out of date information. 
>>> 
>>> Thanks all
>>> 
>>> 
>>> 
>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>> wrote:
>>>> 
>>>> Hi Paul,
>>>> 
>>>> I suggest that you should use Alfresco Webscript as Karl mentioned. Web services is so slow compared to other services and I've also checked that Alfresco CMIS web services does not return change token(may be there is something that I don't know). 
>>>> 
>>>> By the way current version of CMIS connector is not aware of change token. I would write a patch for you if alfresco supports change token property.
>>>> 
>>>> Thanks!
>>>> Muhammed 
>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>>>> Hi Paul,
>>>> 
>>>> The Alfresco Webscript connector is a wholly different connector that has no relation to the CMIS connector.  It requires an Alfresco webscript plugin be installed on your Alfresco server to work, though.
>>>> 
>>>> Hope that helps.
>>>> 
>>>> Karl
>>>> 
>>>> 
>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>> Hi Muhammed/Karl,
>>>> 
>>>> Firstly, thank-you so much for taking the time to reply. It is very much appreciated. 
>>>> 
>>>> Currently I am using the AtomPub for my CMIS repository connection. I have just read something which may shed a little light on this. The post read that change tokens are not passed via AtomPub connections (https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758 <https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758>). If true, this would explain why ManifoldCF may be unable to determine a change in Alfresco.
>>>> 
>>>> It looks like I have two possible options left open to me (correct me if I’m wrong):
>>>> 
>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the connection mechanism
>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’ connector?  (or is this the same as the ‘Web Services’ connection mentioned above?)
>>>> 
>>>> Thanks again,
>>>> 
>>>> Paul
>>>> 
>>>> Paul Farrell
>>>> Senior Search Consultant
>>>>  
>>>> 109-123 Clifton Street, London EC2A 4LD
>>>> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://www.funnelback.com/>
>>>> 
>>>> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>> 
>>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>
>>>> 
>>>> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.
>>>> 
>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>> wrote:
>>>>> 
>>>>> Hi Paul,
>>>>> 
>>>>> Repositories should give information to ManifoldCF when they updated. Current CMIS connector reindex document if the lastest version of the document has changed, not updated. 
>>>>> 
>>>>> There is a change token property in CMIS specification and it should change when document is updated so ManifoldCF can understand that document is updated but implementing change token property is optional.  I've checked Alfresco's CMIS web site and seen that they didn't set the change token.
>>>>> 
>>>>> I think, there is nothing we can do at this point.
>>>>> 
>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>>>>> Hi Paul,
>>>>> 
>>>>> This looks like a bug in the CMIS connector to me; usually the document version string the connector constructs should be adequate to detect all changes.  Can you create a ticket?  https://issues.apache.org/jira <https://issues.apache.org/jira> , project ManifoldCF.  Please include what version of MCF you are using here.  FWIW, this may be in fact a bug in the Alfresco CMIS implementation, but we'll have to have some back and forth before I can determine that for sure.
>>>>> 
>>>>> In the meantime, have you considered using the Alfresco Webscript connector?  It's the preferred way to do Alfresco indexing, although there have been issues reported having to do with running it on some configurations of Alfresco.  I'm not entirely sure what the problem is there; maybe a version dependency of some kind.
>>>>> 
>>>>> Karl
>>>>> 
>>>>> 
>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>>> Hi Everyone,
>>>>> 
>>>>> Hoping someone may be able to advise.
>>>>> 
>>>>> I am currently using Manifold, together with a CMIS connector, to retrieve and index content from an Alfresco repository.
>>>>> 
>>>>> All is going well apart from, what I would call, the ‘incremental crawl’.
>>>>> 
>>>>> The main issue I am having is that the modification of a document’s security settings, in Alfresco, is not being picked up in next Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A and B as Consumers. I run a crawl in Manifold and it picks up the documents fine.  The security is set as expected. I then remove ‘User A’ from the security of that document and re-run the Manifold crawl. User A can still see the document in the local search engine.
>>>>> 
>>>>> It is as if Manifold is not treating the security update as a ‘modification’ and is therefore not refreshing it. Note that if I go into the Output Connections, edit and save the relevant output connection and then click ‘Remove all associated documents’, the next time I crawl, the changes are picked up. It is clear that Manifold is just not updating whatever internal record it has for this item.
>>>>> 
>>>>> Any ideas?
>>>>> 
>>>>> Many thanks.
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 
> 


Re: Manifold/Alfresco seeding and security

Posted by Karl Wright <da...@gmail.com>.
The only way you can have such a reduced list of connectors is if somebody
commented out many connectors in your connectors.xml, or removed them from
the database table where they are registered by hand.

Karl


On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <pf...@funnelback.com>
wrote:

> After a good deal of time clicking around I came to the same conclusion -
> that there is no way of telling from the UI!!
>
> Having dug a bit deeper I believe I may actually have the Alfresco
> WebScript connectors installed. At least the 0.7.0 version. I notice in the
> ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.
>
> Looking in the ‘connectors.xml’ file I can also see the line :
>
> <repositoryconnector name="Alfresco Webscript"
> class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>
> You can imagine my excitement!
>
> The only thing I am missing is the option in the UI. When I click to
> create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive,
> HDFS, Jira, Meridio, RSS, Sharepoint.
>
> Perhaps I am hoping for too much to hope that I can make a simple change
> to enable this repo connection?
>
> Thanks for all the help everyone
>
>
>
> On 19 Oct 2015, at 17:26, Karl Wright <da...@gmail.com> wrote:
>
> Hah; there's not a way to inquire in the UI, if that's what you mean.  But
> if you see "Alfresco webscript" in the list of repository connection types,
> you've got a version that supports that connector.
>
> Thanks,
> Karl
>
>
> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <pf...@funnelback.com>
> wrote:
>
>> Thanks Rafa.
>>
>> As an aside, is there an easy way to identify which version of ManifoldCF
>> you are on?
>>
>> Cheers
>>
>> *Paul Farrell*
>> Senior Search Consultant
>>
>> 109-123 Clifton Street, London EC2A 4LD
>> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>>
>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>
>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> -
>> Twitter <https://twitter.com/funnelback>
>>
>> Funnelback UK Ltd is a limited liability company registered in England &
>> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>> EC2A 4LD. Company registration number: 07004264.
>>
>> On 19 Oct 2015, at 16:54, Rafa Haro <rh...@apache.org> wrote:
>>
>> Hi Paul,
>>
>> All you need to do is to install this webscript
>> <https://github.com/maoo/alfresco-indexer> within your Alfresco
>> instance. The connector itself is already part of the most recent versions
>> of ManifoldCF
>>
>> Cheers,
>> Rafa
>>
>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <pf...@funnelback.com>
>> wrote:
>>
>>> Ok, thanks again guys.
>>>
>>> The Webscript connector it is.
>>>
>>> I realise I am asking a lot here but are there any easy-to-follow
>>> guidelines on how to get this Webscript connector installed?  I see there
>>> is a GitHub page here (
>>> https://github.com/maoo/alfresco-webscript-manifold-connector) which
>>> discusses it (although it directs you to a repository of files).
>>>
>>> I am just keen to make sure that any steps I follow to try and get this
>>> Webscript connector installed and working are updated, reliable steps. I
>>> would hate to waste time with out of date information.
>>>
>>> Thanks all
>>>
>>>
>>>
>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh...@gmail.com> wrote:
>>>
>>> Hi Paul,
>>>
>>> I suggest that you should use Alfresco Webscript as Karl mentioned. Web
>>> services is so slow compared to other services and I've also checked that
>>> Alfresco CMIS web services does not return change token(may be there is
>>> something that I don't know).
>>>
>>> By the way current version of CMIS connector is not aware of change
>>> token. I would write a patch for you if alfresco supports change token
>>> property.
>>>
>>> Thanks!
>>> Muhammed
>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <da...@gmail.com>
>>> şunu yazdı:
>>>
>>>> Hi Paul,
>>>>
>>>> The Alfresco Webscript connector is a wholly different connector that
>>>> has no relation to the CMIS connector.  It requires an Alfresco webscript
>>>> plugin be installed on your Alfresco server to work, though.
>>>>
>>>> Hope that helps.
>>>>
>>>> Karl
>>>>
>>>>
>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <pfarrell@funnelback.com
>>>> > wrote:
>>>>
>>>>> Hi Muhammed/Karl,
>>>>>
>>>>> Firstly, thank-you so much for taking the time to reply. It is very
>>>>> much appreciated.
>>>>>
>>>>> Currently I am using the AtomPub for my CMIS repository connection. I
>>>>> have just read something which may shed a little light on this. The post
>>>>> read that change tokens are not passed via AtomPub connections (
>>>>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758).
>>>>> If true, this would explain why ManifoldCF may be unable to determine a
>>>>> change in Alfresco.
>>>>>
>>>>> It looks like I have two possible options left open to me (correct me
>>>>> if I’m wrong):
>>>>>
>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the
>>>>> connection mechanism
>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’ connector?
>>>>>  (or is this the same as the ‘Web Services’ connection mentioned above?)
>>>>>
>>>>> Thanks again,
>>>>>
>>>>> Paul
>>>>>
>>>>> *Paul Farrell*
>>>>> Senior Search Consultant
>>>>>
>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>>>>>
>>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>>
>>>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback>
>>>>>  - Twitter <https://twitter.com/funnelback>
>>>>>
>>>>> Funnelback UK Ltd is a limited liability company registered in England
>>>>> & Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>>>>> EC2A 4LD. Company registration number: 07004264.
>>>>>
>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh...@gmail.com> wrote:
>>>>>
>>>>> Hi Paul,
>>>>>
>>>>> Repositories should give information to ManifoldCF when they updated.
>>>>> Current CMIS connector reindex document if the lastest version of the
>>>>> document has changed, not updated.
>>>>>
>>>>> There is a change token property in CMIS specification and it should
>>>>> change when document is updated so ManifoldCF can understand that document
>>>>> is updated but implementing change token property is optional.  I've
>>>>> checked Alfresco's CMIS web site and seen that they didn't set the change
>>>>> token.
>>>>>
>>>>> I think, there is nothing we can do at this point.
>>>>>
>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <da...@gmail.com>
>>>>> şunu yazdı:
>>>>>
>>>>>> Hi Paul,
>>>>>>
>>>>>> This looks like a bug in the CMIS connector to me; usually the
>>>>>> document version string the connector constructs should be adequate to
>>>>>> detect all changes.  Can you create a ticket?
>>>>>> https://issues.apache.org/jira , project ManifoldCF.  Please include
>>>>>> what version of MCF you are using here.  FWIW, this may be in fact a bug in
>>>>>> the Alfresco CMIS implementation, but we'll have to have some back and
>>>>>> forth before I can determine that for sure.
>>>>>>
>>>>>> In the meantime, have you considered using the Alfresco Webscript
>>>>>> connector?  It's the preferred way to do Alfresco indexing, although there
>>>>>> have been issues reported having to do with running it on some
>>>>>> configurations of Alfresco.  I'm not entirely sure what the problem is
>>>>>> there; maybe a version dependency of some kind.
>>>>>>
>>>>>> Karl
>>>>>>
>>>>>>
>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <
>>>>>> pfarrell@funnelback.com> wrote:
>>>>>>
>>>>>>> Hi Everyone,
>>>>>>>
>>>>>>> Hoping someone may be able to advise.
>>>>>>>
>>>>>>> I am currently using Manifold, together with a CMIS connector, to
>>>>>>> retrieve and index content from an Alfresco repository.
>>>>>>>
>>>>>>> All is going well apart from, what I would call, the ‘incremental
>>>>>>> crawl’.
>>>>>>>
>>>>>>> The main issue I am having is that the modification of a document’s
>>>>>>> security settings, in Alfresco, is not being picked up in next Manifold
>>>>>>> crawl. As an example I have a document ‘TestDoc1’ which has user A and B as
>>>>>>> Consumers. I run a crawl in Manifold and it picks up the documents fine.
>>>>>>> The security is set as expected. I then remove ‘User A’ from the security
>>>>>>> of that document and re-run the Manifold crawl. User A can still see the
>>>>>>> document in the local search engine.
>>>>>>>
>>>>>>> It is as if Manifold is not treating the security update as a
>>>>>>> ‘modification’ and is therefore not refreshing it. Note that if I go into
>>>>>>> the Output Connections, edit and save the relevant output connection and
>>>>>>> then click ‘Remove all associated documents’, the next time I crawl, the
>>>>>>> changes are picked up. It is clear that Manifold is just not updating
>>>>>>> whatever internal record it has for this item.
>>>>>>>
>>>>>>> Any ideas?
>>>>>>>
>>>>>>> Many thanks.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>
>

Re: Manifold/Alfresco seeding and security

Posted by Paul Farrell <pf...@funnelback.com>.
After a good deal of time clicking around I came to the same conclusion - that there is no way of telling from the UI!!

Having dug a bit deeper I believe I may actually have the Alfresco WebScript connectors installed. At least the 0.7.0 version. I notice in the ‘lib’ directory that I have ‘alfresco-indexer-webscripts-0.7.0.amp.

Looking in the ‘connectors.xml’ file I can also see the line :

<repositoryconnector name="Alfresco Webscript" class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>

You can imagine my excitement!

The only thing I am missing is the option in the UI. When I click to create a new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive, HDFS, Jira, Meridio, RSS, Sharepoint. 

Perhaps I am hoping for too much to hope that I can make a simple change to enable this repo connection?

Thanks for all the help everyone 



> On 19 Oct 2015, at 17:26, Karl Wright <da...@gmail.com> wrote:
> 
> Hah; there's not a way to inquire in the UI, if that's what you mean.  But if you see "Alfresco webscript" in the list of repository connection types, you've got a version that supports that connector.
> 
> Thanks,
> Karl
> 
> 
> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> Thanks Rafa.
> 
> As an aside, is there an easy way to identify which version of ManifoldCF you are on?
> 
> Cheers
> 
> Paul Farrell
> Senior Search Consultant
>  
> 109-123 Clifton Street, London EC2A 4LD
> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://www.funnelback.com/>
> 
> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
> 
> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>
> 
> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.
> 
>> On 19 Oct 2015, at 16:54, Rafa Haro <rharo@apache.org <ma...@apache.org>> wrote:
>> 
>> Hi Paul, 
>> 
>> All you need to do is to install this webscript <https://github.com/maoo/alfresco-indexer> within your Alfresco instance. The connector itself is already part of the most recent versions of ManifoldCF
>> 
>> Cheers,
>> Rafa
>> 
>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>> Ok, thanks again guys. 
>> 
>> The Webscript connector it is. 
>> 
>> I realise I am asking a lot here but are there any easy-to-follow guidelines on how to get this Webscript connector installed?  I see there is a GitHub page here (https://github.com/maoo/alfresco-webscript-manifold-connector <https://github.com/maoo/alfresco-webscript-manifold-connector>) which discusses it (although it directs you to a repository of files). 
>> 
>> I am just keen to make sure that any steps I follow to try and get this Webscript connector installed and working are updated, reliable steps. I would hate to waste time with out of date information. 
>> 
>> Thanks all
>> 
>> 
>> 
>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> Hi Paul,
>>> 
>>> I suggest that you should use Alfresco Webscript as Karl mentioned. Web services is so slow compared to other services and I've also checked that Alfresco CMIS web services does not return change token(may be there is something that I don't know). 
>>> 
>>> By the way current version of CMIS connector is not aware of change token. I would write a patch for you if alfresco supports change token property.
>>> 
>>> Thanks!
>>> Muhammed 
>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>>> Hi Paul,
>>> 
>>> The Alfresco Webscript connector is a wholly different connector that has no relation to the CMIS connector.  It requires an Alfresco webscript plugin be installed on your Alfresco server to work, though.
>>> 
>>> Hope that helps.
>>> 
>>> Karl
>>> 
>>> 
>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>> Hi Muhammed/Karl,
>>> 
>>> Firstly, thank-you so much for taking the time to reply. It is very much appreciated. 
>>> 
>>> Currently I am using the AtomPub for my CMIS repository connection. I have just read something which may shed a little light on this. The post read that change tokens are not passed via AtomPub connections (https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758 <https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758>). If true, this would explain why ManifoldCF may be unable to determine a change in Alfresco.
>>> 
>>> It looks like I have two possible options left open to me (correct me if I’m wrong):
>>> 
>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the connection mechanism
>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’ connector?  (or is this the same as the ‘Web Services’ connection mentioned above?)
>>> 
>>> Thanks again,
>>> 
>>> Paul
>>> 
>>> Paul Farrell
>>> Senior Search Consultant
>>>  
>>> 109-123 Clifton Street, London EC2A 4LD
>>> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://www.funnelback.com/>
>>> 
>>> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>> 
>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>
>>> 
>>> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.
>>> 
>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>> wrote:
>>>> 
>>>> Hi Paul,
>>>> 
>>>> Repositories should give information to ManifoldCF when they updated. Current CMIS connector reindex document if the lastest version of the document has changed, not updated. 
>>>> 
>>>> There is a change token property in CMIS specification and it should change when document is updated so ManifoldCF can understand that document is updated but implementing change token property is optional.  I've checked Alfresco's CMIS web site and seen that they didn't set the change token.
>>>> 
>>>> I think, there is nothing we can do at this point.
>>>> 
>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>>>> Hi Paul,
>>>> 
>>>> This looks like a bug in the CMIS connector to me; usually the document version string the connector constructs should be adequate to detect all changes.  Can you create a ticket?  https://issues.apache.org/jira <https://issues.apache.org/jira> , project ManifoldCF.  Please include what version of MCF you are using here.  FWIW, this may be in fact a bug in the Alfresco CMIS implementation, but we'll have to have some back and forth before I can determine that for sure.
>>>> 
>>>> In the meantime, have you considered using the Alfresco Webscript connector?  It's the preferred way to do Alfresco indexing, although there have been issues reported having to do with running it on some configurations of Alfresco.  I'm not entirely sure what the problem is there; maybe a version dependency of some kind.
>>>> 
>>>> Karl
>>>> 
>>>> 
>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>>> Hi Everyone,
>>>> 
>>>> Hoping someone may be able to advise.
>>>> 
>>>> I am currently using Manifold, together with a CMIS connector, to retrieve and index content from an Alfresco repository.
>>>> 
>>>> All is going well apart from, what I would call, the ‘incremental crawl’.
>>>> 
>>>> The main issue I am having is that the modification of a document’s security settings, in Alfresco, is not being picked up in next Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A and B as Consumers. I run a crawl in Manifold and it picks up the documents fine.  The security is set as expected. I then remove ‘User A’ from the security of that document and re-run the Manifold crawl. User A can still see the document in the local search engine.
>>>> 
>>>> It is as if Manifold is not treating the security update as a ‘modification’ and is therefore not refreshing it. Note that if I go into the Output Connections, edit and save the relevant output connection and then click ‘Remove all associated documents’, the next time I crawl, the changes are picked up. It is clear that Manifold is just not updating whatever internal record it has for this item.
>>>> 
>>>> Any ideas?
>>>> 
>>>> Many thanks.
>>>> 
>>> 
>>> 
>> 
>> 
> 
> 


Re: Manifold/Alfresco seeding and security

Posted by Karl Wright <da...@gmail.com>.
Hah; there's not a way to inquire in the UI, if that's what you mean.  But
if you see "Alfresco webscript" in the list of repository connection types,
you've got a version that supports that connector.

Thanks,
Karl


On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <pf...@funnelback.com>
wrote:

> Thanks Rafa.
>
> As an aside, is there an easy way to identify which version of ManifoldCF
> you are on?
>
> Cheers
>
> *Paul Farrell*
> Senior Search Consultant
>
> 109-123 Clifton Street, London EC2A 4LD
> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>
> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>
> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> -
> Twitter <https://twitter.com/funnelback>
>
> Funnelback UK Ltd is a limited liability company registered in England &
> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
> EC2A 4LD. Company registration number: 07004264.
>
> On 19 Oct 2015, at 16:54, Rafa Haro <rh...@apache.org> wrote:
>
> Hi Paul,
>
> All you need to do is to install this webscript
> <https://github.com/maoo/alfresco-indexer> within your Alfresco instance.
> The connector itself is already part of the most recent versions of
> ManifoldCF
>
> Cheers,
> Rafa
>
> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <pf...@funnelback.com>
> wrote:
>
>> Ok, thanks again guys.
>>
>> The Webscript connector it is.
>>
>> I realise I am asking a lot here but are there any easy-to-follow
>> guidelines on how to get this Webscript connector installed?  I see there
>> is a GitHub page here (
>> https://github.com/maoo/alfresco-webscript-manifold-connector) which
>> discusses it (although it directs you to a repository of files).
>>
>> I am just keen to make sure that any steps I follow to try and get this
>> Webscript connector installed and working are updated, reliable steps. I
>> would hate to waste time with out of date information.
>>
>> Thanks all
>>
>>
>>
>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh...@gmail.com> wrote:
>>
>> Hi Paul,
>>
>> I suggest that you should use Alfresco Webscript as Karl mentioned. Web
>> services is so slow compared to other services and I've also checked that
>> Alfresco CMIS web services does not return change token(may be there is
>> something that I don't know).
>>
>> By the way current version of CMIS connector is not aware of change
>> token. I would write a patch for you if alfresco supports change token
>> property.
>>
>> Thanks!
>> Muhammed
>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <da...@gmail.com>
>> şunu yazdı:
>>
>>> Hi Paul,
>>>
>>> The Alfresco Webscript connector is a wholly different connector that
>>> has no relation to the CMIS connector.  It requires an Alfresco webscript
>>> plugin be installed on your Alfresco server to work, though.
>>>
>>> Hope that helps.
>>>
>>> Karl
>>>
>>>
>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <pf...@funnelback.com>
>>> wrote:
>>>
>>>> Hi Muhammed/Karl,
>>>>
>>>> Firstly, thank-you so much for taking the time to reply. It is very
>>>> much appreciated.
>>>>
>>>> Currently I am using the AtomPub for my CMIS repository connection. I
>>>> have just read something which may shed a little light on this. The post
>>>> read that change tokens are not passed via AtomPub connections (
>>>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758).
>>>> If true, this would explain why ManifoldCF may be unable to determine a
>>>> change in Alfresco.
>>>>
>>>> It looks like I have two possible options left open to me (correct me
>>>> if I’m wrong):
>>>>
>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the connection
>>>> mechanism
>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’ connector?
>>>>  (or is this the same as the ‘Web Services’ connection mentioned above?)
>>>>
>>>> Thanks again,
>>>>
>>>> Paul
>>>>
>>>> *Paul Farrell*
>>>> Senior Search Consultant
>>>>
>>>> 109-123 Clifton Street, London EC2A 4LD
>>>> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>>>>
>>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>
>>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback>
>>>> - Twitter <https://twitter.com/funnelback>
>>>>
>>>> Funnelback UK Ltd is a limited liability company registered in England
>>>> & Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>>>> EC2A 4LD. Company registration number: 07004264.
>>>>
>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh...@gmail.com> wrote:
>>>>
>>>> Hi Paul,
>>>>
>>>> Repositories should give information to ManifoldCF when they updated.
>>>> Current CMIS connector reindex document if the lastest version of the
>>>> document has changed, not updated.
>>>>
>>>> There is a change token property in CMIS specification and it should
>>>> change when document is updated so ManifoldCF can understand that document
>>>> is updated but implementing change token property is optional.  I've
>>>> checked Alfresco's CMIS web site and seen that they didn't set the change
>>>> token.
>>>>
>>>> I think, there is nothing we can do at this point.
>>>>
>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <da...@gmail.com>
>>>> şunu yazdı:
>>>>
>>>>> Hi Paul,
>>>>>
>>>>> This looks like a bug in the CMIS connector to me; usually the
>>>>> document version string the connector constructs should be adequate to
>>>>> detect all changes.  Can you create a ticket?
>>>>> https://issues.apache.org/jira , project ManifoldCF.  Please include
>>>>> what version of MCF you are using here.  FWIW, this may be in fact a bug in
>>>>> the Alfresco CMIS implementation, but we'll have to have some back and
>>>>> forth before I can determine that for sure.
>>>>>
>>>>> In the meantime, have you considered using the Alfresco Webscript
>>>>> connector?  It's the preferred way to do Alfresco indexing, although there
>>>>> have been issues reported having to do with running it on some
>>>>> configurations of Alfresco.  I'm not entirely sure what the problem is
>>>>> there; maybe a version dependency of some kind.
>>>>>
>>>>> Karl
>>>>>
>>>>>
>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <pfarrell@funnelback.com
>>>>> > wrote:
>>>>>
>>>>>> Hi Everyone,
>>>>>>
>>>>>> Hoping someone may be able to advise.
>>>>>>
>>>>>> I am currently using Manifold, together with a CMIS connector, to
>>>>>> retrieve and index content from an Alfresco repository.
>>>>>>
>>>>>> All is going well apart from, what I would call, the ‘incremental
>>>>>> crawl’.
>>>>>>
>>>>>> The main issue I am having is that the modification of a document’s
>>>>>> security settings, in Alfresco, is not being picked up in next Manifold
>>>>>> crawl. As an example I have a document ‘TestDoc1’ which has user A and B as
>>>>>> Consumers. I run a crawl in Manifold and it picks up the documents fine.
>>>>>> The security is set as expected. I then remove ‘User A’ from the security
>>>>>> of that document and re-run the Manifold crawl. User A can still see the
>>>>>> document in the local search engine.
>>>>>>
>>>>>> It is as if Manifold is not treating the security update as a
>>>>>> ‘modification’ and is therefore not refreshing it. Note that if I go into
>>>>>> the Output Connections, edit and save the relevant output connection and
>>>>>> then click ‘Remove all associated documents’, the next time I crawl, the
>>>>>> changes are picked up. It is clear that Manifold is just not updating
>>>>>> whatever internal record it has for this item.
>>>>>>
>>>>>> Any ideas?
>>>>>>
>>>>>> Many thanks.
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
>

Re: Manifold/Alfresco seeding and security

Posted by Paul Farrell <pf...@funnelback.com>.
Thanks Rafa.

As an aside, is there an easy way to identify which version of ManifoldCF you are on?

Cheers

Paul Farrell
Senior Search Consultant
 
109-123 Clifton Street, London EC2A 4LD
T +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>

UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES

Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>

Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.

> On 19 Oct 2015, at 16:54, Rafa Haro <rh...@apache.org> wrote:
> 
> Hi Paul, 
> 
> All you need to do is to install this webscript <https://github.com/maoo/alfresco-indexer> within your Alfresco instance. The connector itself is already part of the most recent versions of ManifoldCF
> 
> Cheers,
> Rafa
> 
> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> Ok, thanks again guys. 
> 
> The Webscript connector it is. 
> 
> I realise I am asking a lot here but are there any easy-to-follow guidelines on how to get this Webscript connector installed?  I see there is a GitHub page here (https://github.com/maoo/alfresco-webscript-manifold-connector <https://github.com/maoo/alfresco-webscript-manifold-connector>) which discusses it (although it directs you to a repository of files). 
> 
> I am just keen to make sure that any steps I follow to try and get this Webscript connector installed and working are updated, reliable steps. I would hate to waste time with out of date information. 
> 
> Thanks all
> 
> 
> 
>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hi Paul,
>> 
>> I suggest that you should use Alfresco Webscript as Karl mentioned. Web services is so slow compared to other services and I've also checked that Alfresco CMIS web services does not return change token(may be there is something that I don't know). 
>> 
>> By the way current version of CMIS connector is not aware of change token. I would write a patch for you if alfresco supports change token property.
>> 
>> Thanks!
>> Muhammed 
>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>> Hi Paul,
>> 
>> The Alfresco Webscript connector is a wholly different connector that has no relation to the CMIS connector.  It requires an Alfresco webscript plugin be installed on your Alfresco server to work, though.
>> 
>> Hope that helps.
>> 
>> Karl
>> 
>> 
>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>> Hi Muhammed/Karl,
>> 
>> Firstly, thank-you so much for taking the time to reply. It is very much appreciated. 
>> 
>> Currently I am using the AtomPub for my CMIS repository connection. I have just read something which may shed a little light on this. The post read that change tokens are not passed via AtomPub connections (https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758 <https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758>). If true, this would explain why ManifoldCF may be unable to determine a change in Alfresco.
>> 
>> It looks like I have two possible options left open to me (correct me if I’m wrong):
>> 
>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the connection mechanism
>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’ connector?  (or is this the same as the ‘Web Services’ connection mentioned above?)
>> 
>> Thanks again,
>> 
>> Paul
>> 
>> Paul Farrell
>> Senior Search Consultant
>>  
>> 109-123 Clifton Street, London EC2A 4LD
>> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://www.funnelback.com/>
>> 
>> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>> 
>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>
>> 
>> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.
>> 
>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> Hi Paul,
>>> 
>>> Repositories should give information to ManifoldCF when they updated. Current CMIS connector reindex document if the lastest version of the document has changed, not updated. 
>>> 
>>> There is a change token property in CMIS specification and it should change when document is updated so ManifoldCF can understand that document is updated but implementing change token property is optional.  I've checked Alfresco's CMIS web site and seen that they didn't set the change token.
>>> 
>>> I think, there is nothing we can do at this point.
>>> 
>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>>> Hi Paul,
>>> 
>>> This looks like a bug in the CMIS connector to me; usually the document version string the connector constructs should be adequate to detect all changes.  Can you create a ticket?  https://issues.apache.org/jira <https://issues.apache.org/jira> , project ManifoldCF.  Please include what version of MCF you are using here.  FWIW, this may be in fact a bug in the Alfresco CMIS implementation, but we'll have to have some back and forth before I can determine that for sure.
>>> 
>>> In the meantime, have you considered using the Alfresco Webscript connector?  It's the preferred way to do Alfresco indexing, although there have been issues reported having to do with running it on some configurations of Alfresco.  I'm not entirely sure what the problem is there; maybe a version dependency of some kind.
>>> 
>>> Karl
>>> 
>>> 
>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>>> Hi Everyone,
>>> 
>>> Hoping someone may be able to advise.
>>> 
>>> I am currently using Manifold, together with a CMIS connector, to retrieve and index content from an Alfresco repository.
>>> 
>>> All is going well apart from, what I would call, the ‘incremental crawl’.
>>> 
>>> The main issue I am having is that the modification of a document’s security settings, in Alfresco, is not being picked up in next Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A and B as Consumers. I run a crawl in Manifold and it picks up the documents fine.  The security is set as expected. I then remove ‘User A’ from the security of that document and re-run the Manifold crawl. User A can still see the document in the local search engine.
>>> 
>>> It is as if Manifold is not treating the security update as a ‘modification’ and is therefore not refreshing it. Note that if I go into the Output Connections, edit and save the relevant output connection and then click ‘Remove all associated documents’, the next time I crawl, the changes are picked up. It is clear that Manifold is just not updating whatever internal record it has for this item.
>>> 
>>> Any ideas?
>>> 
>>> Many thanks.
>>> 
>> 
>> 
> 
> 


Re: Manifold/Alfresco seeding and security

Posted by Rafa Haro <rh...@apache.org>.
Hi Paul,

All you need to do is to install this webscript
<https://github.com/maoo/alfresco-indexer> within your Alfresco instance.
The connector itself is already part of the most recent versions of
ManifoldCF

Cheers,
Rafa

On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <pf...@funnelback.com>
wrote:

> Ok, thanks again guys.
>
> The Webscript connector it is.
>
> I realise I am asking a lot here but are there any easy-to-follow
> guidelines on how to get this Webscript connector installed?  I see there
> is a GitHub page here (
> https://github.com/maoo/alfresco-webscript-manifold-connector) which
> discusses it (although it directs you to a repository of files).
>
> I am just keen to make sure that any steps I follow to try and get this
> Webscript connector installed and working are updated, reliable steps. I
> would hate to waste time with out of date information.
>
> Thanks all
>
>
>
> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh...@gmail.com> wrote:
>
> Hi Paul,
>
> I suggest that you should use Alfresco Webscript as Karl mentioned. Web
> services is so slow compared to other services and I've also checked that
> Alfresco CMIS web services does not return change token(may be there is
> something that I don't know).
>
> By the way current version of CMIS connector is not aware of change token.
> I would write a patch for you if alfresco supports change token property.
>
> Thanks!
> Muhammed
> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <da...@gmail.com>
> şunu yazdı:
>
>> Hi Paul,
>>
>> The Alfresco Webscript connector is a wholly different connector that has
>> no relation to the CMIS connector.  It requires an Alfresco webscript
>> plugin be installed on your Alfresco server to work, though.
>>
>> Hope that helps.
>>
>> Karl
>>
>>
>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <pf...@funnelback.com>
>> wrote:
>>
>>> Hi Muhammed/Karl,
>>>
>>> Firstly, thank-you so much for taking the time to reply. It is very much
>>> appreciated.
>>>
>>> Currently I am using the AtomPub for my CMIS repository connection. I
>>> have just read something which may shed a little light on this. The post
>>> read that change tokens are not passed via AtomPub connections (
>>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758).
>>> If true, this would explain why ManifoldCF may be unable to determine a
>>> change in Alfresco.
>>>
>>> It looks like I have two possible options left open to me (correct me if
>>> I’m wrong):
>>>
>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the connection
>>> mechanism
>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’ connector?
>>>  (or is this the same as the ‘Web Services’ connection mentioned above?)
>>>
>>> Thanks again,
>>>
>>> Paul
>>>
>>> *Paul Farrell*
>>> Senior Search Consultant
>>>
>>> 109-123 Clifton Street, London EC2A 4LD
>>> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>>>
>>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>
>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> -
>>>  Twitter <https://twitter.com/funnelback>
>>>
>>> Funnelback UK Ltd is a limited liability company registered in England &
>>> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>>> EC2A 4LD. Company registration number: 07004264.
>>>
>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh...@gmail.com> wrote:
>>>
>>> Hi Paul,
>>>
>>> Repositories should give information to ManifoldCF when they updated.
>>> Current CMIS connector reindex document if the lastest version of the
>>> document has changed, not updated.
>>>
>>> There is a change token property in CMIS specification and it should
>>> change when document is updated so ManifoldCF can understand that document
>>> is updated but implementing change token property is optional.  I've
>>> checked Alfresco's CMIS web site and seen that they didn't set the change
>>> token.
>>>
>>> I think, there is nothing we can do at this point.
>>>
>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <da...@gmail.com> şunu
>>> yazdı:
>>>
>>>> Hi Paul,
>>>>
>>>> This looks like a bug in the CMIS connector to me; usually the document
>>>> version string the connector constructs should be adequate to detect all
>>>> changes.  Can you create a ticket?  https://issues.apache.org/jira ,
>>>> project ManifoldCF.  Please include what version of MCF you are using
>>>> here.  FWIW, this may be in fact a bug in the Alfresco CMIS implementation,
>>>> but we'll have to have some back and forth before I can determine that for
>>>> sure.
>>>>
>>>> In the meantime, have you considered using the Alfresco Webscript
>>>> connector?  It's the preferred way to do Alfresco indexing, although there
>>>> have been issues reported having to do with running it on some
>>>> configurations of Alfresco.  I'm not entirely sure what the problem is
>>>> there; maybe a version dependency of some kind.
>>>>
>>>> Karl
>>>>
>>>>
>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <pf...@funnelback.com>
>>>> wrote:
>>>>
>>>>> Hi Everyone,
>>>>>
>>>>> Hoping someone may be able to advise.
>>>>>
>>>>> I am currently using Manifold, together with a CMIS connector, to
>>>>> retrieve and index content from an Alfresco repository.
>>>>>
>>>>> All is going well apart from, what I would call, the ‘incremental
>>>>> crawl’.
>>>>>
>>>>> The main issue I am having is that the modification of a document’s
>>>>> security settings, in Alfresco, is not being picked up in next Manifold
>>>>> crawl. As an example I have a document ‘TestDoc1’ which has user A and B as
>>>>> Consumers. I run a crawl in Manifold and it picks up the documents fine.
>>>>> The security is set as expected. I then remove ‘User A’ from the security
>>>>> of that document and re-run the Manifold crawl. User A can still see the
>>>>> document in the local search engine.
>>>>>
>>>>> It is as if Manifold is not treating the security update as a
>>>>> ‘modification’ and is therefore not refreshing it. Note that if I go into
>>>>> the Output Connections, edit and save the relevant output connection and
>>>>> then click ‘Remove all associated documents’, the next time I crawl, the
>>>>> changes are picked up. It is clear that Manifold is just not updating
>>>>> whatever internal record it has for this item.
>>>>>
>>>>> Any ideas?
>>>>>
>>>>> Many thanks.
>>>>
>>>>
>>>>
>>>
>>
>

Re: Manifold/Alfresco seeding and security

Posted by Paul Farrell <pf...@funnelback.com>.
Ok, thanks again guys. 

The Webscript connector it is. 

I realise I am asking a lot here but are there any easy-to-follow guidelines on how to get this Webscript connector installed?  I see there is a GitHub page here (https://github.com/maoo/alfresco-webscript-manifold-connector <https://github.com/maoo/alfresco-webscript-manifold-connector>) which discusses it (although it directs you to a repository of files). 

I am just keen to make sure that any steps I follow to try and get this Webscript connector installed and working are updated, reliable steps. I would hate to waste time with out of date information. 

Thanks all


> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh...@gmail.com> wrote:
> 
> Hi Paul,
> 
> I suggest that you should use Alfresco Webscript as Karl mentioned. Web services is so slow compared to other services and I've also checked that Alfresco CMIS web services does not return change token(may be there is something that I don't know). 
> 
> By the way current version of CMIS connector is not aware of change token. I would write a patch for you if alfresco supports change token property.
> 
> Thanks!
> Muhammed 
> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
> Hi Paul,
> 
> The Alfresco Webscript connector is a wholly different connector that has no relation to the CMIS connector.  It requires an Alfresco webscript plugin be installed on your Alfresco server to work, though.
> 
> Hope that helps.
> 
> Karl
> 
> 
> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> Hi Muhammed/Karl,
> 
> Firstly, thank-you so much for taking the time to reply. It is very much appreciated. 
> 
> Currently I am using the AtomPub for my CMIS repository connection. I have just read something which may shed a little light on this. The post read that change tokens are not passed via AtomPub connections (https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758 <https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758>). If true, this would explain why ManifoldCF may be unable to determine a change in Alfresco.
> 
> It looks like I have two possible options left open to me (correct me if I’m wrong):
> 
> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the connection mechanism
> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’ connector?  (or is this the same as the ‘Web Services’ connection mentioned above?)
> 
> Thanks again,
> 
> Paul
> 
> Paul Farrell
> Senior Search Consultant
>  
> 109-123 Clifton Street, London EC2A 4LD
> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com <http://www.funnelback.com/>
> 
> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
> 
> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>
> 
> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.
> 
>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh.olgun@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hi Paul,
>> 
>> Repositories should give information to ManifoldCF when they updated. Current CMIS connector reindex document if the lastest version of the document has changed, not updated. 
>> 
>> There is a change token property in CMIS specification and it should change when document is updated so ManifoldCF can understand that document is updated but implementing change token property is optional.  I've checked Alfresco's CMIS web site and seen that they didn't set the change token.
>> 
>> I think, there is nothing we can do at this point.
>> 
>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
>> Hi Paul,
>> 
>> This looks like a bug in the CMIS connector to me; usually the document version string the connector constructs should be adequate to detect all changes.  Can you create a ticket?  https://issues.apache.org/jira <https://issues.apache.org/jira> , project ManifoldCF.  Please include what version of MCF you are using here.  FWIW, this may be in fact a bug in the Alfresco CMIS implementation, but we'll have to have some back and forth before I can determine that for sure.
>> 
>> In the meantime, have you considered using the Alfresco Webscript connector?  It's the preferred way to do Alfresco indexing, although there have been issues reported having to do with running it on some configurations of Alfresco.  I'm not entirely sure what the problem is there; maybe a version dependency of some kind.
>> 
>> Karl
>> 
>> 
>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
>> Hi Everyone,
>> 
>> Hoping someone may be able to advise.
>> 
>> I am currently using Manifold, together with a CMIS connector, to retrieve and index content from an Alfresco repository.
>> 
>> All is going well apart from, what I would call, the ‘incremental crawl’.
>> 
>> The main issue I am having is that the modification of a document’s security settings, in Alfresco, is not being picked up in next Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A and B as Consumers. I run a crawl in Manifold and it picks up the documents fine.  The security is set as expected. I then remove ‘User A’ from the security of that document and re-run the Manifold crawl. User A can still see the document in the local search engine.
>> 
>> It is as if Manifold is not treating the security update as a ‘modification’ and is therefore not refreshing it. Note that if I go into the Output Connections, edit and save the relevant output connection and then click ‘Remove all associated documents’, the next time I crawl, the changes are picked up. It is clear that Manifold is just not updating whatever internal record it has for this item.
>> 
>> Any ideas?
>> 
>> Many thanks.
>> 
> 
> 


Re: Manifold/Alfresco seeding and security

Posted by Muhammed Olgun <mh...@gmail.com>.
Hi Paul,

I suggest that you should use Alfresco Webscript as Karl mentioned. Web
services is so slow compared to other services and I've also checked that
Alfresco CMIS web services does not return change token(may be there is
something that I don't know).

By the way current version of CMIS connector is not aware of change token.
I would write a patch for you if alfresco supports change token property.

Thanks!
Muhammed
19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <da...@gmail.com> şunu
yazdı:

> Hi Paul,
>
> The Alfresco Webscript connector is a wholly different connector that has
> no relation to the CMIS connector.  It requires an Alfresco webscript
> plugin be installed on your Alfresco server to work, though.
>
> Hope that helps.
>
> Karl
>
>
> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <pf...@funnelback.com>
> wrote:
>
>> Hi Muhammed/Karl,
>>
>> Firstly, thank-you so much for taking the time to reply. It is very much
>> appreciated.
>>
>> Currently I am using the AtomPub for my CMIS repository connection. I
>> have just read something which may shed a little light on this. The post
>> read that change tokens are not passed via AtomPub connections (
>> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758).
>> If true, this would explain why ManifoldCF may be unable to determine a
>> change in Alfresco.
>>
>> It looks like I have two possible options left open to me (correct me if
>> I’m wrong):
>>
>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the connection
>> mechanism
>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’ connector?
>>  (or is this the same as the ‘Web Services’ connection mentioned above?)
>>
>> Thanks again,
>>
>> Paul
>>
>> *Paul Farrell*
>> Senior Search Consultant
>>
>> 109-123 Clifton Street, London EC2A 4LD
>> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>>
>> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>
>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> -
>> Twitter <https://twitter.com/funnelback>
>>
>> Funnelback UK Ltd is a limited liability company registered in England &
>> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
>> EC2A 4LD. Company registration number: 07004264.
>>
>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh...@gmail.com> wrote:
>>
>> Hi Paul,
>>
>> Repositories should give information to ManifoldCF when they updated.
>> Current CMIS connector reindex document if the lastest version of the
>> document has changed, not updated.
>>
>> There is a change token property in CMIS specification and it should
>> change when document is updated so ManifoldCF can understand that document
>> is updated but implementing change token property is optional.  I've
>> checked Alfresco's CMIS web site and seen that they didn't set the change
>> token.
>>
>> I think, there is nothing we can do at this point.
>>
>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <da...@gmail.com> şunu
>> yazdı:
>>
>>> Hi Paul,
>>>
>>> This looks like a bug in the CMIS connector to me; usually the document
>>> version string the connector constructs should be adequate to detect all
>>> changes.  Can you create a ticket?  https://issues.apache.org/jira ,
>>> project ManifoldCF.  Please include what version of MCF you are using
>>> here.  FWIW, this may be in fact a bug in the Alfresco CMIS implementation,
>>> but we'll have to have some back and forth before I can determine that for
>>> sure.
>>>
>>> In the meantime, have you considered using the Alfresco Webscript
>>> connector?  It's the preferred way to do Alfresco indexing, although there
>>> have been issues reported having to do with running it on some
>>> configurations of Alfresco.  I'm not entirely sure what the problem is
>>> there; maybe a version dependency of some kind.
>>>
>>> Karl
>>>
>>>
>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <pf...@funnelback.com>
>>> wrote:
>>>
>>>> Hi Everyone,
>>>>
>>>> Hoping someone may be able to advise.
>>>>
>>>> I am currently using Manifold, together with a CMIS connector, to
>>>> retrieve and index content from an Alfresco repository.
>>>>
>>>> All is going well apart from, what I would call, the ‘incremental
>>>> crawl’.
>>>>
>>>> The main issue I am having is that the modification of a document’s
>>>> security settings, in Alfresco, is not being picked up in next Manifold
>>>> crawl. As an example I have a document ‘TestDoc1’ which has user A and B as
>>>> Consumers. I run a crawl in Manifold and it picks up the documents fine.
>>>> The security is set as expected. I then remove ‘User A’ from the security
>>>> of that document and re-run the Manifold crawl. User A can still see the
>>>> document in the local search engine.
>>>>
>>>> It is as if Manifold is not treating the security update as a
>>>> ‘modification’ and is therefore not refreshing it. Note that if I go into
>>>> the Output Connections, edit and save the relevant output connection and
>>>> then click ‘Remove all associated documents’, the next time I crawl, the
>>>> changes are picked up. It is clear that Manifold is just not updating
>>>> whatever internal record it has for this item.
>>>>
>>>> Any ideas?
>>>>
>>>> Many thanks.
>>>
>>>
>>>
>>
>

Re: Manifold/Alfresco seeding and security

Posted by Karl Wright <da...@gmail.com>.
Hi Paul,

The Alfresco Webscript connector is a wholly different connector that has
no relation to the CMIS connector.  It requires an Alfresco webscript
plugin be installed on your Alfresco server to work, though.

Hope that helps.

Karl


On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <pf...@funnelback.com>
wrote:

> Hi Muhammed/Karl,
>
> Firstly, thank-you so much for taking the time to reply. It is very much
> appreciated.
>
> Currently I am using the AtomPub for my CMIS repository connection. I have
> just read something which may shed a little light on this. The post read
> that change tokens are not passed via AtomPub connections (
> https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758).
> If true, this would explain why ManifoldCF may be unable to determine a
> change in Alfresco.
>
> It looks like I have two possible options left open to me (correct me if
> I’m wrong):
>
> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the connection
> mechanism
> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’ connector?
>  (or is this the same as the ‘Web Services’ connection mentioned above?)
>
> Thanks again,
>
> Paul
>
> *Paul Farrell*
> Senior Search Consultant
>
> 109-123 Clifton Street, London EC2A 4LD
> *T* +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>
>
> *UNITED KINGDOM* | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>
> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> -
> Twitter <https://twitter.com/funnelback>
>
> Funnelback UK Ltd is a limited liability company registered in England &
> Wales. Registered address: Zetland House 109-123, Clifton Street, London.
> EC2A 4LD. Company registration number: 07004264.
>
> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh...@gmail.com> wrote:
>
> Hi Paul,
>
> Repositories should give information to ManifoldCF when they updated.
> Current CMIS connector reindex document if the lastest version of the
> document has changed, not updated.
>
> There is a change token property in CMIS specification and it should
> change when document is updated so ManifoldCF can understand that document
> is updated but implementing change token property is optional.  I've
> checked Alfresco's CMIS web site and seen that they didn't set the change
> token.
>
> I think, there is nothing we can do at this point.
>
> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <da...@gmail.com> şunu
> yazdı:
>
>> Hi Paul,
>>
>> This looks like a bug in the CMIS connector to me; usually the document
>> version string the connector constructs should be adequate to detect all
>> changes.  Can you create a ticket?  https://issues.apache.org/jira ,
>> project ManifoldCF.  Please include what version of MCF you are using
>> here.  FWIW, this may be in fact a bug in the Alfresco CMIS implementation,
>> but we'll have to have some back and forth before I can determine that for
>> sure.
>>
>> In the meantime, have you considered using the Alfresco Webscript
>> connector?  It's the preferred way to do Alfresco indexing, although there
>> have been issues reported having to do with running it on some
>> configurations of Alfresco.  I'm not entirely sure what the problem is
>> there; maybe a version dependency of some kind.
>>
>> Karl
>>
>>
>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <pf...@funnelback.com>
>> wrote:
>>
>>> Hi Everyone,
>>>
>>> Hoping someone may be able to advise.
>>>
>>> I am currently using Manifold, together with a CMIS connector, to
>>> retrieve and index content from an Alfresco repository.
>>>
>>> All is going well apart from, what I would call, the ‘incremental crawl’.
>>>
>>> The main issue I am having is that the modification of a document’s
>>> security settings, in Alfresco, is not being picked up in next Manifold
>>> crawl. As an example I have a document ‘TestDoc1’ which has user A and B as
>>> Consumers. I run a crawl in Manifold and it picks up the documents fine.
>>> The security is set as expected. I then remove ‘User A’ from the security
>>> of that document and re-run the Manifold crawl. User A can still see the
>>> document in the local search engine.
>>>
>>> It is as if Manifold is not treating the security update as a
>>> ‘modification’ and is therefore not refreshing it. Note that if I go into
>>> the Output Connections, edit and save the relevant output connection and
>>> then click ‘Remove all associated documents’, the next time I crawl, the
>>> changes are picked up. It is clear that Manifold is just not updating
>>> whatever internal record it has for this item.
>>>
>>> Any ideas?
>>>
>>> Many thanks.
>>
>>
>>
>

Re: Manifold/Alfresco seeding and security

Posted by Paul Farrell <pf...@funnelback.com>.
Hi Muhammed/Karl,

Firstly, thank-you so much for taking the time to reply. It is very much appreciated. 

Currently I am using the AtomPub for my CMIS repository connection. I have just read something which may shed a little light on this. The post read that change tokens are not passed via AtomPub connections (https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758 <https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758>). If true, this would explain why ManifoldCF may be unable to determine a change in Alfresco.

It looks like I have two possible options left open to me (correct me if I’m wrong):

1. I look to use ‘Web Services’ instead of ‘AtomPub’ for the connection mechanism
2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’ connector?  (or is this the same as the ‘Web Services’ connection mentioned above?)

Thanks again,

Paul

Paul Farrell
Senior Search Consultant
 
109-123 Clifton Street, London EC2A 4LD
T +44 (0) 207 183 6865 | funnelback.com <http://www.funnelback.com/>

UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES

Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter <https://twitter.com/funnelback>

Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number: 07004264.

> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh...@gmail.com> wrote:
> 
> Hi Paul,
> 
> Repositories should give information to ManifoldCF when they updated. Current CMIS connector reindex document if the lastest version of the document has changed, not updated. 
> 
> There is a change token property in CMIS specification and it should change when document is updated so ManifoldCF can understand that document is updated but implementing change token property is optional.  I've checked Alfresco's CMIS web site and seen that they didn't set the change token.
> 
> I think, there is nothing we can do at this point.
> 
> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <daddywri@gmail.com <ma...@gmail.com>> şunu yazdı:
> Hi Paul,
> 
> This looks like a bug in the CMIS connector to me; usually the document version string the connector constructs should be adequate to detect all changes.  Can you create a ticket?  https://issues.apache.org/jira <https://issues.apache.org/jira> , project ManifoldCF.  Please include what version of MCF you are using here.  FWIW, this may be in fact a bug in the Alfresco CMIS implementation, but we'll have to have some back and forth before I can determine that for sure.
> 
> In the meantime, have you considered using the Alfresco Webscript connector?  It's the preferred way to do Alfresco indexing, although there have been issues reported having to do with running it on some configurations of Alfresco.  I'm not entirely sure what the problem is there; maybe a version dependency of some kind.
> 
> Karl
> 
> 
> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <pfarrell@funnelback.com <ma...@funnelback.com>> wrote:
> Hi Everyone,
> 
> Hoping someone may be able to advise.
> 
> I am currently using Manifold, together with a CMIS connector, to retrieve and index content from an Alfresco repository.
> 
> All is going well apart from, what I would call, the ‘incremental crawl’.
> 
> The main issue I am having is that the modification of a document’s security settings, in Alfresco, is not being picked up in next Manifold crawl. As an example I have a document ‘TestDoc1’ which has user A and B as Consumers. I run a crawl in Manifold and it picks up the documents fine.  The security is set as expected. I then remove ‘User A’ from the security of that document and re-run the Manifold crawl. User A can still see the document in the local search engine.
> 
> It is as if Manifold is not treating the security update as a ‘modification’ and is therefore not refreshing it. Note that if I go into the Output Connections, edit and save the relevant output connection and then click ‘Remove all associated documents’, the next time I crawl, the changes are picked up. It is clear that Manifold is just not updating whatever internal record it has for this item.
> 
> Any ideas?
> 
> Many thanks.
> 


Re: Manifold/Alfresco seeding and security

Posted by Muhammed Olgun <mh...@gmail.com>.
Hi Paul,

Repositories should give information to ManifoldCF when they updated.
Current CMIS connector reindex document if the lastest version of the
document has changed, not updated.

There is a change token property in CMIS specification and it should change
when document is updated so ManifoldCF can understand that document is
updated but implementing change token property is optional.  I've checked
Alfresco's CMIS web site and seen that they didn't set the change token.

I think, there is nothing we can do at this point.

19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <da...@gmail.com> şunu
yazdı:

> Hi Paul,
>
> This looks like a bug in the CMIS connector to me; usually the document
> version string the connector constructs should be adequate to detect all
> changes.  Can you create a ticket?  https://issues.apache.org/jira ,
> project ManifoldCF.  Please include what version of MCF you are using
> here.  FWIW, this may be in fact a bug in the Alfresco CMIS implementation,
> but we'll have to have some back and forth before I can determine that for
> sure.
>
> In the meantime, have you considered using the Alfresco Webscript
> connector?  It's the preferred way to do Alfresco indexing, although there
> have been issues reported having to do with running it on some
> configurations of Alfresco.  I'm not entirely sure what the problem is
> there; maybe a version dependency of some kind.
>
> Karl
>
>
> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <pf...@funnelback.com>
> wrote:
>
>> Hi Everyone,
>>
>> Hoping someone may be able to advise.
>>
>> I am currently using Manifold, together with a CMIS connector, to
>> retrieve and index content from an Alfresco repository.
>>
>> All is going well apart from, what I would call, the ‘incremental crawl’.
>>
>> The main issue I am having is that the modification of a document’s
>> security settings, in Alfresco, is not being picked up in next Manifold
>> crawl. As an example I have a document ‘TestDoc1’ which has user A and B as
>> Consumers. I run a crawl in Manifold and it picks up the documents fine.
>> The security is set as expected. I then remove ‘User A’ from the security
>> of that document and re-run the Manifold crawl. User A can still see the
>> document in the local search engine.
>>
>> It is as if Manifold is not treating the security update as a
>> ‘modification’ and is therefore not refreshing it. Note that if I go into
>> the Output Connections, edit and save the relevant output connection and
>> then click ‘Remove all associated documents’, the next time I crawl, the
>> changes are picked up. It is clear that Manifold is just not updating
>> whatever internal record it has for this item.
>>
>> Any ideas?
>>
>> Many thanks.
>
>
>

Re: Manifold/Alfresco seeding and security

Posted by Karl Wright <da...@gmail.com>.
Hi Paul,

This looks like a bug in the CMIS connector to me; usually the document
version string the connector constructs should be adequate to detect all
changes.  Can you create a ticket?  https://issues.apache.org/jira ,
project ManifoldCF.  Please include what version of MCF you are using
here.  FWIW, this may be in fact a bug in the Alfresco CMIS implementation,
but we'll have to have some back and forth before I can determine that for
sure.

In the meantime, have you considered using the Alfresco Webscript
connector?  It's the preferred way to do Alfresco indexing, although there
have been issues reported having to do with running it on some
configurations of Alfresco.  I'm not entirely sure what the problem is
there; maybe a version dependency of some kind.

Karl


On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <pf...@funnelback.com>
wrote:

> Hi Everyone,
>
> Hoping someone may be able to advise.
>
> I am currently using Manifold, together with a CMIS connector, to retrieve
> and index content from an Alfresco repository.
>
> All is going well apart from, what I would call, the ‘incremental crawl’.
>
> The main issue I am having is that the modification of a document’s
> security settings, in Alfresco, is not being picked up in next Manifold
> crawl. As an example I have a document ‘TestDoc1’ which has user A and B as
> Consumers. I run a crawl in Manifold and it picks up the documents fine.
> The security is set as expected. I then remove ‘User A’ from the security
> of that document and re-run the Manifold crawl. User A can still see the
> document in the local search engine.
>
> It is as if Manifold is not treating the security update as a
> ‘modification’ and is therefore not refreshing it. Note that if I go into
> the Output Connections, edit and save the relevant output connection and
> then click ‘Remove all associated documents’, the next time I crawl, the
> changes are picked up. It is clear that Manifold is just not updating
> whatever internal record it has for this item.
>
> Any ideas?
>
> Many thanks.