You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Piergiorgio Lucidi (Created) (JIRA)" <ji...@apache.org> on 2011/11/09 16:15:51 UTC

[jira] [Created] (CONNECTORS-288) An ElasticSearch connector would be helpful

An ElasticSearch connector would be helpful
-------------------------------------------

                 Key: CONNECTORS-288
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
             Project: ManifoldCF
          Issue Type: New Feature
            Reporter: Piergiorgio Lucidi
            Assignee: Piergiorgio Lucidi


An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

RE: [jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by ka...@nokia.com.
That is the same thing I see.  But if you browse the output connection while the test is running you will see the error I reported.  The test could be modified to report the error directly by fetching the connection status.



Sent from my Windows Phone
________________________________
From: ext Luca Stancapiano (Commented) (JIRA)
Sent: 2/25/2012 12:34 PM
To: connectors-dev@incubator.apache.org
Subject: [jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful


    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216504#comment-13216504 ]

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

how do you start the test? I use inside mcf-elasticsearch-test project:

mvn integration-test

and I continue to get the same error starting the test:

org.apache.manifoldcf.core.interfaces.ManifoldCFException: ManifoldCF did not terminate in the allotted time of 120000 milliseconds
        at org.apache.manifoldcf.elasticsearch_tests.APISanityDerbyIT.waitJobInactive(APISanityDerbyIT.java:576)
        at org.apache.manifoldcf.elasticsearch_tests.APISanityDerbyIT.sanityCheck(APISanityDerbyIT.java:412)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

and I confirm that the elasticsearch connector is never called

> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221639#comment-13221639 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Luca - I am away from internet access at the moment, but I will be back tomorrow.  So I am having to relay this from memory.

The test infrastructure works fine when I run it here.  The stopAgentsRun flag is cleared when the agents process is shut down, I think.  Nevertheless both Piergiorgio and myself have had no problem running the test and having it work fine except for the fact that it cannot delete documents.

Can you do the following exactly as stated and let me know what happens for you:

(1) Create a fresh checkout of https://svn.apache.org/repos/asf/incubator/lcf/branches/CONNECTORS-288
(2) Type "ant run-elasticsearch-tests-derby"
(3) Paste the results you get into a comment in this ticket.

Thanks,
Karl

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209673#comment-13209673 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

I modified the ant build to properly build and run the tests.  Unfortunately the compilation of the IT tests fails.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200568#comment-13200568 ] 

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

I find it interesting.....ready to start!
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>             Fix For: ManifoldCF next
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204115#comment-13204115 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Committed second patch to the branch.


                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211039#comment-13211039 ] 

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

I see still the ElasticSearchSchema class in the SVN:

http://svn.apache.org/repos/asf/incubator/lcf/branches/CONNECTORS-288/connectors/elasticsearch/connector/src/main/java/org/apache/manifoldcf/agents/output/elasticsearch/ElasticSearchSchema.java

That class should be deleted. It is empty
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Piergiorgio Lucidi (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217490#comment-13217490 ] 

Piergiorgio Lucidi commented on CONNECTORS-288:
-----------------------------------------------

I tried to execute the test now and I think that now tests are runned correctly but it seems that it can't delete the job from Manifold at the end of the test:
{code}
sanityCheck(org.apache.manifoldcf.elasticsearch_tests.APISanityDerbyIT): ManifoldCF did not delete in the allotted time of 120000 milliseconds
  sanityCheck(org.apache.manifoldcf.elasticsearch_tests.APISanityDerbyIT): Can't delete repository connection 'CMIS Connection': existing jobs refer to it
{code}
This is the last part of the test, so we are near to have a complete integration test implementation.
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206229#comment-13206229 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Latest patch committed.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205062#comment-13205062 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

I committed the latest patch.
One thing I noticed is that you are not using HTTP connection pooling.  In a real system this would lead to problems.  I'd suggest you have a look at MultiThreadedHttpConnectionManager; there are examples online that demonstrate reasonable usage.
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Piergiorgio Lucidi (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13213840#comment-13213840 ] 

Piergiorgio Lucidi commented on CONNECTORS-288:
-----------------------------------------------

I'm sorry for this delay, I think that we only need to add all the needed ElasticSearch dependencies to run these integration tests:
* ElasticSearch dependencies
* Lucene dependencies

I hope during these days to finish the implementation of integration tests.
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217519#comment-13217519 ] 

Karl Wright edited comment on CONNECTORS-288 at 2/27/12 8:25 PM:
-----------------------------------------------------------------

bq. I tried to execute the test now and I think that now tests are runned correctly but it seems that it can't delete the job from Manifold at the end of the test:

Right, the problem is that the job deletion hangs, because it's trying to delete the documents from the index and something goes wrong with that.  I posted earlier the manifoldcf.log output associated with this failure:

{code}
ERROR 2012-02-26 16:09:35,903 (Document delete thread '7') - Exception tossed: Server/page not found
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Server/page not found
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:111)
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete.<init>(ElasticSearchDelete.java:35)
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.removeDocument(ElasticSearchConnector.java:378)
	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.removeDocument(IncrementalIngester.java:1598)
	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentDeleteMultiple(IncrementalIngester.java:748)
	at org.apache.manifoldcf.crawler.system.DocumentDeleteThread.run(DocumentDeleteThread.java:130)
{code}

The issue is that the "Server/page not found" error seems to occur intermittently on many different requests.  These are usually retried, but at the end during the delete phase the delete threads wait 5 minutes before retrying, which is why the test fails, because it only waits 2 minutes.  The real problem is that we should not be getting these intermittent random errors at all, which is why I think we need to look at data that is kept around in the connector from request to request, namely the cached data structures.  I am certain these are the source of the problem.


                
      was (Author: kwright@metacarta.com):
    bq. I tried to execute the test now and I think that now tests are runned correctly but it seems that it can't delete the job from Manifold at the end of the test:

Right, the problem is that the job deletion hangs, because it's trying to delete the documents from the index and something goes wrong with that.  I posted earlier the manifoldcf.log output associated with this failure:

{code}
ERROR 2012-02-26 16:09:35,903 (Document delete thread '7') - Exception tossed: Server/page not found
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Server/page not found
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:111)
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete.<init>(ElasticSearchDelete.java:35)
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.removeDocument(ElasticSearchConnector.java:378)
	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.removeDocument(IncrementalIngester.java:1598)
	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentDeleteMultiple(IncrementalIngester.java:748)
	at org.apache.manifoldcf.crawler.system.DocumentDeleteThread.run(DocumentDeleteThread.java:130)
{code}

The issue is that the "Server/page not found" error seems to occur intermittently on many different requests.  These are usually retried, but at the end during the delete phase they wait 5 minutes before being retried, which is why the test fails.  The real problem is that we should not be getting intermittent random errors at all, which is why I think we need to look at data that is kept around in the connector from request to request, namely the cached data structures.  I am certain these are the source of the problem.


                  
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207253#comment-13207253 ] 

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

yes....and the velocity part too
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Piergiorgio Lucidi (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218282#comment-13218282 ] 

Piergiorgio Lucidi commented on CONNECTORS-288:
-----------------------------------------------

I added a fix to enable the connector without any index defined in ElasticSearch: we have to discuss if we need the check for the index specified in the configuration or not.
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207015#comment-13207015 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

It now builds too.  I guess the main thing we still need is an integration test and a UI test.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Piergiorgio Lucidi (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206413#comment-13206413 ] 

Piergiorgio Lucidi edited comment on CONNECTORS-288 at 2/14/12 4:39 PM:
------------------------------------------------------------------------

{quote}1 - indent{quote}
{quote}2 - add the velocity parameters{quote}
Luca, I could work on this if you want ;)
Let me know!

{quote}
4 - integration tests
{quote}
Here I can help Luca and I can implement the integration test module, in this way Luca could continue working directly on the connector. 

I published a post in the ElasticSearch group to have a suggestion about integration tests:
http://groups.google.com/group/elasticsearch/browse_thread/thread/5d568a7f5803acd0

This just to have their opinion about how to implement generic integration tests. 

Anyway I have an idea: we could download the latest stable distribution, extract it and then start the search server with the bash command. But I would like to understand if we can use the Java API directly as I saw in their integration tests code, it could be better to control the instance parameters.

I'm looking forward for their reply and let us see how we could continue :)

Anyway here the integration tests process workflow that I would like to implement:
* download, extract and run ElasticSearch server (the standard port is 9200)
* start Jetty with ManifoldCF
* create the test area on the File System
* configure the File System Repository Connection, ElasticSearch Output Connection and the job for ManifoldCF using the REST API
* start the ManifoldCF job for crawling the File System connection and putting content indexes in ElasticSearch server
* check results invoking standard query against the search server to verify the work of the output connector

WDYT?
                
      was (Author: piergiorgiolucidi@gmail.com):
    {quote}1 - indent{quote}
{quote}2 - add the velocity parameters{quote}
Luca, I could work on this if you want ;)
Let me know!

{quote}
4 - integration tests
{quote}
Here I can help Luca and I can implement the integration test module, in this way Luca could continue working directly on the connector. 

I published a post in the ElasticSearch group to have a suggestion about integration tests:
http://groups.google.com/group/elasticsearch/browse_thread/thread/5d568a7f5803acd0

This just to have their opinion about how to implement generic integration tests. 

Anyway I have an idea: we could download the latest stable distribution, extract it and then start the search server with the bash command. But I would like to understand if we can use the Java API directly as I saw in their integration tests code, it could be better to control the instance parameters.

I'm looking forward for their reply and let us see how we could continue :)

I suggest to use the OpenCMIS InMemory Server to test this new connector, I used this in the CMIS Connector integration tests.

Anyway here the integration tests process workflow that I would like to implement:
* download, extract and run ElasticSearch server (the standard port is 9200)
* start Jetty with ManifoldCF and OpenCMIS InMemory Server web app (the standard port is 8085, for CMIS server I will set 9090)
* create the test area on the CMIS server
* configure CMIS Repository Connection, ElasticSearch Output Connection and the job for ManifoldCF using the REST API
* start the ManifoldCF job for crawling the CMIS server and putting content indexes in ElasticSearch server
* check results invoking standard query against the search server to verify the work of the output connector

WDYT?
                  
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221639#comment-13221639 ] 

Karl Wright edited comment on CONNECTORS-288 at 3/4/12 9:50 PM:
----------------------------------------------------------------

Luca - I am away from internet access at the moment, but I will be back tomorrow.  So I am having to relay this from memory.

The test infrastructure works fine when I run it here.  The stopAgentsRun flag is cleared when the agents process is shut down, I think.  Nevertheless both Piergiorgio and myself have had no problem running the test and having it work fine except for the fact that it cannot delete documents.

Can you do the following exactly as stated and let me know what happens for you:

(1) Create a fresh checkout of https://svn.apache.org/repos/asf/incubator/lcf/branches/CONNECTORS-288
(2) Type "ant download-dependencies"
(3) Type "ant run-elasticsearch-tests-derby"
(4) Paste the results you get into a comment in this ticket.

I'd like to do this to see whether you have the same setup I do.

Thanks,
Karl

                
      was (Author: kwright@metacarta.com):
    Luca - I am away from internet access at the moment, but I will be back tomorrow.  So I am having to relay this from memory.

The test infrastructure works fine when I run it here.  The stopAgentsRun flag is cleared when the agents process is shut down, I think.  Nevertheless both Piergiorgio and myself have had no problem running the test and having it work fine except for the fact that it cannot delete documents.

Can you do the following exactly as stated and let me know what happens for you:

(1) Create a fresh checkout of https://svn.apache.org/repos/asf/incubator/lcf/branches/CONNECTORS-288
(2) Type "ant run-elasticsearch-tests-derby"
(3) Paste the results you get into a comment in this ticket.

Thanks,
Karl

                  
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luca Stancapiano updated CONNECTORS-288:
----------------------------------------

    Attachment: manifold-elasticsearch-velocity-patch

This patch resolves:

1 - a bug in the editConfiguration. There was a malformed field $SERVERLOCATION:A

2 - aligning with https://issues.apache.org/jira/browse/CONNECTORS-413
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214670#comment-13214670 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

The testing infrastructure starts jetty and deploys the war files during test setup.  If that hasn't happened then the test is not defined correctly.

I'm happy to give this a look, but I can't run with Maven here at work.  Is the test expected to work under Ant yet?

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227269#comment-13227269 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Hi Luca,

An output connector cannot assume anything about a URL.  The URL may come from any repository connector, and they all have different forms.  For example, file system and SharePoint URLs are basically full paths.  You would not want to confuse a/b/cd with x/y/cd, would you?  So please don't think that the only kind of URL an Elastic Search connector is ever going to see will be from CMIS.  So the resolution of CONNECTORS-417 is immaterial here.

The change I made, which instead of using the last part of the URL path as a "file name", tries to use the entire URL and encode it, so that the whole thing is interpreted as a "file name".  I changed the delete code and also the index code to make this consistent.  But it did not work, as I have said earlier.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luca Stancapiano updated CONNECTORS-288:
----------------------------------------

    Attachment: manifold-elasticsearch-patch

Sorry....I've forgotten to complete the last update. I updated the manifold check method using the _status command of elasticsearch
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216825#comment-13216825 ] 

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

An other thing that I can note is that the org.apache.manifoldcf.crawler.system.WorkerThread and the  org.apache.manifoldcf.crawler.system.StartupThread are not active when the test start. I suppose they support the jobs, when they start  
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216851#comment-13216851 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Looking at the actual test run, the history reports the following at the end:

{code}
02-26-2012 16:09:25.129 	Indexation (ElasticSearch) 	http://localhost:9200/index/_optimize
	OK 	0 	69 	
02-26-2012 16:09:24.939 	job end 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:09:14.909 	Deletion (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
	OK 	0 	7 	
02-26-2012 16:09:07.787 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/140/null
	OK 	27 	6 	
02-26-2012 16:09:07.778 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/139/null
	OK 	27 	7 	
02-26-2012 16:09:07.769 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/137/null
	OK 	27 	15 	
02-26-2012 16:09:05.278 	job start 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:55.020 	Indexation (ElasticSearch) 	http://localhost:9200/index/_optimize
	OK 	0 	93 	
02-26-2012 16:08:54.926 	job end 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:47.678 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
	OK 	27 	10 	
02-26-2012 16:08:47.666 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/140/null
	OK 	27 	6 	
02-26-2012 16:08:47.652 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/139/null
	OK 	27 	11 	
02-26-2012 16:08:47.646 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/137/null
	OK 	27 	13 	
02-26-2012 16:08:45.192 	job start 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:34.940 	Indexation (ElasticSearch) 	http://localhost:9200/index/_optimize
	OK 	0 	75 	
02-26-2012 16:08:34.917 	job end 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:29.502 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/139/null
	OK 	27 	10 	
02-26-2012 16:08:29.491 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/137/null
	OK 	27 	8 	
02-26-2012 16:08:29.412 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/140/null
	OK 	27 	66 	
02-26-2012 16:08:29.404 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
	OK 	27 	68 	
02-26-2012 16:08:25.097 	job start 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:24.846 	Indexation (ElasticSearch) 	http://localhost:9200/index/_optimize
	OK 	0 	88 	
02-26-2012 16:08:14.890 	job stop 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:09.041 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
	OK 	27 	868 	
02-26-2012 16:08:04.900 	job start 	1330290457146(Test Job)
		0 	1 	
{code}

The job at the end is stuck in the "Cleaning up" state, which indicates that it is trying to delete the documents from the index, but is not succeeding for some reason.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216481#comment-13216481 ] 

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

I was working it me too but it seems you resolved first than me.... I can continue on it...you can leave it to me , don't worry :)
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luca Stancapiano updated CONNECTORS-288:
----------------------------------------

    Attachment: manifold-elasticsearch-patch

about tests I think it's a good start point.

In this new patch I release two new update:

1 - Indent

2 - enable the HTTP connection pooling
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206872#comment-13206872 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Tried building this morning, but got a build error:

{code}
    [javac] C:\wip\mcf\CONNECTORS-288\connectors\elasticsearch\connector\src\main\java\org\apache\manifoldcf\agents\output\elasticsearch\ElasticSearchConnector.java:382: cannot access org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchSchema
    [javac] bad class file: C:\wip\mcf\CONNECTORS-288\connectors\elasticsearch\connector\src\main\java\org\apache\manifoldcf\agents\output\elasticsearch\ElasticSearchSchema.java
    [javac] file does not contain class org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchSchema
    [javac] Please remove or make sure it appears in the correct subdirectory of the classpath.
    [javac]     ElasticSearchSchema oss = new ElasticSearchSchema(getConfigParameters(null));
    [javac]     ^
    [javac] 1 error
{code}


                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216971#comment-13216971 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

More code review:

The architecture you are using to cache specifications could use some improvement, I think.  The method getOutputDescription() is not meant to perform a blind conversion of the output specification to a string, but to include only those parameters that, if changed, would change what was indexed.  Furthermore, it is expected that the format of the string be such that it is quickly unpackable, so that no caching should be necessary even if parameters need to be parsed from the string.  To help, there are a set of pack/unpack methods available for your use from the base class that are reasonably performant and meant for this purpose.  See the Solr connector for an idea how these are used.  Or, you can continue to use JSON, but when you go back and forth to JSON I suspect you're doing more work than the pack/unpack methods would do.

If you do decide to cache things for whatever reason, I would urge you to use the ICacheManager construct, since that will be guaranteed to be maintained over the long run.  Ideally, your code when done should not have any synchronize blocks in it at all, since synchronization is managed largely by the framework.

Another subject we should talk about is managing the HTTP connection pool.  I noted that you put pool management into one of the subclasses (ElasticSearchConnection).  The problem with that is that you want the lifetime of the pool to be the lifetime of the ElasticSearchConnector class instance, otherwise the pool is not going to do you much good.  So I would move the MultiThreadedHTTPConnectionManager instance to the main ElasticSearchConnector class, and provide an ElasticSearchConnector method that fetches an HttpClient object from that instance - or just pass it in when you construct ElasticSearchConnection.  Also, don't forget to hook up the poll() method to the MultiThreadedHTTPConnectionManager instance so that connections will be closed when idle.  See the SharePoint connector for an idea how this is done.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216504#comment-13216504 ] 

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

how do you start the test? I use inside mcf-elasticsearch-test project:

mvn integration-test 

and I continue to get the same error starting the test:

org.apache.manifoldcf.core.interfaces.ManifoldCFException: ManifoldCF did not terminate in the allotted time of 120000 milliseconds
	at org.apache.manifoldcf.elasticsearch_tests.APISanityDerbyIT.waitJobInactive(APISanityDerbyIT.java:576)
	at org.apache.manifoldcf.elasticsearch_tests.APISanityDerbyIT.sanityCheck(APISanityDerbyIT.java:412)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

and I confirm that the elasticsearch connector is never called
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216472#comment-13216472 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

I got the UI test working, and made some progress on the IT test.  Now it runs but the connection that it sets up fails with "Threw exception: 'Server/page not found'".  I'll look further but this may be more up Luca's alley to figure out.

You can figure out what is going on by opening a ManifoldCF UI browser window connected to localhost:8346/mcf-crawler-ui while the test is running.


                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Piergiorgio Lucidi (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208698#comment-13208698 ] 

Piergiorgio Lucidi commented on CONNECTORS-288:
-----------------------------------------------

I added an initial version of integration tests.

I also found that now Maven doesn't execute integration tests... that sounds very strange to me :(
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217519#comment-13217519 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

bq. I tried to execute the test now and I think that now tests are runned correctly but it seems that it can't delete the job from Manifold at the end of the test:

Right, the problem is that the job deletion hangs, because it's trying to delete the documents from the index and something goes wrong with that.  I posted earlier the manifoldcf.log output associated with this failure:

{code}
ERROR 2012-02-26 16:09:35,903 (Document delete thread '7') - Exception tossed: Server/page not found
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Server/page not found
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:111)
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete.<init>(ElasticSearchDelete.java:35)
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.removeDocument(ElasticSearchConnector.java:378)
	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.removeDocument(IncrementalIngester.java:1598)
	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentDeleteMultiple(IncrementalIngester.java:748)
	at org.apache.manifoldcf.crawler.system.DocumentDeleteThread.run(DocumentDeleteThread.java:130)
{code}

The issue is that the "Server/page not found" error seems to occur intermittently on many different requests.  These are usually retried, but at the end during the delete phase they wait 5 minutes before being retried, which is why the test fails.  The real problem is that we should not be getting intermittent random errors at all, which is why I think we need to look at data that is kept around in the connector from request to request, namely the cached data structures.  I am certain these are the source of the problem.


                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luca Stancapiano updated CONNECTORS-288:
----------------------------------------

    Attachment: manifold-elasticsearch-patch

This new patch resolves the parameters using  velocity. Now the format of parameters is for example: $SERVERLOCATION_A instead of ${SERVERLOCATION:A}
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214883#comment-13214883 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Please continue!  It's a busy day here and I won't be able to look at this again until evening, if even then...

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13215658#comment-13215658 ] 

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

The problem is that the test expects a json response by the elasticsearch connector plugin but no request is done on it. I expect that the request is done through the startJob call
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Piergiorgio Lucidi (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206413#comment-13206413 ] 

Piergiorgio Lucidi edited comment on CONNECTORS-288 at 2/12/12 2:36 PM:
------------------------------------------------------------------------

{quote}1 - indent{quote}
{quote}2 - add the velocity parameters{quote}
Luca, I could work on this if you want ;)
Let me know!

{quote}
4 - integration tests
{quote}
Here I can help Luca and I can implement the integration test module, in this way Luca could continue working directly on the connector. 

I published a post in the ElasticSearch group to have a suggestion about integration tests:
http://groups.google.com/group/elasticsearch/browse_thread/thread/5d568a7f5803acd0

This just to have their opinion about how to implement generic integration tests. 

Anyway I have an idea: we could download the latest stable distribution, extract it and then start the search server with the bash command. But I would like to understand if we can use the Java API directly as I saw in their integration tests code, it could be better to control the instance parameters.

I'm looking forward for their reply and let us see how we could continue :)

I suggest to use the OpenCMIS InMemory Server to test this new connector, I used this in the CMIS Connector integration tests.

Anyway here the integration tests process workflow that I would like to implement:
* download, extract and run ElasticSearch server (the standard port is 9200)
* start Jetty with ManifoldCF and OpenCMIS InMemory Server web app (the standard port is 8085, for CMIS server I will set 9090)
* create the test area on the CMIS server
* configure CMIS Repository Connection, ElasticSearch Output Connection and the job for ManifoldCF using the REST API
* start the ManifoldCF job for crawling the CMIS server and putting content indexes in ElasticSearch server
* check results invoking standard query against the search server to verify the work of the output connector

WDYT?
                
      was (Author: piergiorgiolucidi@gmail.com):
    {quote}1 - indent{quote}
{quote}2 - add the velocity parameters{quote}
Luca, I could work on this if you want ;)
Let me know!

{quote}
4 - integration tests
{quote}
Here I can help Luca and I can implement the integration test module, in this way Luca could continue working directly on the connector. 

I published a post in the ElasticSearch group to have a suggestion about integration tests:
http://groups.google.com/group/elasticsearch/browse_thread/thread/5d568a7f5803acd0

This just to have their opinion about how to implement generic integration tests. 

Anyway I have an idea: we could download the latest stable distribution, extract it and then start the search server with the bash command. But I would like to understand if we can use the Java API directly as I saw in their integration tests code, it could be better to control the instance parameters.

I'm looking forward for their reply and let us see how we could continue :)
                  
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211051#comment-13211051 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Removed

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226945#comment-13226945 ] 

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

Ok....deleting the properties.xml after each test I can now run and fix the test. The problem of the delete is tied to a last update:

class org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete:row 34

{code}
String fileName = java.net.URLEncoder.encode(documentURI,"utf-8");
{code}

it must be a fileNome, not a Uri because it is the id to use for the delete. The delete method in the elasticsearch connector is as a curl as below:

{code}
curl -XDELETE http://localhost:9200/index/generictype/null
{code}

where gtgt is the 'fileName' variable so it'ld become so:

{code}
curl -XDELETE http://localhost:9200/index/generictype/http%3A%2F%2Flocalhost%3A9090%2Fchemistry-opencmis-server-inmemory%2Fatom%2F139%2Fnull 
{code}

If you modify the row as it was before so:

{code}
String fileName = FilenameUtils.getName(documentURI);
{code}

the delete works.

Done it there is a new problem tied to https://issues.apache.org/jira/browse/CONNECTORS-417:

our test create multiple versions of the document called 'null'

When the delete is called, the test succeeds to delete the first 'null' document because the dlete operation delete all versions for the document. So, after, the 'null' document is not more there but it try to delete anyhow and it goes in a loop. 

I suppose this problem is done because the resolution of https://issues.apache.org/jira/browse/CONNECTORS-417 is not committed in the branch. Let me know for it





                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226945#comment-13226945 ] 

Luca Stancapiano edited comment on CONNECTORS-288 at 3/10/12 9:42 PM:
----------------------------------------------------------------------

Ok....deleting the properties.xml after each test I can now run and fix the test. The problem of the delete is tied to a last update:

class org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete:row 34

{code}
String fileName = java.net.URLEncoder.encode(documentURI,"utf-8");
{code}

it must be a fileName, not a Uri because it is the id to use for the delete. The delete method in the elasticsearch connector is as a curl as below:

{code}
curl -XDELETE http://localhost:9200/index/generictype/gtgt
{code}

where gtgt is the 'fileName' variable so it'ld become so:

{code}
curl -XDELETE http://localhost:9200/index/generictype/http%3A%2F%2Flocalhost%3A9090%2Fchemistry-opencmis-server-inmemory%2Fatom%2F139%2Fnull 
{code}

If you modify the row as it was before so:

{code}
String fileName = FilenameUtils.getName(documentURI);
{code}

the delete works.

Done it there is a new problem tied to https://issues.apache.org/jira/browse/CONNECTORS-417:

our test creates multiple versions of the document called 'null'

When the delete is called, the test succeeds to delete the first 'null' document because the delete operation deletes all versions for the document. So, after, the 'null' document is not more there but it try to delete anyhow and it goes in a loop. 

I suppose this problem is done because the resolution of https://issues.apache.org/jira/browse/CONNECTORS-417 is not committed in the branch. Let me know for it





                
      was (Author: luca.stancaqpiano):
    Ok....deleting the properties.xml after each test I can now run and fix the test. The problem of the delete is tied to a last update:

class org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete:row 34

{code}
String fileName = java.net.URLEncoder.encode(documentURI,"utf-8");
{code}

it must be a fileName, not a Uri because it is the id to use for the delete. The delete method in the elasticsearch connector is as a curl as below:

{code}
curl -XDELETE http://localhost:9200/index/generictype/gtgt
{code}

where gtgt is the 'fileName' variable so it'ld become so:

{code}
curl -XDELETE http://localhost:9200/index/generictype/http%3A%2F%2Flocalhost%3A9090%2Fchemistry-opencmis-server-inmemory%2Fatom%2F139%2Fnull 
{code}

If you modify the row as it was before so:

{code}
String fileName = FilenameUtils.getName(documentURI);
{code}

the delete works.

Done it there is a new problem tied to https://issues.apache.org/jira/browse/CONNECTORS-417:

our test create multiple versions of the document called 'null'

When the delete is called, the test succeeds to delete the first 'null' document because the delete operation deletes all versions for the document. So, after, the 'null' document is not more there but it try to delete anyhow and it goes in a loop. 

I suppose this problem is done because the resolution of https://issues.apache.org/jira/browse/CONNECTORS-417 is not committed in the branch. Let me know for it





                  
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13215737#comment-13215737 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

I don't follow you, but maybe a little later I can delve into this further.
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209841#comment-13209841 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Committed latest patch.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214776#comment-13214776 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

bq. I suggest to download the ElasticSearch binary package and then you can take easily all the dependencies from the lib folder

Sounds reasonable.  Hopefully will be able to look at this more tonight.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luca Stancapiano updated CONNECTORS-288:
----------------------------------------

    Attachment: manifoldcf-elasticsearch-project-patct
    
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217679#comment-13217679 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

I attempted to make the filename be the full URL, but that failed.  Looking at the way the elastic search delete method works, that's not a surprise; it looks like it is REST-style, so slashes are not likely to be welcome.  So the next thing I tried was URL-encoding the whole URL, which should have fixed the slash problem.  But that did not work either; not sure why.  Is it possible that Elastic Search is URL-decoding the filename field?


                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Piergiorgio Lucidi (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207826#comment-13207826 ] 

Piergiorgio Lucidi commented on CONNECTORS-288:
-----------------------------------------------

Ok so probably we could follow the CMIS way to be sure to have some sections of integration tests implemented and taken from another test module.
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217664#comment-13217664 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

I committed some cleanup, which makes things clearer.  I also found out what the issue is.

Basically, the document URI as calculated by ManifoldCF is supposed to be the document's key in the index.  The elasticsearch connector is taking this URI and stripping off all but the last part of the path.  For example, for the CMIS path 'http://localhost:9090/chemistry-opencmis-server-inmemory/atom/138/null' (Piergiorgio, what is that 'null' at the end??  Sounds like it may be a bug to me.), the output connector turns this into filename 'null'.  That's just plain wrong, since 'null' is by no means unique.  In fact, with CMIS, ALL urls end in 'null'.  So the first delete works - and all subsequent deletes fail with a 404 error because they are trying to delete a file that was already deleted.

The design of the connector is therefore seriously flawed.  If the elastic search "filename" field is supposed to be the key, then either the full URL should be used, or if that is not possible, the SHA1 of the full URL should be used.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Piergiorgio Lucidi (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206413#comment-13206413 ] 

Piergiorgio Lucidi commented on CONNECTORS-288:
-----------------------------------------------

{quote}1 - indent{quote}
{quote}2 - add the velocity parameters{quote}
Luca, I could work on this if you want ;)
Let me know!

{quote}
4 - integration tests
{quote}
Here I can help Luca and I can implement the integration test module, in this way Luca could continue working directly on the connector. 

I published a post in the ElasticSearch group to have a suggestion about integration tests:
http://groups.google.com/group/elasticsearch/browse_thread/thread/5d568a7f5803acd0

This just to have their opinion about how to implement generic integration tests. 

Anyway I have an idea: we could download the latest stable distribution, extract it and then start the search server with the bash command. But I would like to understand if we can use the Java API directly as I saw in their integration tests code, it could be better to control the instance parameters.

I'm looking forward for their reply and let us see how we could continue :)
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206456#comment-13206456 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Could you do an svn update and then resubmit the latest patch?  I'm getting conflicts; it looks like you haven't updated.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luca Stancapiano updated CONNECTORS-288:
----------------------------------------

               Labels: elasticsearch  (was: )
    Affects Version/s: ManifoldCF 0.5
               Status: Patch Available  (was: Open)
    
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Piergiorgio Lucidi (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207569#comment-13207569 ] 

Piergiorgio Lucidi commented on CONNECTORS-288:
-----------------------------------------------

For the integration tests module the ElasticSearch Community replied to my message:
https://groups.google.com/group/elasticsearch/msg/f48036470f3931b7

So it seems that it is possible to create an instance of ElasticSearch using the Java API and then we can configure the connector against it. We can try this way, otherwise we can follow the standard way (download, extract and start).

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226945#comment-13226945 ] 

Luca Stancapiano edited comment on CONNECTORS-288 at 3/10/12 9:34 PM:
----------------------------------------------------------------------

Ok....deleting the properties.xml after each test I can now run and fix the test. The problem of the delete is tied to a last update:

class org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete:row 34

{code}
String fileName = java.net.URLEncoder.encode(documentURI,"utf-8");
{code}

it must be a fileName, not a Uri because it is the id to use for the delete. The delete method in the elasticsearch connector is as a curl as below:

{code}
curl -XDELETE http://localhost:9200/index/generictype/gtgt
{code}

where gtgt is the 'fileName' variable so it'ld become so:

{code}
curl -XDELETE http://localhost:9200/index/generictype/http%3A%2F%2Flocalhost%3A9090%2Fchemistry-opencmis-server-inmemory%2Fatom%2F139%2Fnull 
{code}

If you modify the row as it was before so:

{code}
String fileName = FilenameUtils.getName(documentURI);
{code}

the delete works.

Done it there is a new problem tied to https://issues.apache.org/jira/browse/CONNECTORS-417:

our test create multiple versions of the document called 'null'

When the delete is called, the test succeeds to delete the first 'null' document because the delete operation deletes all versions for the document. So, after, the 'null' document is not more there but it try to delete anyhow and it goes in a loop. 

I suppose this problem is done because the resolution of https://issues.apache.org/jira/browse/CONNECTORS-417 is not committed in the branch. Let me know for it





                
      was (Author: luca.stancaqpiano):
    Ok....deleting the properties.xml after each test I can now run and fix the test. The problem of the delete is tied to a last update:

class org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete:row 34

{code}
String fileName = java.net.URLEncoder.encode(documentURI,"utf-8");
{code}

it must be a fileName, not a Uri because it is the id to use for the delete. The delete method in the elasticsearch connector is as a curl as below:

{code}
curl -XDELETE http://localhost:9200/index/generictype/gtgt
{code}

where gtgt is the 'fileName' variable so it'ld become so:

{code}
curl -XDELETE http://localhost:9200/index/generictype/http%3A%2F%2Flocalhost%3A9090%2Fchemistry-opencmis-server-inmemory%2Fatom%2F139%2Fnull 
{code}

If you modify the row as it was before so:

{code}
String fileName = FilenameUtils.getName(documentURI);
{code}

the delete works.

Done it there is a new problem tied to https://issues.apache.org/jira/browse/CONNECTORS-417:

our test create multiple versions of the document called 'null'

When the delete is called, the test succeeds to delete the first 'null' document because the dlete operation delete all versions for the document. So, after, the 'null' document is not more there but it try to delete anyhow and it goes in a loop. 

I suppose this problem is done because the resolution of https://issues.apache.org/jira/browse/CONNECTORS-417 is not committed in the branch. Let me know for it





                  
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202415#comment-13202415 ] 

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

yes no problem.... I'll work for it in this week. I will write here and in the mailing list for problems
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Piergiorgio Lucidi (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214682#comment-13214682 ] 

Piergiorgio Lucidi commented on CONNECTORS-288:
-----------------------------------------------

{quote}
I take it the ElasticSearch dependency come from the sonatype repository?
{quote}
I suggest to download the ElasticSearch binary package and then you can take easily all the dependencies from the lib folder, it's more easy. Here the address of the binary package:
https://github.com/downloads/elasticsearch/elasticsearch/elasticsearch-0.18.7.tar.gz
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luca Stancapiano updated CONNECTORS-288:
----------------------------------------

    Attachment: manifold-elasticsearch-patch

Ok... in this new patch I completed the support for insert, update and deleting of the indexes.

Remains:

1 - indent
2 - add the velocity parameters
3 - enable the HTTP connection pooling
4 - integration tests


                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207010#comment-13207010 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Committed this latest patch.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221632#comment-13221632 ] 

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

I continue to get the problem of WorkerThread. It never starts so the test never calls the api of elasticsearch connector. I expose my last running of the test

In the org.apache.manifoldcf.crawler.tests.BaseITDerby there is:

{code}
  @Before
  public void setUp()
    throws Exception
  {
    super.setUp();
    mcfInstance.start();
  }
{code}

The setUp method starts a clean of the services. In this clean is started the org.apache.manifoldcf.agents.system.ManifoldCF.doCleanup() method:

{code}
    public void doCleanup()
      throws ManifoldCFException
    {
      // Shutting down in this way must prevent startup from taking place.
      synchronized (runningHash)
      {
        stopAgentsRun = true;
      }
      IThreadContext tc = ThreadContextFactory.make();
      stopAgents(tc);
    }
{code}

since this time the stopAgentsRun variable is true!!

So , when Manifold starts we have this control in the org.apache.manifoldcf.agents.system.ManifoldCF class:

{code}
  public static void startAgents(IThreadContext threadContext)
    throws ManifoldCFException
  {
    // Get agent manager
    IAgentManager manager = AgentManagerFactory.make(threadContext);
    String[] classes = manager.getAllAgents();
    ManifoldCFException problem = null;
    synchronized (runningHash)
    {
      // DO NOT permit this method to do anything if stopAgents() has ever been called for this JVM! 
      // (If it has, it means that the JVM is trying to shut down.)
      if (stopAgentsRun)
        return;
      .............
{code}

stopAgentsRun is always true so the agents never be launched!!!
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214658#comment-13214658 ] 

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

The test try to connect here:

http://localhost:8346/mcf-api-service/json/outputconnections/ElasticSearch%20Connection

but there is no active port during the test. Maybe because Manifold goes late to start?
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Tommaso Teofili (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200692#comment-13200692 ] 

Tommaso Teofili commented on CONNECTORS-288:
--------------------------------------------

Nice thing Luce, looking forward to a patch for it :)
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>             Fix For: ManifoldCF next
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luca Stancapiano updated CONNECTORS-288:
----------------------------------------

    Attachment: manifold-elasticsearch-patch

yes, you say right...... now I updated and I add a better indentation too
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200881#comment-13200881 ] 

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

I add here a first patch. It is a skeleton for project and docs. So it is not still working but I prefer maintain an history of the work. The next thing to do is to add json calls. 
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214874#comment-13214874 ] 

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

The problem seems to be done by a configuration of the test. Manifold expects the same name for the name of outputConnection:

org.apache.manifoldcf.elasticsearch_tests.APISanityIT: 291

and the output_connection field of the job:

org.apache.manifoldcf.elasticsearch_tests.APISanityIT: 367

I corrected it and it goes on.

If you want I can continue to test it
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Piergiorgio Lucidi (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218349#comment-13218349 ] 

Piergiorgio Lucidi commented on CONNECTORS-288:
-----------------------------------------------

{quote}
For example, for the CMIS path 'http://localhost:9090/chemistry-opencmis-server-inmemory/atom/138/null' (Piergiorgio, what is that 'null' at the end?? Sounds like it may be a bug to me.), the output connector turns this into filename 'null'.
{quote}
Karl, you're right. I found a bug in the CMIS Connector, that URL is generated by the CMIS Connector, if there are no version label defined for the node, the OpenCMIS InMemory Server returns null, that is the last part of the documentURI.

>From the CMIS Connector code:
{code}
String version = document.getVersionLabel();
String endpoint = protocol+"://"+server+":"+port+path;
String documentURI = endpoint+"/"+id+"/"+version;
activities.ingestDocument(id, version, documentURI, rd);
{code}
That's why we have that null in the documentURI, and maybe it could solve the problem about the delete feature in the integration test, because there is an unique filename "null" that doesn't mean nothing.

So I have to create a new ticket and I have to create a patch for the issue.
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karl Wright updated CONNECTORS-288:
-----------------------------------

       Resolution: Fixed
    Fix Version/s:     (was: ManifoldCF next)
                   ManifoldCF 0.5
           Status: Resolved  (was: Patch Available)

r1299942

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF 0.5
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luca Stancapiano updated CONNECTORS-288:
----------------------------------------

    Attachment: manifold-elasticsearch-patch

Here a new patch with the updates:

1 - resolved the error conflicting with opensearchserver in the ant file

2 - tested the optimize and refresh actions

3 - created a message to send through the indexing operation compatible with json

to resolve:

1 - indent

2 - add the velocity parameters

3 - Actually I get a 500 error when I try to index binary values
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Piergiorgio Lucidi (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202374#comment-13202374 ] 

Piergiorgio Lucidi commented on CONNECTORS-288:
-----------------------------------------------

Luca will work on this task during this week, I think that he will solve all the issues ;)
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Piergiorgio Lucidi (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227517#comment-13227517 ] 

Piergiorgio Lucidi commented on CONNECTORS-288:
-----------------------------------------------

I agree with Karl, we should merge this branch into trunk and then I think that it should work correctly.
Anyway we can fix new issues quickly without any specific issue related to the CMIS server.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227446#comment-13227446 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Luca and I ironed this out.  Basically the problem was that the id (the URL-encoded documentURI) needs to be part of the URL for both the index and the delete operation.  Once that was done, the test passes.

I'm now in the process of merging the branch into trunk.  That should be completed this evening, with luck.  Once complete, the only outstanding issue I can see is that metadata is not being indexed.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214783#comment-13214783 ] 

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

I tried the test using only lucene-core 3.5.0 and elasticsearch 0.18.7 as dependencies and I get the same error. Maybe the other libraries are useless
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216851#comment-13216851 ] 

Karl Wright edited comment on CONNECTORS-288 at 2/26/12 9:16 PM:
-----------------------------------------------------------------

Looking at the actual test run, the history reports the following at the end:

{code}
02-26-2012 16:09:25.129 	Indexation (ElasticSearch) 	http://localhost:9200/index/_optimize
	OK 	0 	69 	
02-26-2012 16:09:24.939 	job end 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:09:14.909 	Deletion (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
	OK 	0 	7 	
02-26-2012 16:09:07.787 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/140/null
	OK 	27 	6 	
02-26-2012 16:09:07.778 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/139/null
	OK 	27 	7 	
02-26-2012 16:09:07.769 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/137/null
	OK 	27 	15 	
02-26-2012 16:09:05.278 	job start 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:55.020 	Indexation (ElasticSearch) 	http://localhost:9200/index/_optimize
	OK 	0 	93 	
02-26-2012 16:08:54.926 	job end 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:47.678 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
	OK 	27 	10 	
02-26-2012 16:08:47.666 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/140/null
	OK 	27 	6 	
02-26-2012 16:08:47.652 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/139/null
	OK 	27 	11 	
02-26-2012 16:08:47.646 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/137/null
	OK 	27 	13 	
02-26-2012 16:08:45.192 	job start 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:34.940 	Indexation (ElasticSearch) 	http://localhost:9200/index/_optimize
	OK 	0 	75 	
02-26-2012 16:08:34.917 	job end 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:29.502 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/139/null
	OK 	27 	10 	
02-26-2012 16:08:29.491 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/137/null
	OK 	27 	8 	
02-26-2012 16:08:29.412 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/140/null
	OK 	27 	66 	
02-26-2012 16:08:29.404 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
	OK 	27 	68 	
02-26-2012 16:08:25.097 	job start 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:24.846 	Indexation (ElasticSearch) 	http://localhost:9200/index/_optimize
	OK 	0 	88 	
02-26-2012 16:08:14.890 	job stop 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:09.041 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
	OK 	27 	868 	
02-26-2012 16:08:04.900 	job start 	1330290457146(Test Job)
		0 	1 	
{code}

The job at the end is stuck in the "Cleaning up" state, which indicates that it is trying to delete the documents from the index, but is not succeeding for some reason.  The jobstatus reports 4 documents at that time.

The CMIS connector is not helping here because it does not seem to record ANY activities.  It also looks like the activities being recorded for the ElasticSearch connector are backwards; it records "Optimize" when it should record "Indexation", and visa versa.


                
      was (Author: kwright@metacarta.com):
    Looking at the actual test run, the history reports the following at the end:

{code}
02-26-2012 16:09:25.129 	Indexation (ElasticSearch) 	http://localhost:9200/index/_optimize
	OK 	0 	69 	
02-26-2012 16:09:24.939 	job end 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:09:14.909 	Deletion (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
	OK 	0 	7 	
02-26-2012 16:09:07.787 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/140/null
	OK 	27 	6 	
02-26-2012 16:09:07.778 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/139/null
	OK 	27 	7 	
02-26-2012 16:09:07.769 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/137/null
	OK 	27 	15 	
02-26-2012 16:09:05.278 	job start 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:55.020 	Indexation (ElasticSearch) 	http://localhost:9200/index/_optimize
	OK 	0 	93 	
02-26-2012 16:08:54.926 	job end 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:47.678 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
	OK 	27 	10 	
02-26-2012 16:08:47.666 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/140/null
	OK 	27 	6 	
02-26-2012 16:08:47.652 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/139/null
	OK 	27 	11 	
02-26-2012 16:08:47.646 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/137/null
	OK 	27 	13 	
02-26-2012 16:08:45.192 	job start 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:34.940 	Indexation (ElasticSearch) 	http://localhost:9200/index/_optimize
	OK 	0 	75 	
02-26-2012 16:08:34.917 	job end 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:29.502 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/139/null
	OK 	27 	10 	
02-26-2012 16:08:29.491 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/137/null
	OK 	27 	8 	
02-26-2012 16:08:29.412 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/140/null
	OK 	27 	66 	
02-26-2012 16:08:29.404 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
	OK 	27 	68 	
02-26-2012 16:08:25.097 	job start 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:24.846 	Indexation (ElasticSearch) 	http://localhost:9200/index/_optimize
	OK 	0 	88 	
02-26-2012 16:08:14.890 	job stop 	1330290457146(Test Job)
		0 	1 	
02-26-2012 16:08:09.041 	Optimize (ElasticSearch) 	http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
	OK 	27 	868 	
02-26-2012 16:08:04.900 	job start 	1330290457146(Test Job)
		0 	1 	
{code}

The job at the end is stuck in the "Cleaning up" state, which indicates that it is trying to delete the documents from the index, but is not succeeding for some reason.

                  
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karl Wright updated CONNECTORS-288:
-----------------------------------

    Fix Version/s: ManifoldCF next
    
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>             Fix For: ManifoldCF next
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214886#comment-13214886 ] 

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

no problem!
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Piergiorgio Lucidi (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13213889#comment-13213889 ] 

Piergiorgio Lucidi commented on CONNECTORS-288:
-----------------------------------------------

I updated the code with the following changes:

- updated the Lucene dependencies to the 3.5 version on the pom.xml
- an initial fix for the integration tests implementation: now Manifold, OpenCMIS and ElasticSearch servers started correctly

So now it is possible to execute with Maven the integration tests for this connector and now it returns this exception:
{code}
OpenCMIS InMemory server is starting...
2012-02-22 20:12:25.157:INFO::jetty-6.1.26
2012-02-22 20:12:25.164:INFO::Extract ../dependency/chemistry-opencmis-server-inmemory.war to /var/folders/PT/PTaGVJVfF8ChNLT0YTb4tk+++TI/-Tmp-/Jetty_0_0_0_0_9090_chemistry.opencmis.server.inmemory.war__chemistry.opencmis.server.inmemory__80aygt/webapp
22-feb-2012 20.12.27 com.sun.xml.ws.transport.http.servlet.WSServletContextListener contextInitialized
INFO: WSSERVLET12: JAX-WS context listener initializing
22-feb-2012 20.12.33 com.sun.xml.ws.transport.http.servlet.WSServletDelegate <init>
INFO: WSSERVLET14: JAX-WS servlet initializing
2012-02-22 20:12:33.390:INFO::Started SocketConnector@0.0.0.0:9090
OpenCMIS InMemory server is started listening on port 9090
ElasticSearch is starting...
ElasticSearch is started on port 9200
PooledConnection.guardConnection(): found closed Connection. Statement information follows. Attempting to recover.
PooledConnection.guardConnection: statement was null
PooledConnection.guardConnection(): Recovered connection
java.lang.Exception: API http error; expected 201, saw 400: 
	at org.apache.manifoldcf.crawler.tests.ManifoldCFInstance.performAPIPutOperation(ManifoldCFInstance.java:314)
	at org.apache.manifoldcf.crawler.tests.ManifoldCFInstance.performAPIPutOperationViaNodes(ManifoldCFInstance.java:377)
	at org.apache.manifoldcf.crawler.tests.BaseITDerby.performAPIPutOperationViaNodes(BaseITDerby.java:166)
	at org.apache.manifoldcf.elasticsearch_tests.APISanityIT.sanityCheck(APISanityIT.java:345)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
	at org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
	at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
	at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
	at org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175)
	at org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:81)
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68)
{code}

I think that probably there is a wrong value for one of the settings for the configuration of the REST API of Manifold.

But anyway this is a little step forward :D

So I suggest to fix this issue about settings and then work on the Ant script to add all the needed dependencies.
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216966#comment-13216966 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Just checked the manifoldcf.log file from the test crawl.  Here's a snippet:

{code}
ERROR 2012-02-26 16:08:09,921 (Worker thread '4') - Exception tossed: 
org.apache.manifoldcf.core.interfaces.ManifoldCFException: 
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:111)
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchIndex.<init>(ElasticSearchIndex.java:100)
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.addOrReplaceDocument(ElasticSearchConnector.java:357)
	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1579)
	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:504)
	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:370)
	at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1577)
	at org.apache.manifoldcf.crawler.connectors.cmis.CmisRepositoryConnector.processDocuments(CmisRepositoryConnector.java:1162)
	at org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
	at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:561)
ERROR 2012-02-26 16:09:35,903 (Document delete thread '7') - Exception tossed: Server/page not found
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Server/page not found
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:111)
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete.<init>(ElasticSearchDelete.java:35)
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.removeDocument(ElasticSearchConnector.java:378)
	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.removeDocument(IncrementalIngester.java:1598)
	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentDeleteMultiple(IncrementalIngester.java:748)
	at org.apache.manifoldcf.crawler.system.DocumentDeleteThread.run(DocumentDeleteThread.java:130)
ERROR 2012-02-26 16:09:36,908 (Document delete thread '9') - Exception tossed: Server/page not found
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Server/page not found
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:111)
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete.<init>(ElasticSearchDelete.java:35)
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.removeDocument(ElasticSearchConnector.java:378)
	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.removeDocument(IncrementalIngester.java:1598)
	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentDeleteMultiple(IncrementalIngester.java:748)
	at org.apache.manifoldcf.crawler.system.DocumentDeleteThread.run(DocumentDeleteThread.java:130)
ERROR 2012-02-26 16:09:37,907 (Document delete thread '8') - Exception tossed: Server/page not found
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Server/page not found
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:111)
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete.<init>(ElasticSearchDelete.java:35)
	at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.removeDocument(ElasticSearchConnector.java:378)
	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.removeDocument(IncrementalIngester.java:1598)
	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentDeleteMultiple(IncrementalIngester.java:748)
	at org.apache.manifoldcf.crawler.system.DocumentDeleteThread.run(DocumentDeleteThread.java:130)
{code}

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214680#comment-13214680 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Just verified that we're not there under ant yet.

So let me ask where all the test dependencies come from, so I can code them in Ant.

The pom includes:

  <repositories>
    <repository>
      <id>sonatype</id>
      <url>https://oss.sonatype.org/content/repositories/releases</url>
    </repository>
  </repositories>


The dependencies look like:

    <dependency>
	    <groupId>org.elasticsearch</groupId>
	    <artifactId>elasticsearch</artifactId>
	    <version>0.18.7</version>
    </dependency>
    <dependency>
      <groupId>org.apache.lucene</groupId>
      <artifactId>lucene-core</artifactId>
      <version>3.5.0</version>
    </dependency>
    <dependency>                 
      <groupId>org.apache.lucene</groupId>
      <artifactId>lucene-analyzers</artifactId>   
      <version>3.5.0</version>                            
    </dependency>
    <dependency>
      <groupId>org.apache.lucene</groupId>
      <artifactId>lucene-snowball</artifactId>
      <version>3.0.3</version>                                                                    
    </dependency>
    <dependency>
      <groupId>org.apache.lucene</groupId>
      <artifactId>lucene-fast-vector-highlighter</artifactId>
      <version>3.0.3</version>
    </dependency>
    <dependency>
      <groupId>org.apache.lucene</groupId>
      <artifactId>lucene-highlighter</artifactId>
      <version>2.4.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.lucene</groupId>
      <artifactId>lucene-queries</artifactId>
      <version>2.4.0</version>
    </dependency>

I take it the ElasticSearch dependency come from the sonatype repository?




                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200931#comment-13200931 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

I've created a branch for this work and committed the first patch against it: branches/CONNECTORS-288.  Please submit subsequent diffs against this branch.  Thanks!

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13213896#comment-13213896 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

A 400 return typically means invalid arguments...  But good work so far.


                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227272#comment-13227272 ] 

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

hi karl

It's wrong anyway.

In that code you append an url to an other url. If the url is: 

http://localhost:9090/index/generictype/

In that url you append this: 

http://localhost:8543/chemistry/filename

so the final url become: 

http://localhost:9090/index/generictype/http://localhost:8543/chemistry/filename

It has no sense. So the curl operation 'ld become:

curl -XDELETE http://localhost:9090/index/generictype/http://localhost:8543/chemistry/filename

instead of:

curl -XDELETE http://localhost:9090/index/generictype/filename

Let me know, please
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202370#comment-13202370 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

I just built the connector using ant.  First thing I noticed was the dist/connectors.xml file:

    <!-- Add your output connectors here -->
  <outputconnector name="OpenSearchServer" class="org.apache.manifoldcf.agents.output.elasticsearch.OpenSearchServerConnector"/>
  <outputconnector name="OpenSearchServer" class="org.apache.manifoldcf.agents.output.opensearchserver.OpenSearchServerConnector"/>

You can't have two connectors with the same name ;-).  Also the class name looks suspicious.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207822#comment-13207822 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

If you want to construct a test that uses File System as the source, you would need to import ../ifs-test-build.xml from your test directory's build.xml to include the right stuff.  (This is used by the Solr Connector's UI test, but there is no actual integration test you can base your test on yet, so you will be blazing a new path.)

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216827#comment-13216827 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

>From what I can see, the connector IS called, but it just throws an exception when it sets up its session.  I can instrument the connector if you like in order to prove this to you.

If you want to see this, just browse to localhost:8346/mcf-crawler-ui while the test is running.  View the output connection.  You will see the exception I've already reported.

WorkerThread and StartupThread will not become active until the agents process starts.  In a test, this happens during a @Before method.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214017#comment-13214017 ] 

Luca Stancapiano commented on CONNECTORS-288:
---------------------------------------------

yes....surely there is something too more in the parameters sent through the test case..... I take it
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luca Stancapiano updated CONNECTORS-288:
----------------------------------------

    Attachment: manifold-elasticsearch-patch

In this new patch:

1 - I added a new configuration type: indexType. It is mandatory for operations of update. Here an example of update through http:

curl -XPUT http://localhost:9200/${indexName}/${indexType}/_update -d {}

2 - I parse the output message. Here two examples of output messages in elasticSearch:

Succesfull:

{"ok":true,"_index":"index","_type":"aa","_id":"_update","_version":1}

I take the field: "ok". If true the operation is succesfull

Error:

{"error":"ElasticSearchParseException[Failed to derive xcontent from (offset=0, length=0): []]","status":500}

If there is not the "ok" field, I take the "error" field and I print the value as message error
                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206458#comment-13206458 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Committed latest patch.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216830#comment-13216830 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Instrumentation yields the following:

    [junit] org.apache.manifoldcf.core.interfaces.ManifoldCFException: Server/pa
ge not found
    [junit]     at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSear
chConnection.call(ElasticSearchConnection.java:111)
    [junit]     at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSear
chAction.<init>(ElasticSearchAction.java:37)
    [junit]     at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSear
chConnector.check(ElasticSearchConnector.java:389)

I'm instrumenting the ElasticSearchAction constructor now to see what URL it thinks it is using.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13215177#comment-13215177 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Committed fixes that tie elasticsearch download and unpack into the build system, and point the elasticsearch IT tests at the unpacked download.

Test now fails with an API error; expects 201 but sees 400.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Piergiorgio Lucidi (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202374#comment-13202374 ] 

Piergiorgio Lucidi edited comment on CONNECTORS-288 at 2/7/12 1:45 PM:
-----------------------------------------------------------------------

Luca will work on this task during this week, I think that he will solve all the issues ;)

Here other issues to solve for this connector as confirmed by Karl:
{quote}
(1) Indent.  Apache standard is 2 spaces per level.  I think it's good
to stick to that so that people can read diffs without a lot of
extraneous reformatting.

(2) For the Velocity conditional, you use PROTOCOL_BJ when (since this
is evaluated in Velocity) you really want something that isn't escaped
at all (just PROTOCOL).
{quote}
                
      was (Author: piergiorgiolucidi@gmail.com):
    Luca will work on this task during this week, I think that he will solve all the issues ;)
                  
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Issue Comment Edited] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Tommaso Teofili (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200692#comment-13200692 ] 

Tommaso Teofili edited comment on CONNECTORS-288 at 2/5/12 8:37 AM:
--------------------------------------------------------------------

Nice thing Luca, looking forward to a patch for it :)
                
      was (Author: teofili):
    Nice thing Luce, looking forward to a patch for it :)
                  
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>             Fix For: ManifoldCF next
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209673#comment-13209673 ] 

Karl Wright edited comment on CONNECTORS-288 at 2/16/12 8:17 PM:
-----------------------------------------------------------------

I modified the ant build to properly build and run the tests.  Unfortunately the compilation of the IT tests fails:

{code}
compile-tests:
    [javac] C:\wip\mcf\CONNECTORS-288\tests\test-build.xml:102: warning: 'includ
eantruntime' was not set, defaulting to build.sysclasspath=last; set to false fo
r repeatable builds
    [javac] Compiling 4 source files to C:\wip\mcf\CONNECTORS-288\tests\elastics
earch\build\test\classes
    [javac] C:\wip\mcf\CONNECTORS-288\tests\elasticsearch\src\test\java\org\apac
he\manifoldcf\elasticsearch_tests\BaseDerby.java:22: package org.elasticsearch.n
ode does not exist
    [javac] import org.elasticsearch.node.Node;
    [javac]                               ^
    [javac] C:\wip\mcf\CONNECTORS-288\tests\elasticsearch\src\test\java\org\apac
he\manifoldcf\elasticsearch_tests\BaseDerby.java:27: package org.elasticsearch.n
ode does not exist
    [javac] import static org.elasticsearch.node.NodeBuilder.*;
    [javac]                                      ^
    [javac] C:\wip\mcf\CONNECTORS-288\tests\elasticsearch\src\test\java\org\apac
he\manifoldcf\elasticsearch_tests\BaseDerby.java:37: cannot find symbol
    [javac] symbol  : class Node
    [javac] location: class org.apache.manifoldcf.elasticsearch_tests.BaseDerby
    [javac]   protected Node node = null;
    [javac]             ^
    [javac] C:\wip\mcf\CONNECTORS-288\tests\elasticsearch\src\test\java\org\apac
he\manifoldcf\elasticsearch_tests\BaseDerby.java:92: cannot find symbol
    [javac] symbol  : method nodeBuilder()
    [javac] location: class org.apache.manifoldcf.elasticsearch_tests.BaseDerby
    [javac]     node = nodeBuilder().local(true).node();
    [javac]            ^
    [javac] 4 errors
{code}


                
      was (Author: kwright@metacarta.com):
    I modified the ant build to properly build and run the tests.  Unfortunately the compilation of the IT tests fails.

                  
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216840#comment-13216840 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

So when I added a System.out.println of the URL in ElasticSearchAction, I no longer get any errors; the check() response is "OK".  (That is wrong, by the way; it should return super.check() instead, which is "Connection working".)

The instrumented URL output looks like this:

    [junit] URL is 'http://localhost:9200/index/_optimize'
    [junit] URL is 'http://localhost:9200/index/_status'
   [junit] URL is 'http://localhost:9200/index/_optimize'
   [junit] URL is 'http://localhost:9200/index/_optimize'
   [junit] URL is 'http://localhost:9200/index/_optimize'

... followed by the 120000 ms timeout.

Some conclusions: (1) We should fix the check() method; (2) The fact that check() succeeds sometimes and fails others is quite disconcerting; clearly the connector is doing something pretty wrong.

I also looked more deeply at the code itself.  The addOrReplaceDocument() method uses a synchronizer to permit only one thread to index at a time.  This does not seem correct to me, and it is thus probable that the problem stems from improper understanding of the ManifoldCF threading model.  Each connector instance should be working with its own ElasticSearchIndex object and its own HttpClient method so that all of the threads can operate independently without collision.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211343#comment-13211343 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Committed latest patch.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222036#comment-13222036 ] 

Karl Wright commented on CONNECTORS-288:
----------------------------------------

Luca,

The BaseITDerby class call to super.setUp() calls ConnectorBase.setUp(), which looks like this:

{code}
  @Before
  public void setUp()
    throws Exception
  {
    try
    {
      localCleanUp();
    }
    catch (Exception e)
    {
      System.out.println("Warning: Preclean failed: "+e.getMessage());
    }
    try
    {
      localSetUp();
    }
    catch (Exception e)
    {
      e.printStackTrace();
      throw e;
    }
  }
{code}

The localCleanup() call does the following:

{code}
  protected void localCleanUp()
    throws Exception
  {
    initialize();
    if (isInitialized())
    {
      ... remove connections, jobs, etc.
    }
  }
{code}

There is no code that calls mcfInstance.stop() in the localCleanUp() method.

It is true that the agents process is meant to only run ONCE in a JVM instance.  That means you can start it, then stop it, but you cannot start it again after that.  The reasons are historical and have to do with avoiding race conditions when more than one entity is trying to start the agents process at the same time.

This is one reason that we use fork="true" for both the Maven and Ant invocations of tests.  If we do not do this, the agents process will not start on the second and subsequent tests that are called, because the JVM instance is shared in that model across all test runs.


                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CONNECTORS-288) An ElasticSearch connector would be helpful

Posted by "Luca Stancapiano (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226945#comment-13226945 ] 

Luca Stancapiano edited comment on CONNECTORS-288 at 3/10/12 9:31 PM:
----------------------------------------------------------------------

Ok....deleting the properties.xml after each test I can now run and fix the test. The problem of the delete is tied to a last update:

class org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete:row 34

{code}
String fileName = java.net.URLEncoder.encode(documentURI,"utf-8");
{code}

it must be a fileName, not a Uri because it is the id to use for the delete. The delete method in the elasticsearch connector is as a curl as below:

{code}
curl -XDELETE http://localhost:9200/index/generictype/gtgt
{code}

where gtgt is the 'fileName' variable so it'ld become so:

{code}
curl -XDELETE http://localhost:9200/index/generictype/http%3A%2F%2Flocalhost%3A9090%2Fchemistry-opencmis-server-inmemory%2Fatom%2F139%2Fnull 
{code}

If you modify the row as it was before so:

{code}
String fileName = FilenameUtils.getName(documentURI);
{code}

the delete works.

Done it there is a new problem tied to https://issues.apache.org/jira/browse/CONNECTORS-417:

our test create multiple versions of the document called 'null'

When the delete is called, the test succeeds to delete the first 'null' document because the dlete operation delete all versions for the document. So, after, the 'null' document is not more there but it try to delete anyhow and it goes in a loop. 

I suppose this problem is done because the resolution of https://issues.apache.org/jira/browse/CONNECTORS-417 is not committed in the branch. Let me know for it





                
      was (Author: luca.stancaqpiano):
    Ok....deleting the properties.xml after each test I can now run and fix the test. The problem of the delete is tied to a last update:

class org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete:row 34

{code}
String fileName = java.net.URLEncoder.encode(documentURI,"utf-8");
{code}

it must be a fileNome, not a Uri because it is the id to use for the delete. The delete method in the elasticsearch connector is as a curl as below:

{code}
curl -XDELETE http://localhost:9200/index/generictype/null
{code}

where gtgt is the 'fileName' variable so it'ld become so:

{code}
curl -XDELETE http://localhost:9200/index/generictype/http%3A%2F%2Flocalhost%3A9090%2Fchemistry-opencmis-server-inmemory%2Fatom%2F139%2Fnull 
{code}

If you modify the row as it was before so:

{code}
String fileName = FilenameUtils.getName(documentURI);
{code}

the delete works.

Done it there is a new problem tied to https://issues.apache.org/jira/browse/CONNECTORS-417:

our test create multiple versions of the document called 'null'

When the delete is called, the test succeeds to delete the first 'null' document because the dlete operation delete all versions for the document. So, after, the 'null' document is not more there but it try to delete anyhow and it goes in a loop. 

I suppose this problem is done because the resolution of https://issues.apache.org/jira/browse/CONNECTORS-417 is not committed in the branch. Let me know for it





                  
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-patch, manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira