You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@manifoldcf.apache.org by kw...@apache.org on 2013/06/02 13:23:29 UTC

svn commit: r1488664 - in /manifoldcf/trunk: ./ connectors/ connectors/dropbox/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/dropbox/ connectors/googledrive/ connectors/wiki/ connectors/wiki/connector/src/main/java/org/apache/manifol...

Author: kwright
Date: Sun Jun  2 11:23:28 2013
New Revision: 1488664

URL: http://svn.apache.org/r1488664
Log:
Add Google Drive connector - CONNECTORS-694.  Thanks to Andrew Janowczyk for this contribution.

Added:
    manifoldcf/trunk/connectors/googledrive/
      - copied from r1488663, manifoldcf/branches/CONNECTORS-694/connectors/googledrive/
    manifoldcf/trunk/framework/core/src/main/java/org/apache/manifoldcf/core/common/XThreadStringBuffer.java
      - copied unchanged from r1488663, manifoldcf/branches/CONNECTORS-694/framework/core/src/main/java/org/apache/manifoldcf/core/common/XThreadStringBuffer.java
    manifoldcf/trunk/site/src/documentation/resources/images/en_US/googledrive-repository-connection-configuration-save.PNG
      - copied unchanged from r1488663, manifoldcf/branches/CONNECTORS-694/site/src/documentation/resources/images/en_US/googledrive-repository-connection-configuration-save.PNG
    manifoldcf/trunk/site/src/documentation/resources/images/en_US/googledrive-repository-connection-configuration.PNG
      - copied unchanged from r1488663, manifoldcf/branches/CONNECTORS-694/site/src/documentation/resources/images/en_US/googledrive-repository-connection-configuration.PNG
    manifoldcf/trunk/site/src/documentation/resources/images/en_US/googledrive-repository-connection-job-googledrive-seed-query.PNG
      - copied unchanged from r1488663, manifoldcf/branches/CONNECTORS-694/site/src/documentation/resources/images/en_US/googledrive-repository-connection-job-googledrive-seed-query.PNG
    manifoldcf/trunk/site/src/documentation/resources/images/en_US/googledrive-repository-setup-1.PNG
      - copied unchanged from r1488663, manifoldcf/branches/CONNECTORS-694/site/src/documentation/resources/images/en_US/googledrive-repository-setup-1.PNG
    manifoldcf/trunk/site/src/documentation/resources/images/en_US/googledrive-repository-setup-2.PNG
      - copied unchanged from r1488663, manifoldcf/branches/CONNECTORS-694/site/src/documentation/resources/images/en_US/googledrive-repository-setup-2.PNG
    manifoldcf/trunk/site/src/documentation/resources/images/en_US/googledrive-repository-setup-3.PNG
      - copied unchanged from r1488663, manifoldcf/branches/CONNECTORS-694/site/src/documentation/resources/images/en_US/googledrive-repository-setup-3.PNG
    manifoldcf/trunk/site/src/documentation/resources/images/en_US/googledrive-repository-setup-4.PNG
      - copied unchanged from r1488663, manifoldcf/branches/CONNECTORS-694/site/src/documentation/resources/images/en_US/googledrive-repository-setup-4.PNG
    manifoldcf/trunk/site/src/documentation/resources/images/en_US/googledrive-repository-setup-5.PNG
      - copied unchanged from r1488663, manifoldcf/branches/CONNECTORS-694/site/src/documentation/resources/images/en_US/googledrive-repository-setup-5.PNG
Removed:
    manifoldcf/trunk/connectors/wiki/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/wiki/PageBuffer.java
Modified:
    manifoldcf/trunk/   (props changed)
    manifoldcf/trunk/CHANGES.txt
    manifoldcf/trunk/build.xml
    manifoldcf/trunk/connectors/dropbox/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/dropbox/DropboxRepositoryConnector.java
    manifoldcf/trunk/connectors/dropbox/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/dropbox/DropboxSession.java
    manifoldcf/trunk/connectors/pom.xml
    manifoldcf/trunk/connectors/wiki/   (props changed)
    manifoldcf/trunk/connectors/wiki/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/wiki/WikiConnector.java
    manifoldcf/trunk/dist-license/LICENSE.txt
    manifoldcf/trunk/framework/core/src/main/java/org/apache/manifoldcf/core/common/XThreadInputStream.java
    manifoldcf/trunk/lib-license/LICENSE.txt
    manifoldcf/trunk/site/src/documentation/content/xdocs/en_US/end-user-documentation.xml

Propchange: manifoldcf/trunk/
------------------------------------------------------------------------------
  Merged /manifoldcf/branches/CONNECTORS-694:r1488166-1488663

Modified: manifoldcf/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/manifoldcf/trunk/CHANGES.txt?rev=1488664&r1=1488663&r2=1488664&view=diff
==============================================================================
--- manifoldcf/trunk/CHANGES.txt (original)
+++ manifoldcf/trunk/CHANGES.txt Sun Jun  2 11:23:28 2013
@@ -3,6 +3,9 @@ $Id$
 
 ======================= 1.3-dev =====================
 
+CONNECTORS-694: Add Google Drive connector.
+(Andrew Janowczyk, Karl Wright)
+
 CONNECTORS-690: For ElasticSearch connector, include _name and
 _content_type field within "file" portion of JSON, so it will work properly
 with the Mapper Attachment Plugin.

Modified: manifoldcf/trunk/build.xml
URL: http://svn.apache.org/viewvc/manifoldcf/trunk/build.xml?rev=1488664&r1=1488663&r2=1488664&view=diff
==============================================================================
--- manifoldcf/trunk/build.xml (original)
+++ manifoldcf/trunk/build.xml Sun Jun  2 11:23:28 2013
@@ -59,6 +59,7 @@
         <ant dir="connectors/alfresco" target="clean"/>
         <ant dir="connectors/cmis" target="clean"/>
         <ant dir="connectors/dropbox" target="clean"/>
+        <ant dir="connectors/googledrive" target="clean"/>
         <ant dir="connectors/activedirectory" target="clean"/>
         <ant dir="connectors/ldap" target="clean"/>
         <ant dir="connectors/documentum" target="clean"/>
@@ -110,6 +111,7 @@
         <ant dir="connectors/alfresco" target="clean"/>
         <ant dir="connectors/cmis" target="clean"/>
         <ant dir="connectors/dropbox" target="clean"/>
+        <ant dir="connectors/googledrive" target="clean"/>
         <ant dir="connectors/activedirectory" target="clean"/>
         <ant dir="connectors/ldap" target="clean"/>
         <ant dir="connectors/documentum" target="clean"/>
@@ -281,6 +283,8 @@
     <target name="setup-cmis-connector" depends="build-framework" if="downloaded"/>
     
     <target name="setup-dropbox-connector" depends="build-framework" if="downloaded"/>
+
+    <target name="setup-googledrive-connector" depends="build-framework" if="downloaded"/>
     
     <target name="setup-alfresco-connector-tests" depends="build-tests-framework" if="downloaded"/>
 
@@ -302,10 +306,21 @@
         <ant dir="connectors/dropbox" target="build"/>
     </target>
 
+
+    <target name="build-googledrive-connector" depends="setup-googledrive-connector" if="downloaded">
+        <ant dir="connectors/googledrive" target="build"/>
+    </target>
+
+
     <target name="doc-dropbox-connector" depends="setup-dropbox-connector" if="downloaded">
         <ant dir="connectors/dropbox" target="doc"/>
     </target>
 
+    <target name="doc-googledrive-connector" depends="setup-googledrive-connector" if="downloaded">
+        <ant dir="connectors/googledrive" target="doc"/>
+    </target>
+
+
     <target name="doc-alfresco-connector" depends="setup-alfresco-connector" if="downloaded">
         <ant dir="connectors/alfresco" target="doc"/>
     </target>
@@ -1395,6 +1410,29 @@
         </condition>
     </target>
 
+
+    <target name="calculate-googledrive-condition" depends="build-googledrive-connector">
+        <available file="connectors/googledrive/dist/lib" type="dir" property="googledrive.exists"/>
+        <condition property="googledrive.include">
+            <and>
+                <isset property="googledrive.exists"/>
+                <isset property="downloaded"/>
+            </and>
+        </condition>
+    </target>
+
+    <target name="calculate-googledrive-doc-condition" depends="doc-googledrive-connector">
+        <available file="connectors/googledrive/dist/doc" type="dir" property="googledrive-doc.exists"/>
+        <condition property="googledrive-doc.include">
+            <and>
+                <isset property="googledrive-doc.exists"/>
+                <isset property="downloaded"/>
+            </and>
+        </condition>
+    </target>
+
+
+
     <target name="calculate-cmis-doc-condition" depends="doc-cmis-connector">
         <available file="connectors/cmis/dist/doc" type="dir" property="cmis-doc.exists"/>
         <condition property="cmis-doc.include">
@@ -1437,6 +1475,24 @@
             <param name="connector-name" value="dropbox"/>
         </antcall>
     </target>
+
+    <target name="deliver-googledrive-connector" depends="calculate-googledrive-condition" if="googledrive.include">
+        <antcall target="general-connector-delivery">
+            <param name="connector-name" value="googledrive"/>
+        </antcall>
+        <antcall target="general-add-repository-connector">
+            <param name="connector-name" value="googledrive"/>
+            <param name="connector-label" value="GoogleDrive"/>
+            <param name="connector-class" value="org.apache.manifoldcf.crawler.connectors.googledrive.GoogleDriveRepositoryConnector"/>
+        </antcall>
+    </target>
+
+    <target name="deliver-googledrive-connector-doc" depends="calculate-googledrive-doc-condition" if="googledrive-doc.include">
+        <antcall target="general-connector-doc-delivery">
+            <param name="connector-name" value="googledrive"/>
+        </antcall>
+    </target>
+
     
     <target name="deliver-cmis-connector-doc" depends="calculate-cmis-doc-condition" if="cmis-doc.include">
         <antcall target="general-connector-doc-delivery">
@@ -2590,8 +2646,8 @@
     <target name="end-to-end-loadtests-HSQLDB" depends="run-filesystem-loadtests-HSQLDB,run-rss-loadtests-HSQLDB,run-wiki-loadtests-HSQLDB,run-alfresco-loadtests-HSQLDB,run-cmis-loadtests-HSQLDB,run-sharepoint-loadtests-HSQLDB"/>
 
 
-    <target name="deliver-open-connectors" depends="deliver-dropbox-connector,deliver-nullauthority-connector,deliver-activedirectory-connector,deliver-ldap-connector,deliver-alfresco-connector,deliver-cmis-connector,deliver-filesystem-connector,deliver-rss-connector,deliver-webcrawler-connector,deliver-wiki-connector,deliver-jdbc-connector"/>
-    <target name="deliver-open-connectors-doc" depends="deliver-dropbox-connector-doc,deliver-nullauthority-connector-doc,deliver-activedirectory-connector-doc,deliver-ldap-connector-doc,deliver-alfresco-connector-doc,deliver-cmis-connector-doc,deliver-filesystem-connector-doc,deliver-rss-connector-doc,deliver-webcrawler-connector-doc,deliver-wiki-connector-doc,deliver-jdbc-connector-doc"/>
+    <target name="deliver-open-connectors" depends="deliver-googledrive-connector,deliver-dropbox-connector,deliver-nullauthority-connector,deliver-activedirectory-connector,deliver-ldap-connector,deliver-alfresco-connector,deliver-cmis-connector,deliver-filesystem-connector,deliver-rss-connector,deliver-webcrawler-connector,deliver-wiki-connector,deliver-jdbc-connector"/>
+    <target name="deliver-open-connectors-doc" depends="deliver-googledrive-connector-doc,deliver-dropbox-connector-doc,deliver-nullauthority-connector-doc,deliver-activedirectory-connector-doc,deliver-ldap-connector-doc,deliver-alfresco-connector-doc,deliver-cmis-connector-doc,deliver-filesystem-connector-doc,deliver-rss-connector-doc,deliver-webcrawler-connector-doc,deliver-wiki-connector-doc,deliver-jdbc-connector-doc"/>
     
     <target name="deliver-output-connectors" depends="deliver-gts-connector,deliver-solr-connector,deliver-nulloutput-connector,deliver-opensearchserver-connector,deliver-elasticsearch-connector"/>
     <target name="deliver-output-connectors-doc" depends="deliver-gts-connector-doc,deliver-solr-connector-doc,deliver-nulloutput-connector-doc,deliver-opensearchserver-connector-doc,deliver-elasticsearch-connector-doc"/>
@@ -3599,6 +3655,54 @@ Use Apache Forrest version forrest-0.9-d
         </antcall>
     </target>
     
+
+   <target name="download-google-api-client">
+        <mkdir dir="lib"/>
+        <antcall target="download-via-maven">
+            <param name="target" value="lib"/>
+            <param name="project-path" value="com/google/apis"/>
+            <param name="artifact-version" value="v2-rev64-1.14.1-beta"/>
+            <param name="artifact-name" value="google-api-services-drive"/>
+            <param name="artifact-type" value="jar"/>
+        </antcall>
+        <antcall target="download-via-maven">
+            <param name="target" value="lib"/>
+            <param name="project-path" value="com/google/http-client"/>
+            <param name="artifact-version" value="1.14.1-beta"/>
+            <param name="artifact-name" value="google-http-client"/>
+            <param name="artifact-type" value="jar"/>
+        </antcall>
+        <antcall target="download-via-maven">
+            <param name="target" value="lib"/>
+            <param name="project-path" value="com/google/http-client"/>
+            <param name="artifact-version" value="1.14.1-beta"/>
+            <param name="artifact-name" value="google-http-client-jackson2"/>
+            <param name="artifact-type" value="jar"/>
+        </antcall>
+        <antcall target="download-via-maven">
+            <param name="target" value="lib"/>
+            <param name="project-path" value="com/google/oauth-client"/>
+            <param name="artifact-version" value="1.14.1-beta"/>
+            <param name="artifact-name" value="google-oauth-client"/>
+            <param name="artifact-type" value="jar"/>
+        </antcall>
+        <antcall target="download-via-maven">
+            <param name="target" value="lib"/>
+            <param name="project-path" value="com/fasterxml/jackson/core/"/>
+            <param name="artifact-version" value="2.1.3"/>
+            <param name="artifact-name" value="jackson-core"/>
+            <param name="artifact-type" value="jar"/>
+        </antcall> 
+        <antcall target="download-via-maven">
+            <param name="target" value="lib"/>
+            <param name="project-path" value="com/google/api-client"/>
+            <param name="artifact-version" value="1.14.1-beta"/>
+            <param name="artifact-name" value="google-api-client"/>
+            <param name="artifact-type" value="jar"/>
+        </antcall>
+
+    </target>
+
     <target name="download-sharepoint-plugins">
         <mkdir dir="lib/sharepoint-2007"/>
         <!-- Download and unpack binary artifact -->
@@ -3652,7 +3756,7 @@ Use Apache Forrest version forrest-0.9-d
         </antcall>
     </target>
 
-    <target name="make-core-deps" depends="download-dropbox-client,download-solrj,download-httpcomponents,download-json,download-hsqldb,download-xerces,download-commons,download-elasticsearch-plugin,download-solr-plugins,download-sharepoint-plugins,download-jstl,download-xmlgraphics-commons,download-wstx-asl,download-xmlsec,download-xml-apis,download-wss4j,download-velocity,download-streambuffer,download-stax,download-servlet-api,download-xml-resolver,download-osgi,download-opensaml,download-mimepull,download-mail,download-log4j,download-junit,download-jaxws,download-glassfish,download-jaxb,download-tomcat,download-h2,download-h2-support,download-geronimo-specs,download-fop,download-derby,download-postgresql,download-axis,download-saaj,download-wsdl4j,download-castor,download-jetty,download-slf4j,download-xalan,download-activation,download-avalon-framework,download-poi,download-chemistry,download-ecj">
+    <target name="make-core-deps" depends="download-google-api-client,download-dropbox-client,download-solrj,download-httpcomponents,download-json,download-hsqldb,download-xerces,download-commons,download-elasticsearch-plugin,download-solr-plugins,download-sharepoint-plugins,download-jstl,download-xmlgraphics-commons,download-wstx-asl,download-xmlsec,download-xml-apis,download-wss4j,download-velocity,download-streambuffer,download-stax,download-servlet-api,download-xml-resolver,download-osgi,download-opensaml,download-mimepull,download-mail,download-log4j,download-junit,download-jaxws,download-glassfish,download-jaxb,download-tomcat,download-h2,download-h2-support,download-geronimo-specs,download-fop,download-derby,download-postgresql,download-axis,download-saaj,download-wsdl4j,download-castor,download-jetty,download-slf4j,download-xalan,download-activation,download-avalon-framework,download-poi,download-chemistry,download-ecj">
         <copy todir="lib">
             <fileset dir="lib-license" includes="*.txt"/>
         </copy>
@@ -3682,6 +3786,7 @@ Use Apache Forrest version forrest-0.9-d
         <ant dir="connectors/alfresco" target="download-dependencies"/>
         <ant dir="connectors/cmis" target="download-dependencies"/>
         <ant dir="connectors/dropbox" target="download-dependencies"/>
+        <ant dir="connectors/googledrive" target="download-dependencies"/>
         <ant dir="connectors/activedirectory" target="download-dependencies"/>
         <ant dir="connectors/ldap" target="download-dependencies"/>
         <ant dir="connectors/documentum" target="download-dependencies"/>
@@ -3719,6 +3824,7 @@ Use Apache Forrest version forrest-0.9-d
         <ant dir="connectors/alfresco" target="download-cleanup"/>
         <ant dir="connectors/cmis" target="download-cleanup"/>
         <ant dir="connectors/dropbox" target="download-cleanup"/>        
+        <ant dir="connectors/googledrive" target="download-cleanup"/>
         <ant dir="connectors/activedirectory" target="download-cleanup"/>
         <ant dir="connectors/ldap" target="download-cleanup"/>
         <ant dir="connectors/documentum" target="download-cleanup"/>

Modified: manifoldcf/trunk/connectors/dropbox/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/dropbox/DropboxRepositoryConnector.java
URL: http://svn.apache.org/viewvc/manifoldcf/trunk/connectors/dropbox/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/dropbox/DropboxRepositoryConnector.java?rev=1488664&r1=1488663&r2=1488664&view=diff
==============================================================================
--- manifoldcf/trunk/connectors/dropbox/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/dropbox/DropboxRepositoryConnector.java (original)
+++ manifoldcf/trunk/connectors/dropbox/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/dropbox/DropboxRepositoryConnector.java Sun Jun  2 11:23:28 2013
@@ -17,12 +17,10 @@
 * limitations under the License.
 */
 
-/*
- * To change this template, choose Tools | Templates
- * and open the template in the editor.
- */
 package org.apache.manifoldcf.crawler.connectors.dropbox;
 
+import org.apache.manifoldcf.core.common.*;
+
 import com.dropbox.client2.DropboxAPI;
 import com.dropbox.client2.exception.DropboxException;
 import java.io.IOException;
@@ -674,20 +672,24 @@ public class DropboxRepositoryConnector 
       i++;
     }
     
-    HashSet<String> seeds = getSeeds(dropboxPath);
-    for (String seed : seeds) {
-      activities.addSeedDocument(seed);
-    }
-
-  }
-
-  protected HashSet<String> getSeeds(String path)
-    throws ManifoldCFException, ServiceInterruption {
     getSession();
-    GetSeedsThread t = new GetSeedsThread(path);
+    XThreadStringBuffer seedBuffer = new XThreadStringBuffer();
+    GetSeedsThread t = new GetSeedsThread(dropboxPath, seedBuffer);
     try {
       t.start();
+      
+      // Pick up the paths, and add them to the activities, before we join with the child thread.
+      while (true) {
+        // The only kind of exceptions this can throw are going to shut the process down.
+        String docPath = seedBuffer.fetch();
+        if (docPath ==  null)
+          break;
+        // Add the pageID to the queue
+        activities.addSeedDocument(docPath);
+      }
+
       t.join();
+
       Throwable thr = t.getException();
       if (thr != null) {
         if (thr instanceof DropboxException) {
@@ -705,35 +707,34 @@ public class DropboxRepositoryConnector 
     } catch (DropboxException e) {
       Logging.connectors.error("DROPBOX: Error adding seed documents: " + e.getMessage(), e);
       handleDropboxException(e);
+    } finally {
+      // Make SURE buffer is dead, otherwise child thread may well hang waiting on it
+      seedBuffer.abandon();
     }
-    return t.getResponse();
   }
 
   protected class GetSeedsThread extends Thread {
 
     protected Throwable exception = null;
-    protected HashSet<String> response = null;
-    protected String path = null;
+    protected final String path;
+    protected final XThreadStringBuffer seedBuffer;
     
-    public GetSeedsThread(String path) {
+    public GetSeedsThread(String path, XThreadStringBuffer seedBuffer) {
       super();
-      this.path=path;
+      this.path = path;
+      this.seedBuffer = seedBuffer;
       setDaemon(true);
     }
 
     @Override
     public void run() {
       try {
-        response = session.getSeeds(path,25000); //upper limit on files to get supported by dropbox api in a single directory
+        session.getSeeds(seedBuffer,path,25000); //upper limit on files to get supported by dropbox api in a single directory
       } catch (Throwable e) {
         this.exception = e;
       }
     }
 
-    public HashSet<String> getResponse() {
-      return response;
-    }
-    
     public Throwable getException() {
       return exception;
     }

Modified: manifoldcf/trunk/connectors/dropbox/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/dropbox/DropboxSession.java
URL: http://svn.apache.org/viewvc/manifoldcf/trunk/connectors/dropbox/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/dropbox/DropboxSession.java?rev=1488664&r1=1488663&r2=1488664&view=diff
==============================================================================
--- manifoldcf/trunk/connectors/dropbox/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/dropbox/DropboxSession.java (original)
+++ manifoldcf/trunk/connectors/dropbox/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/dropbox/DropboxSession.java Sun Jun  2 11:23:28 2013
@@ -23,6 +23,8 @@
  */
 package org.apache.manifoldcf.crawler.connectors.dropbox;
 
+import org.apache.manifoldcf.core.common.*;
+
 import com.dropbox.client2.session.AppKeyPair;
 import java.util.Map;
 import com.dropbox.client2.session.WebAuthSession;
@@ -73,23 +75,22 @@ public class DropboxSession {
     return info;
   }
 
-    public HashSet<String> getSeeds(String path, int max_dirs) throws DropboxException {
-        HashSet<String> ids = new HashSet<String>();
+  public void getSeeds(XThreadStringBuffer idBuffer, String path, int max_dirs)
+    throws DropboxException, InterruptedException {
 
-        ids.add(path); //need to add root dir so that single files such as /file1 will still get read
+    idBuffer.add(path); //need to add root dir so that single files such as /file1 will still get read
         
         
-        DropboxAPI.Entry root_entry = client.metadata(path, max_dirs, null, true, null);
-        List<DropboxAPI.Entry> entries = root_entry.contents; //gets a list of the contents of the entire folder: subfolders + files
+    DropboxAPI.Entry root_entry = client.metadata(path, max_dirs, null, true, null);
+    List<DropboxAPI.Entry> entries = root_entry.contents; //gets a list of the contents of the entire folder: subfolders + files
 
-        // Apply the entries one by one.
-        for (DropboxAPI.Entry e : entries) {
-            if (e.isDir) { //only add the directories as seeds, we'll add the files later
-                ids.add(e.path);
-            }
-        }
-        return ids;
+    // Apply the entries one by one.
+    for (DropboxAPI.Entry e : entries) {
+      if (e.isDir) { //only add the directories as seeds, we'll add the files later
+        idBuffer.add(e.path);
+      }
     }
+  }
   
   public DropboxAPI.Entry getObject(String id) throws DropboxException {
     return client.metadata(id, 25000, null, true, null);

Modified: manifoldcf/trunk/connectors/pom.xml
URL: http://svn.apache.org/viewvc/manifoldcf/trunk/connectors/pom.xml?rev=1488664&r1=1488663&r2=1488664&view=diff
==============================================================================
--- manifoldcf/trunk/connectors/pom.xml (original)
+++ manifoldcf/trunk/connectors/pom.xml Sun Jun  2 11:23:28 2013
@@ -51,6 +51,7 @@
     <module>alfresco</module>
     <module>elasticsearch</module>
     <module>dropbox</module>
+    <module>googledrive</module>
   </modules>
 
 </project>

Propchange: manifoldcf/trunk/connectors/wiki/
------------------------------------------------------------------------------
  Merged /manifoldcf/branches/CONNECTORS-694/connectors/wiki:r1488166-1488663

Modified: manifoldcf/trunk/connectors/wiki/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/wiki/WikiConnector.java
URL: http://svn.apache.org/viewvc/manifoldcf/trunk/connectors/wiki/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/wiki/WikiConnector.java?rev=1488664&r1=1488663&r2=1488664&view=diff
==============================================================================
--- manifoldcf/trunk/connectors/wiki/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/wiki/WikiConnector.java (original)
+++ manifoldcf/trunk/connectors/wiki/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/wiki/WikiConnector.java Sun Jun  2 11:23:28 2013
@@ -2145,7 +2145,7 @@ public class WikiConnector extends org.a
       try
       {
 	HttpRequestBase executeMethod = getInitializedGetMethod(getListPagesURL(startPageTitle,namespace,prefix));
-        PageBuffer pageBuffer = new PageBuffer();
+        XThreadStringBuffer pageBuffer = new XThreadStringBuffer();
         ExecuteListPagesThread t = new ExecuteListPagesThread(httpClient,executeMethod,pageBuffer,startPageTitle);
         try
         {
@@ -2275,12 +2275,12 @@ public class WikiConnector extends org.a
     protected HttpClient client;
     protected HttpRequestBase executeMethod;
     protected Throwable exception = null;
-    protected PageBuffer pageBuffer;
+    protected XThreadStringBuffer pageBuffer;
     protected String lastPageTitle = null;
     protected String startPageTitle;
     protected boolean loginNeeded = false;
 
-    public ExecuteListPagesThread(HttpClient client, HttpRequestBase executeMethod, PageBuffer pageBuffer, String startPageTitle)
+    public ExecuteListPagesThread(HttpClient client, HttpRequestBase executeMethod, XThreadStringBuffer pageBuffer, String startPageTitle)
     {
       super();
       setDaemon(true);
@@ -2361,7 +2361,7 @@ public class WikiConnector extends org.a
   *   </query-continue>
   * </api>
   */
-  protected static boolean parseListPagesResponse(InputStream is, PageBuffer buffer, String startPageTitle, ReturnString lastTitle)
+  protected static boolean parseListPagesResponse(InputStream is, XThreadStringBuffer buffer, String startPageTitle, ReturnString lastTitle)
     throws ManifoldCFException, ServiceInterruption
   {
     // Parse the document.  This will cause various things to occur, within the instantiated XMLContext class.
@@ -2393,11 +2393,11 @@ public class WikiConnector extends org.a
   protected static class WikiListPagesAPIContext extends SingleLevelContext
   {
     protected String lastTitle = null;
-    protected PageBuffer buffer;
+    protected XThreadStringBuffer buffer;
     protected String startPageTitle;
     protected boolean loginNeeded = false;
     
-    public WikiListPagesAPIContext(XMLStream theStream, PageBuffer buffer, String startPageTitle)
+    public WikiListPagesAPIContext(XMLStream theStream, XThreadStringBuffer buffer, String startPageTitle)
     {
       super(theStream,"api");
       this.buffer = buffer;
@@ -2434,11 +2434,11 @@ public class WikiConnector extends org.a
   protected static class WikiListPagesQueryContext extends SingleLevelErrorContext
   {
     protected String lastTitle = null;
-    protected PageBuffer buffer;
+    protected XThreadStringBuffer buffer;
     protected String startPageTitle;
     
     public WikiListPagesQueryContext(XMLStream theStream, String namespaceURI, String localName, String qName, Attributes atts,
-      PageBuffer buffer, String startPageTitle)
+      XThreadStringBuffer buffer, String startPageTitle)
     {
       super(theStream,namespaceURI,localName,qName,atts,"query");
       this.buffer = buffer;
@@ -2469,11 +2469,11 @@ public class WikiConnector extends org.a
   protected static class WikiListPagesAllPagesContext extends SingleLevelContext
   {
     protected String lastTitle = null;
-    protected PageBuffer buffer;
+    protected XThreadStringBuffer buffer;
     protected String startPageTitle;
     
     public WikiListPagesAllPagesContext(XMLStream theStream, String namespaceURI, String localName, String qName, Attributes atts,
-      PageBuffer buffer, String startPageTitle)
+      XThreadStringBuffer buffer, String startPageTitle)
     {
       super(theStream,namespaceURI,localName,qName,atts,"allpages");
       this.buffer = buffer;
@@ -2506,11 +2506,11 @@ public class WikiConnector extends org.a
   protected static class WikiListPagesPContext extends BaseProcessingContext
   {
     protected String lastTitle = null;
-    protected PageBuffer buffer;
+    protected XThreadStringBuffer buffer;
     protected String startPageTitle;
     
     public WikiListPagesPContext(XMLStream theStream, String namespaceURI, String localName, String qName, Attributes atts,
-      PageBuffer buffer, String startPageTitle)
+      XThreadStringBuffer buffer, String startPageTitle)
     {
       super(theStream,namespaceURI,localName,qName,atts);
       this.buffer = buffer;

Modified: manifoldcf/trunk/dist-license/LICENSE.txt
URL: http://svn.apache.org/viewvc/manifoldcf/trunk/dist-license/LICENSE.txt?rev=1488664&r1=1488663&r2=1488664&view=diff
==============================================================================
--- manifoldcf/trunk/dist-license/LICENSE.txt (original)
+++ manifoldcf/trunk/dist-license/LICENSE.txt Sun Jun  2 11:23:28 2013
@@ -299,6 +299,27 @@ License: MIT license (http://opensource.
 This product includes a json-simple-1.1.jar.
 License: Apache 2 (http://www.apache.org/licenses/LICENSE-2.0.txt)
 
+This product includes a jackson-core-2.1.3.jar.
+License: Dual license; we choose to distribute under Apache 2 (http://www.apache.org/licenses/LICENSE-2.0.txt)
+
+This product includes a google-api-client-1.14.1-beta.jar.
+License: Apache 2 (http://www.apache.org/licenses/LICENSE-2.0.txt)
+
+This product includes a google-oauth-client-1.14.1-beta.jar.
+License: Apache 2 (http://www.apache.org/licenses/LICENSE-2.0.txt)
+
+This product includes a google-api-services-drive-v2-rev64-1.14.1-beta.jar.
+License: Apache 2 (http://www.apache.org/licenses/LICENSE-2.0.txt)
+
+This product includes a google-api-services-drive-v2-rev64-1.14.1-beta.jar.
+License: Apache 2 (http://www.apache.org/licenses/LICENSE-2.0.txt)
+
+This product includes a google-http-client-1.14.1-beta.jar.
+License: Apache 2 (http://www.apache.org/licenses/LICENSE-2.0.txt)
+
+This product includes a google-http-client-jackson2-1.14.1-beta.jar.
+License: Apache 2 (http://www.apache.org/licenses/LICENSE-2.0.txt)
+
 This product may include pdf files that embed IPA-licensed fonts.
 License: IPA Font License Agreement v1.0 (http://ossipedia.ipa.go.jp/ipafont/index.html#LicenseEng)
 

Modified: manifoldcf/trunk/framework/core/src/main/java/org/apache/manifoldcf/core/common/XThreadInputStream.java
URL: http://svn.apache.org/viewvc/manifoldcf/trunk/framework/core/src/main/java/org/apache/manifoldcf/core/common/XThreadInputStream.java?rev=1488664&r1=1488663&r2=1488664&view=diff
==============================================================================
--- manifoldcf/trunk/framework/core/src/main/java/org/apache/manifoldcf/core/common/XThreadInputStream.java (original)
+++ manifoldcf/trunk/framework/core/src/main/java/org/apache/manifoldcf/core/common/XThreadInputStream.java Sun Jun  2 11:23:28 2013
@@ -26,20 +26,27 @@ import java.io.*;
 */
 public class XThreadInputStream extends InputStream
 {
-  private byte[] buffer = new byte[65536];
+  private final byte[] buffer = new byte[65536];
   private int startPoint = 0;
   private int byteCount = 0;
   private boolean streamEnd = false;
   private IOException failureException = null;
-  private InputStream sourceStream;
   private boolean abort = false;
-  
-  /** Constructor */
+
+  private final InputStream sourceStream;
+	
+  /** Constructor, from a given input stream. */
   public XThreadInputStream(InputStream sourceStream)
   {
     this.sourceStream = sourceStream;
   }
   
+  /** Constructor, from another source. */
+  public XThreadInputStream()
+  {
+    this.sourceStream = null;
+  }
+  
   /** Call this method to abort the stuffQueue() method.
   */
   public void abort()
@@ -51,8 +58,64 @@ public class XThreadInputStream extends 
     }
   }
   
+  /** This method is called from the helper thread side, to stuff bytes onto
+  * the queue when there is no input stream.
+  * It exits only when interrupted or done.
+  */
+  public void stuffQueue(byte[] byteBuffer, int offset, int amount)
+    throws InterruptedException
+  {
+    while (amount > 0)
+    {
+      int maxToRead;
+      int readStartPoint;
+      synchronized (this)
+      {
+        if (abort || streamEnd)
+          return;
+        // Calculate amount to read
+        maxToRead = buffer.length - byteCount;
+        if (maxToRead == 0)
+        {
+          wait();
+          continue;
+        }
+        readStartPoint = (startPoint + byteCount) & (buffer.length-1);
+      }
+      if (readStartPoint + maxToRead >= buffer.length)
+        maxToRead = buffer.length - readStartPoint;
+      // Now, copy to buffer
+      int amt;
+      if (amount > maxToRead)
+        amt = maxToRead;
+      else
+        amt = amount;
+      //??? make sure this is source -> target
+      System.arraycopy(byteBuffer,offset,buffer,readStartPoint,amt);
+      offset += amt;
+      amount -= amt;
+      synchronized (this)
+      {
+        byteCount += amt;
+        notifyAll();
+      }
+    }
+  }
+  
+  /** Call this method when there is no more data to write.
+  */
+  public void doneStuffingQueue()
+  {
+    synchronized (this)
+    {
+      streamEnd = true;
+      notifyAll();
+    }
+  }
+  
   /** This method is called from the helper thread side, to keep the queue
-  * stuffed.  It exits when the stream is empty, or when interrupted.
+  * stuffed from the input stream.
+  * It exits when the stream is empty, or when interrupted.
   */
   public void stuffQueue()
     throws IOException, InterruptedException

Modified: manifoldcf/trunk/lib-license/LICENSE.txt
URL: http://svn.apache.org/viewvc/manifoldcf/trunk/lib-license/LICENSE.txt?rev=1488664&r1=1488663&r2=1488664&view=diff
==============================================================================
--- manifoldcf/trunk/lib-license/LICENSE.txt (original)
+++ manifoldcf/trunk/lib-license/LICENSE.txt Sun Jun  2 11:23:28 2013
@@ -293,6 +293,33 @@ License: Common Development and Distribu
 This product includes a jstl-impl-1.2.jar.
 License: Common Development and Distribution License (CDDL) v1.0 (https://glassfish.dev.java.net/public/CDDLv1.0.html)
 
+This product includes a dropbox-client-1.5.3.jar.
+License: MIT license (http://opensource.org/licenses/MIT).
+
+This product includes a json-simple-1.1.jar.
+License: Apache 2 (http://www.apache.org/licenses/LICENSE-2.0.txt)
+
+This product includes a jackson-core-2.1.3.jar.
+License: Dual license; we choose to distribute under Apache 2 (http://www.apache.org/licenses/LICENSE-2.0.txt)
+
+This product includes a google-api-client-1.14.1-beta.jar.
+License: Apache 2 (http://www.apache.org/licenses/LICENSE-2.0.txt)
+
+This product includes a google-oauth-client-1.14.1-beta.jar.
+License: Apache 2 (http://www.apache.org/licenses/LICENSE-2.0.txt)
+
+This product includes a google-api-services-drive-v2-rev64-1.14.1-beta.jar.
+License: Apache 2 (http://www.apache.org/licenses/LICENSE-2.0.txt)
+
+This product includes a google-api-services-drive-v2-rev64-1.14.1-beta.jar.
+License: Apache 2 (http://www.apache.org/licenses/LICENSE-2.0.txt)
+
+This product includes a google-http-client-1.14.1-beta.jar.
+License: Apache 2 (http://www.apache.org/licenses/LICENSE-2.0.txt)
+
+This product includes a google-http-client-jackson2-1.14.1-beta.jar.
+License: Apache 2 (http://www.apache.org/licenses/LICENSE-2.0.txt)
+
 This product may include pdf files that embed IPA-licensed fonts.
 License: IPA Font License Agreement v1.0 (http://ossipedia.ipa.go.jp/ipafont/index.html#LicenseEng)
 

Modified: manifoldcf/trunk/site/src/documentation/content/xdocs/en_US/end-user-documentation.xml
URL: http://svn.apache.org/viewvc/manifoldcf/trunk/site/src/documentation/content/xdocs/en_US/end-user-documentation.xml?rev=1488664&r1=1488663&r2=1488664&view=diff
==============================================================================
--- manifoldcf/trunk/site/src/documentation/content/xdocs/en_US/end-user-documentation.xml (original)
+++ manifoldcf/trunk/site/src/documentation/content/xdocs/en_US/end-user-documentation.xml Sun Jun  2 11:23:28 2013
@@ -1708,8 +1708,64 @@ curl -XGET http://localhost:9200/index/_
               <figure src="images/en_US/dropbox-repository-connection-job-save.PNG" alt="CMIS Repository Connection, saving job" width="80%"/>
               <br/><br/>
             </section>
-            
-            <section id="livelinkrepository">
+			
+            <section id="googledriverepository">
+              <title>Google Drive Repository Connection</title>
+              <p>The Google Drive Repository Connection type allows you to index content from <a href="https://drive.google.com">Google Drives</a>.</p>
+              <p>Each Google Drive Connection manages access to a single drive repository. This means that if you have multiple Google Drives (i.e. different users),
+                you need to create a specific connection for each drive repository and provide the associated authentication information.</p>
+              <br/>
+              <p>A Google Drive connection has the following configuration parameters on the repository connection editing screen:</p>
+              <br/><br/>
+              <figure src="images/en_US/googledrive-repository-connection-configuration.PNG" alt="googledrive Repository Connection, configuration parameters" width="80%"/>
+              <br/><br/>
+              <p>As we can see there are 3 pieces of information which are needed to create a succesful connection. The Client ID and Client Secret given by Google Drive 
+                when you register your application for a development license. This is typically done through the <a href="https://code.google.com/apis/console/b/0/">Google APIs Console</a>.</p>
+              <br/><br/>
+              <figure src="images/en_US/googledrive-repository-setup-1.PNG" alt="googledrive create project" width="80%"/>
+              <br/><br/>
+			  <p>Once having created a project, we must enable the Google Drive API</p>
+			  <br/><br/>
+              <figure src="images/en_US/googledrive-repository-setup-2.PNG" alt="googledrive enable drive api" width="80%"/>
+              <br/><br/>
+			  <p>Then going to the API Access link on the righthand side, we need to select create an OAutg 2.0 client ID:</p>
+			  <br/><br/>
+              <figure src="images/en_US/googledrive-repository-setup-3.PNG" alt="googledrive create oauth client" width="80%"/>
+              <br/><br/>
+			  
+			  <p>After filling in the necessary information, we need to select what type of application we'd like. For our purposes we need to select installed application</p>
+			  <br/><br/>
+              <figure src="images/en_US/googledrive-repository-setup-4.PNG" alt="googledrive create client id" width="80%"/>
+              <br/><br/>
+			  
+			  <p>Afterwards we're presented with our Client ID and Client secrets needed for the connector(where the red boxes are):</p>
+			  <br/><br/>
+              <figure src="images/en_US/googledrive-repository-setup-5.PNG" alt="googledrive client id and secret" width="80%"/>
+              <br/><br/>
+			  
+             <p>Now each user must confirm their acceptance of allowing your application to access their google drive. This is done through a run-of-the-mill OAUTH
+                approach, but needs to be done before hand. Once the steps are completed, a long-life refresh token is presented, which is then used by the connector. For completeness, we present the needed steps below since they require some manual work.</p>
+				<br/><br/><p>
+				<ol>
+				<li> Browse to here: https://accounts.google.com/o/oauth2/auth?scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.readonly&state=%2Fprofile&redirect_uri=https%3A%2F%2Flocalhost&response_type=code&client_id=CLIENT_ID&approval_prompt=force&access_type=offline
+				<li> This returns a link (after acceptance) https://localhost/?state=/profile&code=CODE
+				<li> Perform a POST: https://accounts.google.com/o/oauth2/token  with the following as the body: grant_type=authorization_code&redirect_uri=https%3A%2F%2Flocalhost&client_secret=CLIENT_SECRET&client_id=CLIENT_ID&code=CODE
+				<li> The response is then a json response which contains the refresh_token.
+				</ol>
+				</p>
+			 <br/><br/>
+              <p>After you click the "Save" button, you will see a connection summary screen, which might look something like this:</p>
+              <br/><br/>
+              <figure src="images/en_US/googledrive-repository-connection-configuration-save.PNG" alt="googledrive Repository Connection, saving configuration" width="80%"/>
+              <br/><br/>
+              <p>When you configure a job to use the Google Drive repository connection an additional tab is presented. This is the "Google Drive Seed Query" tab:</p>
+              <br/><br/>
+              <figure src="images/en_US/googledrive-repository-connection-job-googledrive-seed-query.PNG" alt="googledrive Repository Connection, seed query" width="80%"/>
+              <br/><br/>
+              <p>This tab allows you to specify the query which will be used to seed documents for the indexing process. The query language is specified on the <a href="https://developers.google.com/drive/search-parameters">Drive Search Paramters</a> site. Directories which meet the seed query are fully crawled as the query on applies to seeds. The default query indexes the entire drive. Lastly, native Google documents such as spreadsheets and word documents are exported to PDF and then ingested.</p>
+            </section>
+			
+			<section id="livelinkrepository">
                 <title>OpenText LiveLink Repository Connection</title>
                 <p>The LiveLink connection type allows you to index content from LiveLink repositories.  LiveLink has a rich variety of different document types and metadata,
                     which include basic documents, as well as compound documents, folders, workspaces, and projects.  A LiveLink connection is able to discover documents