You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Karl Wright (JIRA)" <ji...@apache.org> on 2012/07/17 10:47:36 UTC

[jira] [Commented] (CONNECTORS-492) SharePoint connector on SP2010 throws exception when there are too many documents in a library

    [ https://issues.apache.org/jira/browse/CONNECTORS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416016#comment-13416016 ] 

Karl Wright commented on CONNECTORS-492:
----------------------------------------

The following link: http://social.technet.microsoft.com/Forums/en-US/sharepoint2010programming/thread/51a2b8e8-8478-4559-aa49-e21e2f7a2d90/

... has the following explanation:

"It's rather pathetic that the only real solution to the problem is to effectively disable thresholding; trying to use any list under the threshold more than just minimally is almost impossible.

Read the last section of this link (How Does Indexing Affect Throttling?) for an explanation as to the underlying problem.  Basically, in order to query a list with more items than the throttle limit you need to filter on a field that is indexed, and the results of that filter need to be less than the throttle limit.  If the field isn't indexed, or if the first field that is filtered on (by itself) doesn't put you below the limit then you can't perform the query.  The link explains how setting the rowlimit to something below the throttle limit doesn't actually help get around it in all but a few cases."

The link points you to: http://msdn.microsoft.com/en-us/library/ff798465.aspx

Our problem is that, although I believe we are properly sorting using an indexed column (ID), we cannot use rowlimit to control the size of the resultset.  It's not clear whether there is any good alternative, however.


                
> SharePoint connector on SP2010 throws exception when there are too many documents in a library
> ----------------------------------------------------------------------------------------------
>
>                 Key: CONNECTORS-492
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-492
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: SharePoint connector
>    Affects Versions: ManifoldCF 0.7
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 0.7
>
>
> When there are more than the document list limit set by the administrator, no documents for the library are crawled. Instead the following exception is thrown:
> {code}
> DEBUG 2012-07-16 23:58:04,036 (Worker thread '19') - Mapping Exception to AxisFault
> AxisFault
>  faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server
>  faultSubcode: 
>  faultString: Exception of type 'Microsoft.SharePoint.SoapServer.SoapServerException' was thrown.
>  faultActor: 
>  faultNode: 
>  faultDetail: 
> 	{http://schemas.microsoft.com/sharepoint/soap/}errorstring:The attempted operation is prohibited because it exceeds the list view threshold enforced by the administrator.
> 	{http://schemas.microsoft.com/sharepoint/soap/}errorcode:0x80070024
> Exception of type 'Microsoft.SharePoint.SoapServer.SoapServerException' was thrown.
> 	at org.apache.axis.message.SOAPFaultBuilder.createFault(SOAPFaultBuilder.java:222)
> 	at org.apache.axis.message.SOAPFaultBuilder.endElement(SOAPFaultBuilder.java:129)
> 	at org.apache.axis.encoding.DeserializationContext.endElement(DeserializationContext.java:1087)
> 	at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
> 	at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanEndElement(Unknown Source)
> 	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
> 	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
> 	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
> 	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
> 	at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
> 	at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
> 	at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
> 	at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source)
> 	at org.apache.axis.encoding.DeserializationContext.parse(DeserializationContext.java:227)
> 	at org.apache.axis.SOAPPart.getAsSOAPEnvelope(SOAPPart.java:696)
> 	at org.apache.axis.Message.getSOAPEnvelope(Message.java:435)
> 	at org.apache.axis.handlers.soap.MustUnderstandChecker.invoke(MustUnderstandChecker.java:62)
> 	at org.apache.axis.client.AxisClient.invoke(AxisClient.java:206)
> 	at org.apache.axis.client.Call.invokeEngine(Call.java:2784)
> 	at org.apache.axis.client.Call.invoke(Call.java:2767)
> 	at org.apache.axis.client.Call.invoke(Call.java:2443)
> 	at org.apache.axis.client.Call.invoke(Call.java:2366)
> 	at org.apache.axis.client.Call.invoke(Call.java:1812)
> 	at com.microsoft.schemas.sharepoint.soap.ListsSoapStub.getListItems(ListsSoapStub.java:1841)
> 	at org.apache.manifoldcf.crawler.connectors.sharepoint.SPSProxyHelper.getDocuments(SPSProxyHelper.java:629)
> 	at org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:909)
> 	at org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
> 	at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:561)
> DEBUG 
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira