You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by Farzad Valad <ho...@farzad.net> on 2011/06/07 16:20:31 UTC

Aborting State

Lately when I issue an abort on a crawl job (click abort in UI), it gets 
stuck, meaning the UI doesn't show any new info on subsequent 
refreshes.  It just says Aborting, the start time, no end time, shows # 
of documents, active, and processed.  I restarted Tomcat, but still 
stuck in Aborting state.  Restarting the Agent process doesn't have any 
affect.  But now if you kill the agent process and issue lock clean, 
then start the Agent Process, it will show an Error in the Status 
column, but no end time.  Ironically, this time the problem was a bad 
transaction id.  The last time it was a connection refusal to my 
repository.  Thoughts?

PS.  Previous problem, you were right, dataManager is going null for 
some reason, actually debugging for dataManager I ran into this one : )

ERROR 2011-06-07 08:50:01,416 [Worker thread '64'] 
(CacheManager.java:621) - Thread[Worker thread '64',5,main]: 
invalidateKeys: 1307454600471: 
org.apache.manifoldcf.core.cachemanager.CacheManager@39d7af3: 
Transaction hash = {}
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad 
transaction ID!
     at 
org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
     at 
org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
     at 
org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
     at 
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
     at 
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
     at 
org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
     at 
org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
     at 
org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
     at 
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
     at 
org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
     at 
org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
     at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
ERROR 2011-06-07 08:50:01,510 [Worker thread '64'] 
(WorkerThread.java:893) - Exception tossed: Bad transaction ID!
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad 
transaction ID!
     at 
org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
     at 
org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
     at 
org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
     at 
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
     at 
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
     at 
org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
     at 
org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
     at 
org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
     at 
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
     at 
org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
     at 
org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
     at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)


Re: Aborting State

Posted by Karl Wright <da...@gmail.com>.
The cross-thread issues you were having with your connector would
certainly have affected database access in a significant way, so this
symptom could well be one result of that problem.

Karl

On Tue, Jun 7, 2011 at 10:20 AM, Farzad Valad <ho...@farzad.net> wrote:
> Lately when I issue an abort on a crawl job (click abort in UI), it gets
> stuck, meaning the UI doesn't show any new info on subsequent refreshes.  It
> just says Aborting, the start time, no end time, shows # of documents,
> active, and processed.  I restarted Tomcat, but still stuck in Aborting
> state.  Restarting the Agent process doesn't have any affect.  But now if
> you kill the agent process and issue lock clean, then start the Agent
> Process, it will show an Error in the Status column, but no end time.
>  Ironically, this time the problem was a bad transaction id.  The last time
> it was a connection refusal to my repository.  Thoughts?
>
> PS.  Previous problem, you were right, dataManager is going null for some
> reason, actually debugging for dataManager I ran into this one : )
>
> ERROR 2011-06-07 08:50:01,416 [Worker thread '64'] (CacheManager.java:621) -
> Thread[Worker thread '64',5,main]: invalidateKeys: 1307454600471:
> org.apache.manifoldcf.core.cachemanager.CacheManager@39d7af3: Transaction
> hash = {}
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad transaction
> ID!
>    at
> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>    at
> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>    at
> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>    at
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>    at
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>    at
> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>    at
> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>    at
> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
>    at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>    at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>    at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>    at
> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>    at
> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>    at
> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>    at
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
> ERROR 2011-06-07 08:50:01,510 [Worker thread '64'] (WorkerThread.java:893) -
> Exception tossed: Bad transaction ID!
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad transaction
> ID!
>    at
> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>    at
> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>    at
> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>    at
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>    at
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>    at
> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>    at
> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>    at
> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
>    at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>    at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>    at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>    at
> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>    at
> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>    at
> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>    at
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>
>

Re: Data Manager Null

Posted by Karl Wright <da...@gmail.com>.
Sorry, I wasn't clear.

The dataManager initialization can certainly occur in every
setThreadContext method call because that's when you have the
threadContext available.  But as long as you save the thread context
when setThreadContext is called, you can certainly do it in
getSession() instead - perfectly OK.  But setting up dataManager at
setThreadContext time is a fine way to go and ties this occurrence
directly to setThreadContext, which is probably a bit better.

Now, getSession() is typically used for a different purpose - namely
to set up a connection to your target repository which has some
lifetime.  Thus if (say) you need an HttpClient object, creating that
object (or grabbing it from some HttpClient pool) probably should take
place in getSession().  Since we've stipulated that the HttpClient
object should be released back into the pool when idle, you should
also keep track of the time (in getSession()) when the HttpClient was
last needed, so you know when it should be released. during the poll()
operation.

But I was basically making the case that setThreadContext() operates
on a different time scale than getSession().  setThreadContext() is
tied to when the class instance is pulled out of the class instance
pool, but there is nothing in the contract that says that ManifoldCF
can't call multiple connector methods with the same thread.  So a
connector that expects setThreadContext() to be called before every
addOrReplaceDocument() is making a mistake.  setThreadContext() only
will be called when the connector instance is pulled from the
connector instance pool, which must occur before addOrReplaceDocument
is called, but many many connector class instance invocations may
occur then.

The book describes this pictorially in figure 6.3.  You just need to
imagine that every time a thread obtains a connection handle it may
well hold onto it for an extended period of time.

Karl

On Tue, Jun 7, 2011 at 8:49 PM, Farzad Valad <ho...@farzad.net> wrote:
> I think I get it now why you said put getSession in addOrReplaceDocument.
>  This way you construct the dataManager when you need it as oppose to each
> set and clear pair : )
>
> On 6/7/2011 5:10 PM, Farzad Valad wrote:
>>
>> I don't fully understand how a connector instance can be used by multiple
>> threads without each thread calling setThread.  Here is what I think I know.
>>
>> The contract does say that addOrReplaceDocument is only called after
>> setTC, right?  Because you first have to have connection handle before a
>> particular manifold thread can use it.  So if I create my dataManager in
>> set, all will be well in addOrReplaceDocument.  The other caveat is that
>> I'll make dataManager a class variable, instead of static.  So each object
>> would have its own instance with its TC, and in clearTC they'd be nulling
>> their version an not anyone else's.
>>
>> Do I get it?
>>
>> On 6/7/2011 5:00 PM, Karl Wright wrote:
>>>
>>> The recommendation to have getSession be called in
>>> addOrReplaceDocument is because there is nothing in the contract which
>>> states that the connector instance will switch threads between calls.
>>> Therefore there is no guarantee that
>>> clearThreadContext/setThreadContext will be called right prior to
>>> addOrReplaceDocument.  The two aspects of the interface are therefore
>>> independent of one another, and it would be poor coding to presume
>>> that you could assume something in the contract that was not there.
>>>
>>> Karl
>>>
>>> On Tue, Jun 7, 2011 at 5:52 PM, Farzad Valad<ho...@farzad.net>  wrote:
>>>>
>>>> Thanks for the confirmation.  So if I have code to set dataManager to
>>>> null
>>>> in clearThreadContext and create a dataManager in setThreadContext.  Why
>>>> do
>>>> I need the getSession method in addOrReplaceDocument method?  From what
>>>> I
>>>> learnt about ManifoldCF architecture, setThreadContext will get called
>>>> before addOrReplaceDocument.  This was something you recommended when I
>>>> was
>>>> asking about the third party repository.
>>>>
>>>> Farzad.
>>>>
>>>> On 6/7/2011 4:35 PM, Karl Wright wrote:
>>>>>
>>>>> It sounds like you are on the right track for fixing all of these
>>>>> problems.
>>>>>
>>>>> Karl
>>>>>
>>>>> On Tue, Jun 7, 2011 at 4:38 PM, Farzad Valad<ho...@farzad.net>
>>>>>  wrote:
>>>>>>
>>>>>> I think I found the problem.  I should be tearing down the dataManager
>>>>>> and
>>>>>> recreating it between clear and set thread context calls, because it
>>>>>> has
>>>>>> a
>>>>>> thread context.  I'm not doing that.  I guess I did learn something
>>>>>> reading
>>>>>> : ) let me know if you believe otherwise.  Also do you think this is
>>>>>> why
>>>>>> the
>>>>>> bad transaction id is happening?  Thanks!
>>>>>>
>>>>>>            IDBInterface databaseHandle =
>>>>>> DBInterfaceFactory.make(currentContext,
>>>>>> ManifoldCF.getMasterDatabaseName(),
>>>>>> ManifoldCF.getMasterDatabaseUsername(),
>>>>>> ManifoldCF.getMasterDatabasePassword());
>>>>>>            dataManager = new DataManager(currentContext,
>>>>>> databaseHandle);
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 6/7/2011 12:42 PM, Farzad Valad wrote:
>>>>>>>
>>>>>>> So I think I figured it out.  For some reason I'm getting a db error,
>>>>>>> bad
>>>>>>> transaction id, which then kills my dataManager object, or I should
>>>>>>> say
>>>>>>> the
>>>>>>> framework is setting it to null.  What does a Bad transaction ID
>>>>>>> mean?
>>>>>>>  Thoughts?  This happened after I did a LockClean and restart both
>>>>>>> the
>>>>>>> agent
>>>>>>> and Tomcat.  Thanks, Farzad.
>>>>>>>
>>>>>>> ERROR 2011-06-07 11:44:56,365 [Worker thread '90']
>>>>>>> (CacheManager.java:621)
>>>>>>> - Thread[Worker thread '90',5,main]: invalidateKeys: 1307465096157:
>>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager@13b0c258:
>>>>>>> Transaction
>>>>>>> hash =
>>>>>>>
>>>>>>>
>>>>>>> {1307465096144=org.apache.manifoldcf.core.cachemanager.CacheManager$CacheTransactionHandle@39a72981}
>>>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>>>>> transaction
>>>>>>> ID!
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>>>>> FATAL 2011-06-07 11:44:56,583 [Worker thread '32']
>>>>>>> (DupFinderConnector.java:155) - DATAMANAGER IS NULL!!!!
>>>>>>> ERROR 2011-06-07 11:44:56,599 [Worker thread '90']
>>>>>>> (WorkerThread.java:893)
>>>>>>> - Exception tossed: Bad transaction ID!
>>>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>>>>> transaction
>>>>>>> ID!
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>>>>> FATAL 2011-06-07 11:44:56,614 [Worker thread '32']
>>>>>>> (WorkerThread.java:955)
>>>>>>> - Error tossed: null
>>>>>>> java.lang.NullPointerException
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>>>>    at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>>>>>  INFO 2011-06-07 11:44:56,645 [Worker thread '92']
>>>>>>> (DupFinderConnector.java:251) - Attempting to initialize
>>>>>>> dataManager(null)
>>>>>>> and ciConnector(null)
>>>>>>>
>>>>>>>
>>>>>>> On 6/7/2011 9:20 AM, Farzad Valad wrote:
>>>>>>>>
>>>>>>>> Lately when I issue an abort on a crawl job (click abort in UI), it
>>>>>>>> gets
>>>>>>>> stuck, meaning the UI doesn't show any new info on subsequent
>>>>>>>> refreshes.  It
>>>>>>>> just says Aborting, the start time, no end time, shows # of
>>>>>>>> documents,
>>>>>>>> active, and processed.  I restarted Tomcat, but still stuck in
>>>>>>>> Aborting
>>>>>>>> state.  Restarting the Agent process doesn't have any affect.  But
>>>>>>>> now
>>>>>>>> if
>>>>>>>> you kill the agent process and issue lock clean, then start the
>>>>>>>> Agent
>>>>>>>> Process, it will show an Error in the Status column, but no end
>>>>>>>> time.
>>>>>>>>  Ironically, this time the problem was a bad transaction id.  The
>>>>>>>> last
>>>>>>>> time
>>>>>>>> it was a connection refusal to my repository.  Thoughts?
>>>>>>>>
>>>>>>>> PS.  Previous problem, you were right, dataManager is going null for
>>>>>>>> some
>>>>>>>> reason, actually debugging for dataManager I ran into this one : )
>>>>>>>>
>>>>>>>> ERROR 2011-06-07 08:50:01,416 [Worker thread '64']
>>>>>>>> (CacheManager.java:621) - Thread[Worker thread '64',5,main]:
>>>>>>>> invalidateKeys:
>>>>>>>> 1307454600471:
>>>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager@39d7af3:
>>>>>>>> Transaction hash = {}
>>>>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>>>>>> transaction ID!
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>>>>>> ERROR 2011-06-07 08:50:01,510 [Worker thread '64']
>>>>>>>> (WorkerThread.java:893) - Exception tossed: Bad transaction ID!
>>>>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>>>>>> transaction ID!
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>>>>>    at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>>>>>>
>>>>
>>
>
>

Re: Data Manager Null

Posted by Farzad Valad <ho...@farzad.net>.
I think I get it now why you said put getSession in 
addOrReplaceDocument.  This way you construct the dataManager when you 
need it as oppose to each set and clear pair : )

On 6/7/2011 5:10 PM, Farzad Valad wrote:
> I don't fully understand how a connector instance can be used by 
> multiple threads without each thread calling setThread.  Here is what 
> I think I know.
>
> The contract does say that addOrReplaceDocument is only called after 
> setTC, right?  Because you first have to have connection handle before 
> a particular manifold thread can use it.  So if I create my 
> dataManager in set, all will be well in addOrReplaceDocument.  The 
> other caveat is that I'll make dataManager a class variable, instead 
> of static.  So each object would have its own instance with its TC, 
> and in clearTC they'd be nulling their version an not anyone else's.
>
> Do I get it?
>
> On 6/7/2011 5:00 PM, Karl Wright wrote:
>> The recommendation to have getSession be called in
>> addOrReplaceDocument is because there is nothing in the contract which
>> states that the connector instance will switch threads between calls.
>> Therefore there is no guarantee that
>> clearThreadContext/setThreadContext will be called right prior to
>> addOrReplaceDocument.  The two aspects of the interface are therefore
>> independent of one another, and it would be poor coding to presume
>> that you could assume something in the contract that was not there.
>>
>> Karl
>>
>> On Tue, Jun 7, 2011 at 5:52 PM, Farzad Valad<ho...@farzad.net>  wrote:
>>> Thanks for the confirmation.  So if I have code to set dataManager 
>>> to null
>>> in clearThreadContext and create a dataManager in setThreadContext.  
>>> Why do
>>> I need the getSession method in addOrReplaceDocument method?  From 
>>> what I
>>> learnt about ManifoldCF architecture, setThreadContext will get called
>>> before addOrReplaceDocument.  This was something you recommended 
>>> when I was
>>> asking about the third party repository.
>>>
>>> Farzad.
>>>
>>> On 6/7/2011 4:35 PM, Karl Wright wrote:
>>>> It sounds like you are on the right track for fixing all of these
>>>> problems.
>>>>
>>>> Karl
>>>>
>>>> On Tue, Jun 7, 2011 at 4:38 PM, Farzad Valad<ho...@farzad.net>    
>>>> wrote:
>>>>> I think I found the problem.  I should be tearing down the 
>>>>> dataManager
>>>>> and
>>>>> recreating it between clear and set thread context calls, because 
>>>>> it has
>>>>> a
>>>>> thread context.  I'm not doing that.  I guess I did learn something
>>>>> reading
>>>>> : ) let me know if you believe otherwise.  Also do you think this 
>>>>> is why
>>>>> the
>>>>> bad transaction id is happening?  Thanks!
>>>>>
>>>>>             IDBInterface databaseHandle =
>>>>> DBInterfaceFactory.make(currentContext,
>>>>> ManifoldCF.getMasterDatabaseName(),
>>>>> ManifoldCF.getMasterDatabaseUsername(),
>>>>> ManifoldCF.getMasterDatabasePassword());
>>>>>             dataManager = new DataManager(currentContext, 
>>>>> databaseHandle);
>>>>>
>>>>>
>>>>>
>>>>> On 6/7/2011 12:42 PM, Farzad Valad wrote:
>>>>>> So I think I figured it out.  For some reason I'm getting a db 
>>>>>> error,
>>>>>> bad
>>>>>> transaction id, which then kills my dataManager object, or I 
>>>>>> should say
>>>>>> the
>>>>>> framework is setting it to null.  What does a Bad transaction ID 
>>>>>> mean?
>>>>>>   Thoughts?  This happened after I did a LockClean and restart 
>>>>>> both the
>>>>>> agent
>>>>>> and Tomcat.  Thanks, Farzad.
>>>>>>
>>>>>> ERROR 2011-06-07 11:44:56,365 [Worker thread '90']
>>>>>> (CacheManager.java:621)
>>>>>> - Thread[Worker thread '90',5,main]: invalidateKeys: 1307465096157:
>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager@13b0c258:
>>>>>> Transaction
>>>>>> hash =
>>>>>>
>>>>>> {1307465096144=org.apache.manifoldcf.core.cachemanager.CacheManager$CacheTransactionHandle@39a72981} 
>>>>>>
>>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>>>> transaction
>>>>>> ID!
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564) 
>>>>>>
>>>>>> FATAL 2011-06-07 11:44:56,583 [Worker thread '32']
>>>>>> (DupFinderConnector.java:155) - DATAMANAGER IS NULL!!!!
>>>>>> ERROR 2011-06-07 11:44:56,599 [Worker thread '90']
>>>>>> (WorkerThread.java:893)
>>>>>> - Exception tossed: Bad transaction ID!
>>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>>>> transaction
>>>>>> ID!
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564) 
>>>>>>
>>>>>> FATAL 2011-06-07 11:44:56,614 [Worker thread '32']
>>>>>> (WorkerThread.java:955)
>>>>>> - Error tossed: null
>>>>>> java.lang.NullPointerException
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423) 
>>>>>>
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564) 
>>>>>>
>>>>>>   INFO 2011-06-07 11:44:56,645 [Worker thread '92']
>>>>>> (DupFinderConnector.java:251) - Attempting to initialize
>>>>>> dataManager(null)
>>>>>> and ciConnector(null)
>>>>>>
>>>>>>
>>>>>> On 6/7/2011 9:20 AM, Farzad Valad wrote:
>>>>>>> Lately when I issue an abort on a crawl job (click abort in UI), it
>>>>>>> gets
>>>>>>> stuck, meaning the UI doesn't show any new info on subsequent
>>>>>>> refreshes.  It
>>>>>>> just says Aborting, the start time, no end time, shows # of 
>>>>>>> documents,
>>>>>>> active, and processed.  I restarted Tomcat, but still stuck in 
>>>>>>> Aborting
>>>>>>> state.  Restarting the Agent process doesn't have any affect.  
>>>>>>> But now
>>>>>>> if
>>>>>>> you kill the agent process and issue lock clean, then start the 
>>>>>>> Agent
>>>>>>> Process, it will show an Error in the Status column, but no end 
>>>>>>> time.
>>>>>>>   Ironically, this time the problem was a bad transaction id.  
>>>>>>> The last
>>>>>>> time
>>>>>>> it was a connection refusal to my repository.  Thoughts?
>>>>>>>
>>>>>>> PS.  Previous problem, you were right, dataManager is going null 
>>>>>>> for
>>>>>>> some
>>>>>>> reason, actually debugging for dataManager I ran into this one : )
>>>>>>>
>>>>>>> ERROR 2011-06-07 08:50:01,416 [Worker thread '64']
>>>>>>> (CacheManager.java:621) - Thread[Worker thread '64',5,main]:
>>>>>>> invalidateKeys:
>>>>>>> 1307454600471:
>>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager@39d7af3:
>>>>>>> Transaction hash = {}
>>>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>>>>> transaction ID!
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564) 
>>>>>>>
>>>>>>> ERROR 2011-06-07 08:50:01,510 [Worker thread '64']
>>>>>>> (WorkerThread.java:893) - Exception tossed: Bad transaction ID!
>>>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>>>>> transaction ID!
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423) 
>>>>>>>
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564) 
>>>>>>>
>>>>>>>
>>>
>


Re: Data Manager Null

Posted by Farzad Valad <ho...@farzad.net>.
I don't fully understand how a connector instance can be used by 
multiple threads without each thread calling setThread.  Here is what I 
think I know.

The contract does say that addOrReplaceDocument is only called after 
setTC, right?  Because you first have to have connection handle before a 
particular manifold thread can use it.  So if I create my dataManager in 
set, all will be well in addOrReplaceDocument.  The other caveat is that 
I'll make dataManager a class variable, instead of static.  So each 
object would have its own instance with its TC, and in clearTC they'd be 
nulling their version an not anyone else's.

Do I get it?

On 6/7/2011 5:00 PM, Karl Wright wrote:
> The recommendation to have getSession be called in
> addOrReplaceDocument is because there is nothing in the contract which
> states that the connector instance will switch threads between calls.
> Therefore there is no guarantee that
> clearThreadContext/setThreadContext will be called right prior to
> addOrReplaceDocument.  The two aspects of the interface are therefore
> independent of one another, and it would be poor coding to presume
> that you could assume something in the contract that was not there.
>
> Karl
>
> On Tue, Jun 7, 2011 at 5:52 PM, Farzad Valad<ho...@farzad.net>  wrote:
>> Thanks for the confirmation.  So if I have code to set dataManager to null
>> in clearThreadContext and create a dataManager in setThreadContext.  Why do
>> I need the getSession method in addOrReplaceDocument method?  From what I
>> learnt about ManifoldCF architecture, setThreadContext will get called
>> before addOrReplaceDocument.  This was something you recommended when I was
>> asking about the third party repository.
>>
>> Farzad.
>>
>> On 6/7/2011 4:35 PM, Karl Wright wrote:
>>> It sounds like you are on the right track for fixing all of these
>>> problems.
>>>
>>> Karl
>>>
>>> On Tue, Jun 7, 2011 at 4:38 PM, Farzad Valad<ho...@farzad.net>    wrote:
>>>> I think I found the problem.  I should be tearing down the dataManager
>>>> and
>>>> recreating it between clear and set thread context calls, because it has
>>>> a
>>>> thread context.  I'm not doing that.  I guess I did learn something
>>>> reading
>>>> : ) let me know if you believe otherwise.  Also do you think this is why
>>>> the
>>>> bad transaction id is happening?  Thanks!
>>>>
>>>>             IDBInterface databaseHandle =
>>>> DBInterfaceFactory.make(currentContext,
>>>> ManifoldCF.getMasterDatabaseName(),
>>>> ManifoldCF.getMasterDatabaseUsername(),
>>>> ManifoldCF.getMasterDatabasePassword());
>>>>             dataManager = new DataManager(currentContext, databaseHandle);
>>>>
>>>>
>>>>
>>>> On 6/7/2011 12:42 PM, Farzad Valad wrote:
>>>>> So I think I figured it out.  For some reason I'm getting a db error,
>>>>> bad
>>>>> transaction id, which then kills my dataManager object, or I should say
>>>>> the
>>>>> framework is setting it to null.  What does a Bad transaction ID mean?
>>>>>   Thoughts?  This happened after I did a LockClean and restart both the
>>>>> agent
>>>>> and Tomcat.  Thanks, Farzad.
>>>>>
>>>>> ERROR 2011-06-07 11:44:56,365 [Worker thread '90']
>>>>> (CacheManager.java:621)
>>>>> - Thread[Worker thread '90',5,main]: invalidateKeys: 1307465096157:
>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager@13b0c258:
>>>>> Transaction
>>>>> hash =
>>>>>
>>>>> {1307465096144=org.apache.manifoldcf.core.cachemanager.CacheManager$CacheTransactionHandle@39a72981}
>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>>> transaction
>>>>> ID!
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>>> FATAL 2011-06-07 11:44:56,583 [Worker thread '32']
>>>>> (DupFinderConnector.java:155) - DATAMANAGER IS NULL!!!!
>>>>> ERROR 2011-06-07 11:44:56,599 [Worker thread '90']
>>>>> (WorkerThread.java:893)
>>>>> - Exception tossed: Bad transaction ID!
>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>>> transaction
>>>>> ID!
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>>> FATAL 2011-06-07 11:44:56,614 [Worker thread '32']
>>>>> (WorkerThread.java:955)
>>>>> - Error tossed: null
>>>>> java.lang.NullPointerException
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>>     at
>>>>>
>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>>>   INFO 2011-06-07 11:44:56,645 [Worker thread '92']
>>>>> (DupFinderConnector.java:251) - Attempting to initialize
>>>>> dataManager(null)
>>>>> and ciConnector(null)
>>>>>
>>>>>
>>>>> On 6/7/2011 9:20 AM, Farzad Valad wrote:
>>>>>> Lately when I issue an abort on a crawl job (click abort in UI), it
>>>>>> gets
>>>>>> stuck, meaning the UI doesn't show any new info on subsequent
>>>>>> refreshes.  It
>>>>>> just says Aborting, the start time, no end time, shows # of documents,
>>>>>> active, and processed.  I restarted Tomcat, but still stuck in Aborting
>>>>>> state.  Restarting the Agent process doesn't have any affect.  But now
>>>>>> if
>>>>>> you kill the agent process and issue lock clean, then start the Agent
>>>>>> Process, it will show an Error in the Status column, but no end time.
>>>>>>   Ironically, this time the problem was a bad transaction id.  The last
>>>>>> time
>>>>>> it was a connection refusal to my repository.  Thoughts?
>>>>>>
>>>>>> PS.  Previous problem, you were right, dataManager is going null for
>>>>>> some
>>>>>> reason, actually debugging for dataManager I ran into this one : )
>>>>>>
>>>>>> ERROR 2011-06-07 08:50:01,416 [Worker thread '64']
>>>>>> (CacheManager.java:621) - Thread[Worker thread '64',5,main]:
>>>>>> invalidateKeys:
>>>>>> 1307454600471:
>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager@39d7af3:
>>>>>> Transaction hash = {}
>>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>>>> transaction ID!
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>>>> ERROR 2011-06-07 08:50:01,510 [Worker thread '64']
>>>>>> (WorkerThread.java:893) - Exception tossed: Bad transaction ID!
>>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>>>> transaction ID!
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>>>     at
>>>>>>
>>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>>>>
>>


Re: Data Manager Null

Posted by Karl Wright <da...@gmail.com>.
The recommendation to have getSession be called in
addOrReplaceDocument is because there is nothing in the contract which
states that the connector instance will switch threads between calls.
Therefore there is no guarantee that
clearThreadContext/setThreadContext will be called right prior to
addOrReplaceDocument.  The two aspects of the interface are therefore
independent of one another, and it would be poor coding to presume
that you could assume something in the contract that was not there.

Karl

On Tue, Jun 7, 2011 at 5:52 PM, Farzad Valad <ho...@farzad.net> wrote:
> Thanks for the confirmation.  So if I have code to set dataManager to null
> in clearThreadContext and create a dataManager in setThreadContext.  Why do
> I need the getSession method in addOrReplaceDocument method?  From what I
> learnt about ManifoldCF architecture, setThreadContext will get called
> before addOrReplaceDocument.  This was something you recommended when I was
> asking about the third party repository.
>
> Farzad.
>
> On 6/7/2011 4:35 PM, Karl Wright wrote:
>>
>> It sounds like you are on the right track for fixing all of these
>> problems.
>>
>> Karl
>>
>> On Tue, Jun 7, 2011 at 4:38 PM, Farzad Valad<ho...@farzad.net>  wrote:
>>>
>>> I think I found the problem.  I should be tearing down the dataManager
>>> and
>>> recreating it between clear and set thread context calls, because it has
>>> a
>>> thread context.  I'm not doing that.  I guess I did learn something
>>> reading
>>> : ) let me know if you believe otherwise.  Also do you think this is why
>>> the
>>> bad transaction id is happening?  Thanks!
>>>
>>>            IDBInterface databaseHandle =
>>> DBInterfaceFactory.make(currentContext,
>>> ManifoldCF.getMasterDatabaseName(),
>>> ManifoldCF.getMasterDatabaseUsername(),
>>> ManifoldCF.getMasterDatabasePassword());
>>>            dataManager = new DataManager(currentContext, databaseHandle);
>>>
>>>
>>>
>>> On 6/7/2011 12:42 PM, Farzad Valad wrote:
>>>>
>>>> So I think I figured it out.  For some reason I'm getting a db error,
>>>> bad
>>>> transaction id, which then kills my dataManager object, or I should say
>>>> the
>>>> framework is setting it to null.  What does a Bad transaction ID mean?
>>>>  Thoughts?  This happened after I did a LockClean and restart both the
>>>> agent
>>>> and Tomcat.  Thanks, Farzad.
>>>>
>>>> ERROR 2011-06-07 11:44:56,365 [Worker thread '90']
>>>> (CacheManager.java:621)
>>>> - Thread[Worker thread '90',5,main]: invalidateKeys: 1307465096157:
>>>> org.apache.manifoldcf.core.cachemanager.CacheManager@13b0c258:
>>>> Transaction
>>>> hash =
>>>>
>>>> {1307465096144=org.apache.manifoldcf.core.cachemanager.CacheManager$CacheTransactionHandle@39a72981}
>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>> transaction
>>>> ID!
>>>>    at
>>>>
>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>> FATAL 2011-06-07 11:44:56,583 [Worker thread '32']
>>>> (DupFinderConnector.java:155) - DATAMANAGER IS NULL!!!!
>>>> ERROR 2011-06-07 11:44:56,599 [Worker thread '90']
>>>> (WorkerThread.java:893)
>>>> - Exception tossed: Bad transaction ID!
>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>> transaction
>>>> ID!
>>>>    at
>>>>
>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>> FATAL 2011-06-07 11:44:56,614 [Worker thread '32']
>>>> (WorkerThread.java:955)
>>>> - Error tossed: null
>>>> java.lang.NullPointerException
>>>>    at
>>>>
>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>    at
>>>>
>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>>  INFO 2011-06-07 11:44:56,645 [Worker thread '92']
>>>> (DupFinderConnector.java:251) - Attempting to initialize
>>>> dataManager(null)
>>>> and ciConnector(null)
>>>>
>>>>
>>>> On 6/7/2011 9:20 AM, Farzad Valad wrote:
>>>>>
>>>>> Lately when I issue an abort on a crawl job (click abort in UI), it
>>>>> gets
>>>>> stuck, meaning the UI doesn't show any new info on subsequent
>>>>> refreshes.  It
>>>>> just says Aborting, the start time, no end time, shows # of documents,
>>>>> active, and processed.  I restarted Tomcat, but still stuck in Aborting
>>>>> state.  Restarting the Agent process doesn't have any affect.  But now
>>>>> if
>>>>> you kill the agent process and issue lock clean, then start the Agent
>>>>> Process, it will show an Error in the Status column, but no end time.
>>>>>  Ironically, this time the problem was a bad transaction id.  The last
>>>>> time
>>>>> it was a connection refusal to my repository.  Thoughts?
>>>>>
>>>>> PS.  Previous problem, you were right, dataManager is going null for
>>>>> some
>>>>> reason, actually debugging for dataManager I ran into this one : )
>>>>>
>>>>> ERROR 2011-06-07 08:50:01,416 [Worker thread '64']
>>>>> (CacheManager.java:621) - Thread[Worker thread '64',5,main]:
>>>>> invalidateKeys:
>>>>> 1307454600471:
>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager@39d7af3:
>>>>> Transaction hash = {}
>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>>> transaction ID!
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>>> ERROR 2011-06-07 08:50:01,510 [Worker thread '64']
>>>>> (WorkerThread.java:893) - Exception tossed: Bad transaction ID!
>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>>> transaction ID!
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>>    at
>>>>>
>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>>>
>>>
>
>

Re: Data Manager Null

Posted by Farzad Valad <ho...@farzad.net>.
Thanks for the confirmation.  So if I have code to set dataManager to 
null in clearThreadContext and create a dataManager in 
setThreadContext.  Why do I need the getSession method in 
addOrReplaceDocument method?  From what I learnt about ManifoldCF 
architecture, setThreadContext will get called before 
addOrReplaceDocument.  This was something you recommended when I was 
asking about the third party repository.

Farzad.

On 6/7/2011 4:35 PM, Karl Wright wrote:
> It sounds like you are on the right track for fixing all of these problems.
>
> Karl
>
> On Tue, Jun 7, 2011 at 4:38 PM, Farzad Valad<ho...@farzad.net>  wrote:
>> I think I found the problem.  I should be tearing down the dataManager and
>> recreating it between clear and set thread context calls, because it has a
>> thread context.  I'm not doing that.  I guess I did learn something reading
>> : ) let me know if you believe otherwise.  Also do you think this is why the
>> bad transaction id is happening?  Thanks!
>>
>>             IDBInterface databaseHandle =
>> DBInterfaceFactory.make(currentContext, ManifoldCF.getMasterDatabaseName(),
>> ManifoldCF.getMasterDatabaseUsername(),
>> ManifoldCF.getMasterDatabasePassword());
>>             dataManager = new DataManager(currentContext, databaseHandle);
>>
>>
>>
>> On 6/7/2011 12:42 PM, Farzad Valad wrote:
>>> So I think I figured it out.  For some reason I'm getting a db error, bad
>>> transaction id, which then kills my dataManager object, or I should say the
>>> framework is setting it to null.  What does a Bad transaction ID mean?
>>>   Thoughts?  This happened after I did a LockClean and restart both the agent
>>> and Tomcat.  Thanks, Farzad.
>>>
>>> ERROR 2011-06-07 11:44:56,365 [Worker thread '90'] (CacheManager.java:621)
>>> - Thread[Worker thread '90',5,main]: invalidateKeys: 1307465096157:
>>> org.apache.manifoldcf.core.cachemanager.CacheManager@13b0c258: Transaction
>>> hash =
>>> {1307465096144=org.apache.manifoldcf.core.cachemanager.CacheManager$CacheTransactionHandle@39a72981}
>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad transaction
>>> ID!
>>>     at
>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>     at
>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>     at
>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>     at
>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>     at
>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>     at
>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>     at
>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>     at
>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>>>     at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>     at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>     at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>     at
>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>     at
>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>     at
>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>     at
>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>> FATAL 2011-06-07 11:44:56,583 [Worker thread '32']
>>> (DupFinderConnector.java:155) - DATAMANAGER IS NULL!!!!
>>> ERROR 2011-06-07 11:44:56,599 [Worker thread '90'] (WorkerThread.java:893)
>>> - Exception tossed: Bad transaction ID!
>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad transaction
>>> ID!
>>>     at
>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>     at
>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>     at
>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>     at
>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>     at
>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>     at
>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>     at
>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>     at
>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>>>     at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>     at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>     at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>     at
>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>     at
>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>     at
>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>     at
>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>> FATAL 2011-06-07 11:44:56,614 [Worker thread '32'] (WorkerThread.java:955)
>>> - Error tossed: null
>>> java.lang.NullPointerException
>>>     at
>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>>>     at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>     at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>     at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>     at
>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>     at
>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>     at
>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>     at
>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>   INFO 2011-06-07 11:44:56,645 [Worker thread '92']
>>> (DupFinderConnector.java:251) - Attempting to initialize dataManager(null)
>>> and ciConnector(null)
>>>
>>>
>>> On 6/7/2011 9:20 AM, Farzad Valad wrote:
>>>> Lately when I issue an abort on a crawl job (click abort in UI), it gets
>>>> stuck, meaning the UI doesn't show any new info on subsequent refreshes.  It
>>>> just says Aborting, the start time, no end time, shows # of documents,
>>>> active, and processed.  I restarted Tomcat, but still stuck in Aborting
>>>> state.  Restarting the Agent process doesn't have any affect.  But now if
>>>> you kill the agent process and issue lock clean, then start the Agent
>>>> Process, it will show an Error in the Status column, but no end time.
>>>>   Ironically, this time the problem was a bad transaction id.  The last time
>>>> it was a connection refusal to my repository.  Thoughts?
>>>>
>>>> PS.  Previous problem, you were right, dataManager is going null for some
>>>> reason, actually debugging for dataManager I ran into this one : )
>>>>
>>>> ERROR 2011-06-07 08:50:01,416 [Worker thread '64']
>>>> (CacheManager.java:621) - Thread[Worker thread '64',5,main]: invalidateKeys:
>>>> 1307454600471: org.apache.manifoldcf.core.cachemanager.CacheManager@39d7af3:
>>>> Transaction hash = {}
>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>> transaction ID!
>>>>     at
>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>>     at
>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>>     at
>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>>     at
>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>>     at
>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>>     at
>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>>     at
>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>>     at
>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
>>>>     at
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>     at
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>     at
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>     at
>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>     at
>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>     at
>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>     at
>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>> ERROR 2011-06-07 08:50:01,510 [Worker thread '64']
>>>> (WorkerThread.java:893) - Exception tossed: Bad transaction ID!
>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>>> transaction ID!
>>>>     at
>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>>     at
>>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>>     at
>>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>>     at
>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>>     at
>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>>     at
>>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>>     at
>>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>>     at
>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
>>>>     at
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>     at
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>     at
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>     at
>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>     at
>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>     at
>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>     at
>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>>
>>


Re: Data Manager Null

Posted by Karl Wright <da...@gmail.com>.
It sounds like you are on the right track for fixing all of these problems.

Karl

On Tue, Jun 7, 2011 at 4:38 PM, Farzad Valad <ho...@farzad.net> wrote:
> I think I found the problem.  I should be tearing down the dataManager and
> recreating it between clear and set thread context calls, because it has a
> thread context.  I'm not doing that.  I guess I did learn something reading
> : ) let me know if you believe otherwise.  Also do you think this is why the
> bad transaction id is happening?  Thanks!
>
>            IDBInterface databaseHandle =
> DBInterfaceFactory.make(currentContext, ManifoldCF.getMasterDatabaseName(),
> ManifoldCF.getMasterDatabaseUsername(),
> ManifoldCF.getMasterDatabasePassword());
>            dataManager = new DataManager(currentContext, databaseHandle);
>
>
>
> On 6/7/2011 12:42 PM, Farzad Valad wrote:
>>
>> So I think I figured it out.  For some reason I'm getting a db error, bad
>> transaction id, which then kills my dataManager object, or I should say the
>> framework is setting it to null.  What does a Bad transaction ID mean?
>>  Thoughts?  This happened after I did a LockClean and restart both the agent
>> and Tomcat.  Thanks, Farzad.
>>
>> ERROR 2011-06-07 11:44:56,365 [Worker thread '90'] (CacheManager.java:621)
>> - Thread[Worker thread '90',5,main]: invalidateKeys: 1307465096157:
>> org.apache.manifoldcf.core.cachemanager.CacheManager@13b0c258: Transaction
>> hash =
>> {1307465096144=org.apache.manifoldcf.core.cachemanager.CacheManager$CacheTransactionHandle@39a72981}
>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad transaction
>> ID!
>>    at
>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>    at
>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>    at
>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>    at
>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>    at
>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>    at
>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>    at
>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>    at
>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>>    at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>    at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>    at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>    at
>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>    at
>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>    at
>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>    at
>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>> FATAL 2011-06-07 11:44:56,583 [Worker thread '32']
>> (DupFinderConnector.java:155) - DATAMANAGER IS NULL!!!!
>> ERROR 2011-06-07 11:44:56,599 [Worker thread '90'] (WorkerThread.java:893)
>> - Exception tossed: Bad transaction ID!
>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad transaction
>> ID!
>>    at
>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>    at
>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>    at
>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>    at
>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>    at
>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>    at
>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>    at
>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>    at
>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>>    at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>    at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>    at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>    at
>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>    at
>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>    at
>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>    at
>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>> FATAL 2011-06-07 11:44:56,614 [Worker thread '32'] (WorkerThread.java:955)
>> - Error tossed: null
>> java.lang.NullPointerException
>>    at
>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>>    at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>    at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>    at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>    at
>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>    at
>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>    at
>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>    at
>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>  INFO 2011-06-07 11:44:56,645 [Worker thread '92']
>> (DupFinderConnector.java:251) - Attempting to initialize dataManager(null)
>> and ciConnector(null)
>>
>>
>> On 6/7/2011 9:20 AM, Farzad Valad wrote:
>>>
>>> Lately when I issue an abort on a crawl job (click abort in UI), it gets
>>> stuck, meaning the UI doesn't show any new info on subsequent refreshes.  It
>>> just says Aborting, the start time, no end time, shows # of documents,
>>> active, and processed.  I restarted Tomcat, but still stuck in Aborting
>>> state.  Restarting the Agent process doesn't have any affect.  But now if
>>> you kill the agent process and issue lock clean, then start the Agent
>>> Process, it will show an Error in the Status column, but no end time.
>>>  Ironically, this time the problem was a bad transaction id.  The last time
>>> it was a connection refusal to my repository.  Thoughts?
>>>
>>> PS.  Previous problem, you were right, dataManager is going null for some
>>> reason, actually debugging for dataManager I ran into this one : )
>>>
>>> ERROR 2011-06-07 08:50:01,416 [Worker thread '64']
>>> (CacheManager.java:621) - Thread[Worker thread '64',5,main]: invalidateKeys:
>>> 1307454600471: org.apache.manifoldcf.core.cachemanager.CacheManager@39d7af3:
>>> Transaction hash = {}
>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>> transaction ID!
>>>    at
>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>    at
>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>    at
>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>    at
>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>    at
>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>    at
>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>    at
>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>    at
>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
>>>    at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>    at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>    at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>    at
>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>    at
>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>    at
>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>    at
>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>> ERROR 2011-06-07 08:50:01,510 [Worker thread '64']
>>> (WorkerThread.java:893) - Exception tossed: Bad transaction ID!
>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad
>>> transaction ID!
>>>    at
>>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>>    at
>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>>    at
>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>>    at
>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>>    at
>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>>    at
>>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>>    at
>>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>>    at
>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
>>>    at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>    at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>    at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>    at
>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>    at
>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>    at
>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>    at
>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>
>>
>
>

Re: Data Manager Null

Posted by Farzad Valad <ho...@farzad.net>.
I think I found the problem.  I should be tearing down the dataManager 
and recreating it between clear and set thread context calls, because it 
has a thread context.  I'm not doing that.  I guess I did learn 
something reading : ) let me know if you believe otherwise.  Also do you 
think this is why the bad transaction id is happening?  Thanks!

             IDBInterface databaseHandle = 
DBInterfaceFactory.make(currentContext, 
ManifoldCF.getMasterDatabaseName(), 
ManifoldCF.getMasterDatabaseUsername(), 
ManifoldCF.getMasterDatabasePassword());
             dataManager = new DataManager(currentContext, databaseHandle);



On 6/7/2011 12:42 PM, Farzad Valad wrote:
> So I think I figured it out.  For some reason I'm getting a db error, 
> bad transaction id, which then kills my dataManager object, or I 
> should say the framework is setting it to null.  What does a Bad 
> transaction ID mean?  Thoughts?  This happened after I did a LockClean 
> and restart both the agent and Tomcat.  Thanks, Farzad.
>
> ERROR 2011-06-07 11:44:56,365 [Worker thread '90'] 
> (CacheManager.java:621) - Thread[Worker thread '90',5,main]: 
> invalidateKeys: 1307465096157: 
> org.apache.manifoldcf.core.cachemanager.CacheManager@13b0c258: 
> Transaction hash = 
> {1307465096144=org.apache.manifoldcf.core.cachemanager.CacheManager$CacheTransactionHandle@39a72981}
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad 
> transaction ID!
>     at 
> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>     at 
> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>     at 
> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>     at 
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>     at 
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>     at 
> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>     at 
> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>     at 
> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>     at 
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>     at 
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>     at 
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>     at 
> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>     at 
> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>     at 
> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>     at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
> FATAL 2011-06-07 11:44:56,583 [Worker thread '32'] 
> (DupFinderConnector.java:155) - DATAMANAGER IS NULL!!!!
> ERROR 2011-06-07 11:44:56,599 [Worker thread '90'] 
> (WorkerThread.java:893) - Exception tossed: Bad transaction ID!
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad 
> transaction ID!
>     at 
> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>     at 
> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>     at 
> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>     at 
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>     at 
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>     at 
> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>     at 
> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>     at 
> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>     at 
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>     at 
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>     at 
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>     at 
> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>     at 
> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>     at 
> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>     at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
> FATAL 2011-06-07 11:44:56,614 [Worker thread '32'] 
> (WorkerThread.java:955) - Error tossed: null
> java.lang.NullPointerException
>     at 
> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
>     at 
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>     at 
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>     at 
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>     at 
> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>     at 
> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>     at 
> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>     at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>  INFO 2011-06-07 11:44:56,645 [Worker thread '92'] 
> (DupFinderConnector.java:251) - Attempting to initialize 
> dataManager(null) and ciConnector(null)
>
>
> On 6/7/2011 9:20 AM, Farzad Valad wrote:
>> Lately when I issue an abort on a crawl job (click abort in UI), it 
>> gets stuck, meaning the UI doesn't show any new info on subsequent 
>> refreshes.  It just says Aborting, the start time, no end time, shows 
>> # of documents, active, and processed.  I restarted Tomcat, but still 
>> stuck in Aborting state.  Restarting the Agent process doesn't have 
>> any affect.  But now if you kill the agent process and issue lock 
>> clean, then start the Agent Process, it will show an Error in the 
>> Status column, but no end time.  Ironically, this time the problem 
>> was a bad transaction id.  The last time it was a connection refusal 
>> to my repository.  Thoughts?
>>
>> PS.  Previous problem, you were right, dataManager is going null for 
>> some reason, actually debugging for dataManager I ran into this one : )
>>
>> ERROR 2011-06-07 08:50:01,416 [Worker thread '64'] 
>> (CacheManager.java:621) - Thread[Worker thread '64',5,main]: 
>> invalidateKeys: 1307454600471: 
>> org.apache.manifoldcf.core.cachemanager.CacheManager@39d7af3: 
>> Transaction hash = {}
>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad 
>> transaction ID!
>>     at 
>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>     at 
>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>     at 
>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>     at 
>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>     at 
>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>     at 
>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>     at 
>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>     at 
>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
>>     at 
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>     at 
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>     at 
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>     at 
>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>     at 
>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>     at 
>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>     at 
>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>> ERROR 2011-06-07 08:50:01,510 [Worker thread '64'] 
>> (WorkerThread.java:893) - Exception tossed: Bad transaction ID!
>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad 
>> transaction ID!
>>     at 
>> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>>     at 
>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>>     at 
>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>>     at 
>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>>     at 
>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>>     at 
>> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>>     at 
>> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>>     at 
>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
>>     at 
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>     at 
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>     at 
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>     at 
>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>     at 
>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>     at 
>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>     at 
>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>
>


Data Manager Null

Posted by Farzad Valad <ho...@farzad.net>.
So I think I figured it out.  For some reason I'm getting a db error, 
bad transaction id, which then kills my dataManager object, or I should 
say the framework is setting it to null.  What does a Bad transaction ID 
mean?  Thoughts?  This happened after I did a LockClean and restart both 
the agent and Tomcat.  Thanks, Farzad.

ERROR 2011-06-07 11:44:56,365 [Worker thread '90'] 
(CacheManager.java:621) - Thread[Worker thread '90',5,main]: 
invalidateKeys: 1307465096157: 
org.apache.manifoldcf.core.cachemanager.CacheManager@13b0c258: 
Transaction hash = 
{1307465096144=org.apache.manifoldcf.core.cachemanager.CacheManager$CacheTransactionHandle@39a72981}
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad 
transaction ID!
     at 
org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
     at 
org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
     at 
org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
     at 
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
     at 
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
     at 
org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
     at 
org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
     at 
org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
     at 
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
     at 
org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
     at 
org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
     at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
FATAL 2011-06-07 11:44:56,583 [Worker thread '32'] 
(DupFinderConnector.java:155) - DATAMANAGER IS NULL!!!!
ERROR 2011-06-07 11:44:56,599 [Worker thread '90'] 
(WorkerThread.java:893) - Exception tossed: Bad transaction ID!
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad 
transaction ID!
     at 
org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
     at 
org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
     at 
org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
     at 
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
     at 
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
     at 
org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
     at 
org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
     at 
org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
     at 
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
     at 
org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
     at 
org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
     at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
FATAL 2011-06-07 11:44:56,614 [Worker thread '32'] 
(WorkerThread.java:955) - Error tossed: null
java.lang.NullPointerException
     at 
org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:158)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
     at 
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
     at 
org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
     at 
org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
     at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
  INFO 2011-06-07 11:44:56,645 [Worker thread '92'] 
(DupFinderConnector.java:251) - Attempting to initialize 
dataManager(null) and ciConnector(null)


On 6/7/2011 9:20 AM, Farzad Valad wrote:
> Lately when I issue an abort on a crawl job (click abort in UI), it 
> gets stuck, meaning the UI doesn't show any new info on subsequent 
> refreshes.  It just says Aborting, the start time, no end time, shows 
> # of documents, active, and processed.  I restarted Tomcat, but still 
> stuck in Aborting state.  Restarting the Agent process doesn't have 
> any affect.  But now if you kill the agent process and issue lock 
> clean, then start the Agent Process, it will show an Error in the 
> Status column, but no end time.  Ironically, this time the problem was 
> a bad transaction id.  The last time it was a connection refusal to my 
> repository.  Thoughts?
>
> PS.  Previous problem, you were right, dataManager is going null for 
> some reason, actually debugging for dataManager I ran into this one : )
>
> ERROR 2011-06-07 08:50:01,416 [Worker thread '64'] 
> (CacheManager.java:621) - Thread[Worker thread '64',5,main]: 
> invalidateKeys: 1307454600471: 
> org.apache.manifoldcf.core.cachemanager.CacheManager@39d7af3: 
> Transaction hash = {}
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad 
> transaction ID!
>     at 
> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>     at 
> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>     at 
> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>     at 
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>     at 
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>     at 
> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>     at 
> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>     at 
> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
>     at 
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>     at 
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>     at 
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>     at 
> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>     at 
> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>     at 
> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>     at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
> ERROR 2011-06-07 08:50:01,510 [Worker thread '64'] 
> (WorkerThread.java:893) - Exception tossed: Bad transaction ID!
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad 
> transaction ID!
>     at 
> org.apache.manifoldcf.core.cachemanager.CacheManager.invalidateKeys(CacheManager.java:620)
>     at 
> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:175)
>     at 
> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
>     at 
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:637)
>     at 
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:191)
>     at 
> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>     at 
> org.apache.manifoldcf.agents.output.dupfinder.DataManager.insertData(DataManager.java:115)
>     at 
> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:162)
>     at 
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>     at 
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>     at 
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>     at 
> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>     at 
> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>     at 
> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>     at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>