You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@manifoldcf.apache.org by Roman Šitina <ro...@sitina.cz> on 2015/10/01 11:37:30 UTC

Documentum connector model

Hello,

I found this comment in Documentum connector

/** Let the crawler know the completeness of the information we are giving it.
*/
@Override
public int getConnectorModel()
{
  // For documentum, originally we thought it would return the deleted
objects when we
  // reseeded.  Later research has shown that documentum simply
deletes the whole thing now
  // and doesn't leave a gravemarker around at all.  So we have no
choice but to treat this
  // like other stupid repositories and check for deletes by scanning!
 UGH.  It also does
  // not accurately provide changes, because the ACL changes are not
caught by the query.
  return MODEL_ADD;
}

When you were discovering possibilities how to fetch updated and
deleted documents - had you tried to make use of Documentum table
dm_audittrail?

It looks that by querying it user is able to find ids of documents
which were modified or deleted. For example: select * from
dm_audittrail where event_name='dm_destroy' and
object_type='dm_document'

If yes - what was the reason this did not work?

Thank you very much

Roman

Re: Documentum connector model

Posted by Karl Wright <da...@gmail.com>.
Hi Roman,

Various versions of documentum have done different things, so I cannot say
for sure that using dm_audittrail will always work.  For example, when the
documentum connector was originally developed against 5.2, there were grave
markers left around that our seeding query picked up.  Those went away in
5.3.  (This was ten years ago now, so I may have the exact versions wrong,
but please bear with me.)

For any change like this, it's critical to know the following: (1) will it
catch ALL changes, or just some of them?  My (vague) recollection was that
we talked about using the audit trail but concluded it was insufficient
because it would not capture ACL changes, which are pretty important; (2)
does the record of changes get flushed periodically?  because if it does,
it's not reliable enough to use for the purpose of determining deletions.

If you can answer both of these questions in the correct way, you could
consider trying to change the seeding code to run a second query to include
deletions, and see how that worked.  I don't have a documentum instance
here to play with, so this would be up to you to do.

Thanks,
Karl


On Thu, Oct 1, 2015 at 5:37 AM, Roman Šitina <ro...@sitina.cz> wrote:

> Hello,
>
> I found this comment in Documentum connector
>
> /** Let the crawler know the completeness of the information we are giving
> it.
> */
> @Override
> public int getConnectorModel()
> {
>   // For documentum, originally we thought it would return the deleted
> objects when we
>   // reseeded.  Later research has shown that documentum simply
> deletes the whole thing now
>   // and doesn't leave a gravemarker around at all.  So we have no
> choice but to treat this
>   // like other stupid repositories and check for deletes by scanning!
>  UGH.  It also does
>   // not accurately provide changes, because the ACL changes are not
> caught by the query.
>   return MODEL_ADD;
> }
>
> When you were discovering possibilities how to fetch updated and
> deleted documents - had you tried to make use of Documentum table
> dm_audittrail?
>
> It looks that by querying it user is able to find ids of documents
> which were modified or deleted. For example: select * from
> dm_audittrail where event_name='dm_destroy' and
> object_type='dm_document'
>
> If yes - what was the reason this did not work?
>
> Thank you very much
>
> Roman
>