You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@manifoldcf.apache.org by Radek Sklenicka <ra...@gmail.com> on 2016/04/04 22:31:54 UTC

Re: Documentum - unable to index metadata

Hi Karl,

The patch (CONNECTORS-1293) resolved the UI issue with duplication of
attr_names.
Thank you!

It didn't seem to affect collecting of attributes and we're still not able
to get any attribute values.

Interestingly, attribute filters on the Job > DocumentTypes page work as
expected - it filters documents based on the attribute values.

Collecting attributes works for us with Documentum 7.1 (with only 1
language), but with Documentum 6.7 with multiple languages no luck so far.
Still not sure if the languages is the core of the issue, but it’s the most
significant difference between these two deployments.

We'll try to dig deeper.

Any idea or suggestion would be greatly appreciated.

Thanks,

Radek


On 31 March 2016 at 08:44, Radek Sklenicka <ra...@gmail.com>
wrote:

> Hi Karl,
>
> Many thanks for your prompt actions.
> Just checking with our Documentum guys. I'll let you know as soon as I
> have some updates.
>
> Thanks,
> Radek
>
>
> On 31 March 2016 at 07:44, Karl Wright <da...@gmail.com> wrote:
>
>> Hi Radek,
>>
>> A fix for the UI, at least, can be downloaded from the ticket
>> CONNECTORS-1293.  I can find no definitive mechanism for why this would
>> lead to no attributes being collected, but it's worth applying the patch,
>> updating your jobs, and giving it a try nonetheless.  Please let me know
>> what happens.
>>
>> Thanks,
>> Karl
>>
>>
>> On Wed, Mar 30, 2016 at 5:21 PM, Karl Wright <da...@gmail.com> wrote:
>>
>>> Hi Radek,
>>>
>>> The code that reads attribute values from Documentum DFC persistent
>>> objects does use the attribute name, as follows:
>>>
>>> >>>>>>
>>>   /** Get all the values that an attribute has, including multiple ones
>>> if present */
>>>   public String[] getAttributeValues(String attribute)
>>>     throws DocumentumException, RemoteException
>>>   {
>>>     try
>>>     {
>>>       int valueCount = object.getValueCount(attribute);
>>>       String[] values = new String[valueCount];
>>>       int y = 0;
>>>       while (y < valueCount)
>>>       {
>>>         // Fetch the attribute.
>>>         // It's supposed to work for all attribute types...
>>>         String value = object.getRepeatingString(attribute,y);
>>>         values[y++] = value;
>>>       }
>>>       return values;
>>>     }
>>>     catch (DfAuthenticationException ex)
>>>     {
>>>       throw new DocumentumException("Bad credentials:
>>> "+ex.getMessage(),DocumentumException.TYPE_BADCREDENTIALS);
>>>     }
>>>     catch (DfIdentityException ex)
>>>     {
>>>       throw new DocumentumException("Bad docbase name:
>>> "+ex.getMessage(),DocumentumException.TYPE_BADCONNECTIONPARAMS);
>>>     }
>>>     catch (DfDocbaseUnreachableException e)
>>>     {
>>>       throw new DocumentumException("Docbase unreachable:
>>> "+e.getMessage(),DocumentumException.TYPE_SERVICEINTERRUPTION);
>>>     }
>>>     catch (DfIOException e)
>>>     {
>>>       throw new DocumentumException("Docbase io exception:
>>> "+e.getMessage(),DocumentumException.TYPE_SERVICEINTERRUPTION);
>>>     }
>>>     catch (DfException e)
>>>     {
>>>       throw new DocumentumException("Documentum error: "+e.getMessage());
>>>     }
>>>   }
>>> <<<<<<
>>>
>>> This is how the DFC IDfPersistentObject API is structured.  So it
>>> doesn't look like multiple language values are supported in DFC.  So I
>>> don't know why you wouldn't get attribute values unless the UI issue is
>>> causing there to be no specified attributes for whatever type matches the
>>> document.  I'll have to dig into that code next.
>>>
>>> Karl
>>>
>>>
>>> On Wed, Mar 30, 2016 at 9:58 AM, Karl Wright <da...@gmail.com> wrote:
>>>
>>>> Hi Radek,
>>>>
>>>> I will have to check how the connector uses attribute names and get
>>>> back to you.  But I am pretty certain that the connector specifies
>>>> attributes in its dql queries by means of the attribute name, not the
>>>> r_object_id.  If that's the problem, it also implies that there can be a
>>>> different attribute value for each language, which might be why you aren't
>>>> seeing the attributes you are expecting.
>>>>
>>>> This is not an easy problem to address, however.
>>>>
>>>> Can you confirm whether or not documents can have different attribute
>>>> values for each language in Documentum?
>>>>
>>>> Thanks,
>>>> Karl
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Mar 30, 2016 at 9:41 AM, Radek Sklenicka <
>>>> radek.sklenicka@gmail.com> wrote:
>>>>
>>>>> Hi Karl,
>>>>>
>>>>>
>>>>>
>>>>> We discovered that we get metadata names in triplicate because there
>>>>> are 3 languages installed in Documentum.
>>>>>
>>>>> Multiple attribute records have each the same attr_name and type_name
>>>>> but unique r_object_id and different nls_key (en, es, pt).
>>>>>
>>>>>
>>>>>
>>>>> Could this be the reason why metadata doesn’t make it through the
>>>>> pipeline and we can’t get any metadata during crawling?
>>>>>
>>>>> Are unique attr_names required in Documentum connector?
>>>>>
>>>>>
>>>>>
>>>>> Any suggestions would be greatly appreciated.
>>>>>
>>>>>
>>>>>
>>>>> Thank you,
>>>>>
>>>>>
>>>>> Radek
>>>>>
>>>>> On 23 March 2016 at 18:28, Radek Sklenicka <ra...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thanks for verification, Karl.
>>>>>>
>>>>>> -Radek
>>>>>>
>>>>>> On 23 March 2016 at 14:01, Karl Wright <da...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Radek,
>>>>>>>
>>>>>>> This log output comes from RMI, apparently, and is not something
>>>>>>> I've ever seen before.  But it does look like it's a complete list of
>>>>>>> what's being returned for a request for the list of attributes (the first
>>>>>>> entry), and for a specific object (the second entry).
>>>>>>>
>>>>>>> Karl
>>>>>>>
>>>>>>> On Wed, Mar 23, 2016 at 8:41 AM, Radek Sklenicka <
>>>>>>> radek.sklenicka@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Karl,
>>>>>>>>
>>>>>>>> "select attr_name FROM dmi_dd_attr_info" really returns duplicates
>>>>>>>> - we're looking into that.
>>>>>>>>
>>>>>>>> Is there also a DQL query (or function) used by ManifoldCF that we
>>>>>>>> can try to check what/if attributes are being returned for a particular
>>>>>>>> record?
>>>>>>>>
>>>>>>>> We have trace logs from DFC and it looks like the attributes are
>>>>>>>> being returned from the content server.
>>>>>>>> Could you please help us decode the logs - where to look/verify if
>>>>>>>> attributes are handed over to ManifoldCF?
>>>>>>>> Can we deduce from the logs attached below that the attributes are
>>>>>>>> transferred from DFC to ManifoldCF?
>>>>>>>>
>>>>>>>> Many thanks,
>>>>>>>> Radek
>>>>>>>>
>>>>>>>>
>>>>>>>> 2016-03-22 13:29:26.008 <US...@14660772>  [RMI
>>>>>>>> TCP Connection(1823)-127.0.0.1] [EXIT]
>>>>>>>>  .com.documentum.fc.client.DfTypedObject@36b9ba.getLiteType ==>
>>>>>>>> AspectedLiteType@110eb5e{name=do_domep_project_hse, typeVersion=0,
>>>>>>>> cacheVStamp=178498, attributes={asp_herencia.atr_isnew,
>>>>>>>> asp_herencia.atr_niveles, asp_herencia.atr_tipo, asp_herencia.i_partition},
>>>>>>>> superType=LiteType@2045f2{name=do_domep_project_hse,
>>>>>>>> typeVersion=32, cacheVStamp=178498, attributes={atr_audit_type,
>>>>>>>> atr_speciality, atr_emergency_related}, superType=LiteType@a67471{name=do_domep_project,
>>>>>>>> typeVersion=32, cacheVStamp=178486, attributes={atr_uwi, atr_well_name,
>>>>>>>> atr_usi, atr_survey_name}, superType=LiteType@f74077{name=do_domep_base,
>>>>>>>> typeVersion=27, cacheVStamp=178438, attributes={atr_confidential_level,
>>>>>>>> atr_owner_area, atr_logical_code, atr_original_reference_id, atr_revision,
>>>>>>>> atr_entity, atr_author, atr_doc_type, atr_category_doc, atr_subcat_doc,
>>>>>>>> atr_discipline, atr_subdiscipline, atr_language, atr_physical_document,
>>>>>>>> atr_physical_code, atr_warehouse, atr_retention, atr_digital_media,
>>>>>>>> atr_internal, atr_country, atr_basin, atr_environment, atr_acreage,
>>>>>>>> atr_abstract, atr_doc_creation_date, atr_title, atr_collection,
>>>>>>>> atr_is_collection, atr_is_principal, atr_is_anexo, atr_id_collection,
>>>>>>>> atr_is_relation, atr_field, atr_original_revision, atr_remarks,
>>>>>>>> atr_keywords, atr_principal_folder_id, atr_original_version, atr_be_name,
>>>>>>>> atr_be_ref, atr_be_short_name, atr_issued_for_code,
>>>>>>>> atr_issued_for_description, atr_subbasin, atr_be_type_id, atr_comment,
>>>>>>>> atr_status, atr_prepared_by, atr_preparation_date, atr_verified_by,
>>>>>>>> atr_verification_date, atr_approved_by, atr_approval_date, atr_workflow},
>>>>>>>> superType=LiteType@19390bd{name=do_general, typeVersion=6,
>>>>>>>> cacheVStamp=167968, attributes={negocio, attr_is_gdcom},
>>>>>>>> superType=LiteType@16658e8{name=dm_document, typeVersion=2,
>>>>>>>> cacheVStamp=52034, attributes={}, superType=LiteType@6aac49{name=dm_sysobject,
>>>>>>>> typeVersion=3, cacheVStamp=0, attributes={object_name, r_object_type,
>>>>>>>> title, subject, authors, keywords, a_application_type, a_status,
>>>>>>>> r_creation_date, r_modify_date, r_modifier, r_access_date, a_is_hidden,
>>>>>>>> i_is_deleted, a_retention_date, a_archive, a_compound_architecture,
>>>>>>>> a_link_resolved, i_reference_cnt, i_has_folder, i_folder_id,
>>>>>>>> r_composite_id, r_composite_label, r_component_label, r_order_no,
>>>>>>>> r_link_cnt, r_link_high_cnt, r_assembled_from_id, r_frzn_assembly_cnt,
>>>>>>>> r_has_frzn_assembly, resolution_label, r_is_virtual_doc, i_contents_id,
>>>>>>>> a_content_type, r_page_cnt, r_content_size, a_full_text, a_storage_type,
>>>>>>>> i_cabinet_id, owner_name, owner_permit, group_name, group_permit,
>>>>>>>> world_permit, i_antecedent_id, i_chronicle_id, i_latest_flag, r_lock_owner,
>>>>>>>> r_lock_date, r_lock_machine, log_entry, r_version_label, i_branch_cnt,
>>>>>>>> i_direct_dsc, r_immutable_flag, r_frozen_flag, r_has_events, acl_domain,
>>>>>>>> acl_name, a_special_app, i_is_reference, r_creator_name, r_is_public,
>>>>>>>> r_policy_id, r_resume_state, r_current_state, r_alias_set_id,
>>>>>>>> a_effective_date, a_expiration_date, a_publish_formats, a_effective_label,
>>>>>>>> a_effective_flag, a_category, language_code, a_is_template,
>>>>>>>> a_controlling_app, r_full_content_size, a_extended_properties, a_is_signed,
>>>>>>>> a_last_review_date, i_retain_until, r_aspect_name, i_retainer_id,
>>>>>>>> i_partition, i_is_replica, i_vstamp}}}}}}}}
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2016-03-22 13:29:26.010 <US...@16366401>  [RMI
>>>>>>>> TCP Connection(1820)-127.0.0.1] [RPC_EXIT]  ......RPC: applyForObject ==>
>>>>>>>> TypedData@144302b[id=098c1b38809921b1,
>>>>>>>> type=do_domep_project_well_wd, readOnly=false, autoFill=true,
>>>>>>>> fetchTimestamp=0, values=[object_name=DSC00683.JPG,
>>>>>>>> r_object_type=do_domep_project_well_wd, title=, subject=, authors=[],
>>>>>>>> keywords=[], a_application_type=, a_status=, r_creation_date=2/11/2016
>>>>>>>> 8:35:45 AM, r_modify_date=3/7/2016 11:17:37 AM, r_modifier=admdcmt,
>>>>>>>> r_access_date=3/16/2016 12:23:59 PM, a_is_hidden=F, i_is_deleted=F,
>>>>>>>> a_retention_date=nulldate, a_archive=F, a_compound_architecture=,
>>>>>>>> a_link_resolved=F, i_reference_cnt=1, i_has_folder=T,
>>>>>>>> i_folder_id=[0b8c1b3880991ad3], r_composite_id=[], r_composite_label=[],
>>>>>>>> r_component_label=[], r_order_no=[], r_link_cnt=0, r_link_high_cnt=0,
>>>>>>>> r_assembled_from_id=0000000000000000, r_frzn_assembly_cnt=0,
>>>>>>>> r_has_frzn_assembly=F, resolution_label=, r_is_virtual_doc=0,
>>>>>>>> i_contents_id=068c1b388064051a, a_content_type=jpeg, r_page_cnt=1,
>>>>>>>> r_content_size=949228, a_full_text=T, a_storage_type=repo,
>>>>>>>> i_cabinet_id=0c8c1b38806aee30, owner_name=Domep controlador 00010,
>>>>>>>> owner_permit=7, group_name=, group_permit=1, world_permit=1,
>>>>>>>> i_antecedent_id=0000000000000000, i_chronicle_id=098c1b38809921b1,
>>>>>>>> i_latest_flag=T, r_lock_owner=, r_lock_date=nulldate, r_lock_machine=,
>>>>>>>> log_entry=, r_version_label=[1.0, CURRENT], i_branch_cnt=0, i_direct_dsc=F,
>>>>>>>> r_immutable_flag=F, r_frozen_flag=F, r_has_events=F, acl_domain=admdocum,
>>>>>>>> acl_name=domep_ac_en_02080, a_special_app=, i_is_reference=F,
>>>>>>>> r_creator_name=Domep controlador 00010, r_is_public=F,
>>>>>>>> r_policy_id=468c1b38809c6e47, r_resume_state=-1, r_current_state=0,
>>>>>>>> r_alias_set_id=0000000000000000, a_effective_date=[], a_expiration_date=[],
>>>>>>>> a_publish_formats=[], a_effective_label=[], a_effective_flag=[],
>>>>>>>> a_category=, language_code=, a_is_template=F, a_controlling_app=,
>>>>>>>> r_full_content_size=949228, a_extended_properties=[], a_is_signed=F,
>>>>>>>> a_last_review_date=nulldate, i_retain_until=nulldate,
>>>>>>>> r_aspect_name=[asp_herencia], i_retainer_id=[], i_partition=0,
>>>>>>>> i_is_replica=F, i_vstamp=4, negocio=E&P, attr_is_gdcom=F,
>>>>>>>> atr_confidential_level=Internal Use, atr_owner_area=OFICINA DE E,
>>>>>>>> atr_logical_code=IQEXPEOMKUR000WEL2016000063, atr_original_reference_id=[],
>>>>>>>> atr_revision=, atr_entity=[], atr_author=[Domep controlador 00010],
>>>>>>>> atr_doc_type=Reporting, atr_category_doc=Geology, atr_subcat_doc=Progress
>>>>>>>> Report, atr_discipline=GEOLOGY, atr_subdiscipline=[],
>>>>>>>> atr_language=[ENGLISH], atr_physical_document=F, atr_physical_code=[],
>>>>>>>> atr_warehouse=, atr_retention=YES, atr_digital_media=F, atr_internal=YES,
>>>>>>>> atr_country=[XYZ], atr_basin=[XYZZ], atr_environment=, atr_acreage=[],
>>>>>>>> atr_abstract=, atr_doc_creation_date=5/22/2006 8:31:43 AM, atr_title=WELL
>>>>>>>> BARAM 1 PHOTOS FIELD 06, atr_collection=[], atr_is_collection=F,
>>>>>>>> atr_is_principal=F, atr_is_anexo=F, atr_id_collection=[],
>>>>>>>> atr_is_relation=F, atr_field=[], atr_original_revision=, atr_remarks=,
>>>>>>>> atr_keywords=[], atr_principal_folder_id=0b8c1b3880991ad3,
>>>>>>>> atr_original_version=, atr_be_name=BARAM 1, atr_be_ref=IQWEL000008,
>>>>>>>> atr_be_short_name=BA 1, atr_issued_for_code=, atr_issued_for_description=,
>>>>>>>> atr_subbasin=[], atr_be_type_id=8, atr_comment=, atr_status=Draft,
>>>>>>>> atr_prepared_by=[], atr_preparation_date=nulldate, atr_verified_by=[],
>>>>>>>> atr_verification_date=nulldate, atr_approved_by=[],
>>>>>>>> atr_approval_date=nulldate, atr_workflow=, atr_well_name=BARAM 1,
>>>>>>>> atr_uwi=IQ010004432, atr_borehole_name=[BARAM 1], atr_ubhi=[IQ01000443200],
>>>>>>>> atr_borehole_alias=[], atr_borehole_short_name=[], atr_sample_type=[],
>>>>>>>> atr_analysis_type=[], asp_herencia.atr_isnew=F, asp_herencia.atr_niveles=0,
>>>>>>>> asp_herencia.atr_tipo=[], asp_herencia.i_partition=0,
>>>>>>>> r_object_id=098c1b38809921b1, _KEEP_LOCK_=F, _FREEZE_COMPONENTS_=F,
>>>>>>>> _THAW_COMPONENTS_=F, _CONTENTS_CHANGED_=F, _DIST_SAVE_AS_NEW_=F]]
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11 March 2016 at 15:33, Radek Sklenicka <
>>>>>>>> radek.sklenicka@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Thanks Karl, we'll verify that.
>>>>>>>>>
>>>>>>>>> -Radek
>>>>>>>>>
>>>>>>>>> On 11 March 2016 at 14:21, Karl Wright <da...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Radek,
>>>>>>>>>>
>>>>>>>>>> This is the DQL query that is run:
>>>>>>>>>>
>>>>>>>>>>       String strDQL = "select attr_name FROM dmi_dd_attr_info
>>>>>>>>>> where type_name = '" + docType + "' order by attr_name asc";
>>>>>>>>>>
>>>>>>>>>> Karl
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Mar 11, 2016 at 8:19 AM, Karl Wright <da...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Radek,
>>>>>>>>>>>
>>>>>>>>>>> The Document Types page runs a DQL query to populate the
>>>>>>>>>>> document types.  The fact that you get duplicates means that something may
>>>>>>>>>>> be corrupt with your Document instance.  It's possible that for some reason
>>>>>>>>>>> the instance is set up with multiple records that each have the same name
>>>>>>>>>>> but different key values.
>>>>>>>>>>>
>>>>>>>>>>> Documentum used to have a little web app that allowed you to
>>>>>>>>>>> execute DQL queries.  I'd experiment to see what was leading to the
>>>>>>>>>>> duplication.  The fact that you can't get any metadata during crawling is
>>>>>>>>>>> almost certainly related.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Karl
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Mar 11, 2016 at 8:10 AM, Radek Sklenicka <
>>>>>>>>>>> radek.sklenicka@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hello,
>>>>>>>>>>>>
>>>>>>>>>>>> We are not able to pull metadata from one of our Documentum
>>>>>>>>>>>> instances (it is 6.7)
>>>>>>>>>>>> Interestingly, on the Job > Document Types page each metadata
>>>>>>>>>>>> field is displayed 3 times in the metadata boxes - could this be an issue?
>>>>>>>>>>>> Screenshots:
>>>>>>>>>>>> http://take.ms/mJhPh
>>>>>>>>>>>> http://take.ms/AMZF0
>>>>>>>>>>>> We have quite a long list of document types and it takes
>>>>>>>>>>>> minutes to load the Document Types page.
>>>>>>>>>>>>
>>>>>>>>>>>> Also, we can successfully pull metadata from our testing
>>>>>>>>>>>> Documentum (it is 7.1), and I noticed that there is a difference in
>>>>>>>>>>>> connector logs between the two:
>>>>>>>>>>>>
>>>>>>>>>>>> 1.) here we are able to pull metadata:
>>>>>>>>>>>>
>>>>>>>>>>>> DEBUG 2016-03-10 03:50:08,051 (Worker thread '3') - DCTM:
>>>>>>>>>>>> Document 090007c28000569d has version label:
>>>>>>>>>>>> 11+authors+object_name+owner_name+owner_permit+r_creation_date+r_creator_name+r_modifier+r_modify_date+r_object_id+r_object_type+title++0+DEAD_AUTHORITY+1.0_0_
>>>>>>>>>>>> http://localhost/webtop/
>>>>>>>>>>>> DEBUG 2016-03-10 03:50:08,052 (Worker thread '3') - DCTM:
>>>>>>>>>>>> Inside processDocuments
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2.) NOT able to pull metadata:
>>>>>>>>>>>>
>>>>>>>>>>>> DEBUG 2016-03-10 14:58:22,908 (Worker thread '22') - DCTM:
>>>>>>>>>>>> Document 098c1b3880991f48 has version label: 0++0+DEAD_AUTHORITY+_4_
>>>>>>>>>>>> http://localhost/webtop
>>>>>>>>>>>> DEBUG 2016-03-10 14:58:22,908 (Worker thread '22') - DCTM:
>>>>>>>>>>>> Inside processDocuments
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Any ideas will be appreciated.
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you,
>>>>>>>>>>>>
>>>>>>>>>>>> Radek
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Documentum - unable to index metadata

Posted by Karl Wright <da...@gmail.com>.
Hi Radek,

The connector relies on DFC to fetch the attributes, and gets them from a
DFC document object.  At this point I don't know why the DFC document
object would not contain the expected attributes.  This is what I can think
of:

(1) You might have a DFC version that is incompatible with the documentum
version you are trying to connect to.
(2) There might need to be a special way of obtaining attribute values from
DFC on a multilanguage system -- eg. a special prefix.
(3) The attributes are actually being fetched but you aren't noticing
because of an output connector misconfiguration of some kind.

Wish I knew which it was...

Karl


On Mon, Apr 4, 2016 at 4:31 PM, Radek Sklenicka <ra...@gmail.com>
wrote:

> Hi Karl,
>
> The patch (CONNECTORS-1293) resolved the UI issue with duplication of
> attr_names.
> Thank you!
>
> It didn't seem to affect collecting of attributes and we're still not able
> to get any attribute values.
>
> Interestingly, attribute filters on the Job > DocumentTypes page work as
> expected - it filters documents based on the attribute values.
>
> Collecting attributes works for us with Documentum 7.1 (with only 1
> language), but with Documentum 6.7 with multiple languages no luck so far.
> Still not sure if the languages is the core of the issue, but it’s the
> most significant difference between these two deployments.
>
> We'll try to dig deeper.
>
> Any idea or suggestion would be greatly appreciated.
>
> Thanks,
>
> Radek
>
>
> On 31 March 2016 at 08:44, Radek Sklenicka <ra...@gmail.com>
> wrote:
>
>> Hi Karl,
>>
>> Many thanks for your prompt actions.
>> Just checking with our Documentum guys. I'll let you know as soon as I
>> have some updates.
>>
>> Thanks,
>> Radek
>>
>>
>> On 31 March 2016 at 07:44, Karl Wright <da...@gmail.com> wrote:
>>
>>> Hi Radek,
>>>
>>> A fix for the UI, at least, can be downloaded from the ticket
>>> CONNECTORS-1293.  I can find no definitive mechanism for why this would
>>> lead to no attributes being collected, but it's worth applying the patch,
>>> updating your jobs, and giving it a try nonetheless.  Please let me know
>>> what happens.
>>>
>>> Thanks,
>>> Karl
>>>
>>>
>>> On Wed, Mar 30, 2016 at 5:21 PM, Karl Wright <da...@gmail.com> wrote:
>>>
>>>> Hi Radek,
>>>>
>>>> The code that reads attribute values from Documentum DFC persistent
>>>> objects does use the attribute name, as follows:
>>>>
>>>> >>>>>>
>>>>   /** Get all the values that an attribute has, including multiple ones
>>>> if present */
>>>>   public String[] getAttributeValues(String attribute)
>>>>     throws DocumentumException, RemoteException
>>>>   {
>>>>     try
>>>>     {
>>>>       int valueCount = object.getValueCount(attribute);
>>>>       String[] values = new String[valueCount];
>>>>       int y = 0;
>>>>       while (y < valueCount)
>>>>       {
>>>>         // Fetch the attribute.
>>>>         // It's supposed to work for all attribute types...
>>>>         String value = object.getRepeatingString(attribute,y);
>>>>         values[y++] = value;
>>>>       }
>>>>       return values;
>>>>     }
>>>>     catch (DfAuthenticationException ex)
>>>>     {
>>>>       throw new DocumentumException("Bad credentials:
>>>> "+ex.getMessage(),DocumentumException.TYPE_BADCREDENTIALS);
>>>>     }
>>>>     catch (DfIdentityException ex)
>>>>     {
>>>>       throw new DocumentumException("Bad docbase name:
>>>> "+ex.getMessage(),DocumentumException.TYPE_BADCONNECTIONPARAMS);
>>>>     }
>>>>     catch (DfDocbaseUnreachableException e)
>>>>     {
>>>>       throw new DocumentumException("Docbase unreachable:
>>>> "+e.getMessage(),DocumentumException.TYPE_SERVICEINTERRUPTION);
>>>>     }
>>>>     catch (DfIOException e)
>>>>     {
>>>>       throw new DocumentumException("Docbase io exception:
>>>> "+e.getMessage(),DocumentumException.TYPE_SERVICEINTERRUPTION);
>>>>     }
>>>>     catch (DfException e)
>>>>     {
>>>>       throw new DocumentumException("Documentum error:
>>>> "+e.getMessage());
>>>>     }
>>>>   }
>>>> <<<<<<
>>>>
>>>> This is how the DFC IDfPersistentObject API is structured.  So it
>>>> doesn't look like multiple language values are supported in DFC.  So I
>>>> don't know why you wouldn't get attribute values unless the UI issue is
>>>> causing there to be no specified attributes for whatever type matches the
>>>> document.  I'll have to dig into that code next.
>>>>
>>>> Karl
>>>>
>>>>
>>>> On Wed, Mar 30, 2016 at 9:58 AM, Karl Wright <da...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Radek,
>>>>>
>>>>> I will have to check how the connector uses attribute names and get
>>>>> back to you.  But I am pretty certain that the connector specifies
>>>>> attributes in its dql queries by means of the attribute name, not the
>>>>> r_object_id.  If that's the problem, it also implies that there can be a
>>>>> different attribute value for each language, which might be why you aren't
>>>>> seeing the attributes you are expecting.
>>>>>
>>>>> This is not an easy problem to address, however.
>>>>>
>>>>> Can you confirm whether or not documents can have different attribute
>>>>> values for each language in Documentum?
>>>>>
>>>>> Thanks,
>>>>> Karl
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Mar 30, 2016 at 9:41 AM, Radek Sklenicka <
>>>>> radek.sklenicka@gmail.com> wrote:
>>>>>
>>>>>> Hi Karl,
>>>>>>
>>>>>>
>>>>>>
>>>>>> We discovered that we get metadata names in triplicate because there
>>>>>> are 3 languages installed in Documentum.
>>>>>>
>>>>>> Multiple attribute records have each the same attr_name and
>>>>>> type_name but unique r_object_id and different nls_key (en, es, pt).
>>>>>>
>>>>>>
>>>>>>
>>>>>> Could this be the reason why metadata doesn’t make it through the
>>>>>> pipeline and we can’t get any metadata during crawling?
>>>>>>
>>>>>> Are unique attr_names required in Documentum connector?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Any suggestions would be greatly appreciated.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thank you,
>>>>>>
>>>>>>
>>>>>> Radek
>>>>>>
>>>>>> On 23 March 2016 at 18:28, Radek Sklenicka <radek.sklenicka@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Thanks for verification, Karl.
>>>>>>>
>>>>>>> -Radek
>>>>>>>
>>>>>>> On 23 March 2016 at 14:01, Karl Wright <da...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Radek,
>>>>>>>>
>>>>>>>> This log output comes from RMI, apparently, and is not something
>>>>>>>> I've ever seen before.  But it does look like it's a complete list of
>>>>>>>> what's being returned for a request for the list of attributes (the first
>>>>>>>> entry), and for a specific object (the second entry).
>>>>>>>>
>>>>>>>> Karl
>>>>>>>>
>>>>>>>> On Wed, Mar 23, 2016 at 8:41 AM, Radek Sklenicka <
>>>>>>>> radek.sklenicka@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Karl,
>>>>>>>>>
>>>>>>>>> "select attr_name FROM dmi_dd_attr_info" really returns duplicates
>>>>>>>>> - we're looking into that.
>>>>>>>>>
>>>>>>>>> Is there also a DQL query (or function) used by ManifoldCF that we
>>>>>>>>> can try to check what/if attributes are being returned for a particular
>>>>>>>>> record?
>>>>>>>>>
>>>>>>>>> We have trace logs from DFC and it looks like the attributes are
>>>>>>>>> being returned from the content server.
>>>>>>>>> Could you please help us decode the logs - where to look/verify if
>>>>>>>>> attributes are handed over to ManifoldCF?
>>>>>>>>> Can we deduce from the logs attached below that the attributes are
>>>>>>>>> transferred from DFC to ManifoldCF?
>>>>>>>>>
>>>>>>>>> Many thanks,
>>>>>>>>> Radek
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2016-03-22 13:29:26.008 <US...@14660772>  [RMI
>>>>>>>>> TCP Connection(1823)-127.0.0.1] [EXIT]
>>>>>>>>>  .com.documentum.fc.client.DfTypedObject@36b9ba.getLiteType ==>
>>>>>>>>> AspectedLiteType@110eb5e{name=do_domep_project_hse,
>>>>>>>>> typeVersion=0, cacheVStamp=178498, attributes={asp_herencia.atr_isnew,
>>>>>>>>> asp_herencia.atr_niveles, asp_herencia.atr_tipo, asp_herencia.i_partition},
>>>>>>>>> superType=LiteType@2045f2{name=do_domep_project_hse,
>>>>>>>>> typeVersion=32, cacheVStamp=178498, attributes={atr_audit_type,
>>>>>>>>> atr_speciality, atr_emergency_related}, superType=LiteType@a67471{name=do_domep_project,
>>>>>>>>> typeVersion=32, cacheVStamp=178486, attributes={atr_uwi, atr_well_name,
>>>>>>>>> atr_usi, atr_survey_name}, superType=LiteType@f74077{name=do_domep_base,
>>>>>>>>> typeVersion=27, cacheVStamp=178438, attributes={atr_confidential_level,
>>>>>>>>> atr_owner_area, atr_logical_code, atr_original_reference_id, atr_revision,
>>>>>>>>> atr_entity, atr_author, atr_doc_type, atr_category_doc, atr_subcat_doc,
>>>>>>>>> atr_discipline, atr_subdiscipline, atr_language, atr_physical_document,
>>>>>>>>> atr_physical_code, atr_warehouse, atr_retention, atr_digital_media,
>>>>>>>>> atr_internal, atr_country, atr_basin, atr_environment, atr_acreage,
>>>>>>>>> atr_abstract, atr_doc_creation_date, atr_title, atr_collection,
>>>>>>>>> atr_is_collection, atr_is_principal, atr_is_anexo, atr_id_collection,
>>>>>>>>> atr_is_relation, atr_field, atr_original_revision, atr_remarks,
>>>>>>>>> atr_keywords, atr_principal_folder_id, atr_original_version, atr_be_name,
>>>>>>>>> atr_be_ref, atr_be_short_name, atr_issued_for_code,
>>>>>>>>> atr_issued_for_description, atr_subbasin, atr_be_type_id, atr_comment,
>>>>>>>>> atr_status, atr_prepared_by, atr_preparation_date, atr_verified_by,
>>>>>>>>> atr_verification_date, atr_approved_by, atr_approval_date, atr_workflow},
>>>>>>>>> superType=LiteType@19390bd{name=do_general, typeVersion=6,
>>>>>>>>> cacheVStamp=167968, attributes={negocio, attr_is_gdcom},
>>>>>>>>> superType=LiteType@16658e8{name=dm_document, typeVersion=2,
>>>>>>>>> cacheVStamp=52034, attributes={}, superType=LiteType@6aac49{name=dm_sysobject,
>>>>>>>>> typeVersion=3, cacheVStamp=0, attributes={object_name, r_object_type,
>>>>>>>>> title, subject, authors, keywords, a_application_type, a_status,
>>>>>>>>> r_creation_date, r_modify_date, r_modifier, r_access_date, a_is_hidden,
>>>>>>>>> i_is_deleted, a_retention_date, a_archive, a_compound_architecture,
>>>>>>>>> a_link_resolved, i_reference_cnt, i_has_folder, i_folder_id,
>>>>>>>>> r_composite_id, r_composite_label, r_component_label, r_order_no,
>>>>>>>>> r_link_cnt, r_link_high_cnt, r_assembled_from_id, r_frzn_assembly_cnt,
>>>>>>>>> r_has_frzn_assembly, resolution_label, r_is_virtual_doc, i_contents_id,
>>>>>>>>> a_content_type, r_page_cnt, r_content_size, a_full_text, a_storage_type,
>>>>>>>>> i_cabinet_id, owner_name, owner_permit, group_name, group_permit,
>>>>>>>>> world_permit, i_antecedent_id, i_chronicle_id, i_latest_flag, r_lock_owner,
>>>>>>>>> r_lock_date, r_lock_machine, log_entry, r_version_label, i_branch_cnt,
>>>>>>>>> i_direct_dsc, r_immutable_flag, r_frozen_flag, r_has_events, acl_domain,
>>>>>>>>> acl_name, a_special_app, i_is_reference, r_creator_name, r_is_public,
>>>>>>>>> r_policy_id, r_resume_state, r_current_state, r_alias_set_id,
>>>>>>>>> a_effective_date, a_expiration_date, a_publish_formats, a_effective_label,
>>>>>>>>> a_effective_flag, a_category, language_code, a_is_template,
>>>>>>>>> a_controlling_app, r_full_content_size, a_extended_properties, a_is_signed,
>>>>>>>>> a_last_review_date, i_retain_until, r_aspect_name, i_retainer_id,
>>>>>>>>> i_partition, i_is_replica, i_vstamp}}}}}}}}
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2016-03-22 13:29:26.010 <US...@16366401>  [RMI
>>>>>>>>> TCP Connection(1820)-127.0.0.1] [RPC_EXIT]  ......RPC: applyForObject ==>
>>>>>>>>> TypedData@144302b[id=098c1b38809921b1,
>>>>>>>>> type=do_domep_project_well_wd, readOnly=false, autoFill=true,
>>>>>>>>> fetchTimestamp=0, values=[object_name=DSC00683.JPG,
>>>>>>>>> r_object_type=do_domep_project_well_wd, title=, subject=, authors=[],
>>>>>>>>> keywords=[], a_application_type=, a_status=, r_creation_date=2/11/2016
>>>>>>>>> 8:35:45 AM, r_modify_date=3/7/2016 11:17:37 AM, r_modifier=admdcmt,
>>>>>>>>> r_access_date=3/16/2016 12:23:59 PM, a_is_hidden=F, i_is_deleted=F,
>>>>>>>>> a_retention_date=nulldate, a_archive=F, a_compound_architecture=,
>>>>>>>>> a_link_resolved=F, i_reference_cnt=1, i_has_folder=T,
>>>>>>>>> i_folder_id=[0b8c1b3880991ad3], r_composite_id=[], r_composite_label=[],
>>>>>>>>> r_component_label=[], r_order_no=[], r_link_cnt=0, r_link_high_cnt=0,
>>>>>>>>> r_assembled_from_id=0000000000000000, r_frzn_assembly_cnt=0,
>>>>>>>>> r_has_frzn_assembly=F, resolution_label=, r_is_virtual_doc=0,
>>>>>>>>> i_contents_id=068c1b388064051a, a_content_type=jpeg, r_page_cnt=1,
>>>>>>>>> r_content_size=949228, a_full_text=T, a_storage_type=repo,
>>>>>>>>> i_cabinet_id=0c8c1b38806aee30, owner_name=Domep controlador 00010,
>>>>>>>>> owner_permit=7, group_name=, group_permit=1, world_permit=1,
>>>>>>>>> i_antecedent_id=0000000000000000, i_chronicle_id=098c1b38809921b1,
>>>>>>>>> i_latest_flag=T, r_lock_owner=, r_lock_date=nulldate, r_lock_machine=,
>>>>>>>>> log_entry=, r_version_label=[1.0, CURRENT], i_branch_cnt=0, i_direct_dsc=F,
>>>>>>>>> r_immutable_flag=F, r_frozen_flag=F, r_has_events=F, acl_domain=admdocum,
>>>>>>>>> acl_name=domep_ac_en_02080, a_special_app=, i_is_reference=F,
>>>>>>>>> r_creator_name=Domep controlador 00010, r_is_public=F,
>>>>>>>>> r_policy_id=468c1b38809c6e47, r_resume_state=-1, r_current_state=0,
>>>>>>>>> r_alias_set_id=0000000000000000, a_effective_date=[], a_expiration_date=[],
>>>>>>>>> a_publish_formats=[], a_effective_label=[], a_effective_flag=[],
>>>>>>>>> a_category=, language_code=, a_is_template=F, a_controlling_app=,
>>>>>>>>> r_full_content_size=949228, a_extended_properties=[], a_is_signed=F,
>>>>>>>>> a_last_review_date=nulldate, i_retain_until=nulldate,
>>>>>>>>> r_aspect_name=[asp_herencia], i_retainer_id=[], i_partition=0,
>>>>>>>>> i_is_replica=F, i_vstamp=4, negocio=E&P, attr_is_gdcom=F,
>>>>>>>>> atr_confidential_level=Internal Use, atr_owner_area=OFICINA DE E,
>>>>>>>>> atr_logical_code=IQEXPEOMKUR000WEL2016000063, atr_original_reference_id=[],
>>>>>>>>> atr_revision=, atr_entity=[], atr_author=[Domep controlador 00010],
>>>>>>>>> atr_doc_type=Reporting, atr_category_doc=Geology, atr_subcat_doc=Progress
>>>>>>>>> Report, atr_discipline=GEOLOGY, atr_subdiscipline=[],
>>>>>>>>> atr_language=[ENGLISH], atr_physical_document=F, atr_physical_code=[],
>>>>>>>>> atr_warehouse=, atr_retention=YES, atr_digital_media=F, atr_internal=YES,
>>>>>>>>> atr_country=[XYZ], atr_basin=[XYZZ], atr_environment=, atr_acreage=[],
>>>>>>>>> atr_abstract=, atr_doc_creation_date=5/22/2006 8:31:43 AM, atr_title=WELL
>>>>>>>>> BARAM 1 PHOTOS FIELD 06, atr_collection=[], atr_is_collection=F,
>>>>>>>>> atr_is_principal=F, atr_is_anexo=F, atr_id_collection=[],
>>>>>>>>> atr_is_relation=F, atr_field=[], atr_original_revision=, atr_remarks=,
>>>>>>>>> atr_keywords=[], atr_principal_folder_id=0b8c1b3880991ad3,
>>>>>>>>> atr_original_version=, atr_be_name=BARAM 1, atr_be_ref=IQWEL000008,
>>>>>>>>> atr_be_short_name=BA 1, atr_issued_for_code=, atr_issued_for_description=,
>>>>>>>>> atr_subbasin=[], atr_be_type_id=8, atr_comment=, atr_status=Draft,
>>>>>>>>> atr_prepared_by=[], atr_preparation_date=nulldate, atr_verified_by=[],
>>>>>>>>> atr_verification_date=nulldate, atr_approved_by=[],
>>>>>>>>> atr_approval_date=nulldate, atr_workflow=, atr_well_name=BARAM 1,
>>>>>>>>> atr_uwi=IQ010004432, atr_borehole_name=[BARAM 1], atr_ubhi=[IQ01000443200],
>>>>>>>>> atr_borehole_alias=[], atr_borehole_short_name=[], atr_sample_type=[],
>>>>>>>>> atr_analysis_type=[], asp_herencia.atr_isnew=F, asp_herencia.atr_niveles=0,
>>>>>>>>> asp_herencia.atr_tipo=[], asp_herencia.i_partition=0,
>>>>>>>>> r_object_id=098c1b38809921b1, _KEEP_LOCK_=F, _FREEZE_COMPONENTS_=F,
>>>>>>>>> _THAW_COMPONENTS_=F, _CONTENTS_CHANGED_=F, _DIST_SAVE_AS_NEW_=F]]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 11 March 2016 at 15:33, Radek Sklenicka <
>>>>>>>>> radek.sklenicka@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks Karl, we'll verify that.
>>>>>>>>>>
>>>>>>>>>> -Radek
>>>>>>>>>>
>>>>>>>>>> On 11 March 2016 at 14:21, Karl Wright <da...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Radek,
>>>>>>>>>>>
>>>>>>>>>>> This is the DQL query that is run:
>>>>>>>>>>>
>>>>>>>>>>>       String strDQL = "select attr_name FROM dmi_dd_attr_info
>>>>>>>>>>> where type_name = '" + docType + "' order by attr_name asc";
>>>>>>>>>>>
>>>>>>>>>>> Karl
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Mar 11, 2016 at 8:19 AM, Karl Wright <daddywri@gmail.com
>>>>>>>>>>> > wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Radek,
>>>>>>>>>>>>
>>>>>>>>>>>> The Document Types page runs a DQL query to populate the
>>>>>>>>>>>> document types.  The fact that you get duplicates means that something may
>>>>>>>>>>>> be corrupt with your Document instance.  It's possible that for some reason
>>>>>>>>>>>> the instance is set up with multiple records that each have the same name
>>>>>>>>>>>> but different key values.
>>>>>>>>>>>>
>>>>>>>>>>>> Documentum used to have a little web app that allowed you to
>>>>>>>>>>>> execute DQL queries.  I'd experiment to see what was leading to the
>>>>>>>>>>>> duplication.  The fact that you can't get any metadata during crawling is
>>>>>>>>>>>> almost certainly related.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Karl
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Mar 11, 2016 at 8:10 AM, Radek Sklenicka <
>>>>>>>>>>>> radek.sklenicka@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>
>>>>>>>>>>>>> We are not able to pull metadata from one of our Documentum
>>>>>>>>>>>>> instances (it is 6.7)
>>>>>>>>>>>>> Interestingly, on the Job > Document Types page each metadata
>>>>>>>>>>>>> field is displayed 3 times in the metadata boxes - could this be an issue?
>>>>>>>>>>>>> Screenshots:
>>>>>>>>>>>>> http://take.ms/mJhPh
>>>>>>>>>>>>> http://take.ms/AMZF0
>>>>>>>>>>>>> We have quite a long list of document types and it takes
>>>>>>>>>>>>> minutes to load the Document Types page.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Also, we can successfully pull metadata from our testing
>>>>>>>>>>>>> Documentum (it is 7.1), and I noticed that there is a difference in
>>>>>>>>>>>>> connector logs between the two:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1.) here we are able to pull metadata:
>>>>>>>>>>>>>
>>>>>>>>>>>>> DEBUG 2016-03-10 03:50:08,051 (Worker thread '3') - DCTM:
>>>>>>>>>>>>> Document 090007c28000569d has version label:
>>>>>>>>>>>>> 11+authors+object_name+owner_name+owner_permit+r_creation_date+r_creator_name+r_modifier+r_modify_date+r_object_id+r_object_type+title++0+DEAD_AUTHORITY+1.0_0_
>>>>>>>>>>>>> http://localhost/webtop/
>>>>>>>>>>>>> DEBUG 2016-03-10 03:50:08,052 (Worker thread '3') - DCTM:
>>>>>>>>>>>>> Inside processDocuments
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2.) NOT able to pull metadata:
>>>>>>>>>>>>>
>>>>>>>>>>>>> DEBUG 2016-03-10 14:58:22,908 (Worker thread '22') - DCTM:
>>>>>>>>>>>>> Document 098c1b3880991f48 has version label: 0++0+DEAD_AUTHORITY+_4_
>>>>>>>>>>>>> http://localhost/webtop
>>>>>>>>>>>>> DEBUG 2016-03-10 14:58:22,908 (Worker thread '22') - DCTM:
>>>>>>>>>>>>> Inside processDocuments
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Any ideas will be appreciated.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank you,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Radek
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>