You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Marko A. Rodriguez (JIRA)" <ji...@apache.org> on 2017/02/01 17:35:51 UTC

[jira] [Closed] (TINKERPOP-1315) HadoopConfiguration will not allow an ArrayList to be serialized in vertexProgram configuration unless setProperty is overriden

     [ https://issues.apache.org/jira/browse/TINKERPOP-1315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marko A. Rodriguez closed TINKERPOP-1315.
-----------------------------------------
    Resolution: Won't Fix
      Assignee: Marko A. Rodriguez

Closing this as this is simply the semantics of Apache Commons and lists. Please reopen if you believe that TinkerPop should do something different.

> HadoopConfiguration will not allow an ArrayList to be serialized in vertexProgram configuration unless setProperty is overriden
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: TINKERPOP-1315
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1315
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: hadoop
>    Affects Versions: 3.2.1
>            Reporter: Dylan Bethune-Waddell
>            Assignee: Marko A. Rodriguez
>            Priority: Minor
>
> I have been implementing a "PrecisionBulkLoader" class that takes a ScriptTraversal with bindings that can execute against the target graph to getOrCreate vertices/edges with more precision - this follows from my realization that currently IncrementalBulkLoader will overwrite the first edge of the same label in the target graph that is between the two vertex endpoints - this is an issue for self-loops and multi-edges:
> https://issues.apache.org/jira/browse/TINKERPOP-1099
> I finally got it to work with the script bindings being propagated to workers, but in order to do so without just taking the last value of the Array I had to override the setProperty method in org.apache.tinkerpop.gremlin.hadoop.structure.HadoopConfiguration - before I did that, when ConfigurationUtils.copy(conf1, conf2) was called with a HadoopConfiguration on either end (conf1 or conf2), any multi-valued / list properties get clobbered and only the last value would be there after storeState/loadState goes through the first cycle in BulkLoaderVertexProgram. This is something that was bugging me for a while with multiple hosts configured for TitanGraph in the config and the HadoopConf only opening a connection against the last host in the list - this change to HadoopConfiguration causes it to read  standardtitangraph[cassandrathrift:[host1, host2, ...]] in the spark executor logs instead like you might expect, and allows the bindings for the ScriptTraversal to survive storeState/loadState and be applied to the traversal.
> I suppose I was wondering if this is dangerous or bad somehow? I know that in a few places I saw the values of the configuration being explicitly toString()'d...



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)