You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Marko A. Rodriguez (JIRA)" <ji...@apache.org> on 2017/02/01 17:35:51 UTC
[jira] [Closed] (TINKERPOP-1315) HadoopConfiguration will not allow
an ArrayList to be serialized in vertexProgram configuration unless
setProperty is overriden
[ https://issues.apache.org/jira/browse/TINKERPOP-1315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marko A. Rodriguez closed TINKERPOP-1315.
-----------------------------------------
Resolution: Won't Fix
Assignee: Marko A. Rodriguez
Closing this as this is simply the semantics of Apache Commons and lists. Please reopen if you believe that TinkerPop should do something different.
> HadoopConfiguration will not allow an ArrayList to be serialized in vertexProgram configuration unless setProperty is overriden
> -------------------------------------------------------------------------------------------------------------------------------
>
> Key: TINKERPOP-1315
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1315
> Project: TinkerPop
> Issue Type: Improvement
> Components: hadoop
> Affects Versions: 3.2.1
> Reporter: Dylan Bethune-Waddell
> Assignee: Marko A. Rodriguez
> Priority: Minor
>
> I have been implementing a "PrecisionBulkLoader" class that takes a ScriptTraversal with bindings that can execute against the target graph to getOrCreate vertices/edges with more precision - this follows from my realization that currently IncrementalBulkLoader will overwrite the first edge of the same label in the target graph that is between the two vertex endpoints - this is an issue for self-loops and multi-edges:
> https://issues.apache.org/jira/browse/TINKERPOP-1099
> I finally got it to work with the script bindings being propagated to workers, but in order to do so without just taking the last value of the Array I had to override the setProperty method in org.apache.tinkerpop.gremlin.hadoop.structure.HadoopConfiguration - before I did that, when ConfigurationUtils.copy(conf1, conf2) was called with a HadoopConfiguration on either end (conf1 or conf2), any multi-valued / list properties get clobbered and only the last value would be there after storeState/loadState goes through the first cycle in BulkLoaderVertexProgram. This is something that was bugging me for a while with multiple hosts configured for TitanGraph in the config and the HadoopConf only opening a connection against the last host in the list - this change to HadoopConfiguration causes it to read standardtitangraph[cassandrathrift:[host1, host2, ...]] in the spark executor logs instead like you might expect, and allows the bindings for the ScriptTraversal to survive storeState/loadState and be applied to the traversal.
> I suppose I was wondering if this is dangerous or bad somehow? I know that in a few places I saw the values of the configuration being explicitly toString()'d...
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)