You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Ashvin (JIRA)" <ji...@apache.org> on 2015/09/09 02:15:45 UTC

[jira] [Comment Edited] (GEODE-310) HDFSStore API should be more consistent

    [ https://issues.apache.org/jira/browse/GEODE-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735867#comment-14735867 ] 

Ashvin edited comment on GEODE-310 at 9/9/15 12:14 AM:
-------------------------------------------------------

I think there are two issues reported in this jira.
# Use of units in names
# Use of Buffer instead of Queue in name (consistency with other components)

In general units in names have not been used in Geode. So removing the units from HDFS members could be preferred. All sizes are specified in MB. So the MB prefix can easily be removed for consistency. However, currently there are 3 different units used for different intervals configurations in HdfsStore. This makes it susceptible to configuration errors. I am thinking of using a single unit for all intervals. I.e. all intervals would be configured in milliseconds. Milliseconds because batchInterval is configured in milliseconds in other components. Does that sound ok? Other alternatives 
# use java size style configuration (1s for 1 second, 1m for 1min, 1h for one hour, default milli). This will be the first of its kind in Geode, but may be its time for this kind of change
# use cron string

Regarding the other issue, while Hdfs persistence depends on AysncQueues, use of queue in configuration is confusing. The order of events in queue is different from the order of events on HDFS. So the queue is actually used as a buffer. And event ordering is not guaranteed. For correctness use of Buffer is name is more consistent with the behavior instead of queue? However that also means the api will become component specific and the stats names will not be consistent. So I will replace buffer with queue. Does that sound ok?


was (Author: ashvin):
I think there are two issues reported in this jira.
# Use of units in names
# Use of Buffer instead of Queue in name (consistency with other components)

In general units in names have not been used in Geode. So removing the units from HDFS members could be preferred. All sizes are specified in MB. So the MB prefix can easily be removed for consistency. However, currently there are 3 different units used for different intervals configurations in HdfsStore. This makes it susceptible to configuration errors. I am thinking of using a single unit for all intervals. I.e. all intervals would be configured in seconds. Does that sound ok? 

Regarding the other issue, while Hdfs persistence depends on AysncQueues, use of queue in configuration is confusing. The order of events in queue is different from the order of events on HDFS. So the queue is actually used as a buffer. And event ordering is not guaranteed. For correctness use of Buffer is name is more consistent with the behavior instead of queue? However that also means the api will become component specific and the stats names will not be consistent. So I will replace buffer with queue. Does that sound ok?

> HDFSStore API should be more consistent
> ---------------------------------------
>
>                 Key: GEODE-310
>                 URL: https://issues.apache.org/jira/browse/GEODE-310
>             Project: Geode
>          Issue Type: Bug
>            Reporter: Kirk Lund
>            Assignee: Ashvin
>
> The public interface for HDFSStore has inconsistencies between method names and constants giving default values and between HDFSStore and other entities such as Region and GatewaySender. These need to be made consistent. A decision also needs to be made whether to include units in the attribute names.
> Mismatches between method names and default values:
> -- HDFSStoreFactory setBatchInterval HDFSStore.DEFAULT_BATCH_INTERVAL_MILLIS
> -- HDFSStoreFactory.setBatchSize HDFSStore.DEFAULT_BATCH_SIZE_MB
> -- HDFSStoreFactory.setBufferPersistent GatewaySender.DEFAULT_PERSISTENCE_ENABLED
> -- HDFSStoreFactory.setDispatcherThreads GatewaySender.DEFAULT_HDFS_DISPATCHER_THREADS
> -- HDFSStoreFactory.setInputFileSizeMax HDFSStore.DEFAULT_INPUT_FILE_SIZE_MAX_MB
> -- HDFSStoreFactory.setMajorCompactionInterval HDFSStore.DEFAULT_MAJOR_COMPACTION_INTERVAL_MINS
> -- HDFSStoreFactory.setMaxMemory GatewaySender.DEFAULT_MAXIMUM_QUEUE_MEMORY
> -- HDFSStoreFactory.setPurgeInterval HDFSStore.DEFAULT_OLD_FILE_CLEANUP_INTERVAL_MINS
> -- HDFSStoreFactory.setWriteOnlyFileRolloverSize HDFSStore.DEFAULT_WRITE_ONLY_FILE_SIZE_LIMIT
> HDFSStore uses "synchronousDiskWrite" where Region and GatewaySender use "diskSynchronous"
> This came up during some of the testing and hydra support for HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)