You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gora.apache.org by John Mora <jh...@gmail.com> on 2019/08/03 22:59:03 UTC

Re: Kudu datastore reports

Hi all.

I have updated my report in the Wiki[1].

Also, I have sent a PR with my last commits for review [2]. Please give it
a look if you have time.

This week, I will continue working on the documentation of the kudu
datastore.

Please let me know if you have suggestions.

[1]
https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
[2] https://github.com/apache/gora/pull/178

Best,
John.

El mié., 31 jul. 2019 a las 11:17, carlos muñoz (<ca...@gmail.com>)
escribió:

> Hi John,
>
> Thanks for the update. I reviewed your code a little bit, it is looking
> good. I think tha you should send a PR in order to receive feedback from
> other community members.
>
> Best,
> Carlos
>
> El dom., 28 jul. 2019 a las 23:20, John Mora (<jh...@gmail.com>)
> escribió:
>
>> Hi all.
>>
>> I updated my report in the Wiki[1]. Also, I pushed my last commits to my
>> branch [2]. Please give it a look if you have time.
>>
>> This week, I will give a look to the documentation of datastores.
>>
>> Please let me know if you have suggestions.
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>
>> Cheers,
>> John
>>
>> El mié., 24 jul. 2019 a las 11:34, John Mora (<jh...@gmail.com>)
>> escribió:
>>
>>> Hi Alfonso,
>>>
>>> Yes, I was using this class javafx.util.Pair. It is not a problem I will
>>> find an alternative, it is only an utilitary class.
>>>
>>> Thanks,
>>> John
>>>
>>> El mar., 23 jul. 2019 a las 12:36, Alfonso Nishikawa (<
>>> alfonso.nishikawa@gmail.com>) escribió:
>>>
>>>> Hi, John.
>>>>
>>>> I checked out your code and it looks good :)
>>>> I found that you use javafx, but that is not present in OpenJDK and
>>>> fails to compile, and since we don't stick to Oracle JVM I would suggest to
>>>> change it.
>>>>
>>>> Good job, keep it going :)
>>>>
>>>> Regards,
>>>>
>>>> Alfonso Nishikawa
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> El sáb., 20 jul. 2019 a las 22:25, John Mora (<jh...@gmail.com>)
>>>> escribió:
>>>>
>>>>> Hi.
>>>>>
>>>>> I updated my report in the Wiki[1]. Also, I pushed my last commits to
>>>>> my branch [2]. Please give it a look if you have time.
>>>>>
>>>>> This week, I will give a look to the map reduce tests for DataStores.
>>>>>
>>>>> Please let me know if you have suggestions.
>>>>>
>>>>> [1]
>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>
>>>>> Thanks,
>>>>> John
>>>>>
>>>>> El sáb., 13 jul. 2019 a las 19:31, John Mora (<jh...@gmail.com>)
>>>>> escribió:
>>>>>
>>>>>> Hi all
>>>>>>
>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last commits to
>>>>>> my branch [2]. Please give it a look if you have time.
>>>>>>
>>>>>> This week, I will be working in the getPartitions and deleteByQuery
>>>>>> methods and testing the other tests in the DataStoreTestBase class.
>>>>>>
>>>>>> Please let me know if you have suggestions.
>>>>>>
>>>>>> [1]
>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>
>>>>>> Best,
>>>>>> John.
>>>>>>
>>>>>> El mié., 10 jul. 2019 a las 16:17, John Mora (<jh...@gmail.com>)
>>>>>> escribió:
>>>>>>
>>>>>>> Hi Alfonso,
>>>>>>>
>>>>>>> Thanks so much for your time and support for this project. I will
>>>>>>> work on your comments. Responses inline :)
>>>>>>>
>>>>>>>
>>>>>>> El mar., 9 jul. 2019 a las 16:38, Alfonso Nishikawa (<
>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>
>>>>>>>> Hi, John.
>>>>>>>>
>>>>>>>> Sorry for the delay, I am changing work and I have been very busy
>>>>>>>> :( I will try to answer your questions :)
>>>>>>>>
>>>>>>>> *> In the Employee example there is a field called 'dateOfBirth'. I
>>>>>>>> tried to map that field with the UNIXTIME_MICROS datatype of Kudu (I
>>>>>>>> intuitively assumed this is a date.). However, in the java world the
>>>>>>>> Employee field is a Long value and the kudu datatype is a Timestamp. So, I
>>>>>>>> was wondering whether I should force the usage of the UNIXTIME_MICROS
>>>>>>>> datatype for this field or just use a LONG datatype in Kudu.*
>>>>>>>>
>>>>>>>> In Avro 1.8 were introduced "Logical Types" so there is a "date"
>>>>>>>> type with an underlying "int" [1]. It's the first time I read about because
>>>>>>>> until the last version upgrade of Avro this weren't there. I would suggest
>>>>>>>> to ignore "dates" and map dateOfBirth as long, since in any case -in avro-
>>>>>>>> the value is the unix epoch. After this first approach, a design
>>>>>>>> improvement would be great, though :)
>>>>>>>>
>>>>>>>> - Would be good to have in the mapping a "timestamp" type so
>>>>>>>> KuduStore converts between the Entity long field <-> Kudu timestamp storage?
>>>>>>>> - Is there any other approach?
>>>>>>>>
>>>>>>>
>>>>>>> I think that Entity long field <-> Kudu timestamp conversion that
>>>>>>> the best alternative right now. Because, I would add more compatible
>>>>>>> datatypes to the mapping parameters which users can use. And this
>>>>>>> conversion should not be dificult to implement in my opinion. Also, the new
>>>>>>> Date datatype of avro could be implemented in newer versions because it
>>>>>>> would need further analysis in other datastores too. I will work on that.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *> What is the Gora's policy regarding flush()? *
>>>>>>>> *> KuduClient has multiple flushing modes
>>>>>>>> <https://kudu.apache.org/apidocs/org/apache/kudu/client/SessionConfiguration.FlushMode.html>and
>>>>>>>> also can set time interval
>>>>>>>> <https://kudu.apache.org/releases/1.2.0/apidocs/org/apache/kudu/client/KuduSession.html#setFlushInterval-int->
>>>>>>>> for automatic flush.*
>>>>>>>> *> Should theses behaviors be configurable using gora.properties
>>>>>>>> file? or just use the default configurations.*
>>>>>>>>
>>>>>>>> What we do in HBase is configure an autoflush option in
>>>>>>>> gora.properties [2] which is used when instanced the Table, but at the same
>>>>>>>> time we implement the flush() method to force the flush [3]. I would
>>>>>>>> suggest to follow that example, but adding the flushing options of Kudu.
>>>>>>>> What flushing mode (and time interval if it applies) do you suggest?
>>>>>>>>
>>>>>>>
>>>>>>> Well,  IMHO the default flush mode (auto flush sync) will do the job
>>>>>>> for most use cases. But I will add a configuration in gora.properties for
>>>>>>> selecting the other modes and specifying a autoflush time  if needed  by
>>>>>>> the user.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> *> Also, while reviewing the datastore interface I noticed this
>>>>>>>> method 'getPartitions(Query<K, T> query)'. What is the expected behavior of
>>>>>>>> this method?, should I use the partition definition in the xml mapping file
>>>>>>>> for this?.*
>>>>>>>>
>>>>>>>> The method getPartitions(Query) is related to Hadoop. Apache Gora
>>>>>>>> integrates with Hadoop implementing a custom Map and Reduce that allows to
>>>>>>>> get/write Entities directly.
>>>>>>>> You can take a look at HBase's implementation [4], which relies o.a.h.hbase.mapreduce.TableInputFormatBase
>>>>>>>> [5] to compute the splits (start key---end key) with the location of the
>>>>>>>> split to create a colection of partitions [6].
>>>>>>>>
>>>>>>>> So, if Kudu is allowed to perform computation using local kudu
>>>>>>>> splits, then this method does the needed preparation to allow to "send the
>>>>>>>> computation to where the data is locally".
>>>>>>>>
>>>>>>>> In any case, you can see that:
>>>>>>>>
>>>>>>>>    - MongoDB store implementation does not implement splitting [7]
>>>>>>>>    - Cassandra store implementation does not implement splitting
>>>>>>>>    [8]
>>>>>>>>    - Aerospike store implementation does not implement splitting
>>>>>>>>    [9]
>>>>>>>>    - Accumulo store implementation* does* implement splitting [10]
>>>>>>>>
>>>>>>>> If Kudu has a method to get the different splits for a table and
>>>>>>>> its locations, then you will be able to implement the full feature.
>>>>>>>>
>>>>>>>> This is Hadoop related and it is not trivial. I haven't elaborated
>>>>>>>> much, so if you find you need more information let me know :)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> I will check whether Kudu has these features in order to implement
>>>>>>> this method. If not I will use the default implementation found in other
>>>>>>> backends.
>>>>>>>
>>>>>>>
>>>>>>>> About Queries, what I can tell is that Hbase only implements "Start
>>>>>>>> key" + "End key" because it has only 2 operations: "get" and "scan", and
>>>>>>>> the querying is for "scan" operation, were you want an interval (or all) of
>>>>>>>> the rows. Does Kudu have more querying functionality?
>>>>>>>>
>>>>>>>>
>>>>>>> Yes, Kudu implements a Scanner for querying data among with
>>>>>>> conditional predicates for filtering. I am using those classes.
>>>>>>>
>>>>>>>
>>>>>>>> About other topic, I am trying to install Kudu in standalone (all
>>>>>>>> in 1 node). Do you use a Cloudera installation or do you have a standalone
>>>>>>>> installation? How do you do it? I found some instructions, but they talk
>>>>>>>> about compiling Kudu [11]. I was looking for something like HBase, that it
>>>>>>>> is unzip + execute "hbase start".
>>>>>>>>
>>>>>>>>
>>>>>>> I am using an embedded mini-cluster which comes with compiled
>>>>>>> binaries and can be used with maven[1] for testing my code. Once I get it
>>>>>>> mature enough I think I will be testing the datastore with a docker
>>>>>>> container [2]. I could not find a unzip+execute bundle either and I am
>>>>>>> kinda noob for compiling it myself.
>>>>>>>
>>>>>>> [1]
>>>>>>> https://kudu.apache.org/docs/developing.html#_jvm_based_integration_testing
>>>>>>> [2] https://hub.docker.com/r/usuresearch/apache-kudu/
>>>>>>>
>>>>>>>
>>>>>>>> Good job and thank you!! :)
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Alfonso Nishikawa
>>>>>>>>
>>>>>>>>
>>>>>>>> [1] - https://avro.apache.org/docs/1.8.0/spec.html#Logical+Types
>>>>>>>> [2] -
>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L175
>>>>>>>> [3] -
>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L458
>>>>>>>> [4] -
>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L472
>>>>>>>> [5] -
>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L479
>>>>>>>> [6] -
>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L517
>>>>>>>> [7] -
>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-mongodb/src/main/java/org/apache/gora/mongodb/store/MongoStore.java#L533
>>>>>>>> [8] -
>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L292
>>>>>>>> [9] -
>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-aerospike/src/main/java/org/apache/gora/aerospike/store/AerospikeStore.java#L369
>>>>>>>> [10] -
>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-accumulo/src/main/java/org/apache/gora/accumulo/store/AccumuloStore.java#L902
>>>>>>>> [11] - https://kudu.apache.org/docs/installation.html
>>>>>>>>
>>>>>>>>
>>>>>>>> El lun., 8 jul. 2019 a las 3:42, John Mora (<jh...@gmail.com>)
>>>>>>>> escribió:
>>>>>>>>
>>>>>>>>> Hi all.
>>>>>>>>>
>>>>>>>>> As every week I updated my report in the Wiki[1]. Also, I pushed
>>>>>>>>> my last commits to my branch [2]. Please give it a look if you have time.
>>>>>>>>>
>>>>>>>>> This week, I will be continue working in the Queries
>>>>>>>>> implementation, please reach me out if you have any suggestions.
>>>>>>>>>
>>>>>>>>> Also, while reviewing the datastore interface I noticed this
>>>>>>>>> method 'getPartitions(Query<K, T> query)'. What is the expected behavior of
>>>>>>>>> this method?, should I use the partition definition in the xml mapping file
>>>>>>>>> for this?.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> John.
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> El dom., 30 jun. 2019 a las 16:56, John Mora (<
>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>
>>>>>>>>>> Hi all.
>>>>>>>>>>
>>>>>>>>>> I received my first evaluation from the Google Summer of Code
>>>>>>>>>> program with a positive result. Thanks so much for your support and
>>>>>>>>>> confidence to the project and me.
>>>>>>>>>>
>>>>>>>>>> I updated my report of this week in the Wiki[1]. Also, I pushed
>>>>>>>>>> my last commits to my branch [2].
>>>>>>>>>>
>>>>>>>>>> This week, I will be reviewing my the serialization/
>>>>>>>>>> deserialization process in order to identify optimizations specific for
>>>>>>>>>> Kudu. Because I used a generic methods of other backends which probably
>>>>>>>>>> could be better tuned for kudu. Also, I will start working on the Queries
>>>>>>>>>> implementation.
>>>>>>>>>>
>>>>>>>>>> BTW, I added a question to the wiki about Date types. Please give
>>>>>>>>>> it a look if you have time.
>>>>>>>>>>
>>>>>>>>>> [1]
>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> John
>>>>>>>>>>
>>>>>>>>>> El jue., 27 jun. 2019 a las 21:02, John Mora (<
>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>
>>>>>>>>>>> Hi Carlos.
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the reminder. I submitted the form yesterday. :D
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> John.
>>>>>>>>>>>
>>>>>>>>>>> El jue., 27 jun. 2019 a las 17:34, carlos muñoz (<
>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>
>>>>>>>>>>>> Hi John
>>>>>>>>>>>>
>>>>>>>>>>>> The first Google Summer of Code evaluation is due on June 28th.
>>>>>>>>>>>> Please make sure you submit your Mentors' evaluation on time.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Carlos
>>>>>>>>>>>>
>>>>>>>>>>>> El dom., 23 jun. 2019 a las 18:29, John Mora (<
>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>
>>>>>>>>>>>>> FYI, I updated my report of this week on the Wiki[1]. Also, I
>>>>>>>>>>>>> pushed my last commits to my branch [2].
>>>>>>>>>>>>>
>>>>>>>>>>>>> As I mentioned in the reports I would like to know how
>>>>>>>>>>>>> datastores deal with flush(), should it work always manually executed?.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Finally, This week I will be implementing object
>>>>>>>>>>>>> serialization/deserialization in the methods put, get, delete, exists. Do
>>>>>>>>>>>>> you have any suggestions on how to proceed with this task?.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Footnote: Thanks for the feedback Carlos, I fixed the problem.
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1]
>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> John
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> El lun., 17 jun. 2019 a las 22:58, carlos muñoz (<
>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi John
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Your last changes look good to me. Keep it up. But, I noticed
>>>>>>>>>>>>>> that you have created an Enumeration for datatypes, which is very similar
>>>>>>>>>>>>>> to the kudu-client's [2]. Probably you should replace [1] for [2] in order
>>>>>>>>>>>>>> to avoid code duplication.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/Column.java#L76
>>>>>>>>>>>>>> [2] https://kudu.apache.org/apidocs/org/apache/kudu/Type.html
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Carlos
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> El sáb., 15 jun. 2019 a las 12:01, John Mora (<
>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I updated my report of this week on the Wiki[1]. I noticed
>>>>>>>>>>>>>>> that my code is lacking some javadoc documentation I think I will be
>>>>>>>>>>>>>>> working on that this week, also I would like to enable and check schema
>>>>>>>>>>>>>>> management tests (createSchema, existsSchema, etc.).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> El mar., 11 jun. 2019 a las 0:11, John Mora (<
>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Alfonso.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks so much for your feedback. I am working on your
>>>>>>>>>>>>>>>> comments.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 16:11, Alfonso Nishikawa (<
>>>>>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Regarding your questions at the report [1]:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>    - How to represent partitioning configurations on the
>>>>>>>>>>>>>>>>>    mapping file.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This was discussed in other emails, isn't it? :)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>    - KuduTestHarness requires the Maven plugin
>>>>>>>>>>>>>>>>>    os-maven-plugin, which needs Maven 3.1.1+, is it a problem for Apache Gora?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I believe it is not a problem. My Ubuntu comes with 3.6.0,
>>>>>>>>>>>>>>>>> far from 3.1.1, and I assume everyone uses Maven 3 in a quite new version :)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 21:07, Alfonso Nishikawa (<
>>>>>>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thank you!
>>>>>>>>>>>>>>>>>> Things I have seen:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> - The version of a maven dependency [1] should go on the
>>>>>>>>>>>>>>>>>> Dependency Management of the root pom [2]. Same for [3] and from there,
>>>>>>>>>>>>>>>>>> should not set the version there.
>>>>>>>>>>>>>>>>>> - Set test dependencies' scope to test, at [4] and from
>>>>>>>>>>>>>>>>>> there.
>>>>>>>>>>>>>>>>>> - Set the indentation to 2 spaces for the pom [5]
>>>>>>>>>>>>>>>>>> - Missing "t" in "localhost" at [6].
>>>>>>>>>>>>>>>>>> - Port 13 for Kudu? That is "Daytime Protocol" RFC 867
>>>>>>>>>>>>>>>>>> and you will need root permission to run it. The default port for kudu is
>>>>>>>>>>>>>>>>>> 7051, isn't it?
>>>>>>>>>>>>>>>>>> - I would ask you to add the same functionality to load
>>>>>>>>>>>>>>>>>> the mapping from configuration as in HBase's store [7] in you KuduStore
>>>>>>>>>>>>>>>>>> [8]. This will have implications on your readMapping at [9], so take a look
>>>>>>>>>>>>>>>>>> at the one for HBase at [10]
>>>>>>>>>>>>>>>>>> - I know it is in other backends, but avoid
>>>>>>>>>>>>>>>>>> RuntimeExceptions (at least in Java since we have the checked ones) like in
>>>>>>>>>>>>>>>>>> [11]. You can wrap them in GoraException. An example is [12]
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> And nothing more :)
>>>>>>>>>>>>>>>>>> Keep going, good job.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L98
>>>>>>>>>>>>>>>>>> [2] -
>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/pom.xml#L890
>>>>>>>>>>>>>>>>>> [3] -
>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L121
>>>>>>>>>>>>>>>>>> [4] -
>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L180
>>>>>>>>>>>>>>>>>> [5] -
>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml
>>>>>>>>>>>>>>>>>> [6] -
>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/test/resources/gora.properties#L18
>>>>>>>>>>>>>>>>>> [7] -
>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L92
>>>>>>>>>>>>>>>>>> [8] -
>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/store/KuduStore.java#L53
>>>>>>>>>>>>>>>>>> [9] -
>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L81
>>>>>>>>>>>>>>>>>> [10] -
>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L822
>>>>>>>>>>>>>>>>>> [11] -
>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L141
>>>>>>>>>>>>>>>>>> [12] -
>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L268
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> El sáb., 8 jun. 2019 a las 20:26, John Mora (<
>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I have just updated my weekly reports on Cwiki [1]. This
>>>>>>>>>>>>>>>>>>> next week I think I should be focusing on the create schema operation and
>>>>>>>>>>>>>>>>>>> solving the issue of the partitioning configurations in the mapping file.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Please let me know if you have suggestions, my last
>>>>>>>>>>>>>>>>>>> commits are available here [2]
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>

Re: Kudu datastore reports

Posted by Kevin Ratnasekera <dj...@gmail.com>.
Hi John,

Thank you for the Clarification. If the current implementation has no
issues with build in those environments, I think there is *no* need to
invest time in other approaches Eg:- Docker. Simply just do build test
whether tests pass for those operating systems.

Regards
Kevin

On Wed, Aug 7, 2019 at 3:09 AM John Mora <jh...@gmail.com> wrote:

> Hi Kevin.
>
> KuduTestHarness, theoretically should detect environment through this
> plugin "os-maven-plugin" and download the corresponding kudu binaries [1],
> and it have worked fine for me.
>
> Nonetheless, docker is a good idea. I will give a look to testcontainers
> and docker.
>
> [1]
> https://kudu.apache.org/docs/developing.html#_jvm_based_integration_testing
>
> Regards,
> John
>
> El lun., 5 ago. 2019 a las 23:58, Kevin Ratnasekera (<
> djkevincr1989@gmail.com>) escribió:
>
> > Hi John,
> >
> > Can't we spin up Kudu docker [1] instance for testing purposes? We have
> > used Test containers [2] some data stores like couch DB. Gora build
> should
> > work in both Linux and Non Linux environments. Eg:-  Windows. Is
> classifier
> > [3] depend on the environment the build is running?
> >
> > Kudu is based on C/ C++, so to spin up a server instance, we need check a
> > approach like docker, using such approach allow us to avoid these OS,
> > dependency related stuff come in to play in builds.
> >
> > [1] https://hub.docker.com/r/usuresearch/apache-kudu/
> > [2] https://www.testcontainers.org/
> > [3] <classifier>linux-x86_64</classifier>
> >
> > Regards
> > Kevin
> >
> > On Tue, Aug 6, 2019 at 9:56 AM John Mora <jh...@gmail.com> wrote:
> >
> > > Hi Alfonso,
> > >
> > > Unfortunately, I have not been able to reproduce the issue. Maybe it is
> > > related with my Java version (Oracle), I will try with OpenJDK.
> > > Some details about my development environment:
> > >
> > > os.detected.name: linux
> > > os.detected.arch: x86_64
> > > os.detected.version: 4.10
> > > os.detected.version.major: 4
> > > os.detected.version.minor: 10
> > > os.detected.release: linuxmint
> > > os.detected.release.version: 18.3
> > > os.detected.release.like.linuxmint: true
> > > os.detected.release.like.ubuntu: true
> > > os.detected.classifier: linux-x86_64
> > >
> > > Java
> > > java version "1.8.0_171"
> > > Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
> > > Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)
> > >
> > > Maven
> > > Apache Maven 3.3.9
> > > Maven home: /usr/share/maven
> > > Java version: 1.8.0_171, vendor: Oracle Corporation
> > > Java home: /usr/lib/jvm/java-8-oracle/jre
> > > Default locale: en_US, platform encoding: UTF-8
> > > OS name: "linux", version: "4.10.0-38-generic", arch: "amd64", family:
> > > "unix"
> > >
> > >
> > > Best,
> > > John.
> > >
> > > El lun., 5 ago. 2019 a las 16:48, Alfonso Nishikawa (<
> > > alfonso.nishikawa@gmail.com>) escribió:
> > >
> > >> Hi,
> > >>
> > >> I am using now the following pom configuration I got from executing
> `mvn
> > >> dependency:tree`:
> > >>
> > >>     <dependency>
> > >>       <groupId>org.apache.kudu</groupId>
> > >>       <artifactId>kudu-binary</artifactId>
> > >>       <classifier>linux-x86_64</classifier>
> > >>       <version>1.9.0</version>
> > >>       <scope>test</scope>
> > >>     </dependency>
> > >>
> > >> When I execute `mvn clen package` on gora-kudu I find that it spawns
> the
> > >> following command:
> > >>
> > >> kudu-master
> > >> --fs_wal_dir=/tmp/mini-kudu-cluster8989984398759938222/master-0/wal
> > >> --fs_data_dirs=/tmp/mini-kudu-cluster8989984398759938222/master-0/data
> > >> --block_manager=log --webserver_interface=localhost
> > --ipki_ca_key_size=1024
> > >> --tsk_num_rsa_bits=512 --rpc_bind_addresses=*127.26.116.190*:39535
> > >> --webserver_interface=*127.26.116.190* --webserver_port=0
> --never_fsync
> > >> --ipki_server_key_size=1024 --enable_minidumps=false --redact=none
> > >> --metrics_log_interval_ms=1000 --logtostderr --logbuflevel=-1
> > >> --log_dir=/tmp/mini-kudu-cluster8989984398759938222/master-0/logs
> > >>
> >
> --server_dump_info_path=/tmp/mini-kudu-cluster8989984398759938222/master-0/data/info.pb
> > >> --server_dump_info_format=pb --rpc_server_allow_ephemeral_ports
> > >> --unlock_experimental_flags --unlock_unsafe_flags --rpc_reuseport=true
> > >> --master_addresses=*127.26.116.190*:39535,*127.26.116.189*:33913,
> > >> *127.26.116.188*:42253
> > >>
> > >>
> > >> I highlight the IP addresses because they clearly are not my computer,
> > >> and I guess that is why the tests can't connect to the the database.
> > >>
> > >> Any idea on how to solve this?
> > >>
> > >> Thank you!
> > >>
> > >>
> > >> Best Regards,
> > >>
> > >> Alfonso Nishikawa
> > >>
> > >>
> > >>
> > >> El lun., 5 ago. 2019 a las 8:39, Alfonso Nishikawa (<
> > >> alfonso.nishikawa@gmail.com>) escribió:
> > >>
> > >>> Hi, John.
> > >>>
> > >>> I get a core dump from the binary kudu server when trying to run the
> > >>> tests. Didn't find a log file, but will search thoroughly later.
> > Happened
> > >>> anytime to you? Does it happens to anyone?
> > >>>
> > >>> I am using Ubuntu 18.04
> > >>>
> > >>> Thank you!
> > >>>
> > >>> Regards,
> > >>>
> > >>> Alfonso Nishikawa
> > >>>
> > >>> El dom., 4 ago. 2019 20:10, Furkan KAMACI <fu...@gmail.com>
> > >>> escribió:
> > >>>
> > >>>> Hi John,
> > >>>>
> > >>>> I've already made my comments at your PR. Please check them
> carefully
> > >>>> and ask me if you need help.
> > >>>>
> > >>>> For the documentation, I've checked what you've done. On the other
> > >>>> hand, I would want to encourage you to write a blog post about your
> > Kudu
> > >>>> implementation and demonstrate an example of Kudu integration with
> > Gora as
> > >>>> like a tutorial.
> > >>>>
> > >>>> Kind Regards,
> > >>>> Furkan KAMACI
> > >>>>
> > >>>> On Sun, Aug 4, 2019 at 1:59 AM John Mora <jh...@gmail.com>
> > wrote:
> > >>>>
> > >>>>> Hi all.
> > >>>>>
> > >>>>> I have updated my report in the Wiki[1].
> > >>>>>
> > >>>>> Also, I have sent a PR with my last commits for review [2]. Please
> > >>>>> give it a look if you have time.
> > >>>>>
> > >>>>> This week, I will continue working on the documentation of the kudu
> > >>>>> datastore.
> > >>>>>
> > >>>>> Please let me know if you have suggestions.
> > >>>>>
> > >>>>> [1]
> > >>>>>
> >
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> > >>>>> [2] https://github.com/apache/gora/pull/178
> > >>>>>
> > >>>>> Best,
> > >>>>> John.
> > >>>>>
> > >>>>> El mié., 31 jul. 2019 a las 11:17, carlos muñoz (<
> > carlosrmng@gmail.com>)
> > >>>>> escribió:
> > >>>>>
> > >>>>>> Hi John,
> > >>>>>>
> > >>>>>> Thanks for the update. I reviewed your code a little bit, it is
> > >>>>>> looking good. I think tha you should send a PR in order to receive
> > feedback
> > >>>>>> from other community members.
> > >>>>>>
> > >>>>>> Best,
> > >>>>>> Carlos
> > >>>>>>
> > >>>>>> El dom., 28 jul. 2019 a las 23:20, John Mora (<
> jhnmora000@gmail.com
> > >)
> > >>>>>> escribió:
> > >>>>>>
> > >>>>>>> Hi all.
> > >>>>>>>
> > >>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last
> commits
> > >>>>>>> to my branch [2]. Please give it a look if you have time.
> > >>>>>>>
> > >>>>>>> This week, I will give a look to the documentation of datastores.
> > >>>>>>>
> > >>>>>>> Please let me know if you have suggestions.
> > >>>>>>>
> > >>>>>>> [1]
> > >>>>>>>
> >
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> > >>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
> > >>>>>>>
> > >>>>>>> Cheers,
> > >>>>>>> John
> > >>>>>>>
> > >>>>>>> El mié., 24 jul. 2019 a las 11:34, John Mora (<
> > jhnmora000@gmail.com>)
> > >>>>>>> escribió:
> > >>>>>>>
> > >>>>>>>> Hi Alfonso,
> > >>>>>>>>
> > >>>>>>>> Yes, I was using this class javafx.util.Pair. It is not a
> problem
> > I
> > >>>>>>>> will find an alternative, it is only an utilitary class.
> > >>>>>>>>
> > >>>>>>>> Thanks,
> > >>>>>>>> John
> > >>>>>>>>
> > >>>>>>>> El mar., 23 jul. 2019 a las 12:36, Alfonso Nishikawa (<
> > >>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
> > >>>>>>>>
> > >>>>>>>>> Hi, John.
> > >>>>>>>>>
> > >>>>>>>>> I checked out your code and it looks good :)
> > >>>>>>>>> I found that you use javafx, but that is not present in OpenJDK
> > >>>>>>>>> and fails to compile, and since we don't stick to Oracle JVM I
> > would
> > >>>>>>>>> suggest to change it.
> > >>>>>>>>>
> > >>>>>>>>> Good job, keep it going :)
> > >>>>>>>>>
> > >>>>>>>>> Regards,
> > >>>>>>>>>
> > >>>>>>>>> Alfonso Nishikawa
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> El sáb., 20 jul. 2019 a las 22:25, John Mora (<
> > >>>>>>>>> jhnmora000@gmail.com>) escribió:
> > >>>>>>>>>
> > >>>>>>>>>> Hi.
> > >>>>>>>>>>
> > >>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last
> > >>>>>>>>>> commits to my branch [2]. Please give it a look if you have
> > time.
> > >>>>>>>>>>
> > >>>>>>>>>> This week, I will give a look to the map reduce tests for
> > >>>>>>>>>> DataStores.
> > >>>>>>>>>>
> > >>>>>>>>>> Please let me know if you have suggestions.
> > >>>>>>>>>>
> > >>>>>>>>>> [1]
> > >>>>>>>>>>
> >
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> > >>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks,
> > >>>>>>>>>> John
> > >>>>>>>>>>
> > >>>>>>>>>> El sáb., 13 jul. 2019 a las 19:31, John Mora (<
> > >>>>>>>>>> jhnmora000@gmail.com>) escribió:
> > >>>>>>>>>>
> > >>>>>>>>>>> Hi all
> > >>>>>>>>>>>
> > >>>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last
> > >>>>>>>>>>> commits to my branch [2]. Please give it a look if you have
> > time.
> > >>>>>>>>>>>
> > >>>>>>>>>>> This week, I will be working in the getPartitions and
> > >>>>>>>>>>> deleteByQuery methods and testing the other tests in the
> > DataStoreTestBase
> > >>>>>>>>>>> class.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Please let me know if you have suggestions.
> > >>>>>>>>>>>
> > >>>>>>>>>>> [1]
> > >>>>>>>>>>>
> >
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> > >>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
> > >>>>>>>>>>>
> > >>>>>>>>>>> Best,
> > >>>>>>>>>>> John.
> > >>>>>>>>>>>
> > >>>>>>>>>>> El mié., 10 jul. 2019 a las 16:17, John Mora (<
> > >>>>>>>>>>> jhnmora000@gmail.com>) escribió:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> Hi Alfonso,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Thanks so much for your time and support for this project. I
> > >>>>>>>>>>>> will work on your comments. Responses inline :)
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> El mar., 9 jul. 2019 a las 16:38, Alfonso Nishikawa (<
> > >>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Hi, John.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Sorry for the delay, I am changing work and I have been
> very
> > >>>>>>>>>>>>> busy :( I will try to answer your questions :)
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> *> In the Employee example there is a field called
> > >>>>>>>>>>>>> 'dateOfBirth'. I tried to map that field with the
> > UNIXTIME_MICROS datatype
> > >>>>>>>>>>>>> of Kudu (I intuitively assumed this is a date.). However,
> in
> > the java world
> > >>>>>>>>>>>>> the Employee field is a Long value and the kudu datatype is
> > a Timestamp.
> > >>>>>>>>>>>>> So, I was wondering whether I should force the usage of the
> > UNIXTIME_MICROS
> > >>>>>>>>>>>>> datatype for this field or just use a LONG datatype in
> Kudu.*
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> In Avro 1.8 were introduced "Logical Types" so there is a
> > >>>>>>>>>>>>> "date" type with an underlying "int" [1]. It's the first
> > time I read about
> > >>>>>>>>>>>>> because until the last version upgrade of Avro this weren't
> > there. I would
> > >>>>>>>>>>>>> suggest to ignore "dates" and map dateOfBirth as long,
> since
> > in any case
> > >>>>>>>>>>>>> -in avro- the value is the unix epoch. After this first
> > approach, a design
> > >>>>>>>>>>>>> improvement would be great, though :)
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> - Would be good to have in the mapping a "timestamp" type
> so
> > >>>>>>>>>>>>> KuduStore converts between the Entity long field <-> Kudu
> > timestamp storage?
> > >>>>>>>>>>>>> - Is there any other approach?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I think that Entity long field <-> Kudu timestamp conversion
> > >>>>>>>>>>>> that the best alternative right now. Because, I would add
> > more compatible
> > >>>>>>>>>>>> datatypes to the mapping parameters which users can use. And
> > this
> > >>>>>>>>>>>> conversion should not be dificult to implement in my
> opinion.
> > Also, the new
> > >>>>>>>>>>>> Date datatype of avro could be implemented in newer versions
> > because it
> > >>>>>>>>>>>> would need further analysis in other datastores too. I will
> > work on that.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> *> What is the Gora's policy regarding flush()? *
> > >>>>>>>>>>>>> *> KuduClient has multiple flushing modes
> > >>>>>>>>>>>>> <
> >
> https://kudu.apache.org/apidocs/org/apache/kudu/client/SessionConfiguration.FlushMode.html
> > >and
> > >>>>>>>>>>>>> also can set time interval
> > >>>>>>>>>>>>> <
> >
> https://kudu.apache.org/releases/1.2.0/apidocs/org/apache/kudu/client/KuduSession.html#setFlushInterval-int-
> > >
> > >>>>>>>>>>>>> for automatic flush.*
> > >>>>>>>>>>>>> *> Should theses behaviors be configurable using
> > >>>>>>>>>>>>> gora.properties file? or just use the default
> > configurations.*
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> What we do in HBase is configure an autoflush option in
> > >>>>>>>>>>>>> gora.properties [2] which is used when instanced the Table,
> > but at the same
> > >>>>>>>>>>>>> time we implement the flush() method to force the flush
> [3].
> > I would
> > >>>>>>>>>>>>> suggest to follow that example, but adding the flushing
> > options of Kudu.
> > >>>>>>>>>>>>> What flushing mode (and time interval if it applies) do you
> > suggest?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Well,  IMHO the default flush mode (auto flush sync) will do
> > >>>>>>>>>>>> the job for most use cases. But I will add a configuration
> in
> > >>>>>>>>>>>> gora.properties for selecting the other modes and specifying
> > a autoflush
> > >>>>>>>>>>>> time  if needed  by the user.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> *> Also, while reviewing the datastore interface I noticed
> > >>>>>>>>>>>>> this method 'getPartitions(Query<K, T> query)'. What is the
> > expected
> > >>>>>>>>>>>>> behavior of this method?, should I use the partition
> > definition in the xml
> > >>>>>>>>>>>>> mapping file for this?.*
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> The method getPartitions(Query) is related to Hadoop.
> Apache
> > >>>>>>>>>>>>> Gora integrates with Hadoop implementing a custom Map and
> > Reduce that
> > >>>>>>>>>>>>> allows to get/write Entities directly.
> > >>>>>>>>>>>>> You can take a look at HBase's implementation [4], which
> > >>>>>>>>>>>>> relies o.a.h.hbase.mapreduce.TableInputFormatBase [5] to
> > >>>>>>>>>>>>> compute the splits (start key---end key) with the location
> > of the split to
> > >>>>>>>>>>>>> create a colection of partitions [6].
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> So, if Kudu is allowed to perform computation using local
> > kudu
> > >>>>>>>>>>>>> splits, then this method does the needed preparation to
> > allow to "send the
> > >>>>>>>>>>>>> computation to where the data is locally".
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> In any case, you can see that:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>    - MongoDB store implementation does not implement
> > >>>>>>>>>>>>>    splitting [7]
> > >>>>>>>>>>>>>    - Cassandra store implementation does not implement
> > >>>>>>>>>>>>>    splitting [8]
> > >>>>>>>>>>>>>    - Aerospike store implementation does not implement
> > >>>>>>>>>>>>>    splitting [9]
> > >>>>>>>>>>>>>    - Accumulo store implementation* does* implement
> splitting
> > >>>>>>>>>>>>>    [10]
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> If Kudu has a method to get the different splits for a
> table
> > >>>>>>>>>>>>> and its locations, then you will be able to implement the
> > full feature.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> This is Hadoop related and it is not trivial. I haven't
> > >>>>>>>>>>>>> elaborated much, so if you find you need more information
> > let me know :)
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>> I will check whether Kudu has these features in order to
> > >>>>>>>>>>>> implement this method. If not I will use the default
> > implementation found
> > >>>>>>>>>>>> in other backends.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> About Queries, what I can tell is that Hbase only
> implements
> > >>>>>>>>>>>>> "Start key" + "End key" because it has only 2 operations:
> > "get" and "scan",
> > >>>>>>>>>>>>> and the querying is for "scan" operation, were you want an
> > interval (or
> > >>>>>>>>>>>>> all) of the rows. Does Kudu have more querying
> functionality?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>> Yes, Kudu implements a Scanner for querying data among with
> > >>>>>>>>>>>> conditional predicates for filtering. I am using those
> > classes.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> About other topic, I am trying to install Kudu in
> standalone
> > >>>>>>>>>>>>> (all in 1 node). Do you use a Cloudera installation or do
> > you have a
> > >>>>>>>>>>>>> standalone installation? How do you do it? I found some
> > instructions, but
> > >>>>>>>>>>>>> they talk about compiling Kudu [11]. I was looking for
> > something like
> > >>>>>>>>>>>>> HBase, that it is unzip + execute "hbase start".
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>> I am using an embedded mini-cluster which comes with
> compiled
> > >>>>>>>>>>>> binaries and can be used with maven[1] for testing my code.
> > Once I get it
> > >>>>>>>>>>>> mature enough I think I will be testing the datastore with a
> > docker
> > >>>>>>>>>>>> container [2]. I could not find a unzip+execute bundle
> either
> > and I am
> > >>>>>>>>>>>> kinda noob for compiling it myself.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> [1]
> > >>>>>>>>>>>>
> >
> https://kudu.apache.org/docs/developing.html#_jvm_based_integration_testing
> > >>>>>>>>>>>> [2] https://hub.docker.com/r/usuresearch/apache-kudu/
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Good job and thank you!! :)
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Alfonso Nishikawa
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> [1] -
> > >>>>>>>>>>>>> https://avro.apache.org/docs/1.8.0/spec.html#Logical+Types
> > >>>>>>>>>>>>> [2] -
> > >>>>>>>>>>>>>
> >
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L175
> > >>>>>>>>>>>>> [3] -
> > >>>>>>>>>>>>>
> >
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L458
> > >>>>>>>>>>>>> [4] -
> > >>>>>>>>>>>>>
> >
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L472
> > >>>>>>>>>>>>> [5] -
> > >>>>>>>>>>>>>
> >
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L479
> > >>>>>>>>>>>>> [6] -
> > >>>>>>>>>>>>>
> >
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L517
> > >>>>>>>>>>>>> [7] -
> > >>>>>>>>>>>>>
> >
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-mongodb/src/main/java/org/apache/gora/mongodb/store/MongoStore.java#L533
> > >>>>>>>>>>>>> [8] -
> > >>>>>>>>>>>>>
> >
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L292
> > >>>>>>>>>>>>> [9] -
> > >>>>>>>>>>>>>
> >
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-aerospike/src/main/java/org/apache/gora/aerospike/store/AerospikeStore.java#L369
> > >>>>>>>>>>>>> [10] -
> > >>>>>>>>>>>>>
> >
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-accumulo/src/main/java/org/apache/gora/accumulo/store/AccumuloStore.java#L902
> > >>>>>>>>>>>>> [11] - https://kudu.apache.org/docs/installation.html
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> El lun., 8 jul. 2019 a las 3:42, John Mora (<
> > >>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Hi all.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> As every week I updated my report in the Wiki[1]. Also, I
> > >>>>>>>>>>>>>> pushed my last commits to my branch [2]. Please give it a
> > look if you have
> > >>>>>>>>>>>>>> time.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> This week, I will be continue working in the Queries
> > >>>>>>>>>>>>>> implementation, please reach me out if you have any
> > suggestions.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Also, while reviewing the datastore interface I noticed
> this
> > >>>>>>>>>>>>>> method 'getPartitions(Query<K, T> query)'. What is the
> > expected behavior of
> > >>>>>>>>>>>>>> this method?, should I use the partition definition in the
> > xml mapping file
> > >>>>>>>>>>>>>> for this?.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>>> John.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>>
> >
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> > >>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> El dom., 30 jun. 2019 a las 16:56, John Mora (<
> > >>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Hi all.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> I received my first evaluation from the Google Summer of
> > >>>>>>>>>>>>>>> Code program with a positive result. Thanks so much for
> > your support and
> > >>>>>>>>>>>>>>> confidence to the project and me.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> I updated my report of this week in the Wiki[1]. Also, I
> > >>>>>>>>>>>>>>> pushed my last commits to my branch [2].
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> This week, I will be reviewing my the serialization/
> > >>>>>>>>>>>>>>> deserialization process in order to identify
> optimizations
> > specific for
> > >>>>>>>>>>>>>>> Kudu. Because I used a generic methods of other backends
> > which probably
> > >>>>>>>>>>>>>>> could be better tuned for kudu. Also, I will start
> working
> > on the Queries
> > >>>>>>>>>>>>>>> implementation.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> BTW, I added a question to the wiki about Date types.
> > Please
> > >>>>>>>>>>>>>>> give it a look if you have time.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>>>
> >
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> > >>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>>>> John
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> El jue., 27 jun. 2019 a las 21:02, John Mora (<
> > >>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Hi Carlos.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Thanks for the reminder. I submitted the form yesterday.
> > :D
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>> John.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> El jue., 27 jun. 2019 a las 17:34, carlos muñoz (<
> > >>>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Hi John
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> The first Google Summer of Code evaluation is due on
> June
> > >>>>>>>>>>>>>>>>> 28th. Please make sure you submit your Mentors'
> > evaluation on time.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>>> Carlos
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> El dom., 23 jun. 2019 a las 18:29, John Mora (<
> > >>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Hi all.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> FYI, I updated my report of this week on the Wiki[1].
> > >>>>>>>>>>>>>>>>>> Also, I pushed my last commits to my branch [2].
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> As I mentioned in the reports I would like to know how
> > >>>>>>>>>>>>>>>>>> datastores deal with flush(), should it work always
> > manually executed?.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Finally, This week I will be implementing object
> > >>>>>>>>>>>>>>>>>> serialization/deserialization in the methods put, get,
> > delete, exists. Do
> > >>>>>>>>>>>>>>>>>> you have any suggestions on how to proceed with this
> > task?.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Footnote: Thanks for the feedback Carlos, I fixed the
> > >>>>>>>>>>>>>>>>>> problem.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>>>>>>
> >
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> > >>>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>>>>>>> John
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> El lun., 17 jun. 2019 a las 22:58, carlos muñoz (<
> > >>>>>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Hi John
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Your last changes look good to me. Keep it up. But, I
> > >>>>>>>>>>>>>>>>>>> noticed that you have created an Enumeration for
> > datatypes, which is very
> > >>>>>>>>>>>>>>>>>>> similar to the kudu-client's [2]. Probably you should
> > replace [1] for [2]
> > >>>>>>>>>>>>>>>>>>> in order to avoid code duplication.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>>>>>>>
> >
> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/Column.java#L76
> > >>>>>>>>>>>>>>>>>>> [2]
> > >>>>>>>>>>>>>>>>>>>
> > https://kudu.apache.org/apidocs/org/apache/kudu/Type.html
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>>>> Carlos
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> El sáb., 15 jun. 2019 a las 12:01, John Mora (<
> > >>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Hi all.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> I updated my report of this week on the Wiki[1]. I
> > >>>>>>>>>>>>>>>>>>>> noticed that my code is lacking some javadoc
> > documentation I think I will
> > >>>>>>>>>>>>>>>>>>>> be working on that this week, also I would like to
> > enable and check schema
> > >>>>>>>>>>>>>>>>>>>> management tests (createSchema, existsSchema, etc.).
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>>>>>>>>
> >
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>>>>>>>>> John.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> El mar., 11 jun. 2019 a las 0:11, John Mora (<
> > >>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Hi Alfonso.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Thanks so much for your feedback. I am working on
> > your
> > >>>>>>>>>>>>>>>>>>>>> comments.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>>>>>> John
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 16:11, Alfonso
> Nishikawa
> > (<
> > >>>>>>>>>>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> Hi, John.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> Regarding your questions at the report [1]:
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>    - How to represent partitioning configurations
> on
> > >>>>>>>>>>>>>>>>>>>>>>    the mapping file.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> This was discussed in other emails, isn't it? :)
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>    - KuduTestHarness requires the Maven plugin
> > >>>>>>>>>>>>>>>>>>>>>>    os-maven-plugin, which needs Maven 3.1.1+, is
> it
> > a problem for Apache Gora?
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> I believe it is not a problem. My Ubuntu comes
> with
> > >>>>>>>>>>>>>>>>>>>>>> 3.6.0, far from 3.1.1, and I assume everyone uses
> > Maven 3 in a quite new
> > >>>>>>>>>>>>>>>>>>>>>> version :)
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> [1] -
> > >>>>>>>>>>>>>>>>>>>>>>
> >
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 21:07, Alfonso
> Nishikawa
> > >>>>>>>>>>>>>>>>>>>>>> (<al...@gmail.com>) escribió:
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> Hi, John.
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> Thank you!
> > >>>>>>>>>>>>>>>>>>>>>>> Things I have seen:
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> - The version of a maven dependency [1] should go
> > on
> > >>>>>>>>>>>>>>>>>>>>>>> the Dependency Management of the root pom [2].
> > Same for [3] and from there,
> > >>>>>>>>>>>>>>>>>>>>>>> should not set the version there.
> > >>>>>>>>>>>>>>>>>>>>>>> - Set test dependencies' scope to test, at [4]
> and
> > >>>>>>>>>>>>>>>>>>>>>>> from there.
> > >>>>>>>>>>>>>>>>>>>>>>> - Set the indentation to 2 spaces for the pom [5]
> > >>>>>>>>>>>>>>>>>>>>>>> - Missing "t" in "localhost" at [6].
> > >>>>>>>>>>>>>>>>>>>>>>> - Port 13 for Kudu? That is "Daytime Protocol"
> RFC
> > >>>>>>>>>>>>>>>>>>>>>>> 867 and you will need root permission to run it.
> > The default port for kudu
> > >>>>>>>>>>>>>>>>>>>>>>> is 7051, isn't it?
> > >>>>>>>>>>>>>>>>>>>>>>> - I would ask you to add the same functionality
> to
> > >>>>>>>>>>>>>>>>>>>>>>> load the mapping from configuration as in HBase's
> > store [7] in you
> > >>>>>>>>>>>>>>>>>>>>>>> KuduStore [8]. This will have implications on
> your
> > readMapping at [9], so
> > >>>>>>>>>>>>>>>>>>>>>>> take a look at the one for HBase at [10]
> > >>>>>>>>>>>>>>>>>>>>>>> - I know it is in other backends, but avoid
> > >>>>>>>>>>>>>>>>>>>>>>> RuntimeExceptions (at least in Java since we have
> > the checked ones) like in
> > >>>>>>>>>>>>>>>>>>>>>>> [11]. You can wrap them in GoraException. An
> > example is [12]
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> And nothing more :)
> > >>>>>>>>>>>>>>>>>>>>>>> Keep going, good job.
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> [1] -
> > >>>>>>>>>>>>>>>>>>>>>>>
> > https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L98
> > >>>>>>>>>>>>>>>>>>>>>>> [2] -
> > >>>>>>>>>>>>>>>>>>>>>>>
> > https://github.com/jhnmora000/gora/blob/GORA-485/pom.xml#L890
> > >>>>>>>>>>>>>>>>>>>>>>> [3] -
> > >>>>>>>>>>>>>>>>>>>>>>>
> > https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L121
> > >>>>>>>>>>>>>>>>>>>>>>> [4] -
> > >>>>>>>>>>>>>>>>>>>>>>>
> > https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L180
> > >>>>>>>>>>>>>>>>>>>>>>> [5] -
> > >>>>>>>>>>>>>>>>>>>>>>>
> > https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml
> > >>>>>>>>>>>>>>>>>>>>>>> [6] -
> > >>>>>>>>>>>>>>>>>>>>>>>
> >
> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/test/resources/gora.properties#L18
> > >>>>>>>>>>>>>>>>>>>>>>> [7] -
> > >>>>>>>>>>>>>>>>>>>>>>>
> >
> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L92
> > >>>>>>>>>>>>>>>>>>>>>>> [8] -
> > >>>>>>>>>>>>>>>>>>>>>>>
> >
> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/store/KuduStore.java#L53
> > >>>>>>>>>>>>>>>>>>>>>>> [9] -
> > >>>>>>>>>>>>>>>>>>>>>>>
> >
> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L81
> > >>>>>>>>>>>>>>>>>>>>>>> [10] -
> > >>>>>>>>>>>>>>>>>>>>>>>
> >
> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L822
> > >>>>>>>>>>>>>>>>>>>>>>> [11] -
> > >>>>>>>>>>>>>>>>>>>>>>>
> >
> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L141
> > >>>>>>>>>>>>>>>>>>>>>>> [12] -
> > >>>>>>>>>>>>>>>>>>>>>>>
> >
> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L268
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> El sáb., 8 jun. 2019 a las 20:26, John Mora (<
> > >>>>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Hi all.
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> I have just updated my weekly reports on Cwiki
> > [1].
> > >>>>>>>>>>>>>>>>>>>>>>>> This next week I think I should be focusing on
> > the create schema operation
> > >>>>>>>>>>>>>>>>>>>>>>>> and solving the issue of the partitioning
> > configurations in the mapping
> > >>>>>>>>>>>>>>>>>>>>>>>> file.
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Please let me know if you have suggestions, my
> > last
> > >>>>>>>>>>>>>>>>>>>>>>>> commits are available here [2]
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>>>>>>>>>>>>
> >
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> > >>>>>>>>>>>>>>>>>>>>>>>> [2]
> > >>>>>>>>>>>>>>>>>>>>>>>>
> https://github.com/jhnmora000/gora/tree/GORA-485
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>>>>>>>>> John
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>
> >
>

Re: Kudu datastore reports

Posted by John Mora <jh...@gmail.com>.
Hi Kevin.

KuduTestHarness, theoretically should detect environment through this
plugin "os-maven-plugin" and download the corresponding kudu binaries [1],
and it have worked fine for me.

Nonetheless, docker is a good idea. I will give a look to testcontainers
and docker.

[1]
https://kudu.apache.org/docs/developing.html#_jvm_based_integration_testing

Regards,
John

El lun., 5 ago. 2019 a las 23:58, Kevin Ratnasekera (<
djkevincr1989@gmail.com>) escribió:

> Hi John,
>
> Can't we spin up Kudu docker [1] instance for testing purposes? We have
> used Test containers [2] some data stores like couch DB. Gora build should
> work in both Linux and Non Linux environments. Eg:-  Windows. Is classifier
> [3] depend on the environment the build is running?
>
> Kudu is based on C/ C++, so to spin up a server instance, we need check a
> approach like docker, using such approach allow us to avoid these OS,
> dependency related stuff come in to play in builds.
>
> [1] https://hub.docker.com/r/usuresearch/apache-kudu/
> [2] https://www.testcontainers.org/
> [3] <classifier>linux-x86_64</classifier>
>
> Regards
> Kevin
>
> On Tue, Aug 6, 2019 at 9:56 AM John Mora <jh...@gmail.com> wrote:
>
> > Hi Alfonso,
> >
> > Unfortunately, I have not been able to reproduce the issue. Maybe it is
> > related with my Java version (Oracle), I will try with OpenJDK.
> > Some details about my development environment:
> >
> > os.detected.name: linux
> > os.detected.arch: x86_64
> > os.detected.version: 4.10
> > os.detected.version.major: 4
> > os.detected.version.minor: 10
> > os.detected.release: linuxmint
> > os.detected.release.version: 18.3
> > os.detected.release.like.linuxmint: true
> > os.detected.release.like.ubuntu: true
> > os.detected.classifier: linux-x86_64
> >
> > Java
> > java version "1.8.0_171"
> > Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
> > Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)
> >
> > Maven
> > Apache Maven 3.3.9
> > Maven home: /usr/share/maven
> > Java version: 1.8.0_171, vendor: Oracle Corporation
> > Java home: /usr/lib/jvm/java-8-oracle/jre
> > Default locale: en_US, platform encoding: UTF-8
> > OS name: "linux", version: "4.10.0-38-generic", arch: "amd64", family:
> > "unix"
> >
> >
> > Best,
> > John.
> >
> > El lun., 5 ago. 2019 a las 16:48, Alfonso Nishikawa (<
> > alfonso.nishikawa@gmail.com>) escribió:
> >
> >> Hi,
> >>
> >> I am using now the following pom configuration I got from executing `mvn
> >> dependency:tree`:
> >>
> >>     <dependency>
> >>       <groupId>org.apache.kudu</groupId>
> >>       <artifactId>kudu-binary</artifactId>
> >>       <classifier>linux-x86_64</classifier>
> >>       <version>1.9.0</version>
> >>       <scope>test</scope>
> >>     </dependency>
> >>
> >> When I execute `mvn clen package` on gora-kudu I find that it spawns the
> >> following command:
> >>
> >> kudu-master
> >> --fs_wal_dir=/tmp/mini-kudu-cluster8989984398759938222/master-0/wal
> >> --fs_data_dirs=/tmp/mini-kudu-cluster8989984398759938222/master-0/data
> >> --block_manager=log --webserver_interface=localhost
> --ipki_ca_key_size=1024
> >> --tsk_num_rsa_bits=512 --rpc_bind_addresses=*127.26.116.190*:39535
> >> --webserver_interface=*127.26.116.190* --webserver_port=0 --never_fsync
> >> --ipki_server_key_size=1024 --enable_minidumps=false --redact=none
> >> --metrics_log_interval_ms=1000 --logtostderr --logbuflevel=-1
> >> --log_dir=/tmp/mini-kudu-cluster8989984398759938222/master-0/logs
> >>
> --server_dump_info_path=/tmp/mini-kudu-cluster8989984398759938222/master-0/data/info.pb
> >> --server_dump_info_format=pb --rpc_server_allow_ephemeral_ports
> >> --unlock_experimental_flags --unlock_unsafe_flags --rpc_reuseport=true
> >> --master_addresses=*127.26.116.190*:39535,*127.26.116.189*:33913,
> >> *127.26.116.188*:42253
> >>
> >>
> >> I highlight the IP addresses because they clearly are not my computer,
> >> and I guess that is why the tests can't connect to the the database.
> >>
> >> Any idea on how to solve this?
> >>
> >> Thank you!
> >>
> >>
> >> Best Regards,
> >>
> >> Alfonso Nishikawa
> >>
> >>
> >>
> >> El lun., 5 ago. 2019 a las 8:39, Alfonso Nishikawa (<
> >> alfonso.nishikawa@gmail.com>) escribió:
> >>
> >>> Hi, John.
> >>>
> >>> I get a core dump from the binary kudu server when trying to run the
> >>> tests. Didn't find a log file, but will search thoroughly later.
> Happened
> >>> anytime to you? Does it happens to anyone?
> >>>
> >>> I am using Ubuntu 18.04
> >>>
> >>> Thank you!
> >>>
> >>> Regards,
> >>>
> >>> Alfonso Nishikawa
> >>>
> >>> El dom., 4 ago. 2019 20:10, Furkan KAMACI <fu...@gmail.com>
> >>> escribió:
> >>>
> >>>> Hi John,
> >>>>
> >>>> I've already made my comments at your PR. Please check them carefully
> >>>> and ask me if you need help.
> >>>>
> >>>> For the documentation, I've checked what you've done. On the other
> >>>> hand, I would want to encourage you to write a blog post about your
> Kudu
> >>>> implementation and demonstrate an example of Kudu integration with
> Gora as
> >>>> like a tutorial.
> >>>>
> >>>> Kind Regards,
> >>>> Furkan KAMACI
> >>>>
> >>>> On Sun, Aug 4, 2019 at 1:59 AM John Mora <jh...@gmail.com>
> wrote:
> >>>>
> >>>>> Hi all.
> >>>>>
> >>>>> I have updated my report in the Wiki[1].
> >>>>>
> >>>>> Also, I have sent a PR with my last commits for review [2]. Please
> >>>>> give it a look if you have time.
> >>>>>
> >>>>> This week, I will continue working on the documentation of the kudu
> >>>>> datastore.
> >>>>>
> >>>>> Please let me know if you have suggestions.
> >>>>>
> >>>>> [1]
> >>>>>
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> >>>>> [2] https://github.com/apache/gora/pull/178
> >>>>>
> >>>>> Best,
> >>>>> John.
> >>>>>
> >>>>> El mié., 31 jul. 2019 a las 11:17, carlos muñoz (<
> carlosrmng@gmail.com>)
> >>>>> escribió:
> >>>>>
> >>>>>> Hi John,
> >>>>>>
> >>>>>> Thanks for the update. I reviewed your code a little bit, it is
> >>>>>> looking good. I think tha you should send a PR in order to receive
> feedback
> >>>>>> from other community members.
> >>>>>>
> >>>>>> Best,
> >>>>>> Carlos
> >>>>>>
> >>>>>> El dom., 28 jul. 2019 a las 23:20, John Mora (<jhnmora000@gmail.com
> >)
> >>>>>> escribió:
> >>>>>>
> >>>>>>> Hi all.
> >>>>>>>
> >>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last commits
> >>>>>>> to my branch [2]. Please give it a look if you have time.
> >>>>>>>
> >>>>>>> This week, I will give a look to the documentation of datastores.
> >>>>>>>
> >>>>>>> Please let me know if you have suggestions.
> >>>>>>>
> >>>>>>> [1]
> >>>>>>>
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> >>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> John
> >>>>>>>
> >>>>>>> El mié., 24 jul. 2019 a las 11:34, John Mora (<
> jhnmora000@gmail.com>)
> >>>>>>> escribió:
> >>>>>>>
> >>>>>>>> Hi Alfonso,
> >>>>>>>>
> >>>>>>>> Yes, I was using this class javafx.util.Pair. It is not a problem
> I
> >>>>>>>> will find an alternative, it is only an utilitary class.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> John
> >>>>>>>>
> >>>>>>>> El mar., 23 jul. 2019 a las 12:36, Alfonso Nishikawa (<
> >>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
> >>>>>>>>
> >>>>>>>>> Hi, John.
> >>>>>>>>>
> >>>>>>>>> I checked out your code and it looks good :)
> >>>>>>>>> I found that you use javafx, but that is not present in OpenJDK
> >>>>>>>>> and fails to compile, and since we don't stick to Oracle JVM I
> would
> >>>>>>>>> suggest to change it.
> >>>>>>>>>
> >>>>>>>>> Good job, keep it going :)
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>>
> >>>>>>>>> Alfonso Nishikawa
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> El sáb., 20 jul. 2019 a las 22:25, John Mora (<
> >>>>>>>>> jhnmora000@gmail.com>) escribió:
> >>>>>>>>>
> >>>>>>>>>> Hi.
> >>>>>>>>>>
> >>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last
> >>>>>>>>>> commits to my branch [2]. Please give it a look if you have
> time.
> >>>>>>>>>>
> >>>>>>>>>> This week, I will give a look to the map reduce tests for
> >>>>>>>>>> DataStores.
> >>>>>>>>>>
> >>>>>>>>>> Please let me know if you have suggestions.
> >>>>>>>>>>
> >>>>>>>>>> [1]
> >>>>>>>>>>
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> >>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> John
> >>>>>>>>>>
> >>>>>>>>>> El sáb., 13 jul. 2019 a las 19:31, John Mora (<
> >>>>>>>>>> jhnmora000@gmail.com>) escribió:
> >>>>>>>>>>
> >>>>>>>>>>> Hi all
> >>>>>>>>>>>
> >>>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last
> >>>>>>>>>>> commits to my branch [2]. Please give it a look if you have
> time.
> >>>>>>>>>>>
> >>>>>>>>>>> This week, I will be working in the getPartitions and
> >>>>>>>>>>> deleteByQuery methods and testing the other tests in the
> DataStoreTestBase
> >>>>>>>>>>> class.
> >>>>>>>>>>>
> >>>>>>>>>>> Please let me know if you have suggestions.
> >>>>>>>>>>>
> >>>>>>>>>>> [1]
> >>>>>>>>>>>
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> >>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>> John.
> >>>>>>>>>>>
> >>>>>>>>>>> El mié., 10 jul. 2019 a las 16:17, John Mora (<
> >>>>>>>>>>> jhnmora000@gmail.com>) escribió:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Alfonso,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks so much for your time and support for this project. I
> >>>>>>>>>>>> will work on your comments. Responses inline :)
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> El mar., 9 jul. 2019 a las 16:38, Alfonso Nishikawa (<
> >>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hi, John.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Sorry for the delay, I am changing work and I have been very
> >>>>>>>>>>>>> busy :( I will try to answer your questions :)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> *> In the Employee example there is a field called
> >>>>>>>>>>>>> 'dateOfBirth'. I tried to map that field with the
> UNIXTIME_MICROS datatype
> >>>>>>>>>>>>> of Kudu (I intuitively assumed this is a date.). However, in
> the java world
> >>>>>>>>>>>>> the Employee field is a Long value and the kudu datatype is
> a Timestamp.
> >>>>>>>>>>>>> So, I was wondering whether I should force the usage of the
> UNIXTIME_MICROS
> >>>>>>>>>>>>> datatype for this field or just use a LONG datatype in Kudu.*
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> In Avro 1.8 were introduced "Logical Types" so there is a
> >>>>>>>>>>>>> "date" type with an underlying "int" [1]. It's the first
> time I read about
> >>>>>>>>>>>>> because until the last version upgrade of Avro this weren't
> there. I would
> >>>>>>>>>>>>> suggest to ignore "dates" and map dateOfBirth as long, since
> in any case
> >>>>>>>>>>>>> -in avro- the value is the unix epoch. After this first
> approach, a design
> >>>>>>>>>>>>> improvement would be great, though :)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> - Would be good to have in the mapping a "timestamp" type so
> >>>>>>>>>>>>> KuduStore converts between the Entity long field <-> Kudu
> timestamp storage?
> >>>>>>>>>>>>> - Is there any other approach?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> I think that Entity long field <-> Kudu timestamp conversion
> >>>>>>>>>>>> that the best alternative right now. Because, I would add
> more compatible
> >>>>>>>>>>>> datatypes to the mapping parameters which users can use. And
> this
> >>>>>>>>>>>> conversion should not be dificult to implement in my opinion.
> Also, the new
> >>>>>>>>>>>> Date datatype of avro could be implemented in newer versions
> because it
> >>>>>>>>>>>> would need further analysis in other datastores too. I will
> work on that.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> *> What is the Gora's policy regarding flush()? *
> >>>>>>>>>>>>> *> KuduClient has multiple flushing modes
> >>>>>>>>>>>>> <
> https://kudu.apache.org/apidocs/org/apache/kudu/client/SessionConfiguration.FlushMode.html
> >and
> >>>>>>>>>>>>> also can set time interval
> >>>>>>>>>>>>> <
> https://kudu.apache.org/releases/1.2.0/apidocs/org/apache/kudu/client/KuduSession.html#setFlushInterval-int-
> >
> >>>>>>>>>>>>> for automatic flush.*
> >>>>>>>>>>>>> *> Should theses behaviors be configurable using
> >>>>>>>>>>>>> gora.properties file? or just use the default
> configurations.*
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> What we do in HBase is configure an autoflush option in
> >>>>>>>>>>>>> gora.properties [2] which is used when instanced the Table,
> but at the same
> >>>>>>>>>>>>> time we implement the flush() method to force the flush [3].
> I would
> >>>>>>>>>>>>> suggest to follow that example, but adding the flushing
> options of Kudu.
> >>>>>>>>>>>>> What flushing mode (and time interval if it applies) do you
> suggest?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Well,  IMHO the default flush mode (auto flush sync) will do
> >>>>>>>>>>>> the job for most use cases. But I will add a configuration in
> >>>>>>>>>>>> gora.properties for selecting the other modes and specifying
> a autoflush
> >>>>>>>>>>>> time  if needed  by the user.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> *> Also, while reviewing the datastore interface I noticed
> >>>>>>>>>>>>> this method 'getPartitions(Query<K, T> query)'. What is the
> expected
> >>>>>>>>>>>>> behavior of this method?, should I use the partition
> definition in the xml
> >>>>>>>>>>>>> mapping file for this?.*
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The method getPartitions(Query) is related to Hadoop. Apache
> >>>>>>>>>>>>> Gora integrates with Hadoop implementing a custom Map and
> Reduce that
> >>>>>>>>>>>>> allows to get/write Entities directly.
> >>>>>>>>>>>>> You can take a look at HBase's implementation [4], which
> >>>>>>>>>>>>> relies o.a.h.hbase.mapreduce.TableInputFormatBase [5] to
> >>>>>>>>>>>>> compute the splits (start key---end key) with the location
> of the split to
> >>>>>>>>>>>>> create a colection of partitions [6].
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> So, if Kudu is allowed to perform computation using local
> kudu
> >>>>>>>>>>>>> splits, then this method does the needed preparation to
> allow to "send the
> >>>>>>>>>>>>> computation to where the data is locally".
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> In any case, you can see that:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>    - MongoDB store implementation does not implement
> >>>>>>>>>>>>>    splitting [7]
> >>>>>>>>>>>>>    - Cassandra store implementation does not implement
> >>>>>>>>>>>>>    splitting [8]
> >>>>>>>>>>>>>    - Aerospike store implementation does not implement
> >>>>>>>>>>>>>    splitting [9]
> >>>>>>>>>>>>>    - Accumulo store implementation* does* implement splitting
> >>>>>>>>>>>>>    [10]
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> If Kudu has a method to get the different splits for a table
> >>>>>>>>>>>>> and its locations, then you will be able to implement the
> full feature.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This is Hadoop related and it is not trivial. I haven't
> >>>>>>>>>>>>> elaborated much, so if you find you need more information
> let me know :)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>> I will check whether Kudu has these features in order to
> >>>>>>>>>>>> implement this method. If not I will use the default
> implementation found
> >>>>>>>>>>>> in other backends.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> About Queries, what I can tell is that Hbase only implements
> >>>>>>>>>>>>> "Start key" + "End key" because it has only 2 operations:
> "get" and "scan",
> >>>>>>>>>>>>> and the querying is for "scan" operation, were you want an
> interval (or
> >>>>>>>>>>>>> all) of the rows. Does Kudu have more querying functionality?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>> Yes, Kudu implements a Scanner for querying data among with
> >>>>>>>>>>>> conditional predicates for filtering. I am using those
> classes.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> About other topic, I am trying to install Kudu in standalone
> >>>>>>>>>>>>> (all in 1 node). Do you use a Cloudera installation or do
> you have a
> >>>>>>>>>>>>> standalone installation? How do you do it? I found some
> instructions, but
> >>>>>>>>>>>>> they talk about compiling Kudu [11]. I was looking for
> something like
> >>>>>>>>>>>>> HBase, that it is unzip + execute "hbase start".
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>> I am using an embedded mini-cluster which comes with compiled
> >>>>>>>>>>>> binaries and can be used with maven[1] for testing my code.
> Once I get it
> >>>>>>>>>>>> mature enough I think I will be testing the datastore with a
> docker
> >>>>>>>>>>>> container [2]. I could not find a unzip+execute bundle either
> and I am
> >>>>>>>>>>>> kinda noob for compiling it myself.
> >>>>>>>>>>>>
> >>>>>>>>>>>> [1]
> >>>>>>>>>>>>
> https://kudu.apache.org/docs/developing.html#_jvm_based_integration_testing
> >>>>>>>>>>>> [2] https://hub.docker.com/r/usuresearch/apache-kudu/
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Good job and thank you!! :)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Alfonso Nishikawa
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [1] -
> >>>>>>>>>>>>> https://avro.apache.org/docs/1.8.0/spec.html#Logical+Types
> >>>>>>>>>>>>> [2] -
> >>>>>>>>>>>>>
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L175
> >>>>>>>>>>>>> [3] -
> >>>>>>>>>>>>>
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L458
> >>>>>>>>>>>>> [4] -
> >>>>>>>>>>>>>
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L472
> >>>>>>>>>>>>> [5] -
> >>>>>>>>>>>>>
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L479
> >>>>>>>>>>>>> [6] -
> >>>>>>>>>>>>>
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L517
> >>>>>>>>>>>>> [7] -
> >>>>>>>>>>>>>
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-mongodb/src/main/java/org/apache/gora/mongodb/store/MongoStore.java#L533
> >>>>>>>>>>>>> [8] -
> >>>>>>>>>>>>>
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L292
> >>>>>>>>>>>>> [9] -
> >>>>>>>>>>>>>
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-aerospike/src/main/java/org/apache/gora/aerospike/store/AerospikeStore.java#L369
> >>>>>>>>>>>>> [10] -
> >>>>>>>>>>>>>
> https://github.com/apache/gora/blob/apache-gora-0.9/gora-accumulo/src/main/java/org/apache/gora/accumulo/store/AccumuloStore.java#L902
> >>>>>>>>>>>>> [11] - https://kudu.apache.org/docs/installation.html
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> El lun., 8 jul. 2019 a las 3:42, John Mora (<
> >>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi all.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> As every week I updated my report in the Wiki[1]. Also, I
> >>>>>>>>>>>>>> pushed my last commits to my branch [2]. Please give it a
> look if you have
> >>>>>>>>>>>>>> time.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> This week, I will be continue working in the Queries
> >>>>>>>>>>>>>> implementation, please reach me out if you have any
> suggestions.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Also, while reviewing the datastore interface I noticed this
> >>>>>>>>>>>>>> method 'getPartitions(Query<K, T> query)'. What is the
> expected behavior of
> >>>>>>>>>>>>>> this method?, should I use the partition definition in the
> xml mapping file
> >>>>>>>>>>>>>> for this?.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>> John.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> >>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> El dom., 30 jun. 2019 a las 16:56, John Mora (<
> >>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi all.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I received my first evaluation from the Google Summer of
> >>>>>>>>>>>>>>> Code program with a positive result. Thanks so much for
> your support and
> >>>>>>>>>>>>>>> confidence to the project and me.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I updated my report of this week in the Wiki[1]. Also, I
> >>>>>>>>>>>>>>> pushed my last commits to my branch [2].
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> This week, I will be reviewing my the serialization/
> >>>>>>>>>>>>>>> deserialization process in order to identify optimizations
> specific for
> >>>>>>>>>>>>>>> Kudu. Because I used a generic methods of other backends
> which probably
> >>>>>>>>>>>>>>> could be better tuned for kudu. Also, I will start working
> on the Queries
> >>>>>>>>>>>>>>> implementation.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> BTW, I added a question to the wiki about Date types.
> Please
> >>>>>>>>>>>>>>> give it a look if you have time.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> >>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>> John
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> El jue., 27 jun. 2019 a las 21:02, John Mora (<
> >>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Hi Carlos.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks for the reminder. I submitted the form yesterday.
> :D
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>> John.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> El jue., 27 jun. 2019 a las 17:34, carlos muñoz (<
> >>>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Hi John
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> The first Google Summer of Code evaluation is due on June
> >>>>>>>>>>>>>>>>> 28th. Please make sure you submit your Mentors'
> evaluation on time.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>> Carlos
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> El dom., 23 jun. 2019 a las 18:29, John Mora (<
> >>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Hi all.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> FYI, I updated my report of this week on the Wiki[1].
> >>>>>>>>>>>>>>>>>> Also, I pushed my last commits to my branch [2].
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> As I mentioned in the reports I would like to know how
> >>>>>>>>>>>>>>>>>> datastores deal with flush(), should it work always
> manually executed?.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Finally, This week I will be implementing object
> >>>>>>>>>>>>>>>>>> serialization/deserialization in the methods put, get,
> delete, exists. Do
> >>>>>>>>>>>>>>>>>> you have any suggestions on how to proceed with this
> task?.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Footnote: Thanks for the feedback Carlos, I fixed the
> >>>>>>>>>>>>>>>>>> problem.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>>
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> >>>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>>> John
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> El lun., 17 jun. 2019 a las 22:58, carlos muñoz (<
> >>>>>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Hi John
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Your last changes look good to me. Keep it up. But, I
> >>>>>>>>>>>>>>>>>>> noticed that you have created an Enumeration for
> datatypes, which is very
> >>>>>>>>>>>>>>>>>>> similar to the kudu-client's [2]. Probably you should
> replace [1] for [2]
> >>>>>>>>>>>>>>>>>>> in order to avoid code duplication.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>>>
> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/Column.java#L76
> >>>>>>>>>>>>>>>>>>> [2]
> >>>>>>>>>>>>>>>>>>>
> https://kudu.apache.org/apidocs/org/apache/kudu/Type.html
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>> Carlos
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> El sáb., 15 jun. 2019 a las 12:01, John Mora (<
> >>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Hi all.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> I updated my report of this week on the Wiki[1]. I
> >>>>>>>>>>>>>>>>>>>> noticed that my code is lacking some javadoc
> documentation I think I will
> >>>>>>>>>>>>>>>>>>>> be working on that this week, also I would like to
> enable and check schema
> >>>>>>>>>>>>>>>>>>>> management tests (createSchema, existsSchema, etc.).
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>>>>
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>>>>> John.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> El mar., 11 jun. 2019 a las 0:11, John Mora (<
> >>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Hi Alfonso.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Thanks so much for your feedback. I am working on
> your
> >>>>>>>>>>>>>>>>>>>>> comments.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>> John
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 16:11, Alfonso Nishikawa
> (<
> >>>>>>>>>>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Hi, John.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Regarding your questions at the report [1]:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>    - How to represent partitioning configurations on
> >>>>>>>>>>>>>>>>>>>>>>    the mapping file.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> This was discussed in other emails, isn't it? :)
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>    - KuduTestHarness requires the Maven plugin
> >>>>>>>>>>>>>>>>>>>>>>    os-maven-plugin, which needs Maven 3.1.1+, is it
> a problem for Apache Gora?
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> I believe it is not a problem. My Ubuntu comes with
> >>>>>>>>>>>>>>>>>>>>>> 3.6.0, far from 3.1.1, and I assume everyone uses
> Maven 3 in a quite new
> >>>>>>>>>>>>>>>>>>>>>> version :)
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> [1] -
> >>>>>>>>>>>>>>>>>>>>>>
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 21:07, Alfonso Nishikawa
> >>>>>>>>>>>>>>>>>>>>>> (<al...@gmail.com>) escribió:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Hi, John.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Thank you!
> >>>>>>>>>>>>>>>>>>>>>>> Things I have seen:
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> - The version of a maven dependency [1] should go
> on
> >>>>>>>>>>>>>>>>>>>>>>> the Dependency Management of the root pom [2].
> Same for [3] and from there,
> >>>>>>>>>>>>>>>>>>>>>>> should not set the version there.
> >>>>>>>>>>>>>>>>>>>>>>> - Set test dependencies' scope to test, at [4] and
> >>>>>>>>>>>>>>>>>>>>>>> from there.
> >>>>>>>>>>>>>>>>>>>>>>> - Set the indentation to 2 spaces for the pom [5]
> >>>>>>>>>>>>>>>>>>>>>>> - Missing "t" in "localhost" at [6].
> >>>>>>>>>>>>>>>>>>>>>>> - Port 13 for Kudu? That is "Daytime Protocol" RFC
> >>>>>>>>>>>>>>>>>>>>>>> 867 and you will need root permission to run it.
> The default port for kudu
> >>>>>>>>>>>>>>>>>>>>>>> is 7051, isn't it?
> >>>>>>>>>>>>>>>>>>>>>>> - I would ask you to add the same functionality to
> >>>>>>>>>>>>>>>>>>>>>>> load the mapping from configuration as in HBase's
> store [7] in you
> >>>>>>>>>>>>>>>>>>>>>>> KuduStore [8]. This will have implications on your
> readMapping at [9], so
> >>>>>>>>>>>>>>>>>>>>>>> take a look at the one for HBase at [10]
> >>>>>>>>>>>>>>>>>>>>>>> - I know it is in other backends, but avoid
> >>>>>>>>>>>>>>>>>>>>>>> RuntimeExceptions (at least in Java since we have
> the checked ones) like in
> >>>>>>>>>>>>>>>>>>>>>>> [11]. You can wrap them in GoraException. An
> example is [12]
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> And nothing more :)
> >>>>>>>>>>>>>>>>>>>>>>> Keep going, good job.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> [1] -
> >>>>>>>>>>>>>>>>>>>>>>>
> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L98
> >>>>>>>>>>>>>>>>>>>>>>> [2] -
> >>>>>>>>>>>>>>>>>>>>>>>
> https://github.com/jhnmora000/gora/blob/GORA-485/pom.xml#L890
> >>>>>>>>>>>>>>>>>>>>>>> [3] -
> >>>>>>>>>>>>>>>>>>>>>>>
> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L121
> >>>>>>>>>>>>>>>>>>>>>>> [4] -
> >>>>>>>>>>>>>>>>>>>>>>>
> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L180
> >>>>>>>>>>>>>>>>>>>>>>> [5] -
> >>>>>>>>>>>>>>>>>>>>>>>
> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml
> >>>>>>>>>>>>>>>>>>>>>>> [6] -
> >>>>>>>>>>>>>>>>>>>>>>>
> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/test/resources/gora.properties#L18
> >>>>>>>>>>>>>>>>>>>>>>> [7] -
> >>>>>>>>>>>>>>>>>>>>>>>
> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L92
> >>>>>>>>>>>>>>>>>>>>>>> [8] -
> >>>>>>>>>>>>>>>>>>>>>>>
> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/store/KuduStore.java#L53
> >>>>>>>>>>>>>>>>>>>>>>> [9] -
> >>>>>>>>>>>>>>>>>>>>>>>
> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L81
> >>>>>>>>>>>>>>>>>>>>>>> [10] -
> >>>>>>>>>>>>>>>>>>>>>>>
> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L822
> >>>>>>>>>>>>>>>>>>>>>>> [11] -
> >>>>>>>>>>>>>>>>>>>>>>>
> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L141
> >>>>>>>>>>>>>>>>>>>>>>> [12] -
> >>>>>>>>>>>>>>>>>>>>>>>
> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L268
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> El sáb., 8 jun. 2019 a las 20:26, John Mora (<
> >>>>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Hi all.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> I have just updated my weekly reports on Cwiki
> [1].
> >>>>>>>>>>>>>>>>>>>>>>>> This next week I think I should be focusing on
> the create schema operation
> >>>>>>>>>>>>>>>>>>>>>>>> and solving the issue of the partitioning
> configurations in the mapping
> >>>>>>>>>>>>>>>>>>>>>>>> file.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Please let me know if you have suggestions, my
> last
> >>>>>>>>>>>>>>>>>>>>>>>> commits are available here [2]
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>>>>>>>>
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> >>>>>>>>>>>>>>>>>>>>>>>> [2]
> >>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/tree/GORA-485
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>> John
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
>

Re: Kudu datastore reports

Posted by Kevin Ratnasekera <dj...@gmail.com>.
Hi John,

Can't we spin up Kudu docker [1] instance for testing purposes? We have
used Test containers [2] some data stores like couch DB. Gora build should
work in both Linux and Non Linux environments. Eg:-  Windows. Is classifier
[3] depend on the environment the build is running?

Kudu is based on C/ C++, so to spin up a server instance, we need check a
approach like docker, using such approach allow us to avoid these OS,
dependency related stuff come in to play in builds.

[1] https://hub.docker.com/r/usuresearch/apache-kudu/
[2] https://www.testcontainers.org/
[3] <classifier>linux-x86_64</classifier>

Regards
Kevin

On Tue, Aug 6, 2019 at 9:56 AM John Mora <jh...@gmail.com> wrote:

> Hi Alfonso,
>
> Unfortunately, I have not been able to reproduce the issue. Maybe it is
> related with my Java version (Oracle), I will try with OpenJDK.
> Some details about my development environment:
>
> os.detected.name: linux
> os.detected.arch: x86_64
> os.detected.version: 4.10
> os.detected.version.major: 4
> os.detected.version.minor: 10
> os.detected.release: linuxmint
> os.detected.release.version: 18.3
> os.detected.release.like.linuxmint: true
> os.detected.release.like.ubuntu: true
> os.detected.classifier: linux-x86_64
>
> Java
> java version "1.8.0_171"
> Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
> Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)
>
> Maven
> Apache Maven 3.3.9
> Maven home: /usr/share/maven
> Java version: 1.8.0_171, vendor: Oracle Corporation
> Java home: /usr/lib/jvm/java-8-oracle/jre
> Default locale: en_US, platform encoding: UTF-8
> OS name: "linux", version: "4.10.0-38-generic", arch: "amd64", family:
> "unix"
>
>
> Best,
> John.
>
> El lun., 5 ago. 2019 a las 16:48, Alfonso Nishikawa (<
> alfonso.nishikawa@gmail.com>) escribió:
>
>> Hi,
>>
>> I am using now the following pom configuration I got from executing `mvn
>> dependency:tree`:
>>
>>     <dependency>
>>       <groupId>org.apache.kudu</groupId>
>>       <artifactId>kudu-binary</artifactId>
>>       <classifier>linux-x86_64</classifier>
>>       <version>1.9.0</version>
>>       <scope>test</scope>
>>     </dependency>
>>
>> When I execute `mvn clen package` on gora-kudu I find that it spawns the
>> following command:
>>
>> kudu-master
>> --fs_wal_dir=/tmp/mini-kudu-cluster8989984398759938222/master-0/wal
>> --fs_data_dirs=/tmp/mini-kudu-cluster8989984398759938222/master-0/data
>> --block_manager=log --webserver_interface=localhost --ipki_ca_key_size=1024
>> --tsk_num_rsa_bits=512 --rpc_bind_addresses=*127.26.116.190*:39535
>> --webserver_interface=*127.26.116.190* --webserver_port=0 --never_fsync
>> --ipki_server_key_size=1024 --enable_minidumps=false --redact=none
>> --metrics_log_interval_ms=1000 --logtostderr --logbuflevel=-1
>> --log_dir=/tmp/mini-kudu-cluster8989984398759938222/master-0/logs
>> --server_dump_info_path=/tmp/mini-kudu-cluster8989984398759938222/master-0/data/info.pb
>> --server_dump_info_format=pb --rpc_server_allow_ephemeral_ports
>> --unlock_experimental_flags --unlock_unsafe_flags --rpc_reuseport=true
>> --master_addresses=*127.26.116.190*:39535,*127.26.116.189*:33913,
>> *127.26.116.188*:42253
>>
>>
>> I highlight the IP addresses because they clearly are not my computer,
>> and I guess that is why the tests can't connect to the the database.
>>
>> Any idea on how to solve this?
>>
>> Thank you!
>>
>>
>> Best Regards,
>>
>> Alfonso Nishikawa
>>
>>
>>
>> El lun., 5 ago. 2019 a las 8:39, Alfonso Nishikawa (<
>> alfonso.nishikawa@gmail.com>) escribió:
>>
>>> Hi, John.
>>>
>>> I get a core dump from the binary kudu server when trying to run the
>>> tests. Didn't find a log file, but will search thoroughly later. Happened
>>> anytime to you? Does it happens to anyone?
>>>
>>> I am using Ubuntu 18.04
>>>
>>> Thank you!
>>>
>>> Regards,
>>>
>>> Alfonso Nishikawa
>>>
>>> El dom., 4 ago. 2019 20:10, Furkan KAMACI <fu...@gmail.com>
>>> escribió:
>>>
>>>> Hi John,
>>>>
>>>> I've already made my comments at your PR. Please check them carefully
>>>> and ask me if you need help.
>>>>
>>>> For the documentation, I've checked what you've done. On the other
>>>> hand, I would want to encourage you to write a blog post about your Kudu
>>>> implementation and demonstrate an example of Kudu integration with Gora as
>>>> like a tutorial.
>>>>
>>>> Kind Regards,
>>>> Furkan KAMACI
>>>>
>>>> On Sun, Aug 4, 2019 at 1:59 AM John Mora <jh...@gmail.com> wrote:
>>>>
>>>>> Hi all.
>>>>>
>>>>> I have updated my report in the Wiki[1].
>>>>>
>>>>> Also, I have sent a PR with my last commits for review [2]. Please
>>>>> give it a look if you have time.
>>>>>
>>>>> This week, I will continue working on the documentation of the kudu
>>>>> datastore.
>>>>>
>>>>> Please let me know if you have suggestions.
>>>>>
>>>>> [1]
>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>> [2] https://github.com/apache/gora/pull/178
>>>>>
>>>>> Best,
>>>>> John.
>>>>>
>>>>> El mié., 31 jul. 2019 a las 11:17, carlos muñoz (<ca...@gmail.com>)
>>>>> escribió:
>>>>>
>>>>>> Hi John,
>>>>>>
>>>>>> Thanks for the update. I reviewed your code a little bit, it is
>>>>>> looking good. I think tha you should send a PR in order to receive feedback
>>>>>> from other community members.
>>>>>>
>>>>>> Best,
>>>>>> Carlos
>>>>>>
>>>>>> El dom., 28 jul. 2019 a las 23:20, John Mora (<jh...@gmail.com>)
>>>>>> escribió:
>>>>>>
>>>>>>> Hi all.
>>>>>>>
>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last commits
>>>>>>> to my branch [2]. Please give it a look if you have time.
>>>>>>>
>>>>>>> This week, I will give a look to the documentation of datastores.
>>>>>>>
>>>>>>> Please let me know if you have suggestions.
>>>>>>>
>>>>>>> [1]
>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>
>>>>>>> Cheers,
>>>>>>> John
>>>>>>>
>>>>>>> El mié., 24 jul. 2019 a las 11:34, John Mora (<jh...@gmail.com>)
>>>>>>> escribió:
>>>>>>>
>>>>>>>> Hi Alfonso,
>>>>>>>>
>>>>>>>> Yes, I was using this class javafx.util.Pair. It is not a problem I
>>>>>>>> will find an alternative, it is only an utilitary class.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> John
>>>>>>>>
>>>>>>>> El mar., 23 jul. 2019 a las 12:36, Alfonso Nishikawa (<
>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>
>>>>>>>>> Hi, John.
>>>>>>>>>
>>>>>>>>> I checked out your code and it looks good :)
>>>>>>>>> I found that you use javafx, but that is not present in OpenJDK
>>>>>>>>> and fails to compile, and since we don't stick to Oracle JVM I would
>>>>>>>>> suggest to change it.
>>>>>>>>>
>>>>>>>>> Good job, keep it going :)
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> El sáb., 20 jul. 2019 a las 22:25, John Mora (<
>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>
>>>>>>>>>> Hi.
>>>>>>>>>>
>>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last
>>>>>>>>>> commits to my branch [2]. Please give it a look if you have time.
>>>>>>>>>>
>>>>>>>>>> This week, I will give a look to the map reduce tests for
>>>>>>>>>> DataStores.
>>>>>>>>>>
>>>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>>>
>>>>>>>>>> [1]
>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> John
>>>>>>>>>>
>>>>>>>>>> El sáb., 13 jul. 2019 a las 19:31, John Mora (<
>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>
>>>>>>>>>>> Hi all
>>>>>>>>>>>
>>>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last
>>>>>>>>>>> commits to my branch [2]. Please give it a look if you have time.
>>>>>>>>>>>
>>>>>>>>>>> This week, I will be working in the getPartitions and
>>>>>>>>>>> deleteByQuery methods and testing the other tests in the DataStoreTestBase
>>>>>>>>>>> class.
>>>>>>>>>>>
>>>>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> John.
>>>>>>>>>>>
>>>>>>>>>>> El mié., 10 jul. 2019 a las 16:17, John Mora (<
>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Alfonso,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks so much for your time and support for this project. I
>>>>>>>>>>>> will work on your comments. Responses inline :)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> El mar., 9 jul. 2019 a las 16:38, Alfonso Nishikawa (<
>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sorry for the delay, I am changing work and I have been very
>>>>>>>>>>>>> busy :( I will try to answer your questions :)
>>>>>>>>>>>>>
>>>>>>>>>>>>> *> In the Employee example there is a field called
>>>>>>>>>>>>> 'dateOfBirth'. I tried to map that field with the UNIXTIME_MICROS datatype
>>>>>>>>>>>>> of Kudu (I intuitively assumed this is a date.). However, in the java world
>>>>>>>>>>>>> the Employee field is a Long value and the kudu datatype is a Timestamp.
>>>>>>>>>>>>> So, I was wondering whether I should force the usage of the UNIXTIME_MICROS
>>>>>>>>>>>>> datatype for this field or just use a LONG datatype in Kudu.*
>>>>>>>>>>>>>
>>>>>>>>>>>>> In Avro 1.8 were introduced "Logical Types" so there is a
>>>>>>>>>>>>> "date" type with an underlying "int" [1]. It's the first time I read about
>>>>>>>>>>>>> because until the last version upgrade of Avro this weren't there. I would
>>>>>>>>>>>>> suggest to ignore "dates" and map dateOfBirth as long, since in any case
>>>>>>>>>>>>> -in avro- the value is the unix epoch. After this first approach, a design
>>>>>>>>>>>>> improvement would be great, though :)
>>>>>>>>>>>>>
>>>>>>>>>>>>> - Would be good to have in the mapping a "timestamp" type so
>>>>>>>>>>>>> KuduStore converts between the Entity long field <-> Kudu timestamp storage?
>>>>>>>>>>>>> - Is there any other approach?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I think that Entity long field <-> Kudu timestamp conversion
>>>>>>>>>>>> that the best alternative right now. Because, I would add more compatible
>>>>>>>>>>>> datatypes to the mapping parameters which users can use. And this
>>>>>>>>>>>> conversion should not be dificult to implement in my opinion. Also, the new
>>>>>>>>>>>> Date datatype of avro could be implemented in newer versions because it
>>>>>>>>>>>> would need further analysis in other datastores too. I will work on that.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> *> What is the Gora's policy regarding flush()? *
>>>>>>>>>>>>> *> KuduClient has multiple flushing modes
>>>>>>>>>>>>> <https://kudu.apache.org/apidocs/org/apache/kudu/client/SessionConfiguration.FlushMode.html>and
>>>>>>>>>>>>> also can set time interval
>>>>>>>>>>>>> <https://kudu.apache.org/releases/1.2.0/apidocs/org/apache/kudu/client/KuduSession.html#setFlushInterval-int->
>>>>>>>>>>>>> for automatic flush.*
>>>>>>>>>>>>> *> Should theses behaviors be configurable using
>>>>>>>>>>>>> gora.properties file? or just use the default configurations.*
>>>>>>>>>>>>>
>>>>>>>>>>>>> What we do in HBase is configure an autoflush option in
>>>>>>>>>>>>> gora.properties [2] which is used when instanced the Table, but at the same
>>>>>>>>>>>>> time we implement the flush() method to force the flush [3]. I would
>>>>>>>>>>>>> suggest to follow that example, but adding the flushing options of Kudu.
>>>>>>>>>>>>> What flushing mode (and time interval if it applies) do you suggest?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Well,  IMHO the default flush mode (auto flush sync) will do
>>>>>>>>>>>> the job for most use cases. But I will add a configuration in
>>>>>>>>>>>> gora.properties for selecting the other modes and specifying a autoflush
>>>>>>>>>>>> time  if needed  by the user.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> *> Also, while reviewing the datastore interface I noticed
>>>>>>>>>>>>> this method 'getPartitions(Query<K, T> query)'. What is the expected
>>>>>>>>>>>>> behavior of this method?, should I use the partition definition in the xml
>>>>>>>>>>>>> mapping file for this?.*
>>>>>>>>>>>>>
>>>>>>>>>>>>> The method getPartitions(Query) is related to Hadoop. Apache
>>>>>>>>>>>>> Gora integrates with Hadoop implementing a custom Map and Reduce that
>>>>>>>>>>>>> allows to get/write Entities directly.
>>>>>>>>>>>>> You can take a look at HBase's implementation [4], which
>>>>>>>>>>>>> relies o.a.h.hbase.mapreduce.TableInputFormatBase [5] to
>>>>>>>>>>>>> compute the splits (start key---end key) with the location of the split to
>>>>>>>>>>>>> create a colection of partitions [6].
>>>>>>>>>>>>>
>>>>>>>>>>>>> So, if Kudu is allowed to perform computation using local kudu
>>>>>>>>>>>>> splits, then this method does the needed preparation to allow to "send the
>>>>>>>>>>>>> computation to where the data is locally".
>>>>>>>>>>>>>
>>>>>>>>>>>>> In any case, you can see that:
>>>>>>>>>>>>>
>>>>>>>>>>>>>    - MongoDB store implementation does not implement
>>>>>>>>>>>>>    splitting [7]
>>>>>>>>>>>>>    - Cassandra store implementation does not implement
>>>>>>>>>>>>>    splitting [8]
>>>>>>>>>>>>>    - Aerospike store implementation does not implement
>>>>>>>>>>>>>    splitting [9]
>>>>>>>>>>>>>    - Accumulo store implementation* does* implement splitting
>>>>>>>>>>>>>    [10]
>>>>>>>>>>>>>
>>>>>>>>>>>>> If Kudu has a method to get the different splits for a table
>>>>>>>>>>>>> and its locations, then you will be able to implement the full feature.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is Hadoop related and it is not trivial. I haven't
>>>>>>>>>>>>> elaborated much, so if you find you need more information let me know :)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> I will check whether Kudu has these features in order to
>>>>>>>>>>>> implement this method. If not I will use the default implementation found
>>>>>>>>>>>> in other backends.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> About Queries, what I can tell is that Hbase only implements
>>>>>>>>>>>>> "Start key" + "End key" because it has only 2 operations: "get" and "scan",
>>>>>>>>>>>>> and the querying is for "scan" operation, were you want an interval (or
>>>>>>>>>>>>> all) of the rows. Does Kudu have more querying functionality?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> Yes, Kudu implements a Scanner for querying data among with
>>>>>>>>>>>> conditional predicates for filtering. I am using those classes.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> About other topic, I am trying to install Kudu in standalone
>>>>>>>>>>>>> (all in 1 node). Do you use a Cloudera installation or do you have a
>>>>>>>>>>>>> standalone installation? How do you do it? I found some instructions, but
>>>>>>>>>>>>> they talk about compiling Kudu [11]. I was looking for something like
>>>>>>>>>>>>> HBase, that it is unzip + execute "hbase start".
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> I am using an embedded mini-cluster which comes with compiled
>>>>>>>>>>>> binaries and can be used with maven[1] for testing my code. Once I get it
>>>>>>>>>>>> mature enough I think I will be testing the datastore with a docker
>>>>>>>>>>>> container [2]. I could not find a unzip+execute bundle either and I am
>>>>>>>>>>>> kinda noob for compiling it myself.
>>>>>>>>>>>>
>>>>>>>>>>>> [1]
>>>>>>>>>>>> https://kudu.apache.org/docs/developing.html#_jvm_based_integration_testing
>>>>>>>>>>>> [2] https://hub.docker.com/r/usuresearch/apache-kudu/
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Good job and thank you!! :)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>> https://avro.apache.org/docs/1.8.0/spec.html#Logical+Types
>>>>>>>>>>>>> [2] -
>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L175
>>>>>>>>>>>>> [3] -
>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L458
>>>>>>>>>>>>> [4] -
>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L472
>>>>>>>>>>>>> [5] -
>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L479
>>>>>>>>>>>>> [6] -
>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L517
>>>>>>>>>>>>> [7] -
>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-mongodb/src/main/java/org/apache/gora/mongodb/store/MongoStore.java#L533
>>>>>>>>>>>>> [8] -
>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L292
>>>>>>>>>>>>> [9] -
>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-aerospike/src/main/java/org/apache/gora/aerospike/store/AerospikeStore.java#L369
>>>>>>>>>>>>> [10] -
>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-accumulo/src/main/java/org/apache/gora/accumulo/store/AccumuloStore.java#L902
>>>>>>>>>>>>> [11] - https://kudu.apache.org/docs/installation.html
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> El lun., 8 jul. 2019 a las 3:42, John Mora (<
>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As every week I updated my report in the Wiki[1]. Also, I
>>>>>>>>>>>>>> pushed my last commits to my branch [2]. Please give it a look if you have
>>>>>>>>>>>>>> time.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This week, I will be continue working in the Queries
>>>>>>>>>>>>>> implementation, please reach me out if you have any suggestions.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Also, while reviewing the datastore interface I noticed this
>>>>>>>>>>>>>> method 'getPartitions(Query<K, T> query)'. What is the expected behavior of
>>>>>>>>>>>>>> this method?, should I use the partition definition in the xml mapping file
>>>>>>>>>>>>>> for this?.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> El dom., 30 jun. 2019 a las 16:56, John Mora (<
>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I received my first evaluation from the Google Summer of
>>>>>>>>>>>>>>> Code program with a positive result. Thanks so much for your support and
>>>>>>>>>>>>>>> confidence to the project and me.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I updated my report of this week in the Wiki[1]. Also, I
>>>>>>>>>>>>>>> pushed my last commits to my branch [2].
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This week, I will be reviewing my the serialization/
>>>>>>>>>>>>>>> deserialization process in order to identify optimizations specific for
>>>>>>>>>>>>>>> Kudu. Because I used a generic methods of other backends which probably
>>>>>>>>>>>>>>> could be better tuned for kudu. Also, I will start working on the Queries
>>>>>>>>>>>>>>> implementation.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> BTW, I added a question to the wiki about Date types. Please
>>>>>>>>>>>>>>> give it a look if you have time.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> El jue., 27 jun. 2019 a las 21:02, John Mora (<
>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Carlos.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for the reminder. I submitted the form yesterday. :D
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> El jue., 27 jun. 2019 a las 17:34, carlos muñoz (<
>>>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi John
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The first Google Summer of Code evaluation is due on June
>>>>>>>>>>>>>>>>> 28th. Please make sure you submit your Mentors' evaluation on time.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>> Carlos
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> El dom., 23 jun. 2019 a las 18:29, John Mora (<
>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> FYI, I updated my report of this week on the Wiki[1].
>>>>>>>>>>>>>>>>>> Also, I pushed my last commits to my branch [2].
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> As I mentioned in the reports I would like to know how
>>>>>>>>>>>>>>>>>> datastores deal with flush(), should it work always manually executed?.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Finally, This week I will be implementing object
>>>>>>>>>>>>>>>>>> serialization/deserialization in the methods put, get, delete, exists. Do
>>>>>>>>>>>>>>>>>> you have any suggestions on how to proceed with this task?.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Footnote: Thanks for the feedback Carlos, I fixed the
>>>>>>>>>>>>>>>>>> problem.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> El lun., 17 jun. 2019 a las 22:58, carlos muñoz (<
>>>>>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi John
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Your last changes look good to me. Keep it up. But, I
>>>>>>>>>>>>>>>>>>> noticed that you have created an Enumeration for datatypes, which is very
>>>>>>>>>>>>>>>>>>> similar to the kudu-client's [2]. Probably you should replace [1] for [2]
>>>>>>>>>>>>>>>>>>> in order to avoid code duplication.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/Column.java#L76
>>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>> https://kudu.apache.org/apidocs/org/apache/kudu/Type.html
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>> Carlos
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> El sáb., 15 jun. 2019 a las 12:01, John Mora (<
>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I updated my report of this week on the Wiki[1]. I
>>>>>>>>>>>>>>>>>>>> noticed that my code is lacking some javadoc documentation I think I will
>>>>>>>>>>>>>>>>>>>> be working on that this week, also I would like to enable and check schema
>>>>>>>>>>>>>>>>>>>> management tests (createSchema, existsSchema, etc.).
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> El mar., 11 jun. 2019 a las 0:11, John Mora (<
>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi Alfonso.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks so much for your feedback. I am working on your
>>>>>>>>>>>>>>>>>>>>> comments.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 16:11, Alfonso Nishikawa (<
>>>>>>>>>>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Regarding your questions at the report [1]:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>    - How to represent partitioning configurations on
>>>>>>>>>>>>>>>>>>>>>>    the mapping file.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> This was discussed in other emails, isn't it? :)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>    - KuduTestHarness requires the Maven plugin
>>>>>>>>>>>>>>>>>>>>>>    os-maven-plugin, which needs Maven 3.1.1+, is it a problem for Apache Gora?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I believe it is not a problem. My Ubuntu comes with
>>>>>>>>>>>>>>>>>>>>>> 3.6.0, far from 3.1.1, and I assume everyone uses Maven 3 in a quite new
>>>>>>>>>>>>>>>>>>>>>> version :)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 21:07, Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>>>>> (<al...@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thank you!
>>>>>>>>>>>>>>>>>>>>>>> Things I have seen:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> - The version of a maven dependency [1] should go on
>>>>>>>>>>>>>>>>>>>>>>> the Dependency Management of the root pom [2]. Same for [3] and from there,
>>>>>>>>>>>>>>>>>>>>>>> should not set the version there.
>>>>>>>>>>>>>>>>>>>>>>> - Set test dependencies' scope to test, at [4] and
>>>>>>>>>>>>>>>>>>>>>>> from there.
>>>>>>>>>>>>>>>>>>>>>>> - Set the indentation to 2 spaces for the pom [5]
>>>>>>>>>>>>>>>>>>>>>>> - Missing "t" in "localhost" at [6].
>>>>>>>>>>>>>>>>>>>>>>> - Port 13 for Kudu? That is "Daytime Protocol" RFC
>>>>>>>>>>>>>>>>>>>>>>> 867 and you will need root permission to run it. The default port for kudu
>>>>>>>>>>>>>>>>>>>>>>> is 7051, isn't it?
>>>>>>>>>>>>>>>>>>>>>>> - I would ask you to add the same functionality to
>>>>>>>>>>>>>>>>>>>>>>> load the mapping from configuration as in HBase's store [7] in you
>>>>>>>>>>>>>>>>>>>>>>> KuduStore [8]. This will have implications on your readMapping at [9], so
>>>>>>>>>>>>>>>>>>>>>>> take a look at the one for HBase at [10]
>>>>>>>>>>>>>>>>>>>>>>> - I know it is in other backends, but avoid
>>>>>>>>>>>>>>>>>>>>>>> RuntimeExceptions (at least in Java since we have the checked ones) like in
>>>>>>>>>>>>>>>>>>>>>>> [11]. You can wrap them in GoraException. An example is [12]
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> And nothing more :)
>>>>>>>>>>>>>>>>>>>>>>> Keep going, good job.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L98
>>>>>>>>>>>>>>>>>>>>>>> [2] -
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/pom.xml#L890
>>>>>>>>>>>>>>>>>>>>>>> [3] -
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L121
>>>>>>>>>>>>>>>>>>>>>>> [4] -
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L180
>>>>>>>>>>>>>>>>>>>>>>> [5] -
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml
>>>>>>>>>>>>>>>>>>>>>>> [6] -
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/test/resources/gora.properties#L18
>>>>>>>>>>>>>>>>>>>>>>> [7] -
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L92
>>>>>>>>>>>>>>>>>>>>>>> [8] -
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/store/KuduStore.java#L53
>>>>>>>>>>>>>>>>>>>>>>> [9] -
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L81
>>>>>>>>>>>>>>>>>>>>>>> [10] -
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L822
>>>>>>>>>>>>>>>>>>>>>>> [11] -
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L141
>>>>>>>>>>>>>>>>>>>>>>> [12] -
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L268
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> El sáb., 8 jun. 2019 a las 20:26, John Mora (<
>>>>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I have just updated my weekly reports on Cwiki [1].
>>>>>>>>>>>>>>>>>>>>>>>> This next week I think I should be focusing on the create schema operation
>>>>>>>>>>>>>>>>>>>>>>>> and solving the issue of the partitioning configurations in the mapping
>>>>>>>>>>>>>>>>>>>>>>>> file.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Please let me know if you have suggestions, my last
>>>>>>>>>>>>>>>>>>>>>>>> commits are available here [2]
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>

Re: Kudu datastore reports

Posted by Alfonso Nishikawa <al...@gmail.com>.
Hi, John.

I finally managed to pass the tests!!!
It seems it needs the packages ntp-wait and/or chronyc, so in order to fix
the error:

java.io.IOException: failed to start masters: Unable to start Master at
index 0:
/tmp/kudu-binary-jar2880948780300354108/kudu-binary-1.9.0-linux-x86_64/bin/kudu-master:
process exited on signal 6 (core dumped)

I just installed:

sudo apt-get install chrony

And the minicluster started working :)

Will take further looks to the PR soon.

Thank you!!

Regards,

Alfonso Nishikawa



El vie., 9 ago. 2019 a las 20:35, John Mora (<jh...@gmail.com>)
escribió:

> Hi.
>
> I updated my PR[2] with some improvements.
>
> Also, I tested my code with this docker image [3] and it worked fine.
>
> This what I did:
> Run docker container:
> docker run --hostname localhost -d --rm --name apache-kudu -p 7051:7051 -p
> 7050:7050 -p 8051:8051 -p 8050:8050 usuresearch/apache-kudu:1.8.0
>
> Change master address within tests [1]:
>
> @Override
> public void setUpClass() throws Exception {
> //comment KuduTestHarness start
> //harness.before();
> conf.set(KuduBackendConstants.PROP_MASTERADDRESSES, "localhost:7051");
> }
>
> @Override
> public void tearDownClass() throws Exception {
> // comment KuduTestHarness end
> ///harness.after();
> }
>
>
> If you agree I could a create second version of the tests using
> testcontainers and this docker image.
>
>
> [1]
> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/test/java/org/apache/gora/kudu/GoraKuduTestDriver.java#L41
> [2] https://github.com/apache/gora/pull/178
> [3] https://hub.docker.com/r/usuresearch/apache-kudu/
>
> Cheers,
> John
>
> El vie., 9 ago. 2019 a las 13:33, John Mora (<jh...@gmail.com>)
> escribió:
>
>> Hi Alfonso,
>>
>> Please take into a consideration that the property
>> 'gora.datastore.kudu.masterAddress' is overridden in the class
>> GoraKuduTestDriver [1]. Because KuduTestHarness  generates random ports
>> which need to be configured at runtime. Probably, you should change the
>> property there too.
>>
>> I will test my code with a docker container, in order to figure out the
>> origin of the issue. Please let me know if someone faces this issue when
>> building the project.
>>
>>
>> [1]
>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/test/java/org/apache/gora/kudu/GoraKuduTestDriver.java#L41
>>
>>
>> Cheers,
>> John.
>>
>>
>> El vie., 9 ago. 2019 a las 12:22, Alfonso Nishikawa (<
>> alfonso.nishikawa@gmail.com>) escribió:
>>
>>> Hi,
>>>
>>> I installed a standalone Kudu server (compiled from sources) in a
>>> virtual machine and configured test/resources/gora.properties to use
>>> "nosql:8051" master.
>>>
>>> The tests freezes waiting for kudu response and a tons of connections to
>>> the master gets created.
>>>
>>> *To the community:* Can someone please clone
>>> https://github.com/jhnmora000/gora/tree/GORA-485 and build it to see if
>>> the problem can be reproduced?
>>>
>>> I attach snapshots showing how connections stays waiting.
>>>
>>> Thank you!
>>>
>>> Best Regards,
>>>
>>> Alfonso Nishikawa
>>>
>>>
>>>
>>> El jue., 8 ago. 2019 a las 20:18, Alfonso Nishikawa (<
>>> alfonso.nishikawa@gmail.com>) escribió:
>>>
>>>> Hi, John.
>>>>
>>>> Tried using Oracle jdk 1.8 and found the same core dump:
>>>>
>>>> [INFO] Running org.apache.gora.kudu.store.TestKuduStore
>>>> [ERROR] Tests run: 44, Failures: 0, Errors: 40, Skipped: 4, Time
>>>> elapsed: 52.466 s <<< FAILURE! - in org.apache.gora.kudu.store.TestKuduStore
>>>> [ERROR] testNewInstance(org.apache.gora.kudu.store.TestKuduStore)  Time
>>>> elapsed: 1.834 s  <<< ERROR!
>>>> java.io.IOException: failed to start masters: Unable to start Master at
>>>> index 0:
>>>> /tmp/kudu-binary-jar4319751617646651391/kudu-binary-1.9.0-linux-x86_64/bin/kudu-master:
>>>> process exited on signal 6 (core dumped)
>>>>
>>>> (I expected to fail too, since the problem doesn't look like being
>>>> related to the jvm).
>>>>
>>>> Thanks for giving it a look. Don't know what must be the problem :\
>>>>
>>>> Best Regards,
>>>>
>>>> Alfosno Nishikawa
>>>>
>>>>
>>>> El mar., 6 ago. 2019 a las 4:26, John Mora (<jh...@gmail.com>)
>>>> escribió:
>>>>
>>>>> Hi Alfonso,
>>>>>
>>>>> Unfortunately, I have not been able to reproduce the issue. Maybe it
>>>>> is related with my Java version (Oracle), I will try with OpenJDK.
>>>>> Some details about my development environment:
>>>>>
>>>>> os.detected.name: linux
>>>>> os.detected.arch: x86_64
>>>>> os.detected.version: 4.10
>>>>> os.detected.version.major: 4
>>>>> os.detected.version.minor: 10
>>>>> os.detected.release: linuxmint
>>>>> os.detected.release.version: 18.3
>>>>> os.detected.release.like.linuxmint: true
>>>>> os.detected.release.like.ubuntu: true
>>>>> os.detected.classifier: linux-x86_64
>>>>>
>>>>> Java
>>>>> java version "1.8.0_171"
>>>>> Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
>>>>> Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)
>>>>>
>>>>> Maven
>>>>> Apache Maven 3.3.9
>>>>> Maven home: /usr/share/maven
>>>>> Java version: 1.8.0_171, vendor: Oracle Corporation
>>>>> Java home: /usr/lib/jvm/java-8-oracle/jre
>>>>> Default locale: en_US, platform encoding: UTF-8
>>>>> OS name: "linux", version: "4.10.0-38-generic", arch: "amd64", family:
>>>>> "unix"
>>>>>
>>>>>
>>>>> Best,
>>>>> John.
>>>>>
>>>>> El lun., 5 ago. 2019 a las 16:48, Alfonso Nishikawa (<
>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am using now the following pom configuration I got from executing
>>>>>> `mvn dependency:tree`:
>>>>>>
>>>>>>     <dependency>
>>>>>>       <groupId>org.apache.kudu</groupId>
>>>>>>       <artifactId>kudu-binary</artifactId>
>>>>>>       <classifier>linux-x86_64</classifier>
>>>>>>       <version>1.9.0</version>
>>>>>>       <scope>test</scope>
>>>>>>     </dependency>
>>>>>>
>>>>>> When I execute `mvn clen package` on gora-kudu I find that it spawns
>>>>>> the following command:
>>>>>>
>>>>>> kudu-master
>>>>>> --fs_wal_dir=/tmp/mini-kudu-cluster8989984398759938222/master-0/wal
>>>>>> --fs_data_dirs=/tmp/mini-kudu-cluster8989984398759938222/master-0/data
>>>>>> --block_manager=log --webserver_interface=localhost --ipki_ca_key_size=1024
>>>>>> --tsk_num_rsa_bits=512 --rpc_bind_addresses=*127.26.116.190*:39535
>>>>>> --webserver_interface=*127.26.116.190* --webserver_port=0
>>>>>> --never_fsync --ipki_server_key_size=1024 --enable_minidumps=false
>>>>>> --redact=none --metrics_log_interval_ms=1000 --logtostderr --logbuflevel=-1
>>>>>> --log_dir=/tmp/mini-kudu-cluster8989984398759938222/master-0/logs
>>>>>> --server_dump_info_path=/tmp/mini-kudu-cluster8989984398759938222/master-0/data/info.pb
>>>>>> --server_dump_info_format=pb --rpc_server_allow_ephemeral_ports
>>>>>> --unlock_experimental_flags --unlock_unsafe_flags --rpc_reuseport=true
>>>>>> --master_addresses=*127.26.116.190*:39535,*127.26.116.189*:33913,
>>>>>> *127.26.116.188*:42253
>>>>>>
>>>>>>
>>>>>> I highlight the IP addresses because they clearly are not my
>>>>>> computer, and I guess that is why the tests can't connect to the the
>>>>>> database.
>>>>>>
>>>>>> Any idea on how to solve this?
>>>>>>
>>>>>> Thank you!
>>>>>>
>>>>>>
>>>>>> Best Regards,
>>>>>>
>>>>>> Alfonso Nishikawa
>>>>>>
>>>>>>
>>>>>>
>>>>>> El lun., 5 ago. 2019 a las 8:39, Alfonso Nishikawa (<
>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>
>>>>>>> Hi, John.
>>>>>>>
>>>>>>> I get a core dump from the binary kudu server when trying to run the
>>>>>>> tests. Didn't find a log file, but will search thoroughly later. Happened
>>>>>>> anytime to you? Does it happens to anyone?
>>>>>>>
>>>>>>> I am using Ubuntu 18.04
>>>>>>>
>>>>>>> Thank you!
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Alfonso Nishikawa
>>>>>>>
>>>>>>> El dom., 4 ago. 2019 20:10, Furkan KAMACI <fu...@gmail.com>
>>>>>>> escribió:
>>>>>>>
>>>>>>>> Hi John,
>>>>>>>>
>>>>>>>> I've already made my comments at your PR. Please check them
>>>>>>>> carefully and ask me if you need help.
>>>>>>>>
>>>>>>>> For the documentation, I've checked what you've done. On the other
>>>>>>>> hand, I would want to encourage you to write a blog post about your Kudu
>>>>>>>> implementation and demonstrate an example of Kudu integration with Gora as
>>>>>>>> like a tutorial.
>>>>>>>>
>>>>>>>> Kind Regards,
>>>>>>>> Furkan KAMACI
>>>>>>>>
>>>>>>>> On Sun, Aug 4, 2019 at 1:59 AM John Mora <jh...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi all.
>>>>>>>>>
>>>>>>>>> I have updated my report in the Wiki[1].
>>>>>>>>>
>>>>>>>>> Also, I have sent a PR with my last commits for review [2]. Please
>>>>>>>>> give it a look if you have time.
>>>>>>>>>
>>>>>>>>> This week, I will continue working on the documentation of the
>>>>>>>>> kudu datastore.
>>>>>>>>>
>>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>> [2] https://github.com/apache/gora/pull/178
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> John.
>>>>>>>>>
>>>>>>>>> El mié., 31 jul. 2019 a las 11:17, carlos muñoz (<
>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>
>>>>>>>>>> Hi John,
>>>>>>>>>>
>>>>>>>>>> Thanks for the update. I reviewed your code a little bit, it is
>>>>>>>>>> looking good. I think tha you should send a PR in order to receive feedback
>>>>>>>>>> from other community members.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Carlos
>>>>>>>>>>
>>>>>>>>>> El dom., 28 jul. 2019 a las 23:20, John Mora (<
>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>
>>>>>>>>>>> Hi all.
>>>>>>>>>>>
>>>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last
>>>>>>>>>>> commits to my branch [2]. Please give it a look if you have time.
>>>>>>>>>>>
>>>>>>>>>>> This week, I will give a look to the documentation of datastores.
>>>>>>>>>>>
>>>>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> John
>>>>>>>>>>>
>>>>>>>>>>> El mié., 24 jul. 2019 a las 11:34, John Mora (<
>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Alfonso,
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, I was using this class javafx.util.Pair. It is not a
>>>>>>>>>>>> problem I will find an alternative, it is only an utilitary class.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> John
>>>>>>>>>>>>
>>>>>>>>>>>> El mar., 23 jul. 2019 a las 12:36, Alfonso Nishikawa (<
>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I checked out your code and it looks good :)
>>>>>>>>>>>>> I found that you use javafx, but that is not present in
>>>>>>>>>>>>> OpenJDK and fails to compile, and since we don't stick to Oracle JVM I
>>>>>>>>>>>>> would suggest to change it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Good job, keep it going :)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> El sáb., 20 jul. 2019 a las 22:25, John Mora (<
>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last
>>>>>>>>>>>>>> commits to my branch [2]. Please give it a look if you have time.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This week, I will give a look to the map reduce tests for
>>>>>>>>>>>>>> DataStores.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> El sáb., 13 jul. 2019 a las 19:31, John Mora (<
>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi all
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last
>>>>>>>>>>>>>>> commits to my branch [2]. Please give it a look if you have time.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This week, I will be working in the getPartitions and
>>>>>>>>>>>>>>> deleteByQuery methods and testing the other tests in the DataStoreTestBase
>>>>>>>>>>>>>>> class.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> El mié., 10 jul. 2019 a las 16:17, John Mora (<
>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Alfonso,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks so much for your time and support for this project.
>>>>>>>>>>>>>>>> I will work on your comments. Responses inline :)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> El mar., 9 jul. 2019 a las 16:38, Alfonso Nishikawa (<
>>>>>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sorry for the delay, I am changing work and I have been
>>>>>>>>>>>>>>>>> very busy :( I will try to answer your questions :)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> *> In the Employee example there is a field called
>>>>>>>>>>>>>>>>> 'dateOfBirth'. I tried to map that field with the UNIXTIME_MICROS datatype
>>>>>>>>>>>>>>>>> of Kudu (I intuitively assumed this is a date.). However, in the java world
>>>>>>>>>>>>>>>>> the Employee field is a Long value and the kudu datatype is a Timestamp.
>>>>>>>>>>>>>>>>> So, I was wondering whether I should force the usage of the UNIXTIME_MICROS
>>>>>>>>>>>>>>>>> datatype for this field or just use a LONG datatype in Kudu.*
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In Avro 1.8 were introduced "Logical Types" so there is a
>>>>>>>>>>>>>>>>> "date" type with an underlying "int" [1]. It's the first time I read about
>>>>>>>>>>>>>>>>> because until the last version upgrade of Avro this weren't there. I would
>>>>>>>>>>>>>>>>> suggest to ignore "dates" and map dateOfBirth as long, since in any case
>>>>>>>>>>>>>>>>> -in avro- the value is the unix epoch. After this first approach, a design
>>>>>>>>>>>>>>>>> improvement would be great, though :)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - Would be good to have in the mapping a "timestamp" type
>>>>>>>>>>>>>>>>> so KuduStore converts between the Entity long field <-> Kudu timestamp
>>>>>>>>>>>>>>>>> storage?
>>>>>>>>>>>>>>>>> - Is there any other approach?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think that Entity long field <-> Kudu timestamp
>>>>>>>>>>>>>>>> conversion that the best alternative right now. Because, I would add more
>>>>>>>>>>>>>>>> compatible datatypes to the mapping parameters which users can use. And
>>>>>>>>>>>>>>>> this conversion should not be dificult to implement in my opinion. Also,
>>>>>>>>>>>>>>>> the new Date datatype of avro could be implemented in newer versions
>>>>>>>>>>>>>>>> because it would need further analysis in other datastores too. I will work
>>>>>>>>>>>>>>>> on that.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> *> What is the Gora's policy regarding flush()? *
>>>>>>>>>>>>>>>>> *> KuduClient has multiple flushing modes
>>>>>>>>>>>>>>>>> <https://kudu.apache.org/apidocs/org/apache/kudu/client/SessionConfiguration.FlushMode.html>and
>>>>>>>>>>>>>>>>> also can set time interval
>>>>>>>>>>>>>>>>> <https://kudu.apache.org/releases/1.2.0/apidocs/org/apache/kudu/client/KuduSession.html#setFlushInterval-int->
>>>>>>>>>>>>>>>>> for automatic flush.*
>>>>>>>>>>>>>>>>> *> Should theses behaviors be configurable using
>>>>>>>>>>>>>>>>> gora.properties file? or just use the default configurations.*
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> What we do in HBase is configure an autoflush option in
>>>>>>>>>>>>>>>>> gora.properties [2] which is used when instanced the Table, but at the same
>>>>>>>>>>>>>>>>> time we implement the flush() method to force the flush [3]. I would
>>>>>>>>>>>>>>>>> suggest to follow that example, but adding the flushing options of Kudu.
>>>>>>>>>>>>>>>>> What flushing mode (and time interval if it applies) do you suggest?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Well,  IMHO the default flush mode (auto flush sync) will
>>>>>>>>>>>>>>>> do the job for most use cases. But I will add a configuration in
>>>>>>>>>>>>>>>> gora.properties for selecting the other modes and specifying a autoflush
>>>>>>>>>>>>>>>> time  if needed  by the user.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> *> Also, while reviewing the datastore interface I noticed
>>>>>>>>>>>>>>>>> this method 'getPartitions(Query<K, T> query)'. What is the expected
>>>>>>>>>>>>>>>>> behavior of this method?, should I use the partition definition in the xml
>>>>>>>>>>>>>>>>> mapping file for this?.*
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The method getPartitions(Query) is related to Hadoop.
>>>>>>>>>>>>>>>>> Apache Gora integrates with Hadoop implementing a custom Map and Reduce
>>>>>>>>>>>>>>>>> that allows to get/write Entities directly.
>>>>>>>>>>>>>>>>> You can take a look at HBase's implementation [4], which
>>>>>>>>>>>>>>>>> relies o.a.h.hbase.mapreduce.TableInputFormatBase [5] to
>>>>>>>>>>>>>>>>> compute the splits (start key---end key) with the location of the split to
>>>>>>>>>>>>>>>>> create a colection of partitions [6].
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So, if Kudu is allowed to perform computation using local
>>>>>>>>>>>>>>>>> kudu splits, then this method does the needed preparation to allow to "send
>>>>>>>>>>>>>>>>> the computation to where the data is locally".
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In any case, you can see that:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>    - MongoDB store implementation does not implement
>>>>>>>>>>>>>>>>>    splitting [7]
>>>>>>>>>>>>>>>>>    - Cassandra store implementation does not implement
>>>>>>>>>>>>>>>>>    splitting [8]
>>>>>>>>>>>>>>>>>    - Aerospike store implementation does not implement
>>>>>>>>>>>>>>>>>    splitting [9]
>>>>>>>>>>>>>>>>>    - Accumulo store implementation* does* implement
>>>>>>>>>>>>>>>>>    splitting [10]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If Kudu has a method to get the different splits for a
>>>>>>>>>>>>>>>>> table and its locations, then you will be able to implement the full
>>>>>>>>>>>>>>>>> feature.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This is Hadoop related and it is not trivial. I haven't
>>>>>>>>>>>>>>>>> elaborated much, so if you find you need more information let me know :)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I will check whether Kudu has these features in order to
>>>>>>>>>>>>>>>> implement this method. If not I will use the default implementation found
>>>>>>>>>>>>>>>> in other backends.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> About Queries, what I can tell is that Hbase only
>>>>>>>>>>>>>>>>> implements "Start key" + "End key" because it has only 2 operations: "get"
>>>>>>>>>>>>>>>>> and "scan", and the querying is for "scan" operation, were you want an
>>>>>>>>>>>>>>>>> interval (or all) of the rows. Does Kudu have more querying functionality?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yes, Kudu implements a Scanner for querying data among with
>>>>>>>>>>>>>>>> conditional predicates for filtering. I am using those classes.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> About other topic, I am trying to install Kudu in
>>>>>>>>>>>>>>>>> standalone (all in 1 node). Do you use a Cloudera installation or do you
>>>>>>>>>>>>>>>>> have a standalone installation? How do you do it? I found some
>>>>>>>>>>>>>>>>> instructions, but they talk about compiling Kudu [11]. I was looking for
>>>>>>>>>>>>>>>>> something like HBase, that it is unzip + execute "hbase start".
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am using an embedded mini-cluster which comes with
>>>>>>>>>>>>>>>> compiled binaries and can be used with maven[1] for testing my code. Once I
>>>>>>>>>>>>>>>> get it mature enough I think I will be testing the datastore with a docker
>>>>>>>>>>>>>>>> container [2]. I could not find a unzip+execute bundle either and I am
>>>>>>>>>>>>>>>> kinda noob for compiling it myself.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>> https://kudu.apache.org/docs/developing.html#_jvm_based_integration_testing
>>>>>>>>>>>>>>>> [2] https://hub.docker.com/r/usuresearch/apache-kudu/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Good job and thank you!! :)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>> https://avro.apache.org/docs/1.8.0/spec.html#Logical+Types
>>>>>>>>>>>>>>>>> [2] -
>>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L175
>>>>>>>>>>>>>>>>> [3] -
>>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L458
>>>>>>>>>>>>>>>>> [4] -
>>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L472
>>>>>>>>>>>>>>>>> [5] -
>>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L479
>>>>>>>>>>>>>>>>> [6] -
>>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L517
>>>>>>>>>>>>>>>>> [7] -
>>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-mongodb/src/main/java/org/apache/gora/mongodb/store/MongoStore.java#L533
>>>>>>>>>>>>>>>>> [8] -
>>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L292
>>>>>>>>>>>>>>>>> [9] -
>>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-aerospike/src/main/java/org/apache/gora/aerospike/store/AerospikeStore.java#L369
>>>>>>>>>>>>>>>>> [10] -
>>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-accumulo/src/main/java/org/apache/gora/accumulo/store/AccumuloStore.java#L902
>>>>>>>>>>>>>>>>> [11] - https://kudu.apache.org/docs/installation.html
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> El lun., 8 jul. 2019 a las 3:42, John Mora (<
>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> As every week I updated my report in the Wiki[1]. Also, I
>>>>>>>>>>>>>>>>>> pushed my last commits to my branch [2]. Please give it a look if you have
>>>>>>>>>>>>>>>>>> time.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This week, I will be continue working in the Queries
>>>>>>>>>>>>>>>>>> implementation, please reach me out if you have any suggestions.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Also, while reviewing the datastore interface I noticed
>>>>>>>>>>>>>>>>>> this method 'getPartitions(Query<K, T> query)'. What is the expected
>>>>>>>>>>>>>>>>>> behavior of this method?, should I use the partition definition in the xml
>>>>>>>>>>>>>>>>>> mapping file for this?.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> El dom., 30 jun. 2019 a las 16:56, John Mora (<
>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I received my first evaluation from the Google Summer of
>>>>>>>>>>>>>>>>>>> Code program with a positive result. Thanks so much for your support and
>>>>>>>>>>>>>>>>>>> confidence to the project and me.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I updated my report of this week in the Wiki[1]. Also, I
>>>>>>>>>>>>>>>>>>> pushed my last commits to my branch [2].
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> This week, I will be reviewing my the serialization/
>>>>>>>>>>>>>>>>>>> deserialization process in order to identify optimizations specific for
>>>>>>>>>>>>>>>>>>> Kudu. Because I used a generic methods of other backends which probably
>>>>>>>>>>>>>>>>>>> could be better tuned for kudu. Also, I will start working on the Queries
>>>>>>>>>>>>>>>>>>> implementation.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> BTW, I added a question to the wiki about Date types.
>>>>>>>>>>>>>>>>>>> Please give it a look if you have time.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> El jue., 27 jun. 2019 a las 21:02, John Mora (<
>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi Carlos.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks for the reminder. I submitted the form
>>>>>>>>>>>>>>>>>>>> yesterday. :D
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> El jue., 27 jun. 2019 a las 17:34, carlos muñoz (<
>>>>>>>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi John
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The first Google Summer of Code evaluation is due on
>>>>>>>>>>>>>>>>>>>>> June 28th. Please make sure you submit your Mentors' evaluation on time.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>> Carlos
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> El dom., 23 jun. 2019 a las 18:29, John Mora (<
>>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> FYI, I updated my report of this week on the Wiki[1].
>>>>>>>>>>>>>>>>>>>>>> Also, I pushed my last commits to my branch [2].
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> As I mentioned in the reports I would like to know
>>>>>>>>>>>>>>>>>>>>>> how datastores deal with flush(), should it work always manually executed?.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Finally, This week I will be implementing object
>>>>>>>>>>>>>>>>>>>>>> serialization/deserialization in the methods put, get, delete, exists. Do
>>>>>>>>>>>>>>>>>>>>>> you have any suggestions on how to proceed with this task?.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Footnote: Thanks for the feedback Carlos, I fixed the
>>>>>>>>>>>>>>>>>>>>>> problem.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> El lun., 17 jun. 2019 a las 22:58, carlos muñoz (<
>>>>>>>>>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Hi John
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Your last changes look good to me. Keep it up. But,
>>>>>>>>>>>>>>>>>>>>>>> I noticed that you have created an Enumeration for datatypes, which is very
>>>>>>>>>>>>>>>>>>>>>>> similar to the kudu-client's [2]. Probably you should replace [1] for [2]
>>>>>>>>>>>>>>>>>>>>>>> in order to avoid code duplication.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/Column.java#L76
>>>>>>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>>> https://kudu.apache.org/apidocs/org/apache/kudu/Type.html
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>> Carlos
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> El sáb., 15 jun. 2019 a las 12:01, John Mora (<
>>>>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I updated my report of this week on the Wiki[1]. I
>>>>>>>>>>>>>>>>>>>>>>>> noticed that my code is lacking some javadoc documentation I think I will
>>>>>>>>>>>>>>>>>>>>>>>> be working on that this week, also I would like to enable and check schema
>>>>>>>>>>>>>>>>>>>>>>>> management tests (createSchema, existsSchema, etc.).
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> El mar., 11 jun. 2019 a las 0:11, John Mora (<
>>>>>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Hi Alfonso.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Thanks so much for your feedback. I am working on
>>>>>>>>>>>>>>>>>>>>>>>>> your comments.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 16:11, Alfonso
>>>>>>>>>>>>>>>>>>>>>>>>> Nishikawa (<al...@gmail.com>)
>>>>>>>>>>>>>>>>>>>>>>>>> escribió:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Regarding your questions at the report [1]:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>    - How to represent partitioning
>>>>>>>>>>>>>>>>>>>>>>>>>>    configurations on the mapping file.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> This was discussed in other emails, isn't it? :)
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>    - KuduTestHarness requires the Maven plugin
>>>>>>>>>>>>>>>>>>>>>>>>>>    os-maven-plugin, which needs Maven 3.1.1+, is it a problem for Apache Gora?
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> I believe it is not a problem. My Ubuntu comes
>>>>>>>>>>>>>>>>>>>>>>>>>> with 3.6.0, far from 3.1.1, and I assume everyone uses Maven 3 in a quite
>>>>>>>>>>>>>>>>>>>>>>>>>> new version :)
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 21:07, Alfonso
>>>>>>>>>>>>>>>>>>>>>>>>>> Nishikawa (<al...@gmail.com>)
>>>>>>>>>>>>>>>>>>>>>>>>>> escribió:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you!
>>>>>>>>>>>>>>>>>>>>>>>>>>> Things I have seen:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> - The version of a maven dependency [1] should
>>>>>>>>>>>>>>>>>>>>>>>>>>> go on the Dependency Management of the root pom [2]. Same for [3] and from
>>>>>>>>>>>>>>>>>>>>>>>>>>> there, should not set the version there.
>>>>>>>>>>>>>>>>>>>>>>>>>>> - Set test dependencies' scope to test, at [4]
>>>>>>>>>>>>>>>>>>>>>>>>>>> and from there.
>>>>>>>>>>>>>>>>>>>>>>>>>>> - Set the indentation to 2 spaces for the pom [5]
>>>>>>>>>>>>>>>>>>>>>>>>>>> - Missing "t" in "localhost" at [6].
>>>>>>>>>>>>>>>>>>>>>>>>>>> - Port 13 for Kudu? That is "Daytime Protocol"
>>>>>>>>>>>>>>>>>>>>>>>>>>> RFC 867 and you will need root permission to run it. The default port for
>>>>>>>>>>>>>>>>>>>>>>>>>>> kudu is 7051, isn't it?
>>>>>>>>>>>>>>>>>>>>>>>>>>> - I would ask you to add the same functionality
>>>>>>>>>>>>>>>>>>>>>>>>>>> to load the mapping from configuration as in HBase's store [7] in you
>>>>>>>>>>>>>>>>>>>>>>>>>>> KuduStore [8]. This will have implications on your readMapping at [9], so
>>>>>>>>>>>>>>>>>>>>>>>>>>> take a look at the one for HBase at [10]
>>>>>>>>>>>>>>>>>>>>>>>>>>> - I know it is in other backends, but avoid
>>>>>>>>>>>>>>>>>>>>>>>>>>> RuntimeExceptions (at least in Java since we have the checked ones) like in
>>>>>>>>>>>>>>>>>>>>>>>>>>> [11]. You can wrap them in GoraException. An example is [12]
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> And nothing more :)
>>>>>>>>>>>>>>>>>>>>>>>>>>> Keep going, good job.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L98
>>>>>>>>>>>>>>>>>>>>>>>>>>> [2] -
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/pom.xml#L890
>>>>>>>>>>>>>>>>>>>>>>>>>>> [3] -
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L121
>>>>>>>>>>>>>>>>>>>>>>>>>>> [4] -
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L180
>>>>>>>>>>>>>>>>>>>>>>>>>>> [5] -
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml
>>>>>>>>>>>>>>>>>>>>>>>>>>> [6] -
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/test/resources/gora.properties#L18
>>>>>>>>>>>>>>>>>>>>>>>>>>> [7] -
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L92
>>>>>>>>>>>>>>>>>>>>>>>>>>> [8] -
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/store/KuduStore.java#L53
>>>>>>>>>>>>>>>>>>>>>>>>>>> [9] -
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L81
>>>>>>>>>>>>>>>>>>>>>>>>>>> [10] -
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L822
>>>>>>>>>>>>>>>>>>>>>>>>>>> [11] -
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L141
>>>>>>>>>>>>>>>>>>>>>>>>>>> [12] -
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L268
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> El sáb., 8 jun. 2019 a las 20:26, John Mora (<
>>>>>>>>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> I have just updated my weekly reports on Cwiki
>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1]. This next week I think I should be focusing on the create schema
>>>>>>>>>>>>>>>>>>>>>>>>>>>> operation and solving the issue of the partitioning configurations in the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> mapping file.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please let me know if you have suggestions, my
>>>>>>>>>>>>>>>>>>>>>>>>>>>> last commits are available here [2]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Re: Kudu datastore reports

Posted by John Mora <jh...@gmail.com>.
Hi.

I updated my PR[2] with some improvements.

Also, I tested my code with this docker image [3] and it worked fine.

This what I did:
Run docker container:
docker run --hostname localhost -d --rm --name apache-kudu -p 7051:7051 -p
7050:7050 -p 8051:8051 -p 8050:8050 usuresearch/apache-kudu:1.8.0

Change master address within tests [1]:

@Override
public void setUpClass() throws Exception {
//comment KuduTestHarness start
//harness.before();
conf.set(KuduBackendConstants.PROP_MASTERADDRESSES, "localhost:7051");
}

@Override
public void tearDownClass() throws Exception {
// comment KuduTestHarness end
///harness.after();
}


If you agree I could a create second version of the tests using
testcontainers and this docker image.


[1]
https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/test/java/org/apache/gora/kudu/GoraKuduTestDriver.java#L41
[2] https://github.com/apache/gora/pull/178
[3] https://hub.docker.com/r/usuresearch/apache-kudu/

Cheers,
John

El vie., 9 ago. 2019 a las 13:33, John Mora (<jh...@gmail.com>)
escribió:

> Hi Alfonso,
>
> Please take into a consideration that the property
> 'gora.datastore.kudu.masterAddress' is overridden in the class
> GoraKuduTestDriver [1]. Because KuduTestHarness  generates random ports
> which need to be configured at runtime. Probably, you should change the
> property there too.
>
> I will test my code with a docker container, in order to figure out the
> origin of the issue. Please let me know if someone faces this issue when
> building the project.
>
>
> [1]
> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/test/java/org/apache/gora/kudu/GoraKuduTestDriver.java#L41
>
>
> Cheers,
> John.
>
>
> El vie., 9 ago. 2019 a las 12:22, Alfonso Nishikawa (<
> alfonso.nishikawa@gmail.com>) escribió:
>
>> Hi,
>>
>> I installed a standalone Kudu server (compiled from sources) in a virtual
>> machine and configured test/resources/gora.properties to use "nosql:8051"
>> master.
>>
>> The tests freezes waiting for kudu response and a tons of connections to
>> the master gets created.
>>
>> *To the community:* Can someone please clone
>> https://github.com/jhnmora000/gora/tree/GORA-485 and build it to see if
>> the problem can be reproduced?
>>
>> I attach snapshots showing how connections stays waiting.
>>
>> Thank you!
>>
>> Best Regards,
>>
>> Alfonso Nishikawa
>>
>>
>>
>> El jue., 8 ago. 2019 a las 20:18, Alfonso Nishikawa (<
>> alfonso.nishikawa@gmail.com>) escribió:
>>
>>> Hi, John.
>>>
>>> Tried using Oracle jdk 1.8 and found the same core dump:
>>>
>>> [INFO] Running org.apache.gora.kudu.store.TestKuduStore
>>> [ERROR] Tests run: 44, Failures: 0, Errors: 40, Skipped: 4, Time
>>> elapsed: 52.466 s <<< FAILURE! - in org.apache.gora.kudu.store.TestKuduStore
>>> [ERROR] testNewInstance(org.apache.gora.kudu.store.TestKuduStore)  Time
>>> elapsed: 1.834 s  <<< ERROR!
>>> java.io.IOException: failed to start masters: Unable to start Master at
>>> index 0:
>>> /tmp/kudu-binary-jar4319751617646651391/kudu-binary-1.9.0-linux-x86_64/bin/kudu-master:
>>> process exited on signal 6 (core dumped)
>>>
>>> (I expected to fail too, since the problem doesn't look like being
>>> related to the jvm).
>>>
>>> Thanks for giving it a look. Don't know what must be the problem :\
>>>
>>> Best Regards,
>>>
>>> Alfosno Nishikawa
>>>
>>>
>>> El mar., 6 ago. 2019 a las 4:26, John Mora (<jh...@gmail.com>)
>>> escribió:
>>>
>>>> Hi Alfonso,
>>>>
>>>> Unfortunately, I have not been able to reproduce the issue. Maybe it is
>>>> related with my Java version (Oracle), I will try with OpenJDK.
>>>> Some details about my development environment:
>>>>
>>>> os.detected.name: linux
>>>> os.detected.arch: x86_64
>>>> os.detected.version: 4.10
>>>> os.detected.version.major: 4
>>>> os.detected.version.minor: 10
>>>> os.detected.release: linuxmint
>>>> os.detected.release.version: 18.3
>>>> os.detected.release.like.linuxmint: true
>>>> os.detected.release.like.ubuntu: true
>>>> os.detected.classifier: linux-x86_64
>>>>
>>>> Java
>>>> java version "1.8.0_171"
>>>> Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
>>>> Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)
>>>>
>>>> Maven
>>>> Apache Maven 3.3.9
>>>> Maven home: /usr/share/maven
>>>> Java version: 1.8.0_171, vendor: Oracle Corporation
>>>> Java home: /usr/lib/jvm/java-8-oracle/jre
>>>> Default locale: en_US, platform encoding: UTF-8
>>>> OS name: "linux", version: "4.10.0-38-generic", arch: "amd64", family:
>>>> "unix"
>>>>
>>>>
>>>> Best,
>>>> John.
>>>>
>>>> El lun., 5 ago. 2019 a las 16:48, Alfonso Nishikawa (<
>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am using now the following pom configuration I got from executing
>>>>> `mvn dependency:tree`:
>>>>>
>>>>>     <dependency>
>>>>>       <groupId>org.apache.kudu</groupId>
>>>>>       <artifactId>kudu-binary</artifactId>
>>>>>       <classifier>linux-x86_64</classifier>
>>>>>       <version>1.9.0</version>
>>>>>       <scope>test</scope>
>>>>>     </dependency>
>>>>>
>>>>> When I execute `mvn clen package` on gora-kudu I find that it spawns
>>>>> the following command:
>>>>>
>>>>> kudu-master
>>>>> --fs_wal_dir=/tmp/mini-kudu-cluster8989984398759938222/master-0/wal
>>>>> --fs_data_dirs=/tmp/mini-kudu-cluster8989984398759938222/master-0/data
>>>>> --block_manager=log --webserver_interface=localhost --ipki_ca_key_size=1024
>>>>> --tsk_num_rsa_bits=512 --rpc_bind_addresses=*127.26.116.190*:39535
>>>>> --webserver_interface=*127.26.116.190* --webserver_port=0
>>>>> --never_fsync --ipki_server_key_size=1024 --enable_minidumps=false
>>>>> --redact=none --metrics_log_interval_ms=1000 --logtostderr --logbuflevel=-1
>>>>> --log_dir=/tmp/mini-kudu-cluster8989984398759938222/master-0/logs
>>>>> --server_dump_info_path=/tmp/mini-kudu-cluster8989984398759938222/master-0/data/info.pb
>>>>> --server_dump_info_format=pb --rpc_server_allow_ephemeral_ports
>>>>> --unlock_experimental_flags --unlock_unsafe_flags --rpc_reuseport=true
>>>>> --master_addresses=*127.26.116.190*:39535,*127.26.116.189*:33913,
>>>>> *127.26.116.188*:42253
>>>>>
>>>>>
>>>>> I highlight the IP addresses because they clearly are not my computer,
>>>>> and I guess that is why the tests can't connect to the the database.
>>>>>
>>>>> Any idea on how to solve this?
>>>>>
>>>>> Thank you!
>>>>>
>>>>>
>>>>> Best Regards,
>>>>>
>>>>> Alfonso Nishikawa
>>>>>
>>>>>
>>>>>
>>>>> El lun., 5 ago. 2019 a las 8:39, Alfonso Nishikawa (<
>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>
>>>>>> Hi, John.
>>>>>>
>>>>>> I get a core dump from the binary kudu server when trying to run the
>>>>>> tests. Didn't find a log file, but will search thoroughly later. Happened
>>>>>> anytime to you? Does it happens to anyone?
>>>>>>
>>>>>> I am using Ubuntu 18.04
>>>>>>
>>>>>> Thank you!
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Alfonso Nishikawa
>>>>>>
>>>>>> El dom., 4 ago. 2019 20:10, Furkan KAMACI <fu...@gmail.com>
>>>>>> escribió:
>>>>>>
>>>>>>> Hi John,
>>>>>>>
>>>>>>> I've already made my comments at your PR. Please check them
>>>>>>> carefully and ask me if you need help.
>>>>>>>
>>>>>>> For the documentation, I've checked what you've done. On the other
>>>>>>> hand, I would want to encourage you to write a blog post about your Kudu
>>>>>>> implementation and demonstrate an example of Kudu integration with Gora as
>>>>>>> like a tutorial.
>>>>>>>
>>>>>>> Kind Regards,
>>>>>>> Furkan KAMACI
>>>>>>>
>>>>>>> On Sun, Aug 4, 2019 at 1:59 AM John Mora <jh...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi all.
>>>>>>>>
>>>>>>>> I have updated my report in the Wiki[1].
>>>>>>>>
>>>>>>>> Also, I have sent a PR with my last commits for review [2]. Please
>>>>>>>> give it a look if you have time.
>>>>>>>>
>>>>>>>> This week, I will continue working on the documentation of the kudu
>>>>>>>> datastore.
>>>>>>>>
>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>
>>>>>>>> [1]
>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>> [2] https://github.com/apache/gora/pull/178
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> John.
>>>>>>>>
>>>>>>>> El mié., 31 jul. 2019 a las 11:17, carlos muñoz (<
>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>
>>>>>>>>> Hi John,
>>>>>>>>>
>>>>>>>>> Thanks for the update. I reviewed your code a little bit, it is
>>>>>>>>> looking good. I think tha you should send a PR in order to receive feedback
>>>>>>>>> from other community members.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Carlos
>>>>>>>>>
>>>>>>>>> El dom., 28 jul. 2019 a las 23:20, John Mora (<
>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>
>>>>>>>>>> Hi all.
>>>>>>>>>>
>>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last
>>>>>>>>>> commits to my branch [2]. Please give it a look if you have time.
>>>>>>>>>>
>>>>>>>>>> This week, I will give a look to the documentation of datastores.
>>>>>>>>>>
>>>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>>>
>>>>>>>>>> [1]
>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> John
>>>>>>>>>>
>>>>>>>>>> El mié., 24 jul. 2019 a las 11:34, John Mora (<
>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>
>>>>>>>>>>> Hi Alfonso,
>>>>>>>>>>>
>>>>>>>>>>> Yes, I was using this class javafx.util.Pair. It is not a
>>>>>>>>>>> problem I will find an alternative, it is only an utilitary class.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> John
>>>>>>>>>>>
>>>>>>>>>>> El mar., 23 jul. 2019 a las 12:36, Alfonso Nishikawa (<
>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>
>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>
>>>>>>>>>>>> I checked out your code and it looks good :)
>>>>>>>>>>>> I found that you use javafx, but that is not present in OpenJDK
>>>>>>>>>>>> and fails to compile, and since we don't stick to Oracle JVM I would
>>>>>>>>>>>> suggest to change it.
>>>>>>>>>>>>
>>>>>>>>>>>> Good job, keep it going :)
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>>
>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> El sáb., 20 jul. 2019 a las 22:25, John Mora (<
>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last
>>>>>>>>>>>>> commits to my branch [2]. Please give it a look if you have time.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This week, I will give a look to the map reduce tests for
>>>>>>>>>>>>> DataStores.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1]
>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> John
>>>>>>>>>>>>>
>>>>>>>>>>>>> El sáb., 13 jul. 2019 a las 19:31, John Mora (<
>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi all
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last
>>>>>>>>>>>>>> commits to my branch [2]. Please give it a look if you have time.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This week, I will be working in the getPartitions and
>>>>>>>>>>>>>> deleteByQuery methods and testing the other tests in the DataStoreTestBase
>>>>>>>>>>>>>> class.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> El mié., 10 jul. 2019 a las 16:17, John Mora (<
>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Alfonso,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks so much for your time and support for this project. I
>>>>>>>>>>>>>>> will work on your comments. Responses inline :)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> El mar., 9 jul. 2019 a las 16:38, Alfonso Nishikawa (<
>>>>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sorry for the delay, I am changing work and I have been
>>>>>>>>>>>>>>>> very busy :( I will try to answer your questions :)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *> In the Employee example there is a field called
>>>>>>>>>>>>>>>> 'dateOfBirth'. I tried to map that field with the UNIXTIME_MICROS datatype
>>>>>>>>>>>>>>>> of Kudu (I intuitively assumed this is a date.). However, in the java world
>>>>>>>>>>>>>>>> the Employee field is a Long value and the kudu datatype is a Timestamp.
>>>>>>>>>>>>>>>> So, I was wondering whether I should force the usage of the UNIXTIME_MICROS
>>>>>>>>>>>>>>>> datatype for this field or just use a LONG datatype in Kudu.*
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In Avro 1.8 were introduced "Logical Types" so there is a
>>>>>>>>>>>>>>>> "date" type with an underlying "int" [1]. It's the first time I read about
>>>>>>>>>>>>>>>> because until the last version upgrade of Avro this weren't there. I would
>>>>>>>>>>>>>>>> suggest to ignore "dates" and map dateOfBirth as long, since in any case
>>>>>>>>>>>>>>>> -in avro- the value is the unix epoch. After this first approach, a design
>>>>>>>>>>>>>>>> improvement would be great, though :)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> - Would be good to have in the mapping a "timestamp" type
>>>>>>>>>>>>>>>> so KuduStore converts between the Entity long field <-> Kudu timestamp
>>>>>>>>>>>>>>>> storage?
>>>>>>>>>>>>>>>> - Is there any other approach?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think that Entity long field <-> Kudu timestamp conversion
>>>>>>>>>>>>>>> that the best alternative right now. Because, I would add more compatible
>>>>>>>>>>>>>>> datatypes to the mapping parameters which users can use. And this
>>>>>>>>>>>>>>> conversion should not be dificult to implement in my opinion. Also, the new
>>>>>>>>>>>>>>> Date datatype of avro could be implemented in newer versions because it
>>>>>>>>>>>>>>> would need further analysis in other datastores too. I will work on that.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *> What is the Gora's policy regarding flush()? *
>>>>>>>>>>>>>>>> *> KuduClient has multiple flushing modes
>>>>>>>>>>>>>>>> <https://kudu.apache.org/apidocs/org/apache/kudu/client/SessionConfiguration.FlushMode.html>and
>>>>>>>>>>>>>>>> also can set time interval
>>>>>>>>>>>>>>>> <https://kudu.apache.org/releases/1.2.0/apidocs/org/apache/kudu/client/KuduSession.html#setFlushInterval-int->
>>>>>>>>>>>>>>>> for automatic flush.*
>>>>>>>>>>>>>>>> *> Should theses behaviors be configurable using
>>>>>>>>>>>>>>>> gora.properties file? or just use the default configurations.*
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> What we do in HBase is configure an autoflush option in
>>>>>>>>>>>>>>>> gora.properties [2] which is used when instanced the Table, but at the same
>>>>>>>>>>>>>>>> time we implement the flush() method to force the flush [3]. I would
>>>>>>>>>>>>>>>> suggest to follow that example, but adding the flushing options of Kudu.
>>>>>>>>>>>>>>>> What flushing mode (and time interval if it applies) do you suggest?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Well,  IMHO the default flush mode (auto flush sync) will do
>>>>>>>>>>>>>>> the job for most use cases. But I will add a configuration in
>>>>>>>>>>>>>>> gora.properties for selecting the other modes and specifying a autoflush
>>>>>>>>>>>>>>> time  if needed  by the user.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *> Also, while reviewing the datastore interface I noticed
>>>>>>>>>>>>>>>> this method 'getPartitions(Query<K, T> query)'. What is the expected
>>>>>>>>>>>>>>>> behavior of this method?, should I use the partition definition in the xml
>>>>>>>>>>>>>>>> mapping file for this?.*
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The method getPartitions(Query) is related to Hadoop.
>>>>>>>>>>>>>>>> Apache Gora integrates with Hadoop implementing a custom Map and Reduce
>>>>>>>>>>>>>>>> that allows to get/write Entities directly.
>>>>>>>>>>>>>>>> You can take a look at HBase's implementation [4], which
>>>>>>>>>>>>>>>> relies o.a.h.hbase.mapreduce.TableInputFormatBase [5] to
>>>>>>>>>>>>>>>> compute the splits (start key---end key) with the location of the split to
>>>>>>>>>>>>>>>> create a colection of partitions [6].
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So, if Kudu is allowed to perform computation using local
>>>>>>>>>>>>>>>> kudu splits, then this method does the needed preparation to allow to "send
>>>>>>>>>>>>>>>> the computation to where the data is locally".
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In any case, you can see that:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>    - MongoDB store implementation does not implement
>>>>>>>>>>>>>>>>    splitting [7]
>>>>>>>>>>>>>>>>    - Cassandra store implementation does not implement
>>>>>>>>>>>>>>>>    splitting [8]
>>>>>>>>>>>>>>>>    - Aerospike store implementation does not implement
>>>>>>>>>>>>>>>>    splitting [9]
>>>>>>>>>>>>>>>>    - Accumulo store implementation* does* implement
>>>>>>>>>>>>>>>>    splitting [10]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If Kudu has a method to get the different splits for a
>>>>>>>>>>>>>>>> table and its locations, then you will be able to implement the full
>>>>>>>>>>>>>>>> feature.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This is Hadoop related and it is not trivial. I haven't
>>>>>>>>>>>>>>>> elaborated much, so if you find you need more information let me know :)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I will check whether Kudu has these features in order to
>>>>>>>>>>>>>>> implement this method. If not I will use the default implementation found
>>>>>>>>>>>>>>> in other backends.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> About Queries, what I can tell is that Hbase only
>>>>>>>>>>>>>>>> implements "Start key" + "End key" because it has only 2 operations: "get"
>>>>>>>>>>>>>>>> and "scan", and the querying is for "scan" operation, were you want an
>>>>>>>>>>>>>>>> interval (or all) of the rows. Does Kudu have more querying functionality?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yes, Kudu implements a Scanner for querying data among with
>>>>>>>>>>>>>>> conditional predicates for filtering. I am using those classes.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> About other topic, I am trying to install Kudu in
>>>>>>>>>>>>>>>> standalone (all in 1 node). Do you use a Cloudera installation or do you
>>>>>>>>>>>>>>>> have a standalone installation? How do you do it? I found some
>>>>>>>>>>>>>>>> instructions, but they talk about compiling Kudu [11]. I was looking for
>>>>>>>>>>>>>>>> something like HBase, that it is unzip + execute "hbase start".
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am using an embedded mini-cluster which comes with
>>>>>>>>>>>>>>> compiled binaries and can be used with maven[1] for testing my code. Once I
>>>>>>>>>>>>>>> get it mature enough I think I will be testing the datastore with a docker
>>>>>>>>>>>>>>> container [2]. I could not find a unzip+execute bundle either and I am
>>>>>>>>>>>>>>> kinda noob for compiling it myself.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>> https://kudu.apache.org/docs/developing.html#_jvm_based_integration_testing
>>>>>>>>>>>>>>> [2] https://hub.docker.com/r/usuresearch/apache-kudu/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Good job and thank you!! :)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>> https://avro.apache.org/docs/1.8.0/spec.html#Logical+Types
>>>>>>>>>>>>>>>> [2] -
>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L175
>>>>>>>>>>>>>>>> [3] -
>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L458
>>>>>>>>>>>>>>>> [4] -
>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L472
>>>>>>>>>>>>>>>> [5] -
>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L479
>>>>>>>>>>>>>>>> [6] -
>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L517
>>>>>>>>>>>>>>>> [7] -
>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-mongodb/src/main/java/org/apache/gora/mongodb/store/MongoStore.java#L533
>>>>>>>>>>>>>>>> [8] -
>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L292
>>>>>>>>>>>>>>>> [9] -
>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-aerospike/src/main/java/org/apache/gora/aerospike/store/AerospikeStore.java#L369
>>>>>>>>>>>>>>>> [10] -
>>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-accumulo/src/main/java/org/apache/gora/accumulo/store/AccumuloStore.java#L902
>>>>>>>>>>>>>>>> [11] - https://kudu.apache.org/docs/installation.html
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> El lun., 8 jul. 2019 a las 3:42, John Mora (<
>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> As every week I updated my report in the Wiki[1]. Also, I
>>>>>>>>>>>>>>>>> pushed my last commits to my branch [2]. Please give it a look if you have
>>>>>>>>>>>>>>>>> time.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This week, I will be continue working in the Queries
>>>>>>>>>>>>>>>>> implementation, please reach me out if you have any suggestions.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Also, while reviewing the datastore interface I noticed
>>>>>>>>>>>>>>>>> this method 'getPartitions(Query<K, T> query)'. What is the expected
>>>>>>>>>>>>>>>>> behavior of this method?, should I use the partition definition in the xml
>>>>>>>>>>>>>>>>> mapping file for this?.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> El dom., 30 jun. 2019 a las 16:56, John Mora (<
>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I received my first evaluation from the Google Summer of
>>>>>>>>>>>>>>>>>> Code program with a positive result. Thanks so much for your support and
>>>>>>>>>>>>>>>>>> confidence to the project and me.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I updated my report of this week in the Wiki[1]. Also, I
>>>>>>>>>>>>>>>>>> pushed my last commits to my branch [2].
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This week, I will be reviewing my the serialization/
>>>>>>>>>>>>>>>>>> deserialization process in order to identify optimizations specific for
>>>>>>>>>>>>>>>>>> Kudu. Because I used a generic methods of other backends which probably
>>>>>>>>>>>>>>>>>> could be better tuned for kudu. Also, I will start working on the Queries
>>>>>>>>>>>>>>>>>> implementation.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> BTW, I added a question to the wiki about Date types.
>>>>>>>>>>>>>>>>>> Please give it a look if you have time.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> El jue., 27 jun. 2019 a las 21:02, John Mora (<
>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi Carlos.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks for the reminder. I submitted the form yesterday.
>>>>>>>>>>>>>>>>>>> :D
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> El jue., 27 jun. 2019 a las 17:34, carlos muñoz (<
>>>>>>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi John
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The first Google Summer of Code evaluation is due on
>>>>>>>>>>>>>>>>>>>> June 28th. Please make sure you submit your Mentors' evaluation on time.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>> Carlos
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> El dom., 23 jun. 2019 a las 18:29, John Mora (<
>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> FYI, I updated my report of this week on the Wiki[1].
>>>>>>>>>>>>>>>>>>>>> Also, I pushed my last commits to my branch [2].
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> As I mentioned in the reports I would like to know how
>>>>>>>>>>>>>>>>>>>>> datastores deal with flush(), should it work always manually executed?.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Finally, This week I will be implementing object
>>>>>>>>>>>>>>>>>>>>> serialization/deserialization in the methods put, get, delete, exists. Do
>>>>>>>>>>>>>>>>>>>>> you have any suggestions on how to proceed with this task?.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Footnote: Thanks for the feedback Carlos, I fixed the
>>>>>>>>>>>>>>>>>>>>> problem.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> El lun., 17 jun. 2019 a las 22:58, carlos muñoz (<
>>>>>>>>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hi John
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Your last changes look good to me. Keep it up. But, I
>>>>>>>>>>>>>>>>>>>>>> noticed that you have created an Enumeration for datatypes, which is very
>>>>>>>>>>>>>>>>>>>>>> similar to the kudu-client's [2]. Probably you should replace [1] for [2]
>>>>>>>>>>>>>>>>>>>>>> in order to avoid code duplication.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/Column.java#L76
>>>>>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>> https://kudu.apache.org/apidocs/org/apache/kudu/Type.html
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>> Carlos
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> El sáb., 15 jun. 2019 a las 12:01, John Mora (<
>>>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I updated my report of this week on the Wiki[1]. I
>>>>>>>>>>>>>>>>>>>>>>> noticed that my code is lacking some javadoc documentation I think I will
>>>>>>>>>>>>>>>>>>>>>>> be working on that this week, also I would like to enable and check schema
>>>>>>>>>>>>>>>>>>>>>>> management tests (createSchema, existsSchema, etc.).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> El mar., 11 jun. 2019 a las 0:11, John Mora (<
>>>>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Hi Alfonso.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Thanks so much for your feedback. I am working on
>>>>>>>>>>>>>>>>>>>>>>>> your comments.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 16:11, Alfonso
>>>>>>>>>>>>>>>>>>>>>>>> Nishikawa (<al...@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Regarding your questions at the report [1]:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>    - How to represent partitioning configurations
>>>>>>>>>>>>>>>>>>>>>>>>>    on the mapping file.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> This was discussed in other emails, isn't it? :)
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>    - KuduTestHarness requires the Maven plugin
>>>>>>>>>>>>>>>>>>>>>>>>>    os-maven-plugin, which needs Maven 3.1.1+, is it a problem for Apache Gora?
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> I believe it is not a problem. My Ubuntu comes
>>>>>>>>>>>>>>>>>>>>>>>>> with 3.6.0, far from 3.1.1, and I assume everyone uses Maven 3 in a quite
>>>>>>>>>>>>>>>>>>>>>>>>> new version :)
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 21:07, Alfonso
>>>>>>>>>>>>>>>>>>>>>>>>> Nishikawa (<al...@gmail.com>)
>>>>>>>>>>>>>>>>>>>>>>>>> escribió:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you!
>>>>>>>>>>>>>>>>>>>>>>>>>> Things I have seen:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> - The version of a maven dependency [1] should go
>>>>>>>>>>>>>>>>>>>>>>>>>> on the Dependency Management of the root pom [2]. Same for [3] and from
>>>>>>>>>>>>>>>>>>>>>>>>>> there, should not set the version there.
>>>>>>>>>>>>>>>>>>>>>>>>>> - Set test dependencies' scope to test, at [4]
>>>>>>>>>>>>>>>>>>>>>>>>>> and from there.
>>>>>>>>>>>>>>>>>>>>>>>>>> - Set the indentation to 2 spaces for the pom [5]
>>>>>>>>>>>>>>>>>>>>>>>>>> - Missing "t" in "localhost" at [6].
>>>>>>>>>>>>>>>>>>>>>>>>>> - Port 13 for Kudu? That is "Daytime Protocol"
>>>>>>>>>>>>>>>>>>>>>>>>>> RFC 867 and you will need root permission to run it. The default port for
>>>>>>>>>>>>>>>>>>>>>>>>>> kudu is 7051, isn't it?
>>>>>>>>>>>>>>>>>>>>>>>>>> - I would ask you to add the same functionality
>>>>>>>>>>>>>>>>>>>>>>>>>> to load the mapping from configuration as in HBase's store [7] in you
>>>>>>>>>>>>>>>>>>>>>>>>>> KuduStore [8]. This will have implications on your readMapping at [9], so
>>>>>>>>>>>>>>>>>>>>>>>>>> take a look at the one for HBase at [10]
>>>>>>>>>>>>>>>>>>>>>>>>>> - I know it is in other backends, but avoid
>>>>>>>>>>>>>>>>>>>>>>>>>> RuntimeExceptions (at least in Java since we have the checked ones) like in
>>>>>>>>>>>>>>>>>>>>>>>>>> [11]. You can wrap them in GoraException. An example is [12]
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> And nothing more :)
>>>>>>>>>>>>>>>>>>>>>>>>>> Keep going, good job.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L98
>>>>>>>>>>>>>>>>>>>>>>>>>> [2] -
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/pom.xml#L890
>>>>>>>>>>>>>>>>>>>>>>>>>> [3] -
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L121
>>>>>>>>>>>>>>>>>>>>>>>>>> [4] -
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L180
>>>>>>>>>>>>>>>>>>>>>>>>>> [5] -
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml
>>>>>>>>>>>>>>>>>>>>>>>>>> [6] -
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/test/resources/gora.properties#L18
>>>>>>>>>>>>>>>>>>>>>>>>>> [7] -
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L92
>>>>>>>>>>>>>>>>>>>>>>>>>> [8] -
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/store/KuduStore.java#L53
>>>>>>>>>>>>>>>>>>>>>>>>>> [9] -
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L81
>>>>>>>>>>>>>>>>>>>>>>>>>> [10] -
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L822
>>>>>>>>>>>>>>>>>>>>>>>>>> [11] -
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L141
>>>>>>>>>>>>>>>>>>>>>>>>>> [12] -
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L268
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> El sáb., 8 jun. 2019 a las 20:26, John Mora (<
>>>>>>>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> I have just updated my weekly reports on Cwiki
>>>>>>>>>>>>>>>>>>>>>>>>>>> [1]. This next week I think I should be focusing on the create schema
>>>>>>>>>>>>>>>>>>>>>>>>>>> operation and solving the issue of the partitioning configurations in the
>>>>>>>>>>>>>>>>>>>>>>>>>>> mapping file.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Please let me know if you have suggestions, my
>>>>>>>>>>>>>>>>>>>>>>>>>>> last commits are available here [2]
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>

Re: Kudu datastore reports

Posted by John Mora <jh...@gmail.com>.
Hi Alfonso,

Please take into a consideration that the property
'gora.datastore.kudu.masterAddress' is overridden in the class
GoraKuduTestDriver [1]. Because KuduTestHarness  generates random ports
which need to be configured at runtime. Probably, you should change the
property there too.

I will test my code with a docker container, in order to figure out the
origin of the issue. Please let me know if someone faces this issue when
building the project.


[1]
https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/test/java/org/apache/gora/kudu/GoraKuduTestDriver.java#L41


Cheers,
John.


El vie., 9 ago. 2019 a las 12:22, Alfonso Nishikawa (<
alfonso.nishikawa@gmail.com>) escribió:

> Hi,
>
> I installed a standalone Kudu server (compiled from sources) in a virtual
> machine and configured test/resources/gora.properties to use "nosql:8051"
> master.
>
> The tests freezes waiting for kudu response and a tons of connections to
> the master gets created.
>
> *To the community:* Can someone please clone
> https://github.com/jhnmora000/gora/tree/GORA-485 and build it to see if
> the problem can be reproduced?
>
> I attach snapshots showing how connections stays waiting.
>
> Thank you!
>
> Best Regards,
>
> Alfonso Nishikawa
>
>
>
> El jue., 8 ago. 2019 a las 20:18, Alfonso Nishikawa (<
> alfonso.nishikawa@gmail.com>) escribió:
>
>> Hi, John.
>>
>> Tried using Oracle jdk 1.8 and found the same core dump:
>>
>> [INFO] Running org.apache.gora.kudu.store.TestKuduStore
>> [ERROR] Tests run: 44, Failures: 0, Errors: 40, Skipped: 4, Time elapsed:
>> 52.466 s <<< FAILURE! - in org.apache.gora.kudu.store.TestKuduStore
>> [ERROR] testNewInstance(org.apache.gora.kudu.store.TestKuduStore)  Time
>> elapsed: 1.834 s  <<< ERROR!
>> java.io.IOException: failed to start masters: Unable to start Master at
>> index 0:
>> /tmp/kudu-binary-jar4319751617646651391/kudu-binary-1.9.0-linux-x86_64/bin/kudu-master:
>> process exited on signal 6 (core dumped)
>>
>> (I expected to fail too, since the problem doesn't look like being
>> related to the jvm).
>>
>> Thanks for giving it a look. Don't know what must be the problem :\
>>
>> Best Regards,
>>
>> Alfosno Nishikawa
>>
>>
>> El mar., 6 ago. 2019 a las 4:26, John Mora (<jh...@gmail.com>)
>> escribió:
>>
>>> Hi Alfonso,
>>>
>>> Unfortunately, I have not been able to reproduce the issue. Maybe it is
>>> related with my Java version (Oracle), I will try with OpenJDK.
>>> Some details about my development environment:
>>>
>>> os.detected.name: linux
>>> os.detected.arch: x86_64
>>> os.detected.version: 4.10
>>> os.detected.version.major: 4
>>> os.detected.version.minor: 10
>>> os.detected.release: linuxmint
>>> os.detected.release.version: 18.3
>>> os.detected.release.like.linuxmint: true
>>> os.detected.release.like.ubuntu: true
>>> os.detected.classifier: linux-x86_64
>>>
>>> Java
>>> java version "1.8.0_171"
>>> Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
>>> Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)
>>>
>>> Maven
>>> Apache Maven 3.3.9
>>> Maven home: /usr/share/maven
>>> Java version: 1.8.0_171, vendor: Oracle Corporation
>>> Java home: /usr/lib/jvm/java-8-oracle/jre
>>> Default locale: en_US, platform encoding: UTF-8
>>> OS name: "linux", version: "4.10.0-38-generic", arch: "amd64", family:
>>> "unix"
>>>
>>>
>>> Best,
>>> John.
>>>
>>> El lun., 5 ago. 2019 a las 16:48, Alfonso Nishikawa (<
>>> alfonso.nishikawa@gmail.com>) escribió:
>>>
>>>> Hi,
>>>>
>>>> I am using now the following pom configuration I got from executing
>>>> `mvn dependency:tree`:
>>>>
>>>>     <dependency>
>>>>       <groupId>org.apache.kudu</groupId>
>>>>       <artifactId>kudu-binary</artifactId>
>>>>       <classifier>linux-x86_64</classifier>
>>>>       <version>1.9.0</version>
>>>>       <scope>test</scope>
>>>>     </dependency>
>>>>
>>>> When I execute `mvn clen package` on gora-kudu I find that it spawns
>>>> the following command:
>>>>
>>>> kudu-master
>>>> --fs_wal_dir=/tmp/mini-kudu-cluster8989984398759938222/master-0/wal
>>>> --fs_data_dirs=/tmp/mini-kudu-cluster8989984398759938222/master-0/data
>>>> --block_manager=log --webserver_interface=localhost --ipki_ca_key_size=1024
>>>> --tsk_num_rsa_bits=512 --rpc_bind_addresses=*127.26.116.190*:39535
>>>> --webserver_interface=*127.26.116.190* --webserver_port=0
>>>> --never_fsync --ipki_server_key_size=1024 --enable_minidumps=false
>>>> --redact=none --metrics_log_interval_ms=1000 --logtostderr --logbuflevel=-1
>>>> --log_dir=/tmp/mini-kudu-cluster8989984398759938222/master-0/logs
>>>> --server_dump_info_path=/tmp/mini-kudu-cluster8989984398759938222/master-0/data/info.pb
>>>> --server_dump_info_format=pb --rpc_server_allow_ephemeral_ports
>>>> --unlock_experimental_flags --unlock_unsafe_flags --rpc_reuseport=true
>>>> --master_addresses=*127.26.116.190*:39535,*127.26.116.189*:33913,
>>>> *127.26.116.188*:42253
>>>>
>>>>
>>>> I highlight the IP addresses because they clearly are not my computer,
>>>> and I guess that is why the tests can't connect to the the database.
>>>>
>>>> Any idea on how to solve this?
>>>>
>>>> Thank you!
>>>>
>>>>
>>>> Best Regards,
>>>>
>>>> Alfonso Nishikawa
>>>>
>>>>
>>>>
>>>> El lun., 5 ago. 2019 a las 8:39, Alfonso Nishikawa (<
>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>
>>>>> Hi, John.
>>>>>
>>>>> I get a core dump from the binary kudu server when trying to run the
>>>>> tests. Didn't find a log file, but will search thoroughly later. Happened
>>>>> anytime to you? Does it happens to anyone?
>>>>>
>>>>> I am using Ubuntu 18.04
>>>>>
>>>>> Thank you!
>>>>>
>>>>> Regards,
>>>>>
>>>>> Alfonso Nishikawa
>>>>>
>>>>> El dom., 4 ago. 2019 20:10, Furkan KAMACI <fu...@gmail.com>
>>>>> escribió:
>>>>>
>>>>>> Hi John,
>>>>>>
>>>>>> I've already made my comments at your PR. Please check them carefully
>>>>>> and ask me if you need help.
>>>>>>
>>>>>> For the documentation, I've checked what you've done. On the other
>>>>>> hand, I would want to encourage you to write a blog post about your Kudu
>>>>>> implementation and demonstrate an example of Kudu integration with Gora as
>>>>>> like a tutorial.
>>>>>>
>>>>>> Kind Regards,
>>>>>> Furkan KAMACI
>>>>>>
>>>>>> On Sun, Aug 4, 2019 at 1:59 AM John Mora <jh...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi all.
>>>>>>>
>>>>>>> I have updated my report in the Wiki[1].
>>>>>>>
>>>>>>> Also, I have sent a PR with my last commits for review [2]. Please
>>>>>>> give it a look if you have time.
>>>>>>>
>>>>>>> This week, I will continue working on the documentation of the kudu
>>>>>>> datastore.
>>>>>>>
>>>>>>> Please let me know if you have suggestions.
>>>>>>>
>>>>>>> [1]
>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>> [2] https://github.com/apache/gora/pull/178
>>>>>>>
>>>>>>> Best,
>>>>>>> John.
>>>>>>>
>>>>>>> El mié., 31 jul. 2019 a las 11:17, carlos muñoz (<
>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>
>>>>>>>> Hi John,
>>>>>>>>
>>>>>>>> Thanks for the update. I reviewed your code a little bit, it is
>>>>>>>> looking good. I think tha you should send a PR in order to receive feedback
>>>>>>>> from other community members.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Carlos
>>>>>>>>
>>>>>>>> El dom., 28 jul. 2019 a las 23:20, John Mora (<jh...@gmail.com>)
>>>>>>>> escribió:
>>>>>>>>
>>>>>>>>> Hi all.
>>>>>>>>>
>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last commits
>>>>>>>>> to my branch [2]. Please give it a look if you have time.
>>>>>>>>>
>>>>>>>>> This week, I will give a look to the documentation of datastores.
>>>>>>>>>
>>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> John
>>>>>>>>>
>>>>>>>>> El mié., 24 jul. 2019 a las 11:34, John Mora (<
>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>
>>>>>>>>>> Hi Alfonso,
>>>>>>>>>>
>>>>>>>>>> Yes, I was using this class javafx.util.Pair. It is not a problem
>>>>>>>>>> I will find an alternative, it is only an utilitary class.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> John
>>>>>>>>>>
>>>>>>>>>> El mar., 23 jul. 2019 a las 12:36, Alfonso Nishikawa (<
>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>
>>>>>>>>>>> Hi, John.
>>>>>>>>>>>
>>>>>>>>>>> I checked out your code and it looks good :)
>>>>>>>>>>> I found that you use javafx, but that is not present in OpenJDK
>>>>>>>>>>> and fails to compile, and since we don't stick to Oracle JVM I would
>>>>>>>>>>> suggest to change it.
>>>>>>>>>>>
>>>>>>>>>>> Good job, keep it going :)
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>>
>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> El sáb., 20 jul. 2019 a las 22:25, John Mora (<
>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>
>>>>>>>>>>>> Hi.
>>>>>>>>>>>>
>>>>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last
>>>>>>>>>>>> commits to my branch [2]. Please give it a look if you have time.
>>>>>>>>>>>>
>>>>>>>>>>>> This week, I will give a look to the map reduce tests for
>>>>>>>>>>>> DataStores.
>>>>>>>>>>>>
>>>>>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>>>>>
>>>>>>>>>>>> [1]
>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> John
>>>>>>>>>>>>
>>>>>>>>>>>> El sáb., 13 jul. 2019 a las 19:31, John Mora (<
>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi all
>>>>>>>>>>>>>
>>>>>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last
>>>>>>>>>>>>> commits to my branch [2]. Please give it a look if you have time.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This week, I will be working in the getPartitions and
>>>>>>>>>>>>> deleteByQuery methods and testing the other tests in the DataStoreTestBase
>>>>>>>>>>>>> class.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1]
>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> John.
>>>>>>>>>>>>>
>>>>>>>>>>>>> El mié., 10 jul. 2019 a las 16:17, John Mora (<
>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Alfonso,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks so much for your time and support for this project. I
>>>>>>>>>>>>>> will work on your comments. Responses inline :)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> El mar., 9 jul. 2019 a las 16:38, Alfonso Nishikawa (<
>>>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sorry for the delay, I am changing work and I have been very
>>>>>>>>>>>>>>> busy :( I will try to answer your questions :)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *> In the Employee example there is a field called
>>>>>>>>>>>>>>> 'dateOfBirth'. I tried to map that field with the UNIXTIME_MICROS datatype
>>>>>>>>>>>>>>> of Kudu (I intuitively assumed this is a date.). However, in the java world
>>>>>>>>>>>>>>> the Employee field is a Long value and the kudu datatype is a Timestamp.
>>>>>>>>>>>>>>> So, I was wondering whether I should force the usage of the UNIXTIME_MICROS
>>>>>>>>>>>>>>> datatype for this field or just use a LONG datatype in Kudu.*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In Avro 1.8 were introduced "Logical Types" so there is a
>>>>>>>>>>>>>>> "date" type with an underlying "int" [1]. It's the first time I read about
>>>>>>>>>>>>>>> because until the last version upgrade of Avro this weren't there. I would
>>>>>>>>>>>>>>> suggest to ignore "dates" and map dateOfBirth as long, since in any case
>>>>>>>>>>>>>>> -in avro- the value is the unix epoch. After this first approach, a design
>>>>>>>>>>>>>>> improvement would be great, though :)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - Would be good to have in the mapping a "timestamp" type so
>>>>>>>>>>>>>>> KuduStore converts between the Entity long field <-> Kudu timestamp storage?
>>>>>>>>>>>>>>> - Is there any other approach?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think that Entity long field <-> Kudu timestamp conversion
>>>>>>>>>>>>>> that the best alternative right now. Because, I would add more compatible
>>>>>>>>>>>>>> datatypes to the mapping parameters which users can use. And this
>>>>>>>>>>>>>> conversion should not be dificult to implement in my opinion. Also, the new
>>>>>>>>>>>>>> Date datatype of avro could be implemented in newer versions because it
>>>>>>>>>>>>>> would need further analysis in other datastores too. I will work on that.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *> What is the Gora's policy regarding flush()? *
>>>>>>>>>>>>>>> *> KuduClient has multiple flushing modes
>>>>>>>>>>>>>>> <https://kudu.apache.org/apidocs/org/apache/kudu/client/SessionConfiguration.FlushMode.html>and
>>>>>>>>>>>>>>> also can set time interval
>>>>>>>>>>>>>>> <https://kudu.apache.org/releases/1.2.0/apidocs/org/apache/kudu/client/KuduSession.html#setFlushInterval-int->
>>>>>>>>>>>>>>> for automatic flush.*
>>>>>>>>>>>>>>> *> Should theses behaviors be configurable using
>>>>>>>>>>>>>>> gora.properties file? or just use the default configurations.*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What we do in HBase is configure an autoflush option in
>>>>>>>>>>>>>>> gora.properties [2] which is used when instanced the Table, but at the same
>>>>>>>>>>>>>>> time we implement the flush() method to force the flush [3]. I would
>>>>>>>>>>>>>>> suggest to follow that example, but adding the flushing options of Kudu.
>>>>>>>>>>>>>>> What flushing mode (and time interval if it applies) do you suggest?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Well,  IMHO the default flush mode (auto flush sync) will do
>>>>>>>>>>>>>> the job for most use cases. But I will add a configuration in
>>>>>>>>>>>>>> gora.properties for selecting the other modes and specifying a autoflush
>>>>>>>>>>>>>> time  if needed  by the user.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *> Also, while reviewing the datastore interface I noticed
>>>>>>>>>>>>>>> this method 'getPartitions(Query<K, T> query)'. What is the expected
>>>>>>>>>>>>>>> behavior of this method?, should I use the partition definition in the xml
>>>>>>>>>>>>>>> mapping file for this?.*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The method getPartitions(Query) is related to Hadoop. Apache
>>>>>>>>>>>>>>> Gora integrates with Hadoop implementing a custom Map and Reduce that
>>>>>>>>>>>>>>> allows to get/write Entities directly.
>>>>>>>>>>>>>>> You can take a look at HBase's implementation [4], which
>>>>>>>>>>>>>>> relies o.a.h.hbase.mapreduce.TableInputFormatBase [5] to
>>>>>>>>>>>>>>> compute the splits (start key---end key) with the location of the split to
>>>>>>>>>>>>>>> create a colection of partitions [6].
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So, if Kudu is allowed to perform computation using local
>>>>>>>>>>>>>>> kudu splits, then this method does the needed preparation to allow to "send
>>>>>>>>>>>>>>> the computation to where the data is locally".
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In any case, you can see that:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>    - MongoDB store implementation does not implement
>>>>>>>>>>>>>>>    splitting [7]
>>>>>>>>>>>>>>>    - Cassandra store implementation does not implement
>>>>>>>>>>>>>>>    splitting [8]
>>>>>>>>>>>>>>>    - Aerospike store implementation does not implement
>>>>>>>>>>>>>>>    splitting [9]
>>>>>>>>>>>>>>>    - Accumulo store implementation* does* implement
>>>>>>>>>>>>>>>    splitting [10]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If Kudu has a method to get the different splits for a table
>>>>>>>>>>>>>>> and its locations, then you will be able to implement the full feature.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This is Hadoop related and it is not trivial. I haven't
>>>>>>>>>>>>>>> elaborated much, so if you find you need more information let me know :)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I will check whether Kudu has these features in order to
>>>>>>>>>>>>>> implement this method. If not I will use the default implementation found
>>>>>>>>>>>>>> in other backends.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> About Queries, what I can tell is that Hbase only implements
>>>>>>>>>>>>>>> "Start key" + "End key" because it has only 2 operations: "get" and "scan",
>>>>>>>>>>>>>>> and the querying is for "scan" operation, were you want an interval (or
>>>>>>>>>>>>>>> all) of the rows. Does Kudu have more querying functionality?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes, Kudu implements a Scanner for querying data among with
>>>>>>>>>>>>>> conditional predicates for filtering. I am using those classes.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> About other topic, I am trying to install Kudu in standalone
>>>>>>>>>>>>>>> (all in 1 node). Do you use a Cloudera installation or do you have a
>>>>>>>>>>>>>>> standalone installation? How do you do it? I found some instructions, but
>>>>>>>>>>>>>>> they talk about compiling Kudu [11]. I was looking for something like
>>>>>>>>>>>>>>> HBase, that it is unzip + execute "hbase start".
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I am using an embedded mini-cluster which comes with compiled
>>>>>>>>>>>>>> binaries and can be used with maven[1] for testing my code. Once I get it
>>>>>>>>>>>>>> mature enough I think I will be testing the datastore with a docker
>>>>>>>>>>>>>> container [2]. I could not find a unzip+execute bundle either and I am
>>>>>>>>>>>>>> kinda noob for compiling it myself.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>> https://kudu.apache.org/docs/developing.html#_jvm_based_integration_testing
>>>>>>>>>>>>>> [2] https://hub.docker.com/r/usuresearch/apache-kudu/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Good job and thank you!! :)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>> https://avro.apache.org/docs/1.8.0/spec.html#Logical+Types
>>>>>>>>>>>>>>> [2] -
>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L175
>>>>>>>>>>>>>>> [3] -
>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L458
>>>>>>>>>>>>>>> [4] -
>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L472
>>>>>>>>>>>>>>> [5] -
>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L479
>>>>>>>>>>>>>>> [6] -
>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L517
>>>>>>>>>>>>>>> [7] -
>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-mongodb/src/main/java/org/apache/gora/mongodb/store/MongoStore.java#L533
>>>>>>>>>>>>>>> [8] -
>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L292
>>>>>>>>>>>>>>> [9] -
>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-aerospike/src/main/java/org/apache/gora/aerospike/store/AerospikeStore.java#L369
>>>>>>>>>>>>>>> [10] -
>>>>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-accumulo/src/main/java/org/apache/gora/accumulo/store/AccumuloStore.java#L902
>>>>>>>>>>>>>>> [11] - https://kudu.apache.org/docs/installation.html
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> El lun., 8 jul. 2019 a las 3:42, John Mora (<
>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> As every week I updated my report in the Wiki[1]. Also, I
>>>>>>>>>>>>>>>> pushed my last commits to my branch [2]. Please give it a look if you have
>>>>>>>>>>>>>>>> time.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This week, I will be continue working in the Queries
>>>>>>>>>>>>>>>> implementation, please reach me out if you have any suggestions.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Also, while reviewing the datastore interface I noticed
>>>>>>>>>>>>>>>> this method 'getPartitions(Query<K, T> query)'. What is the expected
>>>>>>>>>>>>>>>> behavior of this method?, should I use the partition definition in the xml
>>>>>>>>>>>>>>>> mapping file for this?.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> El dom., 30 jun. 2019 a las 16:56, John Mora (<
>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I received my first evaluation from the Google Summer of
>>>>>>>>>>>>>>>>> Code program with a positive result. Thanks so much for your support and
>>>>>>>>>>>>>>>>> confidence to the project and me.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I updated my report of this week in the Wiki[1]. Also, I
>>>>>>>>>>>>>>>>> pushed my last commits to my branch [2].
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This week, I will be reviewing my the serialization/
>>>>>>>>>>>>>>>>> deserialization process in order to identify optimizations specific for
>>>>>>>>>>>>>>>>> Kudu. Because I used a generic methods of other backends which probably
>>>>>>>>>>>>>>>>> could be better tuned for kudu. Also, I will start working on the Queries
>>>>>>>>>>>>>>>>> implementation.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> BTW, I added a question to the wiki about Date types.
>>>>>>>>>>>>>>>>> Please give it a look if you have time.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> El jue., 27 jun. 2019 a las 21:02, John Mora (<
>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi Carlos.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks for the reminder. I submitted the form yesterday.
>>>>>>>>>>>>>>>>>> :D
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> El jue., 27 jun. 2019 a las 17:34, carlos muñoz (<
>>>>>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi John
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The first Google Summer of Code evaluation is due on
>>>>>>>>>>>>>>>>>>> June 28th. Please make sure you submit your Mentors' evaluation on time.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>> Carlos
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> El dom., 23 jun. 2019 a las 18:29, John Mora (<
>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> FYI, I updated my report of this week on the Wiki[1].
>>>>>>>>>>>>>>>>>>>> Also, I pushed my last commits to my branch [2].
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> As I mentioned in the reports I would like to know how
>>>>>>>>>>>>>>>>>>>> datastores deal with flush(), should it work always manually executed?.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Finally, This week I will be implementing object
>>>>>>>>>>>>>>>>>>>> serialization/deserialization in the methods put, get, delete, exists. Do
>>>>>>>>>>>>>>>>>>>> you have any suggestions on how to proceed with this task?.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Footnote: Thanks for the feedback Carlos, I fixed the
>>>>>>>>>>>>>>>>>>>> problem.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> El lun., 17 jun. 2019 a las 22:58, carlos muñoz (<
>>>>>>>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi John
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Your last changes look good to me. Keep it up. But, I
>>>>>>>>>>>>>>>>>>>>> noticed that you have created an Enumeration for datatypes, which is very
>>>>>>>>>>>>>>>>>>>>> similar to the kudu-client's [2]. Probably you should replace [1] for [2]
>>>>>>>>>>>>>>>>>>>>> in order to avoid code duplication.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/Column.java#L76
>>>>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>> https://kudu.apache.org/apidocs/org/apache/kudu/Type.html
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>> Carlos
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> El sáb., 15 jun. 2019 a las 12:01, John Mora (<
>>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I updated my report of this week on the Wiki[1]. I
>>>>>>>>>>>>>>>>>>>>>> noticed that my code is lacking some javadoc documentation I think I will
>>>>>>>>>>>>>>>>>>>>>> be working on that this week, also I would like to enable and check schema
>>>>>>>>>>>>>>>>>>>>>> management tests (createSchema, existsSchema, etc.).
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> El mar., 11 jun. 2019 a las 0:11, John Mora (<
>>>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Hi Alfonso.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thanks so much for your feedback. I am working on
>>>>>>>>>>>>>>>>>>>>>>> your comments.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 16:11, Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>>>>>> (<al...@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Regarding your questions at the report [1]:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>    - How to represent partitioning configurations
>>>>>>>>>>>>>>>>>>>>>>>>    on the mapping file.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> This was discussed in other emails, isn't it? :)
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>    - KuduTestHarness requires the Maven plugin
>>>>>>>>>>>>>>>>>>>>>>>>    os-maven-plugin, which needs Maven 3.1.1+, is it a problem for Apache Gora?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I believe it is not a problem. My Ubuntu comes with
>>>>>>>>>>>>>>>>>>>>>>>> 3.6.0, far from 3.1.1, and I assume everyone uses Maven 3 in a quite new
>>>>>>>>>>>>>>>>>>>>>>>> version :)
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 21:07, Alfonso
>>>>>>>>>>>>>>>>>>>>>>>> Nishikawa (<al...@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Thank you!
>>>>>>>>>>>>>>>>>>>>>>>>> Things I have seen:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> - The version of a maven dependency [1] should go
>>>>>>>>>>>>>>>>>>>>>>>>> on the Dependency Management of the root pom [2]. Same for [3] and from
>>>>>>>>>>>>>>>>>>>>>>>>> there, should not set the version there.
>>>>>>>>>>>>>>>>>>>>>>>>> - Set test dependencies' scope to test, at [4] and
>>>>>>>>>>>>>>>>>>>>>>>>> from there.
>>>>>>>>>>>>>>>>>>>>>>>>> - Set the indentation to 2 spaces for the pom [5]
>>>>>>>>>>>>>>>>>>>>>>>>> - Missing "t" in "localhost" at [6].
>>>>>>>>>>>>>>>>>>>>>>>>> - Port 13 for Kudu? That is "Daytime Protocol" RFC
>>>>>>>>>>>>>>>>>>>>>>>>> 867 and you will need root permission to run it. The default port for kudu
>>>>>>>>>>>>>>>>>>>>>>>>> is 7051, isn't it?
>>>>>>>>>>>>>>>>>>>>>>>>> - I would ask you to add the same functionality to
>>>>>>>>>>>>>>>>>>>>>>>>> load the mapping from configuration as in HBase's store [7] in you
>>>>>>>>>>>>>>>>>>>>>>>>> KuduStore [8]. This will have implications on your readMapping at [9], so
>>>>>>>>>>>>>>>>>>>>>>>>> take a look at the one for HBase at [10]
>>>>>>>>>>>>>>>>>>>>>>>>> - I know it is in other backends, but avoid
>>>>>>>>>>>>>>>>>>>>>>>>> RuntimeExceptions (at least in Java since we have the checked ones) like in
>>>>>>>>>>>>>>>>>>>>>>>>> [11]. You can wrap them in GoraException. An example is [12]
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> And nothing more :)
>>>>>>>>>>>>>>>>>>>>>>>>> Keep going, good job.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L98
>>>>>>>>>>>>>>>>>>>>>>>>> [2] -
>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/pom.xml#L890
>>>>>>>>>>>>>>>>>>>>>>>>> [3] -
>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L121
>>>>>>>>>>>>>>>>>>>>>>>>> [4] -
>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L180
>>>>>>>>>>>>>>>>>>>>>>>>> [5] -
>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml
>>>>>>>>>>>>>>>>>>>>>>>>> [6] -
>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/test/resources/gora.properties#L18
>>>>>>>>>>>>>>>>>>>>>>>>> [7] -
>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L92
>>>>>>>>>>>>>>>>>>>>>>>>> [8] -
>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/store/KuduStore.java#L53
>>>>>>>>>>>>>>>>>>>>>>>>> [9] -
>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L81
>>>>>>>>>>>>>>>>>>>>>>>>> [10] -
>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L822
>>>>>>>>>>>>>>>>>>>>>>>>> [11] -
>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L141
>>>>>>>>>>>>>>>>>>>>>>>>> [12] -
>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L268
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> El sáb., 8 jun. 2019 a las 20:26, John Mora (<
>>>>>>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> I have just updated my weekly reports on Cwiki
>>>>>>>>>>>>>>>>>>>>>>>>>> [1]. This next week I think I should be focusing on the create schema
>>>>>>>>>>>>>>>>>>>>>>>>>> operation and solving the issue of the partitioning configurations in the
>>>>>>>>>>>>>>>>>>>>>>>>>> mapping file.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Please let me know if you have suggestions, my
>>>>>>>>>>>>>>>>>>>>>>>>>> last commits are available here [2]
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>

Re: Kudu datastore reports

Posted by John Mora <jh...@gmail.com>.
Hi Alfonso,

Unfortunately, I have not been able to reproduce the issue. Maybe it is
related with my Java version (Oracle), I will try with OpenJDK.
Some details about my development environment:

os.detected.name: linux
os.detected.arch: x86_64
os.detected.version: 4.10
os.detected.version.major: 4
os.detected.version.minor: 10
os.detected.release: linuxmint
os.detected.release.version: 18.3
os.detected.release.like.linuxmint: true
os.detected.release.like.ubuntu: true
os.detected.classifier: linux-x86_64

Java
java version "1.8.0_171"
Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)

Maven
Apache Maven 3.3.9
Maven home: /usr/share/maven
Java version: 1.8.0_171, vendor: Oracle Corporation
Java home: /usr/lib/jvm/java-8-oracle/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "4.10.0-38-generic", arch: "amd64", family:
"unix"


Best,
John.

El lun., 5 ago. 2019 a las 16:48, Alfonso Nishikawa (<
alfonso.nishikawa@gmail.com>) escribió:

> Hi,
>
> I am using now the following pom configuration I got from executing `mvn
> dependency:tree`:
>
>     <dependency>
>       <groupId>org.apache.kudu</groupId>
>       <artifactId>kudu-binary</artifactId>
>       <classifier>linux-x86_64</classifier>
>       <version>1.9.0</version>
>       <scope>test</scope>
>     </dependency>
>
> When I execute `mvn clen package` on gora-kudu I find that it spawns the
> following command:
>
> kudu-master
> --fs_wal_dir=/tmp/mini-kudu-cluster8989984398759938222/master-0/wal
> --fs_data_dirs=/tmp/mini-kudu-cluster8989984398759938222/master-0/data
> --block_manager=log --webserver_interface=localhost --ipki_ca_key_size=1024
> --tsk_num_rsa_bits=512 --rpc_bind_addresses=*127.26.116.190*:39535
> --webserver_interface=*127.26.116.190* --webserver_port=0 --never_fsync
> --ipki_server_key_size=1024 --enable_minidumps=false --redact=none
> --metrics_log_interval_ms=1000 --logtostderr --logbuflevel=-1
> --log_dir=/tmp/mini-kudu-cluster8989984398759938222/master-0/logs
> --server_dump_info_path=/tmp/mini-kudu-cluster8989984398759938222/master-0/data/info.pb
> --server_dump_info_format=pb --rpc_server_allow_ephemeral_ports
> --unlock_experimental_flags --unlock_unsafe_flags --rpc_reuseport=true
> --master_addresses=*127.26.116.190*:39535,*127.26.116.189*:33913,
> *127.26.116.188*:42253
>
>
> I highlight the IP addresses because they clearly are not my computer, and
> I guess that is why the tests can't connect to the the database.
>
> Any idea on how to solve this?
>
> Thank you!
>
>
> Best Regards,
>
> Alfonso Nishikawa
>
>
>
> El lun., 5 ago. 2019 a las 8:39, Alfonso Nishikawa (<
> alfonso.nishikawa@gmail.com>) escribió:
>
>> Hi, John.
>>
>> I get a core dump from the binary kudu server when trying to run the
>> tests. Didn't find a log file, but will search thoroughly later. Happened
>> anytime to you? Does it happens to anyone?
>>
>> I am using Ubuntu 18.04
>>
>> Thank you!
>>
>> Regards,
>>
>> Alfonso Nishikawa
>>
>> El dom., 4 ago. 2019 20:10, Furkan KAMACI <fu...@gmail.com>
>> escribió:
>>
>>> Hi John,
>>>
>>> I've already made my comments at your PR. Please check them carefully
>>> and ask me if you need help.
>>>
>>> For the documentation, I've checked what you've done. On the other hand,
>>> I would want to encourage you to write a blog post about your Kudu
>>> implementation and demonstrate an example of Kudu integration with Gora as
>>> like a tutorial.
>>>
>>> Kind Regards,
>>> Furkan KAMACI
>>>
>>> On Sun, Aug 4, 2019 at 1:59 AM John Mora <jh...@gmail.com> wrote:
>>>
>>>> Hi all.
>>>>
>>>> I have updated my report in the Wiki[1].
>>>>
>>>> Also, I have sent a PR with my last commits for review [2]. Please give
>>>> it a look if you have time.
>>>>
>>>> This week, I will continue working on the documentation of the kudu
>>>> datastore.
>>>>
>>>> Please let me know if you have suggestions.
>>>>
>>>> [1]
>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>> [2] https://github.com/apache/gora/pull/178
>>>>
>>>> Best,
>>>> John.
>>>>
>>>> El mié., 31 jul. 2019 a las 11:17, carlos muñoz (<ca...@gmail.com>)
>>>> escribió:
>>>>
>>>>> Hi John,
>>>>>
>>>>> Thanks for the update. I reviewed your code a little bit, it is
>>>>> looking good. I think tha you should send a PR in order to receive feedback
>>>>> from other community members.
>>>>>
>>>>> Best,
>>>>> Carlos
>>>>>
>>>>> El dom., 28 jul. 2019 a las 23:20, John Mora (<jh...@gmail.com>)
>>>>> escribió:
>>>>>
>>>>>> Hi all.
>>>>>>
>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last commits to
>>>>>> my branch [2]. Please give it a look if you have time.
>>>>>>
>>>>>> This week, I will give a look to the documentation of datastores.
>>>>>>
>>>>>> Please let me know if you have suggestions.
>>>>>>
>>>>>> [1]
>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>
>>>>>> Cheers,
>>>>>> John
>>>>>>
>>>>>> El mié., 24 jul. 2019 a las 11:34, John Mora (<jh...@gmail.com>)
>>>>>> escribió:
>>>>>>
>>>>>>> Hi Alfonso,
>>>>>>>
>>>>>>> Yes, I was using this class javafx.util.Pair. It is not a problem I
>>>>>>> will find an alternative, it is only an utilitary class.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> John
>>>>>>>
>>>>>>> El mar., 23 jul. 2019 a las 12:36, Alfonso Nishikawa (<
>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>
>>>>>>>> Hi, John.
>>>>>>>>
>>>>>>>> I checked out your code and it looks good :)
>>>>>>>> I found that you use javafx, but that is not present in OpenJDK and
>>>>>>>> fails to compile, and since we don't stick to Oracle JVM I would suggest to
>>>>>>>> change it.
>>>>>>>>
>>>>>>>> Good job, keep it going :)
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Alfonso Nishikawa
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> El sáb., 20 jul. 2019 a las 22:25, John Mora (<jh...@gmail.com>)
>>>>>>>> escribió:
>>>>>>>>
>>>>>>>>> Hi.
>>>>>>>>>
>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last commits
>>>>>>>>> to my branch [2]. Please give it a look if you have time.
>>>>>>>>>
>>>>>>>>> This week, I will give a look to the map reduce tests for
>>>>>>>>> DataStores.
>>>>>>>>>
>>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> John
>>>>>>>>>
>>>>>>>>> El sáb., 13 jul. 2019 a las 19:31, John Mora (<
>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>
>>>>>>>>>> Hi all
>>>>>>>>>>
>>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last
>>>>>>>>>> commits to my branch [2]. Please give it a look if you have time.
>>>>>>>>>>
>>>>>>>>>> This week, I will be working in the getPartitions and
>>>>>>>>>> deleteByQuery methods and testing the other tests in the DataStoreTestBase
>>>>>>>>>> class.
>>>>>>>>>>
>>>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>>>
>>>>>>>>>> [1]
>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> John.
>>>>>>>>>>
>>>>>>>>>> El mié., 10 jul. 2019 a las 16:17, John Mora (<
>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>
>>>>>>>>>>> Hi Alfonso,
>>>>>>>>>>>
>>>>>>>>>>> Thanks so much for your time and support for this project. I
>>>>>>>>>>> will work on your comments. Responses inline :)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> El mar., 9 jul. 2019 a las 16:38, Alfonso Nishikawa (<
>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>
>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>
>>>>>>>>>>>> Sorry for the delay, I am changing work and I have been very
>>>>>>>>>>>> busy :( I will try to answer your questions :)
>>>>>>>>>>>>
>>>>>>>>>>>> *> In the Employee example there is a field called
>>>>>>>>>>>> 'dateOfBirth'. I tried to map that field with the UNIXTIME_MICROS datatype
>>>>>>>>>>>> of Kudu (I intuitively assumed this is a date.). However, in the java world
>>>>>>>>>>>> the Employee field is a Long value and the kudu datatype is a Timestamp.
>>>>>>>>>>>> So, I was wondering whether I should force the usage of the UNIXTIME_MICROS
>>>>>>>>>>>> datatype for this field or just use a LONG datatype in Kudu.*
>>>>>>>>>>>>
>>>>>>>>>>>> In Avro 1.8 were introduced "Logical Types" so there is a
>>>>>>>>>>>> "date" type with an underlying "int" [1]. It's the first time I read about
>>>>>>>>>>>> because until the last version upgrade of Avro this weren't there. I would
>>>>>>>>>>>> suggest to ignore "dates" and map dateOfBirth as long, since in any case
>>>>>>>>>>>> -in avro- the value is the unix epoch. After this first approach, a design
>>>>>>>>>>>> improvement would be great, though :)
>>>>>>>>>>>>
>>>>>>>>>>>> - Would be good to have in the mapping a "timestamp" type so
>>>>>>>>>>>> KuduStore converts between the Entity long field <-> Kudu timestamp storage?
>>>>>>>>>>>> - Is there any other approach?
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I think that Entity long field <-> Kudu timestamp conversion
>>>>>>>>>>> that the best alternative right now. Because, I would add more compatible
>>>>>>>>>>> datatypes to the mapping parameters which users can use. And this
>>>>>>>>>>> conversion should not be dificult to implement in my opinion. Also, the new
>>>>>>>>>>> Date datatype of avro could be implemented in newer versions because it
>>>>>>>>>>> would need further analysis in other datastores too. I will work on that.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *> What is the Gora's policy regarding flush()? *
>>>>>>>>>>>> *> KuduClient has multiple flushing modes
>>>>>>>>>>>> <https://kudu.apache.org/apidocs/org/apache/kudu/client/SessionConfiguration.FlushMode.html>and
>>>>>>>>>>>> also can set time interval
>>>>>>>>>>>> <https://kudu.apache.org/releases/1.2.0/apidocs/org/apache/kudu/client/KuduSession.html#setFlushInterval-int->
>>>>>>>>>>>> for automatic flush.*
>>>>>>>>>>>> *> Should theses behaviors be configurable using
>>>>>>>>>>>> gora.properties file? or just use the default configurations.*
>>>>>>>>>>>>
>>>>>>>>>>>> What we do in HBase is configure an autoflush option in
>>>>>>>>>>>> gora.properties [2] which is used when instanced the Table, but at the same
>>>>>>>>>>>> time we implement the flush() method to force the flush [3]. I would
>>>>>>>>>>>> suggest to follow that example, but adding the flushing options of Kudu.
>>>>>>>>>>>> What flushing mode (and time interval if it applies) do you suggest?
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Well,  IMHO the default flush mode (auto flush sync) will do the
>>>>>>>>>>> job for most use cases. But I will add a configuration in gora.properties
>>>>>>>>>>> for selecting the other modes and specifying a autoflush time  if needed
>>>>>>>>>>>  by the user.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *> Also, while reviewing the datastore interface I noticed this
>>>>>>>>>>>> method 'getPartitions(Query<K, T> query)'. What is the expected behavior of
>>>>>>>>>>>> this method?, should I use the partition definition in the xml mapping file
>>>>>>>>>>>> for this?.*
>>>>>>>>>>>>
>>>>>>>>>>>> The method getPartitions(Query) is related to Hadoop. Apache
>>>>>>>>>>>> Gora integrates with Hadoop implementing a custom Map and Reduce that
>>>>>>>>>>>> allows to get/write Entities directly.
>>>>>>>>>>>> You can take a look at HBase's implementation [4], which relies o.a.h.hbase.mapreduce.TableInputFormatBase
>>>>>>>>>>>> [5] to compute the splits (start key---end key) with the location of the
>>>>>>>>>>>> split to create a colection of partitions [6].
>>>>>>>>>>>>
>>>>>>>>>>>> So, if Kudu is allowed to perform computation using local kudu
>>>>>>>>>>>> splits, then this method does the needed preparation to allow to "send the
>>>>>>>>>>>> computation to where the data is locally".
>>>>>>>>>>>>
>>>>>>>>>>>> In any case, you can see that:
>>>>>>>>>>>>
>>>>>>>>>>>>    - MongoDB store implementation does not implement splitting
>>>>>>>>>>>>    [7]
>>>>>>>>>>>>    - Cassandra store implementation does not implement
>>>>>>>>>>>>    splitting [8]
>>>>>>>>>>>>    - Aerospike store implementation does not implement
>>>>>>>>>>>>    splitting [9]
>>>>>>>>>>>>    - Accumulo store implementation* does* implement splitting
>>>>>>>>>>>>    [10]
>>>>>>>>>>>>
>>>>>>>>>>>> If Kudu has a method to get the different splits for a table
>>>>>>>>>>>> and its locations, then you will be able to implement the full feature.
>>>>>>>>>>>>
>>>>>>>>>>>> This is Hadoop related and it is not trivial. I haven't
>>>>>>>>>>>> elaborated much, so if you find you need more information let me know :)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> I will check whether Kudu has these features in order to
>>>>>>>>>>> implement this method. If not I will use the default implementation found
>>>>>>>>>>> in other backends.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> About Queries, what I can tell is that Hbase only implements
>>>>>>>>>>>> "Start key" + "End key" because it has only 2 operations: "get" and "scan",
>>>>>>>>>>>> and the querying is for "scan" operation, were you want an interval (or
>>>>>>>>>>>> all) of the rows. Does Kudu have more querying functionality?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> Yes, Kudu implements a Scanner for querying data among with
>>>>>>>>>>> conditional predicates for filtering. I am using those classes.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> About other topic, I am trying to install Kudu in standalone
>>>>>>>>>>>> (all in 1 node). Do you use a Cloudera installation or do you have a
>>>>>>>>>>>> standalone installation? How do you do it? I found some instructions, but
>>>>>>>>>>>> they talk about compiling Kudu [11]. I was looking for something like
>>>>>>>>>>>> HBase, that it is unzip + execute "hbase start".
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> I am using an embedded mini-cluster which comes with compiled
>>>>>>>>>>> binaries and can be used with maven[1] for testing my code. Once I get it
>>>>>>>>>>> mature enough I think I will be testing the datastore with a docker
>>>>>>>>>>> container [2]. I could not find a unzip+execute bundle either and I am
>>>>>>>>>>> kinda noob for compiling it myself.
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>> https://kudu.apache.org/docs/developing.html#_jvm_based_integration_testing
>>>>>>>>>>> [2] https://hub.docker.com/r/usuresearch/apache-kudu/
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Good job and thank you!! :)
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>>
>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> [1] -
>>>>>>>>>>>> https://avro.apache.org/docs/1.8.0/spec.html#Logical+Types
>>>>>>>>>>>> [2] -
>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L175
>>>>>>>>>>>> [3] -
>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L458
>>>>>>>>>>>> [4] -
>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L472
>>>>>>>>>>>> [5] -
>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L479
>>>>>>>>>>>> [6] -
>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L517
>>>>>>>>>>>> [7] -
>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-mongodb/src/main/java/org/apache/gora/mongodb/store/MongoStore.java#L533
>>>>>>>>>>>> [8] -
>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L292
>>>>>>>>>>>> [9] -
>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-aerospike/src/main/java/org/apache/gora/aerospike/store/AerospikeStore.java#L369
>>>>>>>>>>>> [10] -
>>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-accumulo/src/main/java/org/apache/gora/accumulo/store/AccumuloStore.java#L902
>>>>>>>>>>>> [11] - https://kudu.apache.org/docs/installation.html
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> El lun., 8 jul. 2019 a las 3:42, John Mora (<
>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>
>>>>>>>>>>>>> As every week I updated my report in the Wiki[1]. Also, I
>>>>>>>>>>>>> pushed my last commits to my branch [2]. Please give it a look if you have
>>>>>>>>>>>>> time.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This week, I will be continue working in the Queries
>>>>>>>>>>>>> implementation, please reach me out if you have any suggestions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Also, while reviewing the datastore interface I noticed this
>>>>>>>>>>>>> method 'getPartitions(Query<K, T> query)'. What is the expected behavior of
>>>>>>>>>>>>> this method?, should I use the partition definition in the xml mapping file
>>>>>>>>>>>>> for this?.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> John.
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1]
>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> El dom., 30 jun. 2019 a las 16:56, John Mora (<
>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I received my first evaluation from the Google Summer of Code
>>>>>>>>>>>>>> program with a positive result. Thanks so much for your support and
>>>>>>>>>>>>>> confidence to the project and me.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I updated my report of this week in the Wiki[1]. Also, I
>>>>>>>>>>>>>> pushed my last commits to my branch [2].
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This week, I will be reviewing my the serialization/
>>>>>>>>>>>>>> deserialization process in order to identify optimizations specific for
>>>>>>>>>>>>>> Kudu. Because I used a generic methods of other backends which probably
>>>>>>>>>>>>>> could be better tuned for kudu. Also, I will start working on the Queries
>>>>>>>>>>>>>> implementation.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> BTW, I added a question to the wiki about Date types. Please
>>>>>>>>>>>>>> give it a look if you have time.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> El jue., 27 jun. 2019 a las 21:02, John Mora (<
>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Carlos.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for the reminder. I submitted the form yesterday. :D
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> El jue., 27 jun. 2019 a las 17:34, carlos muñoz (<
>>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi John
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The first Google Summer of Code evaluation is due on June
>>>>>>>>>>>>>>>> 28th. Please make sure you submit your Mentors' evaluation on time.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>> Carlos
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> El dom., 23 jun. 2019 a las 18:29, John Mora (<
>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> FYI, I updated my report of this week on the Wiki[1].
>>>>>>>>>>>>>>>>> Also, I pushed my last commits to my branch [2].
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> As I mentioned in the reports I would like to know how
>>>>>>>>>>>>>>>>> datastores deal with flush(), should it work always manually executed?.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Finally, This week I will be implementing object
>>>>>>>>>>>>>>>>> serialization/deserialization in the methods put, get, delete, exists. Do
>>>>>>>>>>>>>>>>> you have any suggestions on how to proceed with this task?.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Footnote: Thanks for the feedback Carlos, I fixed the
>>>>>>>>>>>>>>>>> problem.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> El lun., 17 jun. 2019 a las 22:58, carlos muñoz (<
>>>>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi John
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Your last changes look good to me. Keep it up. But, I
>>>>>>>>>>>>>>>>>> noticed that you have created an Enumeration for datatypes, which is very
>>>>>>>>>>>>>>>>>> similar to the kudu-client's [2]. Probably you should replace [1] for [2]
>>>>>>>>>>>>>>>>>> in order to avoid code duplication.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/Column.java#L76
>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>> https://kudu.apache.org/apidocs/org/apache/kudu/Type.html
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>> Carlos
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> El sáb., 15 jun. 2019 a las 12:01, John Mora (<
>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I updated my report of this week on the Wiki[1]. I
>>>>>>>>>>>>>>>>>>> noticed that my code is lacking some javadoc documentation I think I will
>>>>>>>>>>>>>>>>>>> be working on that this week, also I would like to enable and check schema
>>>>>>>>>>>>>>>>>>> management tests (createSchema, existsSchema, etc.).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> El mar., 11 jun. 2019 a las 0:11, John Mora (<
>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi Alfonso.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks so much for your feedback. I am working on your
>>>>>>>>>>>>>>>>>>>> comments.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 16:11, Alfonso Nishikawa (<
>>>>>>>>>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Regarding your questions at the report [1]:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>    - How to represent partitioning configurations on
>>>>>>>>>>>>>>>>>>>>>    the mapping file.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> This was discussed in other emails, isn't it? :)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>    - KuduTestHarness requires the Maven plugin
>>>>>>>>>>>>>>>>>>>>>    os-maven-plugin, which needs Maven 3.1.1+, is it a problem for Apache Gora?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I believe it is not a problem. My Ubuntu comes with
>>>>>>>>>>>>>>>>>>>>> 3.6.0, far from 3.1.1, and I assume everyone uses Maven 3 in a quite new
>>>>>>>>>>>>>>>>>>>>> version :)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 21:07, Alfonso Nishikawa (<
>>>>>>>>>>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thank you!
>>>>>>>>>>>>>>>>>>>>>> Things I have seen:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> - The version of a maven dependency [1] should go on
>>>>>>>>>>>>>>>>>>>>>> the Dependency Management of the root pom [2]. Same for [3] and from there,
>>>>>>>>>>>>>>>>>>>>>> should not set the version there.
>>>>>>>>>>>>>>>>>>>>>> - Set test dependencies' scope to test, at [4] and
>>>>>>>>>>>>>>>>>>>>>> from there.
>>>>>>>>>>>>>>>>>>>>>> - Set the indentation to 2 spaces for the pom [5]
>>>>>>>>>>>>>>>>>>>>>> - Missing "t" in "localhost" at [6].
>>>>>>>>>>>>>>>>>>>>>> - Port 13 for Kudu? That is "Daytime Protocol" RFC
>>>>>>>>>>>>>>>>>>>>>> 867 and you will need root permission to run it. The default port for kudu
>>>>>>>>>>>>>>>>>>>>>> is 7051, isn't it?
>>>>>>>>>>>>>>>>>>>>>> - I would ask you to add the same functionality to
>>>>>>>>>>>>>>>>>>>>>> load the mapping from configuration as in HBase's store [7] in you
>>>>>>>>>>>>>>>>>>>>>> KuduStore [8]. This will have implications on your readMapping at [9], so
>>>>>>>>>>>>>>>>>>>>>> take a look at the one for HBase at [10]
>>>>>>>>>>>>>>>>>>>>>> - I know it is in other backends, but avoid
>>>>>>>>>>>>>>>>>>>>>> RuntimeExceptions (at least in Java since we have the checked ones) like in
>>>>>>>>>>>>>>>>>>>>>> [11]. You can wrap them in GoraException. An example is [12]
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> And nothing more :)
>>>>>>>>>>>>>>>>>>>>>> Keep going, good job.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L98
>>>>>>>>>>>>>>>>>>>>>> [2] -
>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/pom.xml#L890
>>>>>>>>>>>>>>>>>>>>>> [3] -
>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L121
>>>>>>>>>>>>>>>>>>>>>> [4] -
>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L180
>>>>>>>>>>>>>>>>>>>>>> [5] -
>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml
>>>>>>>>>>>>>>>>>>>>>> [6] -
>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/test/resources/gora.properties#L18
>>>>>>>>>>>>>>>>>>>>>> [7] -
>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L92
>>>>>>>>>>>>>>>>>>>>>> [8] -
>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/store/KuduStore.java#L53
>>>>>>>>>>>>>>>>>>>>>> [9] -
>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L81
>>>>>>>>>>>>>>>>>>>>>> [10] -
>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L822
>>>>>>>>>>>>>>>>>>>>>> [11] -
>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L141
>>>>>>>>>>>>>>>>>>>>>> [12] -
>>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L268
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> El sáb., 8 jun. 2019 a las 20:26, John Mora (<
>>>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I have just updated my weekly reports on Cwiki [1].
>>>>>>>>>>>>>>>>>>>>>>> This next week I think I should be focusing on the create schema operation
>>>>>>>>>>>>>>>>>>>>>>> and solving the issue of the partitioning configurations in the mapping
>>>>>>>>>>>>>>>>>>>>>>> file.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Please let me know if you have suggestions, my last
>>>>>>>>>>>>>>>>>>>>>>> commits are available here [2]
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>

Re: Kudu datastore reports

Posted by Alfonso Nishikawa <al...@gmail.com>.
Hi,

I am using now the following pom configuration I got from executing `mvn
dependency:tree`:

    <dependency>
      <groupId>org.apache.kudu</groupId>
      <artifactId>kudu-binary</artifactId>
      <classifier>linux-x86_64</classifier>
      <version>1.9.0</version>
      <scope>test</scope>
    </dependency>

When I execute `mvn clen package` on gora-kudu I find that it spawns the
following command:

kudu-master
--fs_wal_dir=/tmp/mini-kudu-cluster8989984398759938222/master-0/wal
--fs_data_dirs=/tmp/mini-kudu-cluster8989984398759938222/master-0/data
--block_manager=log --webserver_interface=localhost --ipki_ca_key_size=1024
--tsk_num_rsa_bits=512 --rpc_bind_addresses=*127.26.116.190*:39535
--webserver_interface=*127.26.116.190* --webserver_port=0 --never_fsync
--ipki_server_key_size=1024 --enable_minidumps=false --redact=none
--metrics_log_interval_ms=1000 --logtostderr --logbuflevel=-1
--log_dir=/tmp/mini-kudu-cluster8989984398759938222/master-0/logs
--server_dump_info_path=/tmp/mini-kudu-cluster8989984398759938222/master-0/data/info.pb
--server_dump_info_format=pb --rpc_server_allow_ephemeral_ports
--unlock_experimental_flags --unlock_unsafe_flags --rpc_reuseport=true
--master_addresses=*127.26.116.190*:39535,*127.26.116.189*:33913,
*127.26.116.188*:42253


I highlight the IP addresses because they clearly are not my computer, and
I guess that is why the tests can't connect to the the database.

Any idea on how to solve this?

Thank you!


Best Regards,

Alfonso Nishikawa



El lun., 5 ago. 2019 a las 8:39, Alfonso Nishikawa (<
alfonso.nishikawa@gmail.com>) escribió:

> Hi, John.
>
> I get a core dump from the binary kudu server when trying to run the
> tests. Didn't find a log file, but will search thoroughly later. Happened
> anytime to you? Does it happens to anyone?
>
> I am using Ubuntu 18.04
>
> Thank you!
>
> Regards,
>
> Alfonso Nishikawa
>
> El dom., 4 ago. 2019 20:10, Furkan KAMACI <fu...@gmail.com>
> escribió:
>
>> Hi John,
>>
>> I've already made my comments at your PR. Please check them carefully and
>> ask me if you need help.
>>
>> For the documentation, I've checked what you've done. On the other hand,
>> I would want to encourage you to write a blog post about your Kudu
>> implementation and demonstrate an example of Kudu integration with Gora as
>> like a tutorial.
>>
>> Kind Regards,
>> Furkan KAMACI
>>
>> On Sun, Aug 4, 2019 at 1:59 AM John Mora <jh...@gmail.com> wrote:
>>
>>> Hi all.
>>>
>>> I have updated my report in the Wiki[1].
>>>
>>> Also, I have sent a PR with my last commits for review [2]. Please give
>>> it a look if you have time.
>>>
>>> This week, I will continue working on the documentation of the kudu
>>> datastore.
>>>
>>> Please let me know if you have suggestions.
>>>
>>> [1]
>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>> [2] https://github.com/apache/gora/pull/178
>>>
>>> Best,
>>> John.
>>>
>>> El mié., 31 jul. 2019 a las 11:17, carlos muñoz (<ca...@gmail.com>)
>>> escribió:
>>>
>>>> Hi John,
>>>>
>>>> Thanks for the update. I reviewed your code a little bit, it is looking
>>>> good. I think tha you should send a PR in order to receive feedback from
>>>> other community members.
>>>>
>>>> Best,
>>>> Carlos
>>>>
>>>> El dom., 28 jul. 2019 a las 23:20, John Mora (<jh...@gmail.com>)
>>>> escribió:
>>>>
>>>>> Hi all.
>>>>>
>>>>> I updated my report in the Wiki[1]. Also, I pushed my last commits to
>>>>> my branch [2]. Please give it a look if you have time.
>>>>>
>>>>> This week, I will give a look to the documentation of datastores.
>>>>>
>>>>> Please let me know if you have suggestions.
>>>>>
>>>>> [1]
>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>
>>>>> Cheers,
>>>>> John
>>>>>
>>>>> El mié., 24 jul. 2019 a las 11:34, John Mora (<jh...@gmail.com>)
>>>>> escribió:
>>>>>
>>>>>> Hi Alfonso,
>>>>>>
>>>>>> Yes, I was using this class javafx.util.Pair. It is not a problem I
>>>>>> will find an alternative, it is only an utilitary class.
>>>>>>
>>>>>> Thanks,
>>>>>> John
>>>>>>
>>>>>> El mar., 23 jul. 2019 a las 12:36, Alfonso Nishikawa (<
>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>
>>>>>>> Hi, John.
>>>>>>>
>>>>>>> I checked out your code and it looks good :)
>>>>>>> I found that you use javafx, but that is not present in OpenJDK and
>>>>>>> fails to compile, and since we don't stick to Oracle JVM I would suggest to
>>>>>>> change it.
>>>>>>>
>>>>>>> Good job, keep it going :)
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Alfonso Nishikawa
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> El sáb., 20 jul. 2019 a las 22:25, John Mora (<jh...@gmail.com>)
>>>>>>> escribió:
>>>>>>>
>>>>>>>> Hi.
>>>>>>>>
>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last commits
>>>>>>>> to my branch [2]. Please give it a look if you have time.
>>>>>>>>
>>>>>>>> This week, I will give a look to the map reduce tests for
>>>>>>>> DataStores.
>>>>>>>>
>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>
>>>>>>>> [1]
>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> John
>>>>>>>>
>>>>>>>> El sáb., 13 jul. 2019 a las 19:31, John Mora (<jh...@gmail.com>)
>>>>>>>> escribió:
>>>>>>>>
>>>>>>>>> Hi all
>>>>>>>>>
>>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last commits
>>>>>>>>> to my branch [2]. Please give it a look if you have time.
>>>>>>>>>
>>>>>>>>> This week, I will be working in the getPartitions and
>>>>>>>>> deleteByQuery methods and testing the other tests in the DataStoreTestBase
>>>>>>>>> class.
>>>>>>>>>
>>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> John.
>>>>>>>>>
>>>>>>>>> El mié., 10 jul. 2019 a las 16:17, John Mora (<
>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>
>>>>>>>>>> Hi Alfonso,
>>>>>>>>>>
>>>>>>>>>> Thanks so much for your time and support for this project. I will
>>>>>>>>>> work on your comments. Responses inline :)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> El mar., 9 jul. 2019 a las 16:38, Alfonso Nishikawa (<
>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>
>>>>>>>>>>> Hi, John.
>>>>>>>>>>>
>>>>>>>>>>> Sorry for the delay, I am changing work and I have been very
>>>>>>>>>>> busy :( I will try to answer your questions :)
>>>>>>>>>>>
>>>>>>>>>>> *> In the Employee example there is a field called
>>>>>>>>>>> 'dateOfBirth'. I tried to map that field with the UNIXTIME_MICROS datatype
>>>>>>>>>>> of Kudu (I intuitively assumed this is a date.). However, in the java world
>>>>>>>>>>> the Employee field is a Long value and the kudu datatype is a Timestamp.
>>>>>>>>>>> So, I was wondering whether I should force the usage of the UNIXTIME_MICROS
>>>>>>>>>>> datatype for this field or just use a LONG datatype in Kudu.*
>>>>>>>>>>>
>>>>>>>>>>> In Avro 1.8 were introduced "Logical Types" so there is a "date"
>>>>>>>>>>> type with an underlying "int" [1]. It's the first time I read about because
>>>>>>>>>>> until the last version upgrade of Avro this weren't there. I would suggest
>>>>>>>>>>> to ignore "dates" and map dateOfBirth as long, since in any case -in avro-
>>>>>>>>>>> the value is the unix epoch. After this first approach, a design
>>>>>>>>>>> improvement would be great, though :)
>>>>>>>>>>>
>>>>>>>>>>> - Would be good to have in the mapping a "timestamp" type so
>>>>>>>>>>> KuduStore converts between the Entity long field <-> Kudu timestamp storage?
>>>>>>>>>>> - Is there any other approach?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I think that Entity long field <-> Kudu timestamp conversion that
>>>>>>>>>> the best alternative right now. Because, I would add more compatible
>>>>>>>>>> datatypes to the mapping parameters which users can use. And this
>>>>>>>>>> conversion should not be dificult to implement in my opinion. Also, the new
>>>>>>>>>> Date datatype of avro could be implemented in newer versions because it
>>>>>>>>>> would need further analysis in other datastores too. I will work on that.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *> What is the Gora's policy regarding flush()? *
>>>>>>>>>>> *> KuduClient has multiple flushing modes
>>>>>>>>>>> <https://kudu.apache.org/apidocs/org/apache/kudu/client/SessionConfiguration.FlushMode.html>and
>>>>>>>>>>> also can set time interval
>>>>>>>>>>> <https://kudu.apache.org/releases/1.2.0/apidocs/org/apache/kudu/client/KuduSession.html#setFlushInterval-int->
>>>>>>>>>>> for automatic flush.*
>>>>>>>>>>> *> Should theses behaviors be configurable using gora.properties
>>>>>>>>>>> file? or just use the default configurations.*
>>>>>>>>>>>
>>>>>>>>>>> What we do in HBase is configure an autoflush option in
>>>>>>>>>>> gora.properties [2] which is used when instanced the Table, but at the same
>>>>>>>>>>> time we implement the flush() method to force the flush [3]. I would
>>>>>>>>>>> suggest to follow that example, but adding the flushing options of Kudu.
>>>>>>>>>>> What flushing mode (and time interval if it applies) do you suggest?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Well,  IMHO the default flush mode (auto flush sync) will do the
>>>>>>>>>> job for most use cases. But I will add a configuration in gora.properties
>>>>>>>>>> for selecting the other modes and specifying a autoflush time  if needed
>>>>>>>>>>  by the user.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *> Also, while reviewing the datastore interface I noticed this
>>>>>>>>>>> method 'getPartitions(Query<K, T> query)'. What is the expected behavior of
>>>>>>>>>>> this method?, should I use the partition definition in the xml mapping file
>>>>>>>>>>> for this?.*
>>>>>>>>>>>
>>>>>>>>>>> The method getPartitions(Query) is related to Hadoop. Apache
>>>>>>>>>>> Gora integrates with Hadoop implementing a custom Map and Reduce that
>>>>>>>>>>> allows to get/write Entities directly.
>>>>>>>>>>> You can take a look at HBase's implementation [4], which relies o.a.h.hbase.mapreduce.TableInputFormatBase
>>>>>>>>>>> [5] to compute the splits (start key---end key) with the location of the
>>>>>>>>>>> split to create a colection of partitions [6].
>>>>>>>>>>>
>>>>>>>>>>> So, if Kudu is allowed to perform computation using local kudu
>>>>>>>>>>> splits, then this method does the needed preparation to allow to "send the
>>>>>>>>>>> computation to where the data is locally".
>>>>>>>>>>>
>>>>>>>>>>> In any case, you can see that:
>>>>>>>>>>>
>>>>>>>>>>>    - MongoDB store implementation does not implement splitting
>>>>>>>>>>>    [7]
>>>>>>>>>>>    - Cassandra store implementation does not implement
>>>>>>>>>>>    splitting [8]
>>>>>>>>>>>    - Aerospike store implementation does not implement
>>>>>>>>>>>    splitting [9]
>>>>>>>>>>>    - Accumulo store implementation* does* implement splitting
>>>>>>>>>>>    [10]
>>>>>>>>>>>
>>>>>>>>>>> If Kudu has a method to get the different splits for a table and
>>>>>>>>>>> its locations, then you will be able to implement the full feature.
>>>>>>>>>>>
>>>>>>>>>>> This is Hadoop related and it is not trivial. I haven't
>>>>>>>>>>> elaborated much, so if you find you need more information let me know :)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> I will check whether Kudu has these features in order to
>>>>>>>>>> implement this method. If not I will use the default implementation found
>>>>>>>>>> in other backends.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> About Queries, what I can tell is that Hbase only implements
>>>>>>>>>>> "Start key" + "End key" because it has only 2 operations: "get" and "scan",
>>>>>>>>>>> and the querying is for "scan" operation, were you want an interval (or
>>>>>>>>>>> all) of the rows. Does Kudu have more querying functionality?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> Yes, Kudu implements a Scanner for querying data among with
>>>>>>>>>> conditional predicates for filtering. I am using those classes.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> About other topic, I am trying to install Kudu in standalone
>>>>>>>>>>> (all in 1 node). Do you use a Cloudera installation or do you have a
>>>>>>>>>>> standalone installation? How do you do it? I found some instructions, but
>>>>>>>>>>> they talk about compiling Kudu [11]. I was looking for something like
>>>>>>>>>>> HBase, that it is unzip + execute "hbase start".
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> I am using an embedded mini-cluster which comes with compiled
>>>>>>>>>> binaries and can be used with maven[1] for testing my code. Once I get it
>>>>>>>>>> mature enough I think I will be testing the datastore with a docker
>>>>>>>>>> container [2]. I could not find a unzip+execute bundle either and I am
>>>>>>>>>> kinda noob for compiling it myself.
>>>>>>>>>>
>>>>>>>>>> [1]
>>>>>>>>>> https://kudu.apache.org/docs/developing.html#_jvm_based_integration_testing
>>>>>>>>>> [2] https://hub.docker.com/r/usuresearch/apache-kudu/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Good job and thank you!! :)
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>>
>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [1] - https://avro.apache.org/docs/1.8.0/spec.html#Logical+Types
>>>>>>>>>>> [2] -
>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L175
>>>>>>>>>>> [3] -
>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L458
>>>>>>>>>>> [4] -
>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L472
>>>>>>>>>>> [5] -
>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L479
>>>>>>>>>>> [6] -
>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L517
>>>>>>>>>>> [7] -
>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-mongodb/src/main/java/org/apache/gora/mongodb/store/MongoStore.java#L533
>>>>>>>>>>> [8] -
>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L292
>>>>>>>>>>> [9] -
>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-aerospike/src/main/java/org/apache/gora/aerospike/store/AerospikeStore.java#L369
>>>>>>>>>>> [10] -
>>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-accumulo/src/main/java/org/apache/gora/accumulo/store/AccumuloStore.java#L902
>>>>>>>>>>> [11] - https://kudu.apache.org/docs/installation.html
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> El lun., 8 jul. 2019 a las 3:42, John Mora (<
>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>
>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>
>>>>>>>>>>>> As every week I updated my report in the Wiki[1]. Also, I
>>>>>>>>>>>> pushed my last commits to my branch [2]. Please give it a look if you have
>>>>>>>>>>>> time.
>>>>>>>>>>>>
>>>>>>>>>>>> This week, I will be continue working in the Queries
>>>>>>>>>>>> implementation, please reach me out if you have any suggestions.
>>>>>>>>>>>>
>>>>>>>>>>>> Also, while reviewing the datastore interface I noticed this
>>>>>>>>>>>> method 'getPartitions(Query<K, T> query)'. What is the expected behavior of
>>>>>>>>>>>> this method?, should I use the partition definition in the xml mapping file
>>>>>>>>>>>> for this?.
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> John.
>>>>>>>>>>>>
>>>>>>>>>>>> [1]
>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> El dom., 30 jun. 2019 a las 16:56, John Mora (<
>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I received my first evaluation from the Google Summer of Code
>>>>>>>>>>>>> program with a positive result. Thanks so much for your support and
>>>>>>>>>>>>> confidence to the project and me.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I updated my report of this week in the Wiki[1]. Also, I
>>>>>>>>>>>>> pushed my last commits to my branch [2].
>>>>>>>>>>>>>
>>>>>>>>>>>>> This week, I will be reviewing my the serialization/
>>>>>>>>>>>>> deserialization process in order to identify optimizations specific for
>>>>>>>>>>>>> Kudu. Because I used a generic methods of other backends which probably
>>>>>>>>>>>>> could be better tuned for kudu. Also, I will start working on the Queries
>>>>>>>>>>>>> implementation.
>>>>>>>>>>>>>
>>>>>>>>>>>>> BTW, I added a question to the wiki about Date types. Please
>>>>>>>>>>>>> give it a look if you have time.
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1]
>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> John
>>>>>>>>>>>>>
>>>>>>>>>>>>> El jue., 27 jun. 2019 a las 21:02, John Mora (<
>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Carlos.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for the reminder. I submitted the form yesterday. :D
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> El jue., 27 jun. 2019 a las 17:34, carlos muñoz (<
>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi John
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The first Google Summer of Code evaluation is due on June
>>>>>>>>>>>>>>> 28th. Please make sure you submit your Mentors' evaluation on time.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>> Carlos
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> El dom., 23 jun. 2019 a las 18:29, John Mora (<
>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> FYI, I updated my report of this week on the Wiki[1]. Also,
>>>>>>>>>>>>>>>> I pushed my last commits to my branch [2].
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> As I mentioned in the reports I would like to know how
>>>>>>>>>>>>>>>> datastores deal with flush(), should it work always manually executed?.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Finally, This week I will be implementing object
>>>>>>>>>>>>>>>> serialization/deserialization in the methods put, get, delete, exists. Do
>>>>>>>>>>>>>>>> you have any suggestions on how to proceed with this task?.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Footnote: Thanks for the feedback Carlos, I fixed the
>>>>>>>>>>>>>>>> problem.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> El lun., 17 jun. 2019 a las 22:58, carlos muñoz (<
>>>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi John
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Your last changes look good to me. Keep it up. But, I
>>>>>>>>>>>>>>>>> noticed that you have created an Enumeration for datatypes, which is very
>>>>>>>>>>>>>>>>> similar to the kudu-client's [2]. Probably you should replace [1] for [2]
>>>>>>>>>>>>>>>>> in order to avoid code duplication.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/Column.java#L76
>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>> https://kudu.apache.org/apidocs/org/apache/kudu/Type.html
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>> Carlos
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> El sáb., 15 jun. 2019 a las 12:01, John Mora (<
>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I updated my report of this week on the Wiki[1]. I
>>>>>>>>>>>>>>>>>> noticed that my code is lacking some javadoc documentation I think I will
>>>>>>>>>>>>>>>>>> be working on that this week, also I would like to enable and check schema
>>>>>>>>>>>>>>>>>> management tests (createSchema, existsSchema, etc.).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> El mar., 11 jun. 2019 a las 0:11, John Mora (<
>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi Alfonso.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks so much for your feedback. I am working on your
>>>>>>>>>>>>>>>>>>> comments.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 16:11, Alfonso Nishikawa (<
>>>>>>>>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Regarding your questions at the report [1]:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>    - How to represent partitioning configurations on
>>>>>>>>>>>>>>>>>>>>    the mapping file.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> This was discussed in other emails, isn't it? :)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>    - KuduTestHarness requires the Maven plugin
>>>>>>>>>>>>>>>>>>>>    os-maven-plugin, which needs Maven 3.1.1+, is it a problem for Apache Gora?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I believe it is not a problem. My Ubuntu comes with
>>>>>>>>>>>>>>>>>>>> 3.6.0, far from 3.1.1, and I assume everyone uses Maven 3 in a quite new
>>>>>>>>>>>>>>>>>>>> version :)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 21:07, Alfonso Nishikawa (<
>>>>>>>>>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thank you!
>>>>>>>>>>>>>>>>>>>>> Things I have seen:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> - The version of a maven dependency [1] should go on
>>>>>>>>>>>>>>>>>>>>> the Dependency Management of the root pom [2]. Same for [3] and from there,
>>>>>>>>>>>>>>>>>>>>> should not set the version there.
>>>>>>>>>>>>>>>>>>>>> - Set test dependencies' scope to test, at [4] and
>>>>>>>>>>>>>>>>>>>>> from there.
>>>>>>>>>>>>>>>>>>>>> - Set the indentation to 2 spaces for the pom [5]
>>>>>>>>>>>>>>>>>>>>> - Missing "t" in "localhost" at [6].
>>>>>>>>>>>>>>>>>>>>> - Port 13 for Kudu? That is "Daytime Protocol" RFC 867
>>>>>>>>>>>>>>>>>>>>> and you will need root permission to run it. The default port for kudu is
>>>>>>>>>>>>>>>>>>>>> 7051, isn't it?
>>>>>>>>>>>>>>>>>>>>> - I would ask you to add the same functionality to
>>>>>>>>>>>>>>>>>>>>> load the mapping from configuration as in HBase's store [7] in you
>>>>>>>>>>>>>>>>>>>>> KuduStore [8]. This will have implications on your readMapping at [9], so
>>>>>>>>>>>>>>>>>>>>> take a look at the one for HBase at [10]
>>>>>>>>>>>>>>>>>>>>> - I know it is in other backends, but avoid
>>>>>>>>>>>>>>>>>>>>> RuntimeExceptions (at least in Java since we have the checked ones) like in
>>>>>>>>>>>>>>>>>>>>> [11]. You can wrap them in GoraException. An example is [12]
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> And nothing more :)
>>>>>>>>>>>>>>>>>>>>> Keep going, good job.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L98
>>>>>>>>>>>>>>>>>>>>> [2] -
>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/pom.xml#L890
>>>>>>>>>>>>>>>>>>>>> [3] -
>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L121
>>>>>>>>>>>>>>>>>>>>> [4] -
>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L180
>>>>>>>>>>>>>>>>>>>>> [5] -
>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml
>>>>>>>>>>>>>>>>>>>>> [6] -
>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/test/resources/gora.properties#L18
>>>>>>>>>>>>>>>>>>>>> [7] -
>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L92
>>>>>>>>>>>>>>>>>>>>> [8] -
>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/store/KuduStore.java#L53
>>>>>>>>>>>>>>>>>>>>> [9] -
>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L81
>>>>>>>>>>>>>>>>>>>>> [10] -
>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L822
>>>>>>>>>>>>>>>>>>>>> [11] -
>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L141
>>>>>>>>>>>>>>>>>>>>> [12] -
>>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L268
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> El sáb., 8 jun. 2019 a las 20:26, John Mora (<
>>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I have just updated my weekly reports on Cwiki [1].
>>>>>>>>>>>>>>>>>>>>>> This next week I think I should be focusing on the create schema operation
>>>>>>>>>>>>>>>>>>>>>> and solving the issue of the partitioning configurations in the mapping
>>>>>>>>>>>>>>>>>>>>>> file.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Please let me know if you have suggestions, my last
>>>>>>>>>>>>>>>>>>>>>> commits are available here [2]
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>

Re: Kudu datastore reports

Posted by Alfonso Nishikawa <al...@gmail.com>.
Hi, John.

I get a core dump from the binary kudu server when trying to run the tests.
Didn't find a log file, but will search thoroughly later. Happened anytime
to you? Does it happens to anyone?

I am using Ubuntu 18.04

Thank you!

Regards,

Alfonso Nishikawa

El dom., 4 ago. 2019 20:10, Furkan KAMACI <fu...@gmail.com> escribió:

> Hi John,
>
> I've already made my comments at your PR. Please check them carefully and
> ask me if you need help.
>
> For the documentation, I've checked what you've done. On the other hand, I
> would want to encourage you to write a blog post about your Kudu
> implementation and demonstrate an example of Kudu integration with Gora as
> like a tutorial.
>
> Kind Regards,
> Furkan KAMACI
>
> On Sun, Aug 4, 2019 at 1:59 AM John Mora <jh...@gmail.com> wrote:
>
>> Hi all.
>>
>> I have updated my report in the Wiki[1].
>>
>> Also, I have sent a PR with my last commits for review [2]. Please give
>> it a look if you have time.
>>
>> This week, I will continue working on the documentation of the kudu
>> datastore.
>>
>> Please let me know if you have suggestions.
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>> [2] https://github.com/apache/gora/pull/178
>>
>> Best,
>> John.
>>
>> El mié., 31 jul. 2019 a las 11:17, carlos muñoz (<ca...@gmail.com>)
>> escribió:
>>
>>> Hi John,
>>>
>>> Thanks for the update. I reviewed your code a little bit, it is looking
>>> good. I think tha you should send a PR in order to receive feedback from
>>> other community members.
>>>
>>> Best,
>>> Carlos
>>>
>>> El dom., 28 jul. 2019 a las 23:20, John Mora (<jh...@gmail.com>)
>>> escribió:
>>>
>>>> Hi all.
>>>>
>>>> I updated my report in the Wiki[1]. Also, I pushed my last commits to
>>>> my branch [2]. Please give it a look if you have time.
>>>>
>>>> This week, I will give a look to the documentation of datastores.
>>>>
>>>> Please let me know if you have suggestions.
>>>>
>>>> [1]
>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>
>>>> Cheers,
>>>> John
>>>>
>>>> El mié., 24 jul. 2019 a las 11:34, John Mora (<jh...@gmail.com>)
>>>> escribió:
>>>>
>>>>> Hi Alfonso,
>>>>>
>>>>> Yes, I was using this class javafx.util.Pair. It is not a problem I
>>>>> will find an alternative, it is only an utilitary class.
>>>>>
>>>>> Thanks,
>>>>> John
>>>>>
>>>>> El mar., 23 jul. 2019 a las 12:36, Alfonso Nishikawa (<
>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>
>>>>>> Hi, John.
>>>>>>
>>>>>> I checked out your code and it looks good :)
>>>>>> I found that you use javafx, but that is not present in OpenJDK and
>>>>>> fails to compile, and since we don't stick to Oracle JVM I would suggest to
>>>>>> change it.
>>>>>>
>>>>>> Good job, keep it going :)
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Alfonso Nishikawa
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> El sáb., 20 jul. 2019 a las 22:25, John Mora (<jh...@gmail.com>)
>>>>>> escribió:
>>>>>>
>>>>>>> Hi.
>>>>>>>
>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last commits
>>>>>>> to my branch [2]. Please give it a look if you have time.
>>>>>>>
>>>>>>> This week, I will give a look to the map reduce tests for DataStores.
>>>>>>>
>>>>>>> Please let me know if you have suggestions.
>>>>>>>
>>>>>>> [1]
>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>
>>>>>>> Thanks,
>>>>>>> John
>>>>>>>
>>>>>>> El sáb., 13 jul. 2019 a las 19:31, John Mora (<jh...@gmail.com>)
>>>>>>> escribió:
>>>>>>>
>>>>>>>> Hi all
>>>>>>>>
>>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last commits
>>>>>>>> to my branch [2]. Please give it a look if you have time.
>>>>>>>>
>>>>>>>> This week, I will be working in the getPartitions and deleteByQuery
>>>>>>>> methods and testing the other tests in the DataStoreTestBase class.
>>>>>>>>
>>>>>>>> Please let me know if you have suggestions.
>>>>>>>>
>>>>>>>> [1]
>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> John.
>>>>>>>>
>>>>>>>> El mié., 10 jul. 2019 a las 16:17, John Mora (<jh...@gmail.com>)
>>>>>>>> escribió:
>>>>>>>>
>>>>>>>>> Hi Alfonso,
>>>>>>>>>
>>>>>>>>> Thanks so much for your time and support for this project. I will
>>>>>>>>> work on your comments. Responses inline :)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> El mar., 9 jul. 2019 a las 16:38, Alfonso Nishikawa (<
>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>
>>>>>>>>>> Hi, John.
>>>>>>>>>>
>>>>>>>>>> Sorry for the delay, I am changing work and I have been very busy
>>>>>>>>>> :( I will try to answer your questions :)
>>>>>>>>>>
>>>>>>>>>> *> In the Employee example there is a field called 'dateOfBirth'.
>>>>>>>>>> I tried to map that field with the UNIXTIME_MICROS datatype of Kudu (I
>>>>>>>>>> intuitively assumed this is a date.). However, in the java world the
>>>>>>>>>> Employee field is a Long value and the kudu datatype is a Timestamp. So, I
>>>>>>>>>> was wondering whether I should force the usage of the UNIXTIME_MICROS
>>>>>>>>>> datatype for this field or just use a LONG datatype in Kudu.*
>>>>>>>>>>
>>>>>>>>>> In Avro 1.8 were introduced "Logical Types" so there is a "date"
>>>>>>>>>> type with an underlying "int" [1]. It's the first time I read about because
>>>>>>>>>> until the last version upgrade of Avro this weren't there. I would suggest
>>>>>>>>>> to ignore "dates" and map dateOfBirth as long, since in any case -in avro-
>>>>>>>>>> the value is the unix epoch. After this first approach, a design
>>>>>>>>>> improvement would be great, though :)
>>>>>>>>>>
>>>>>>>>>> - Would be good to have in the mapping a "timestamp" type so
>>>>>>>>>> KuduStore converts between the Entity long field <-> Kudu timestamp storage?
>>>>>>>>>> - Is there any other approach?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I think that Entity long field <-> Kudu timestamp conversion that
>>>>>>>>> the best alternative right now. Because, I would add more compatible
>>>>>>>>> datatypes to the mapping parameters which users can use. And this
>>>>>>>>> conversion should not be dificult to implement in my opinion. Also, the new
>>>>>>>>> Date datatype of avro could be implemented in newer versions because it
>>>>>>>>> would need further analysis in other datastores too. I will work on that.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *> What is the Gora's policy regarding flush()? *
>>>>>>>>>> *> KuduClient has multiple flushing modes
>>>>>>>>>> <https://kudu.apache.org/apidocs/org/apache/kudu/client/SessionConfiguration.FlushMode.html>and
>>>>>>>>>> also can set time interval
>>>>>>>>>> <https://kudu.apache.org/releases/1.2.0/apidocs/org/apache/kudu/client/KuduSession.html#setFlushInterval-int->
>>>>>>>>>> for automatic flush.*
>>>>>>>>>> *> Should theses behaviors be configurable using gora.properties
>>>>>>>>>> file? or just use the default configurations.*
>>>>>>>>>>
>>>>>>>>>> What we do in HBase is configure an autoflush option in
>>>>>>>>>> gora.properties [2] which is used when instanced the Table, but at the same
>>>>>>>>>> time we implement the flush() method to force the flush [3]. I would
>>>>>>>>>> suggest to follow that example, but adding the flushing options of Kudu.
>>>>>>>>>> What flushing mode (and time interval if it applies) do you suggest?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Well,  IMHO the default flush mode (auto flush sync) will do the
>>>>>>>>> job for most use cases. But I will add a configuration in gora.properties
>>>>>>>>> for selecting the other modes and specifying a autoflush time  if needed
>>>>>>>>>  by the user.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *> Also, while reviewing the datastore interface I noticed this
>>>>>>>>>> method 'getPartitions(Query<K, T> query)'. What is the expected behavior of
>>>>>>>>>> this method?, should I use the partition definition in the xml mapping file
>>>>>>>>>> for this?.*
>>>>>>>>>>
>>>>>>>>>> The method getPartitions(Query) is related to Hadoop. Apache Gora
>>>>>>>>>> integrates with Hadoop implementing a custom Map and Reduce that allows to
>>>>>>>>>> get/write Entities directly.
>>>>>>>>>> You can take a look at HBase's implementation [4], which relies o.a.h.hbase.mapreduce.TableInputFormatBase
>>>>>>>>>> [5] to compute the splits (start key---end key) with the location of the
>>>>>>>>>> split to create a colection of partitions [6].
>>>>>>>>>>
>>>>>>>>>> So, if Kudu is allowed to perform computation using local kudu
>>>>>>>>>> splits, then this method does the needed preparation to allow to "send the
>>>>>>>>>> computation to where the data is locally".
>>>>>>>>>>
>>>>>>>>>> In any case, you can see that:
>>>>>>>>>>
>>>>>>>>>>    - MongoDB store implementation does not implement splitting
>>>>>>>>>>    [7]
>>>>>>>>>>    - Cassandra store implementation does not implement splitting
>>>>>>>>>>    [8]
>>>>>>>>>>    - Aerospike store implementation does not implement splitting
>>>>>>>>>>    [9]
>>>>>>>>>>    - Accumulo store implementation* does* implement splitting
>>>>>>>>>>    [10]
>>>>>>>>>>
>>>>>>>>>> If Kudu has a method to get the different splits for a table and
>>>>>>>>>> its locations, then you will be able to implement the full feature.
>>>>>>>>>>
>>>>>>>>>> This is Hadoop related and it is not trivial. I haven't
>>>>>>>>>> elaborated much, so if you find you need more information let me know :)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> I will check whether Kudu has these features in order to implement
>>>>>>>>> this method. If not I will use the default implementation found in other
>>>>>>>>> backends.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> About Queries, what I can tell is that Hbase only implements
>>>>>>>>>> "Start key" + "End key" because it has only 2 operations: "get" and "scan",
>>>>>>>>>> and the querying is for "scan" operation, were you want an interval (or
>>>>>>>>>> all) of the rows. Does Kudu have more querying functionality?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> Yes, Kudu implements a Scanner for querying data among with
>>>>>>>>> conditional predicates for filtering. I am using those classes.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> About other topic, I am trying to install Kudu in standalone (all
>>>>>>>>>> in 1 node). Do you use a Cloudera installation or do you have a standalone
>>>>>>>>>> installation? How do you do it? I found some instructions, but they talk
>>>>>>>>>> about compiling Kudu [11]. I was looking for something like HBase, that it
>>>>>>>>>> is unzip + execute "hbase start".
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> I am using an embedded mini-cluster which comes with compiled
>>>>>>>>> binaries and can be used with maven[1] for testing my code. Once I get it
>>>>>>>>> mature enough I think I will be testing the datastore with a docker
>>>>>>>>> container [2]. I could not find a unzip+execute bundle either and I am
>>>>>>>>> kinda noob for compiling it myself.
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> https://kudu.apache.org/docs/developing.html#_jvm_based_integration_testing
>>>>>>>>> [2] https://hub.docker.com/r/usuresearch/apache-kudu/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Good job and thank you!! :)
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>
>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [1] - https://avro.apache.org/docs/1.8.0/spec.html#Logical+Types
>>>>>>>>>> [2] -
>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L175
>>>>>>>>>> [3] -
>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L458
>>>>>>>>>> [4] -
>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L472
>>>>>>>>>> [5] -
>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L479
>>>>>>>>>> [6] -
>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L517
>>>>>>>>>> [7] -
>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-mongodb/src/main/java/org/apache/gora/mongodb/store/MongoStore.java#L533
>>>>>>>>>> [8] -
>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L292
>>>>>>>>>> [9] -
>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-aerospike/src/main/java/org/apache/gora/aerospike/store/AerospikeStore.java#L369
>>>>>>>>>> [10] -
>>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-accumulo/src/main/java/org/apache/gora/accumulo/store/AccumuloStore.java#L902
>>>>>>>>>> [11] - https://kudu.apache.org/docs/installation.html
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> El lun., 8 jul. 2019 a las 3:42, John Mora (<jh...@gmail.com>)
>>>>>>>>>> escribió:
>>>>>>>>>>
>>>>>>>>>>> Hi all.
>>>>>>>>>>>
>>>>>>>>>>> As every week I updated my report in the Wiki[1]. Also, I pushed
>>>>>>>>>>> my last commits to my branch [2]. Please give it a look if you have time.
>>>>>>>>>>>
>>>>>>>>>>> This week, I will be continue working in the Queries
>>>>>>>>>>> implementation, please reach me out if you have any suggestions.
>>>>>>>>>>>
>>>>>>>>>>> Also, while reviewing the datastore interface I noticed this
>>>>>>>>>>> method 'getPartitions(Query<K, T> query)'. What is the expected behavior of
>>>>>>>>>>> this method?, should I use the partition definition in the xml mapping file
>>>>>>>>>>> for this?.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> John.
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> El dom., 30 jun. 2019 a las 16:56, John Mora (<
>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>
>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>
>>>>>>>>>>>> I received my first evaluation from the Google Summer of Code
>>>>>>>>>>>> program with a positive result. Thanks so much for your support and
>>>>>>>>>>>> confidence to the project and me.
>>>>>>>>>>>>
>>>>>>>>>>>> I updated my report of this week in the Wiki[1]. Also, I pushed
>>>>>>>>>>>> my last commits to my branch [2].
>>>>>>>>>>>>
>>>>>>>>>>>> This week, I will be reviewing my the serialization/
>>>>>>>>>>>> deserialization process in order to identify optimizations specific for
>>>>>>>>>>>> Kudu. Because I used a generic methods of other backends which probably
>>>>>>>>>>>> could be better tuned for kudu. Also, I will start working on the Queries
>>>>>>>>>>>> implementation.
>>>>>>>>>>>>
>>>>>>>>>>>> BTW, I added a question to the wiki about Date types. Please
>>>>>>>>>>>> give it a look if you have time.
>>>>>>>>>>>>
>>>>>>>>>>>> [1]
>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> John
>>>>>>>>>>>>
>>>>>>>>>>>> El jue., 27 jun. 2019 a las 21:02, John Mora (<
>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Carlos.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the reminder. I submitted the form yesterday. :D
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> John.
>>>>>>>>>>>>>
>>>>>>>>>>>>> El jue., 27 jun. 2019 a las 17:34, carlos muñoz (<
>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi John
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The first Google Summer of Code evaluation is due on June
>>>>>>>>>>>>>> 28th. Please make sure you submit your Mentors' evaluation on time.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Carlos
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> El dom., 23 jun. 2019 a las 18:29, John Mora (<
>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> FYI, I updated my report of this week on the Wiki[1]. Also,
>>>>>>>>>>>>>>> I pushed my last commits to my branch [2].
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As I mentioned in the reports I would like to know how
>>>>>>>>>>>>>>> datastores deal with flush(), should it work always manually executed?.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Finally, This week I will be implementing object
>>>>>>>>>>>>>>> serialization/deserialization in the methods put, get, delete, exists. Do
>>>>>>>>>>>>>>> you have any suggestions on how to proceed with this task?.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Footnote: Thanks for the feedback Carlos, I fixed the
>>>>>>>>>>>>>>> problem.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> El lun., 17 jun. 2019 a las 22:58, carlos muñoz (<
>>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi John
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Your last changes look good to me. Keep it up. But, I
>>>>>>>>>>>>>>>> noticed that you have created an Enumeration for datatypes, which is very
>>>>>>>>>>>>>>>> similar to the kudu-client's [2]. Probably you should replace [1] for [2]
>>>>>>>>>>>>>>>> in order to avoid code duplication.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/Column.java#L76
>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>> https://kudu.apache.org/apidocs/org/apache/kudu/Type.html
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Carlos
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> El sáb., 15 jun. 2019 a las 12:01, John Mora (<
>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I updated my report of this week on the Wiki[1]. I noticed
>>>>>>>>>>>>>>>>> that my code is lacking some javadoc documentation I think I will be
>>>>>>>>>>>>>>>>> working on that this week, also I would like to enable and check schema
>>>>>>>>>>>>>>>>> management tests (createSchema, existsSchema, etc.).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> El mar., 11 jun. 2019 a las 0:11, John Mora (<
>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi Alfonso.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks so much for your feedback. I am working on your
>>>>>>>>>>>>>>>>>> comments.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 16:11, Alfonso Nishikawa (<
>>>>>>>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Regarding your questions at the report [1]:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>    - How to represent partitioning configurations on
>>>>>>>>>>>>>>>>>>>    the mapping file.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> This was discussed in other emails, isn't it? :)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>    - KuduTestHarness requires the Maven plugin
>>>>>>>>>>>>>>>>>>>    os-maven-plugin, which needs Maven 3.1.1+, is it a problem for Apache Gora?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I believe it is not a problem. My Ubuntu comes with
>>>>>>>>>>>>>>>>>>> 3.6.0, far from 3.1.1, and I assume everyone uses Maven 3 in a quite new
>>>>>>>>>>>>>>>>>>> version :)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 21:07, Alfonso Nishikawa (<
>>>>>>>>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thank you!
>>>>>>>>>>>>>>>>>>>> Things I have seen:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> - The version of a maven dependency [1] should go on
>>>>>>>>>>>>>>>>>>>> the Dependency Management of the root pom [2]. Same for [3] and from there,
>>>>>>>>>>>>>>>>>>>> should not set the version there.
>>>>>>>>>>>>>>>>>>>> - Set test dependencies' scope to test, at [4] and from
>>>>>>>>>>>>>>>>>>>> there.
>>>>>>>>>>>>>>>>>>>> - Set the indentation to 2 spaces for the pom [5]
>>>>>>>>>>>>>>>>>>>> - Missing "t" in "localhost" at [6].
>>>>>>>>>>>>>>>>>>>> - Port 13 for Kudu? That is "Daytime Protocol" RFC 867
>>>>>>>>>>>>>>>>>>>> and you will need root permission to run it. The default port for kudu is
>>>>>>>>>>>>>>>>>>>> 7051, isn't it?
>>>>>>>>>>>>>>>>>>>> - I would ask you to add the same functionality to load
>>>>>>>>>>>>>>>>>>>> the mapping from configuration as in HBase's store [7] in you KuduStore
>>>>>>>>>>>>>>>>>>>> [8]. This will have implications on your readMapping at [9], so take a look
>>>>>>>>>>>>>>>>>>>> at the one for HBase at [10]
>>>>>>>>>>>>>>>>>>>> - I know it is in other backends, but avoid
>>>>>>>>>>>>>>>>>>>> RuntimeExceptions (at least in Java since we have the checked ones) like in
>>>>>>>>>>>>>>>>>>>> [11]. You can wrap them in GoraException. An example is [12]
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> And nothing more :)
>>>>>>>>>>>>>>>>>>>> Keep going, good job.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L98
>>>>>>>>>>>>>>>>>>>> [2] -
>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/pom.xml#L890
>>>>>>>>>>>>>>>>>>>> [3] -
>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L121
>>>>>>>>>>>>>>>>>>>> [4] -
>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L180
>>>>>>>>>>>>>>>>>>>> [5] -
>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml
>>>>>>>>>>>>>>>>>>>> [6] -
>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/test/resources/gora.properties#L18
>>>>>>>>>>>>>>>>>>>> [7] -
>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L92
>>>>>>>>>>>>>>>>>>>> [8] -
>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/store/KuduStore.java#L53
>>>>>>>>>>>>>>>>>>>> [9] -
>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L81
>>>>>>>>>>>>>>>>>>>> [10] -
>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L822
>>>>>>>>>>>>>>>>>>>> [11] -
>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L141
>>>>>>>>>>>>>>>>>>>> [12] -
>>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L268
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> El sáb., 8 jun. 2019 a las 20:26, John Mora (<
>>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I have just updated my weekly reports on Cwiki [1].
>>>>>>>>>>>>>>>>>>>>> This next week I think I should be focusing on the create schema operation
>>>>>>>>>>>>>>>>>>>>> and solving the issue of the partitioning configurations in the mapping
>>>>>>>>>>>>>>>>>>>>> file.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Please let me know if you have suggestions, my last
>>>>>>>>>>>>>>>>>>>>> commits are available here [2]
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>

Re: Kudu datastore reports

Posted by Furkan KAMACI <fu...@gmail.com>.
Hi John,

I've already made my comments at your PR. Please check them carefully and
ask me if you need help.

For the documentation, I've checked what you've done. On the other hand, I
would want to encourage you to write a blog post about your Kudu
implementation and demonstrate an example of Kudu integration with Gora as
like a tutorial.

Kind Regards,
Furkan KAMACI

On Sun, Aug 4, 2019 at 1:59 AM John Mora <jh...@gmail.com> wrote:

> Hi all.
>
> I have updated my report in the Wiki[1].
>
> Also, I have sent a PR with my last commits for review [2]. Please give it
> a look if you have time.
>
> This week, I will continue working on the documentation of the kudu
> datastore.
>
> Please let me know if you have suggestions.
>
> [1]
> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
> [2] https://github.com/apache/gora/pull/178
>
> Best,
> John.
>
> El mié., 31 jul. 2019 a las 11:17, carlos muñoz (<ca...@gmail.com>)
> escribió:
>
>> Hi John,
>>
>> Thanks for the update. I reviewed your code a little bit, it is looking
>> good. I think tha you should send a PR in order to receive feedback from
>> other community members.
>>
>> Best,
>> Carlos
>>
>> El dom., 28 jul. 2019 a las 23:20, John Mora (<jh...@gmail.com>)
>> escribió:
>>
>>> Hi all.
>>>
>>> I updated my report in the Wiki[1]. Also, I pushed my last commits to my
>>> branch [2]. Please give it a look if you have time.
>>>
>>> This week, I will give a look to the documentation of datastores.
>>>
>>> Please let me know if you have suggestions.
>>>
>>> [1]
>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>
>>> Cheers,
>>> John
>>>
>>> El mié., 24 jul. 2019 a las 11:34, John Mora (<jh...@gmail.com>)
>>> escribió:
>>>
>>>> Hi Alfonso,
>>>>
>>>> Yes, I was using this class javafx.util.Pair. It is not a problem I
>>>> will find an alternative, it is only an utilitary class.
>>>>
>>>> Thanks,
>>>> John
>>>>
>>>> El mar., 23 jul. 2019 a las 12:36, Alfonso Nishikawa (<
>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>
>>>>> Hi, John.
>>>>>
>>>>> I checked out your code and it looks good :)
>>>>> I found that you use javafx, but that is not present in OpenJDK and
>>>>> fails to compile, and since we don't stick to Oracle JVM I would suggest to
>>>>> change it.
>>>>>
>>>>> Good job, keep it going :)
>>>>>
>>>>> Regards,
>>>>>
>>>>> Alfonso Nishikawa
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> El sáb., 20 jul. 2019 a las 22:25, John Mora (<jh...@gmail.com>)
>>>>> escribió:
>>>>>
>>>>>> Hi.
>>>>>>
>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last commits to
>>>>>> my branch [2]. Please give it a look if you have time.
>>>>>>
>>>>>> This week, I will give a look to the map reduce tests for DataStores.
>>>>>>
>>>>>> Please let me know if you have suggestions.
>>>>>>
>>>>>> [1]
>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>
>>>>>> Thanks,
>>>>>> John
>>>>>>
>>>>>> El sáb., 13 jul. 2019 a las 19:31, John Mora (<jh...@gmail.com>)
>>>>>> escribió:
>>>>>>
>>>>>>> Hi all
>>>>>>>
>>>>>>> I updated my report in the Wiki[1]. Also, I pushed my last commits
>>>>>>> to my branch [2]. Please give it a look if you have time.
>>>>>>>
>>>>>>> This week, I will be working in the getPartitions and deleteByQuery
>>>>>>> methods and testing the other tests in the DataStoreTestBase class.
>>>>>>>
>>>>>>> Please let me know if you have suggestions.
>>>>>>>
>>>>>>> [1]
>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>
>>>>>>> Best,
>>>>>>> John.
>>>>>>>
>>>>>>> El mié., 10 jul. 2019 a las 16:17, John Mora (<jh...@gmail.com>)
>>>>>>> escribió:
>>>>>>>
>>>>>>>> Hi Alfonso,
>>>>>>>>
>>>>>>>> Thanks so much for your time and support for this project. I will
>>>>>>>> work on your comments. Responses inline :)
>>>>>>>>
>>>>>>>>
>>>>>>>> El mar., 9 jul. 2019 a las 16:38, Alfonso Nishikawa (<
>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>
>>>>>>>>> Hi, John.
>>>>>>>>>
>>>>>>>>> Sorry for the delay, I am changing work and I have been very busy
>>>>>>>>> :( I will try to answer your questions :)
>>>>>>>>>
>>>>>>>>> *> In the Employee example there is a field called 'dateOfBirth'.
>>>>>>>>> I tried to map that field with the UNIXTIME_MICROS datatype of Kudu (I
>>>>>>>>> intuitively assumed this is a date.). However, in the java world the
>>>>>>>>> Employee field is a Long value and the kudu datatype is a Timestamp. So, I
>>>>>>>>> was wondering whether I should force the usage of the UNIXTIME_MICROS
>>>>>>>>> datatype for this field or just use a LONG datatype in Kudu.*
>>>>>>>>>
>>>>>>>>> In Avro 1.8 were introduced "Logical Types" so there is a "date"
>>>>>>>>> type with an underlying "int" [1]. It's the first time I read about because
>>>>>>>>> until the last version upgrade of Avro this weren't there. I would suggest
>>>>>>>>> to ignore "dates" and map dateOfBirth as long, since in any case -in avro-
>>>>>>>>> the value is the unix epoch. After this first approach, a design
>>>>>>>>> improvement would be great, though :)
>>>>>>>>>
>>>>>>>>> - Would be good to have in the mapping a "timestamp" type so
>>>>>>>>> KuduStore converts between the Entity long field <-> Kudu timestamp storage?
>>>>>>>>> - Is there any other approach?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I think that Entity long field <-> Kudu timestamp conversion that
>>>>>>>> the best alternative right now. Because, I would add more compatible
>>>>>>>> datatypes to the mapping parameters which users can use. And this
>>>>>>>> conversion should not be dificult to implement in my opinion. Also, the new
>>>>>>>> Date datatype of avro could be implemented in newer versions because it
>>>>>>>> would need further analysis in other datastores too. I will work on that.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *> What is the Gora's policy regarding flush()? *
>>>>>>>>> *> KuduClient has multiple flushing modes
>>>>>>>>> <https://kudu.apache.org/apidocs/org/apache/kudu/client/SessionConfiguration.FlushMode.html>and
>>>>>>>>> also can set time interval
>>>>>>>>> <https://kudu.apache.org/releases/1.2.0/apidocs/org/apache/kudu/client/KuduSession.html#setFlushInterval-int->
>>>>>>>>> for automatic flush.*
>>>>>>>>> *> Should theses behaviors be configurable using gora.properties
>>>>>>>>> file? or just use the default configurations.*
>>>>>>>>>
>>>>>>>>> What we do in HBase is configure an autoflush option in
>>>>>>>>> gora.properties [2] which is used when instanced the Table, but at the same
>>>>>>>>> time we implement the flush() method to force the flush [3]. I would
>>>>>>>>> suggest to follow that example, but adding the flushing options of Kudu.
>>>>>>>>> What flushing mode (and time interval if it applies) do you suggest?
>>>>>>>>>
>>>>>>>>
>>>>>>>> Well,  IMHO the default flush mode (auto flush sync) will do the
>>>>>>>> job for most use cases. But I will add a configuration in gora.properties
>>>>>>>> for selecting the other modes and specifying a autoflush time  if needed
>>>>>>>>  by the user.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> *> Also, while reviewing the datastore interface I noticed this
>>>>>>>>> method 'getPartitions(Query<K, T> query)'. What is the expected behavior of
>>>>>>>>> this method?, should I use the partition definition in the xml mapping file
>>>>>>>>> for this?.*
>>>>>>>>>
>>>>>>>>> The method getPartitions(Query) is related to Hadoop. Apache Gora
>>>>>>>>> integrates with Hadoop implementing a custom Map and Reduce that allows to
>>>>>>>>> get/write Entities directly.
>>>>>>>>> You can take a look at HBase's implementation [4], which relies o.a.h.hbase.mapreduce.TableInputFormatBase
>>>>>>>>> [5] to compute the splits (start key---end key) with the location of the
>>>>>>>>> split to create a colection of partitions [6].
>>>>>>>>>
>>>>>>>>> So, if Kudu is allowed to perform computation using local kudu
>>>>>>>>> splits, then this method does the needed preparation to allow to "send the
>>>>>>>>> computation to where the data is locally".
>>>>>>>>>
>>>>>>>>> In any case, you can see that:
>>>>>>>>>
>>>>>>>>>    - MongoDB store implementation does not implement splitting [7]
>>>>>>>>>    - Cassandra store implementation does not implement splitting
>>>>>>>>>    [8]
>>>>>>>>>    - Aerospike store implementation does not implement splitting
>>>>>>>>>    [9]
>>>>>>>>>    - Accumulo store implementation* does* implement splitting [10]
>>>>>>>>>
>>>>>>>>> If Kudu has a method to get the different splits for a table and
>>>>>>>>> its locations, then you will be able to implement the full feature.
>>>>>>>>>
>>>>>>>>> This is Hadoop related and it is not trivial. I haven't elaborated
>>>>>>>>> much, so if you find you need more information let me know :)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> I will check whether Kudu has these features in order to implement
>>>>>>>> this method. If not I will use the default implementation found in other
>>>>>>>> backends.
>>>>>>>>
>>>>>>>>
>>>>>>>>> About Queries, what I can tell is that Hbase only implements
>>>>>>>>> "Start key" + "End key" because it has only 2 operations: "get" and "scan",
>>>>>>>>> and the querying is for "scan" operation, were you want an interval (or
>>>>>>>>> all) of the rows. Does Kudu have more querying functionality?
>>>>>>>>>
>>>>>>>>>
>>>>>>>> Yes, Kudu implements a Scanner for querying data among with
>>>>>>>> conditional predicates for filtering. I am using those classes.
>>>>>>>>
>>>>>>>>
>>>>>>>>> About other topic, I am trying to install Kudu in standalone (all
>>>>>>>>> in 1 node). Do you use a Cloudera installation or do you have a standalone
>>>>>>>>> installation? How do you do it? I found some instructions, but they talk
>>>>>>>>> about compiling Kudu [11]. I was looking for something like HBase, that it
>>>>>>>>> is unzip + execute "hbase start".
>>>>>>>>>
>>>>>>>>>
>>>>>>>> I am using an embedded mini-cluster which comes with compiled
>>>>>>>> binaries and can be used with maven[1] for testing my code. Once I get it
>>>>>>>> mature enough I think I will be testing the datastore with a docker
>>>>>>>> container [2]. I could not find a unzip+execute bundle either and I am
>>>>>>>> kinda noob for compiling it myself.
>>>>>>>>
>>>>>>>> [1]
>>>>>>>> https://kudu.apache.org/docs/developing.html#_jvm_based_integration_testing
>>>>>>>> [2] https://hub.docker.com/r/usuresearch/apache-kudu/
>>>>>>>>
>>>>>>>>
>>>>>>>>> Good job and thank you!! :)
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [1] - https://avro.apache.org/docs/1.8.0/spec.html#Logical+Types
>>>>>>>>> [2] -
>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L175
>>>>>>>>> [3] -
>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L458
>>>>>>>>> [4] -
>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L472
>>>>>>>>> [5] -
>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L479
>>>>>>>>> [6] -
>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L517
>>>>>>>>> [7] -
>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-mongodb/src/main/java/org/apache/gora/mongodb/store/MongoStore.java#L533
>>>>>>>>> [8] -
>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L292
>>>>>>>>> [9] -
>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-aerospike/src/main/java/org/apache/gora/aerospike/store/AerospikeStore.java#L369
>>>>>>>>> [10] -
>>>>>>>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-accumulo/src/main/java/org/apache/gora/accumulo/store/AccumuloStore.java#L902
>>>>>>>>> [11] - https://kudu.apache.org/docs/installation.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> El lun., 8 jul. 2019 a las 3:42, John Mora (<jh...@gmail.com>)
>>>>>>>>> escribió:
>>>>>>>>>
>>>>>>>>>> Hi all.
>>>>>>>>>>
>>>>>>>>>> As every week I updated my report in the Wiki[1]. Also, I pushed
>>>>>>>>>> my last commits to my branch [2]. Please give it a look if you have time.
>>>>>>>>>>
>>>>>>>>>> This week, I will be continue working in the Queries
>>>>>>>>>> implementation, please reach me out if you have any suggestions.
>>>>>>>>>>
>>>>>>>>>> Also, while reviewing the datastore interface I noticed this
>>>>>>>>>> method 'getPartitions(Query<K, T> query)'. What is the expected behavior of
>>>>>>>>>> this method?, should I use the partition definition in the xml mapping file
>>>>>>>>>> for this?.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> John.
>>>>>>>>>>
>>>>>>>>>> [1]
>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> El dom., 30 jun. 2019 a las 16:56, John Mora (<
>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>
>>>>>>>>>>> Hi all.
>>>>>>>>>>>
>>>>>>>>>>> I received my first evaluation from the Google Summer of Code
>>>>>>>>>>> program with a positive result. Thanks so much for your support and
>>>>>>>>>>> confidence to the project and me.
>>>>>>>>>>>
>>>>>>>>>>> I updated my report of this week in the Wiki[1]. Also, I pushed
>>>>>>>>>>> my last commits to my branch [2].
>>>>>>>>>>>
>>>>>>>>>>> This week, I will be reviewing my the serialization/
>>>>>>>>>>> deserialization process in order to identify optimizations specific for
>>>>>>>>>>> Kudu. Because I used a generic methods of other backends which probably
>>>>>>>>>>> could be better tuned for kudu. Also, I will start working on the Queries
>>>>>>>>>>> implementation.
>>>>>>>>>>>
>>>>>>>>>>> BTW, I added a question to the wiki about Date types. Please
>>>>>>>>>>> give it a look if you have time.
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> John
>>>>>>>>>>>
>>>>>>>>>>> El jue., 27 jun. 2019 a las 21:02, John Mora (<
>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Carlos.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the reminder. I submitted the form yesterday. :D
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> John.
>>>>>>>>>>>>
>>>>>>>>>>>> El jue., 27 jun. 2019 a las 17:34, carlos muñoz (<
>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi John
>>>>>>>>>>>>>
>>>>>>>>>>>>> The first Google Summer of Code evaluation is due on June
>>>>>>>>>>>>> 28th. Please make sure you submit your Mentors' evaluation on time.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Carlos
>>>>>>>>>>>>>
>>>>>>>>>>>>> El dom., 23 jun. 2019 a las 18:29, John Mora (<
>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> FYI, I updated my report of this week on the Wiki[1]. Also, I
>>>>>>>>>>>>>> pushed my last commits to my branch [2].
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As I mentioned in the reports I would like to know how
>>>>>>>>>>>>>> datastores deal with flush(), should it work always manually executed?.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Finally, This week I will be implementing object
>>>>>>>>>>>>>> serialization/deserialization in the methods put, get, delete, exists. Do
>>>>>>>>>>>>>> you have any suggestions on how to proceed with this task?.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Footnote: Thanks for the feedback Carlos, I fixed the
>>>>>>>>>>>>>> problem.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> El lun., 17 jun. 2019 a las 22:58, carlos muñoz (<
>>>>>>>>>>>>>> carlosrmng@gmail.com>) escribió:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi John
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Your last changes look good to me. Keep it up. But, I
>>>>>>>>>>>>>>> noticed that you have created an Enumeration for datatypes, which is very
>>>>>>>>>>>>>>> similar to the kudu-client's [2]. Probably you should replace [1] for [2]
>>>>>>>>>>>>>>> in order to avoid code duplication.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/Column.java#L76
>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>> https://kudu.apache.org/apidocs/org/apache/kudu/Type.html
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>> Carlos
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> El sáb., 15 jun. 2019 a las 12:01, John Mora (<
>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I updated my report of this week on the Wiki[1]. I noticed
>>>>>>>>>>>>>>>> that my code is lacking some javadoc documentation I think I will be
>>>>>>>>>>>>>>>> working on that this week, also I would like to enable and check schema
>>>>>>>>>>>>>>>> management tests (createSchema, existsSchema, etc.).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>> John.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> El mar., 11 jun. 2019 a las 0:11, John Mora (<
>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Alfonso.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks so much for your feedback. I am working on your
>>>>>>>>>>>>>>>>> comments.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 16:11, Alfonso Nishikawa (<
>>>>>>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Regarding your questions at the report [1]:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>    - How to represent partitioning configurations on the
>>>>>>>>>>>>>>>>>>    mapping file.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This was discussed in other emails, isn't it? :)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>    - KuduTestHarness requires the Maven plugin
>>>>>>>>>>>>>>>>>>    os-maven-plugin, which needs Maven 3.1.1+, is it a problem for Apache Gora?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I believe it is not a problem. My Ubuntu comes with
>>>>>>>>>>>>>>>>>> 3.6.0, far from 3.1.1, and I assume everyone uses Maven 3 in a quite new
>>>>>>>>>>>>>>>>>> version :)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> El lun., 10 jun. 2019 a las 21:07, Alfonso Nishikawa (<
>>>>>>>>>>>>>>>>>> alfonso.nishikawa@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi, John.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thank you!
>>>>>>>>>>>>>>>>>>> Things I have seen:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - The version of a maven dependency [1] should go on the
>>>>>>>>>>>>>>>>>>> Dependency Management of the root pom [2]. Same for [3] and from there,
>>>>>>>>>>>>>>>>>>> should not set the version there.
>>>>>>>>>>>>>>>>>>> - Set test dependencies' scope to test, at [4] and from
>>>>>>>>>>>>>>>>>>> there.
>>>>>>>>>>>>>>>>>>> - Set the indentation to 2 spaces for the pom [5]
>>>>>>>>>>>>>>>>>>> - Missing "t" in "localhost" at [6].
>>>>>>>>>>>>>>>>>>> - Port 13 for Kudu? That is "Daytime Protocol" RFC 867
>>>>>>>>>>>>>>>>>>> and you will need root permission to run it. The default port for kudu is
>>>>>>>>>>>>>>>>>>> 7051, isn't it?
>>>>>>>>>>>>>>>>>>> - I would ask you to add the same functionality to load
>>>>>>>>>>>>>>>>>>> the mapping from configuration as in HBase's store [7] in you KuduStore
>>>>>>>>>>>>>>>>>>> [8]. This will have implications on your readMapping at [9], so take a look
>>>>>>>>>>>>>>>>>>> at the one for HBase at [10]
>>>>>>>>>>>>>>>>>>> - I know it is in other backends, but avoid
>>>>>>>>>>>>>>>>>>> RuntimeExceptions (at least in Java since we have the checked ones) like in
>>>>>>>>>>>>>>>>>>> [11]. You can wrap them in GoraException. An example is [12]
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> And nothing more :)
>>>>>>>>>>>>>>>>>>> Keep going, good job.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L98
>>>>>>>>>>>>>>>>>>> [2] -
>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/pom.xml#L890
>>>>>>>>>>>>>>>>>>> [3] -
>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L121
>>>>>>>>>>>>>>>>>>> [4] -
>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L180
>>>>>>>>>>>>>>>>>>> [5] -
>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml
>>>>>>>>>>>>>>>>>>> [6] -
>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/test/resources/gora.properties#L18
>>>>>>>>>>>>>>>>>>> [7] -
>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L92
>>>>>>>>>>>>>>>>>>> [8] -
>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/store/KuduStore.java#L53
>>>>>>>>>>>>>>>>>>> [9] -
>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L81
>>>>>>>>>>>>>>>>>>> [10] -
>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L822
>>>>>>>>>>>>>>>>>>> [11] -
>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L141
>>>>>>>>>>>>>>>>>>> [12] -
>>>>>>>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L268
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Alfonso Nishikawa
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> El sáb., 8 jun. 2019 a las 20:26, John Mora (<
>>>>>>>>>>>>>>>>>>> jhnmora000@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi all.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I have just updated my weekly reports on Cwiki [1].
>>>>>>>>>>>>>>>>>>>> This next week I think I should be focusing on the create schema operation
>>>>>>>>>>>>>>>>>>>> and solving the issue of the partitioning configurations in the mapping
>>>>>>>>>>>>>>>>>>>> file.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Please let me know if you have suggestions, my last
>>>>>>>>>>>>>>>>>>>> commits are available here [2]
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports
>>>>>>>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>