You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@predictionio.apache.org by "Pat Ferrel (JIRA)" <ji...@apache.org> on 2016/11/21 01:30:58 UTC

[jira] [Updated] (PIO-45) SelfCleaningDatasource erases all data

     [ https://issues.apache.org/jira/browse/PIO-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pat Ferrel updated PIO-45:
--------------------------
    Description: 
as integrated into the UR, in the integration-test, the SelfCleaningDataset erases all data. This feature works fine in the AML version of PIO.

Although not tested one could assume that this would be true with any other Datasource in other templates.

[~emergentorder] can you check to see if the PIO merge was done correctly.

  was:
as integrated into the UR, in the integration-test, the SelfCleaningDataset erases all data. This feature works fine in the AML version of PIO.

Although not tested one could assume that this would be true with any other Datasource in other templates.

[~amerritt] can you check to see if the PIO merge was done correctly.


> SelfCleaningDatasource erases all data
> --------------------------------------
>
>                 Key: PIO-45
>                 URL: https://issues.apache.org/jira/browse/PIO-45
>             Project: PredictionIO
>          Issue Type: Bug
>    Affects Versions: 0.10.0-incubating
>            Reporter: Pat Ferrel
>            Assignee: Alexander  Merritt
>            Priority: Critical
>             Fix For: 0.11.0
>
>
> as integrated into the UR, in the integration-test, the SelfCleaningDataset erases all data. This feature works fine in the AML version of PIO.
> Although not tested one could assume that this would be true with any other Datasource in other templates.
> [~emergentorder] can you check to see if the PIO merge was done correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [jira] [Updated] (PIO-45) SelfCleaningDatasource erases all data

Posted by Pat Ferrel <pa...@occamsmachete.com>.
Cool.

BTW since we have a source release, should we tag major bug fixes in the master so it will be easier to advise users how to get a fix? This would also mean pushing it to master, of course. Sort of a very lite and non-official release? We can reference commit numbers but if it doesn’t get into master it would be hard to point users to develop with much confidence.

Anyone have an opinion? Donald?


On Nov 24, 2016, at 8:44 AM, Alex Merritt <em...@apache.org> wrote:

So it looks like earlier in the process of fixing this for JDBC I broke it for HBase. Still not quite sure why, but it appears that inserting events without eventIds is the cause of the deletion. Regardless, I just moved the event id stripping to JDBCPEvents (to fix insert errors in JDBC). Also added a test case which fails before this fix.
Committed and pushed. Tests passed locally, Travis is running right now.
Will close the JIRA when I see it complete.

On Wed, Nov 23, 2016 at 11:42 AM, Alex Merritt <lecaran@gmail.com <ma...@gmail.com>> wrote:
I first took a quick look at the merge, and it looked like the only (minor) divergence is in JDBC. And yet, I assume you are using HBase here.
As was I, when I was later able to reproduce the issue (using SelfCleaningDataSourceTest).

Will aim to track down & 
attempt a fix today / tomorrow.

Alex

On Mon, Nov 21, 2016 at 5:16 PM, Alex Merritt <lecaran@gmail.com <ma...@gmail.com>> wrote:
Sure, I can try to reproduce this / take a look tomorrow.

Alex


On Nov 21, 2016 12:05 PM, "Pat Ferrel" <pat@occamsmachete.com <ma...@occamsmachete.com>> wrote:
Do you have time to look at this Alex? I may have made a mistake in merging this feature. At present any use of it erases all data. Since it is only used from templates we haven’t had one that used it except your integration test that should be merged with Apache-PIO. Can you at least run those to see if the problem is reproducible? Or tell me how to run those? It’s included in one of the example templates, right?


On Nov 20, 2016, at 5:30 PM, Pat Ferrel (JIRA) <jira@apache.org <ma...@apache.org>> wrote:


    [ https://issues.apache.org/jira/browse/PIO-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel <https://issues.apache.org/jira/browse/PIO-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel> ]

Pat Ferrel updated PIO-45:
--------------------------
   Description:
as integrated into the UR, in the integration-test, the SelfCleaningDataset erases all data. This feature works fine in the AML version of PIO.

Although not tested one could assume that this would be true with any other Datasource in other templates.

[~emergentorder] can you check to see if the PIO merge was done correctly.

 was:
as integrated into the UR, in the integration-test, the SelfCleaningDataset erases all data. This feature works fine in the AML version of PIO.

Although not tested one could assume that this would be true with any other Datasource in other templates.

[~amerritt] can you check to see if the PIO merge was done correctly.


> SelfCleaningDatasource erases all data
> --------------------------------------
>
>                Key: PIO-45
>                URL: https://issues.apache.org/jira/browse/PIO-45 <https://issues.apache.org/jira/browse/PIO-45>
>            Project: PredictionIO
>         Issue Type: Bug
>   Affects Versions: 0.10.0-incubating
>           Reporter: Pat Ferrel
>           Assignee: Alexander  Merritt
>           Priority: Critical
>            Fix For: 0.11.0
>
>
> as integrated into the UR, in the integration-test, the SelfCleaningDataset erases all data. This feature works fine in the AML version of PIO.
> Although not tested one could assume that this would be true with any other Datasource in other templates.
> [~emergentorder] can you check to see if the PIO merge was done correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)





Re: [jira] [Updated] (PIO-45) SelfCleaningDatasource erases all data

Posted by Alex Merritt <em...@apache.org>.
So it looks like earlier in the process of fixing this for JDBC I broke it
for HBase. Still not quite sure why, but it appears that inserting events
without eventIds is the cause of the deletion. Regardless, I just moved the
event id stripping to JDBCPEvents (to fix insert errors in JDBC). Also
added a test case which fails before this fix.
Committed and pushed. Tests passed locally, Travis is running right now.
Will close the JIRA when I see it complete.

On Wed, Nov 23, 2016 at 11:42 AM, Alex Merritt <le...@gmail.com> wrote:

> I first took a quick look at the merge, and it looked like the only
> (minor) divergence is in JDBC. And yet, I assume you are using HBase here.
> As was I, when I was later able to reproduce the issue (using
> SelfCleaningDataSourceTest).
>
> Will aim to track down &
> attempt a fix today / tomorrow.
>
> Alex
>
> On Mon, Nov 21, 2016 at 5:16 PM, Alex Merritt <le...@gmail.com> wrote:
>
>> Sure, I can try to reproduce this / take a look tomorrow.
>>
>> Alex
>>
>> On Nov 21, 2016 12:05 PM, "Pat Ferrel" <pa...@occamsmachete.com> wrote:
>>
>>> Do you have time to look at this Alex? I may have made a mistake in
>>> merging this feature. At present any use of it erases all data. Since it is
>>> only used from templates we haven’t had one that used it except your
>>> integration test that should be merged with Apache-PIO. Can you at least
>>> run those to see if the problem is reproducible? Or tell me how to run
>>> those? It’s included in one of the example templates, right?
>>>
>>>
>>> On Nov 20, 2016, at 5:30 PM, Pat Ferrel (JIRA) <ji...@apache.org> wrote:
>>>
>>>
>>>     [ https://issues.apache.org/jira/browse/PIO-45?page=com.atlass
>>> ian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>>>
>>> Pat Ferrel updated PIO-45:
>>> --------------------------
>>>    Description:
>>> as integrated into the UR, in the integration-test, the
>>> SelfCleaningDataset erases all data. This feature works fine in the AML
>>> version of PIO.
>>>
>>> Although not tested one could assume that this would be true with any
>>> other Datasource in other templates.
>>>
>>> [~emergentorder] can you check to see if the PIO merge was done
>>> correctly.
>>>
>>>  was:
>>> as integrated into the UR, in the integration-test, the
>>> SelfCleaningDataset erases all data. This feature works fine in the AML
>>> version of PIO.
>>>
>>> Although not tested one could assume that this would be true with any
>>> other Datasource in other templates.
>>>
>>> [~amerritt] can you check to see if the PIO merge was done correctly.
>>>
>>>
>>> > SelfCleaningDatasource erases all data
>>> > --------------------------------------
>>> >
>>> >                Key: PIO-45
>>> >                URL: https://issues.apache.org/jira/browse/PIO-45
>>> >            Project: PredictionIO
>>> >         Issue Type: Bug
>>> >   Affects Versions: 0.10.0-incubating
>>> >           Reporter: Pat Ferrel
>>> >           Assignee: Alexander  Merritt
>>> >           Priority: Critical
>>> >            Fix For: 0.11.0
>>> >
>>> >
>>> > as integrated into the UR, in the integration-test, the
>>> SelfCleaningDataset erases all data. This feature works fine in the AML
>>> version of PIO.
>>> > Although not tested one could assume that this would be true with any
>>> other Datasource in other templates.
>>> > [~emergentorder] can you check to see if the PIO merge was done
>>> correctly.
>>>
>>>
>>>
>>> --
>>> This message was sent by Atlassian JIRA
>>> (v6.3.4#6332)
>>>
>>>
>

Re: [jira] [Updated] (PIO-45) SelfCleaningDatasource erases all data

Posted by Pat Ferrel <pa...@occamsmachete.com>.
True, I use HBase and saw the problem there. We’ve use the SelfCleaningDatasource in the ActionML fork successfully. It’s in the integration test for the Universal Recommender, which I was porting when I saw the test failure. I assume you are using the test integrated with the PIO release? 

I guess eventually it would be nice to integrate the test into the new framework so it's done on builds. Chan and Marcin did the framework. I’m trying to figure out how to do the UR tests using the framework too.

Chan, Marcin is there a how-to for running the test and integrating templates into the framework? Alex’s test is currently in the PIO release bits as a script that uses a built-in template in examples. 


On Nov 23, 2016, at 9:42 AM, Alex Merritt <le...@gmail.com> wrote:

I first took a quick look at the merge, and it looked like the only (minor)
divergence is in JDBC. And yet, I assume you are using HBase here.
As was I, when I was later able to reproduce the issue (using
SelfCleaningDataSourceTest).

Will aim to track down &
attempt a fix today / tomorrow.

Alex

On Mon, Nov 21, 2016 at 5:16 PM, Alex Merritt <le...@gmail.com> wrote:

> Sure, I can try to reproduce this / take a look tomorrow.
> 
> Alex
> 
> On Nov 21, 2016 12:05 PM, "Pat Ferrel" <pa...@occamsmachete.com> wrote:
> 
>> Do you have time to look at this Alex? I may have made a mistake in
>> merging this feature. At present any use of it erases all data. Since it is
>> only used from templates we haven’t had one that used it except your
>> integration test that should be merged with Apache-PIO. Can you at least
>> run those to see if the problem is reproducible? Or tell me how to run
>> those? It’s included in one of the example templates, right?
>> 
>> 
>> On Nov 20, 2016, at 5:30 PM, Pat Ferrel (JIRA) <ji...@apache.org> wrote:
>> 
>> 
>>    [ https://issues.apache.org/jira/browse/PIO-45?page=com.atlass
>> ian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>> 
>> Pat Ferrel updated PIO-45:
>> --------------------------
>>   Description:
>> as integrated into the UR, in the integration-test, the
>> SelfCleaningDataset erases all data. This feature works fine in the AML
>> version of PIO.
>> 
>> Although not tested one could assume that this would be true with any
>> other Datasource in other templates.
>> 
>> [~emergentorder] can you check to see if the PIO merge was done correctly.
>> 
>> was:
>> as integrated into the UR, in the integration-test, the
>> SelfCleaningDataset erases all data. This feature works fine in the AML
>> version of PIO.
>> 
>> Although not tested one could assume that this would be true with any
>> other Datasource in other templates.
>> 
>> [~amerritt] can you check to see if the PIO merge was done correctly.
>> 
>> 
>>> SelfCleaningDatasource erases all data
>>> --------------------------------------
>>> 
>>>               Key: PIO-45
>>>               URL: https://issues.apache.org/jira/browse/PIO-45
>>>           Project: PredictionIO
>>>        Issue Type: Bug
>>>  Affects Versions: 0.10.0-incubating
>>>          Reporter: Pat Ferrel
>>>          Assignee: Alexander  Merritt
>>>          Priority: Critical
>>>           Fix For: 0.11.0
>>> 
>>> 
>>> as integrated into the UR, in the integration-test, the
>> SelfCleaningDataset erases all data. This feature works fine in the AML
>> version of PIO.
>>> Although not tested one could assume that this would be true with any
>> other Datasource in other templates.
>>> [~emergentorder] can you check to see if the PIO merge was done
>> correctly.
>> 
>> 
>> 
>> --
>> This message was sent by Atlassian JIRA
>> (v6.3.4#6332)
>> 
>> 


Re: [jira] [Updated] (PIO-45) SelfCleaningDatasource erases all data

Posted by Alex Merritt <le...@gmail.com>.
I first took a quick look at the merge, and it looked like the only (minor)
divergence is in JDBC. And yet, I assume you are using HBase here.
As was I, when I was later able to reproduce the issue (using
SelfCleaningDataSourceTest).

Will aim to track down &
attempt a fix today / tomorrow.

Alex

On Mon, Nov 21, 2016 at 5:16 PM, Alex Merritt <le...@gmail.com> wrote:

> Sure, I can try to reproduce this / take a look tomorrow.
>
> Alex
>
> On Nov 21, 2016 12:05 PM, "Pat Ferrel" <pa...@occamsmachete.com> wrote:
>
>> Do you have time to look at this Alex? I may have made a mistake in
>> merging this feature. At present any use of it erases all data. Since it is
>> only used from templates we haven’t had one that used it except your
>> integration test that should be merged with Apache-PIO. Can you at least
>> run those to see if the problem is reproducible? Or tell me how to run
>> those? It’s included in one of the example templates, right?
>>
>>
>> On Nov 20, 2016, at 5:30 PM, Pat Ferrel (JIRA) <ji...@apache.org> wrote:
>>
>>
>>     [ https://issues.apache.org/jira/browse/PIO-45?page=com.atlass
>> ian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>>
>> Pat Ferrel updated PIO-45:
>> --------------------------
>>    Description:
>> as integrated into the UR, in the integration-test, the
>> SelfCleaningDataset erases all data. This feature works fine in the AML
>> version of PIO.
>>
>> Although not tested one could assume that this would be true with any
>> other Datasource in other templates.
>>
>> [~emergentorder] can you check to see if the PIO merge was done correctly.
>>
>>  was:
>> as integrated into the UR, in the integration-test, the
>> SelfCleaningDataset erases all data. This feature works fine in the AML
>> version of PIO.
>>
>> Although not tested one could assume that this would be true with any
>> other Datasource in other templates.
>>
>> [~amerritt] can you check to see if the PIO merge was done correctly.
>>
>>
>> > SelfCleaningDatasource erases all data
>> > --------------------------------------
>> >
>> >                Key: PIO-45
>> >                URL: https://issues.apache.org/jira/browse/PIO-45
>> >            Project: PredictionIO
>> >         Issue Type: Bug
>> >   Affects Versions: 0.10.0-incubating
>> >           Reporter: Pat Ferrel
>> >           Assignee: Alexander  Merritt
>> >           Priority: Critical
>> >            Fix For: 0.11.0
>> >
>> >
>> > as integrated into the UR, in the integration-test, the
>> SelfCleaningDataset erases all data. This feature works fine in the AML
>> version of PIO.
>> > Although not tested one could assume that this would be true with any
>> other Datasource in other templates.
>> > [~emergentorder] can you check to see if the PIO merge was done
>> correctly.
>>
>>
>>
>> --
>> This message was sent by Atlassian JIRA
>> (v6.3.4#6332)
>>
>>

Re: [jira] [Updated] (PIO-45) SelfCleaningDatasource erases all data

Posted by Alex Merritt <le...@gmail.com>.
Sure, I can try to reproduce this / take a look tomorrow.

Alex

On Nov 21, 2016 12:05 PM, "Pat Ferrel" <pa...@occamsmachete.com> wrote:

> Do you have time to look at this Alex? I may have made a mistake in
> merging this feature. At present any use of it erases all data. Since it is
> only used from templates we haven’t had one that used it except your
> integration test that should be merged with Apache-PIO. Can you at least
> run those to see if the problem is reproducible? Or tell me how to run
> those? It’s included in one of the example templates, right?
>
>
> On Nov 20, 2016, at 5:30 PM, Pat Ferrel (JIRA) <ji...@apache.org> wrote:
>
>
>     [ https://issues.apache.org/jira/browse/PIO-45?page=com.
> atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Pat Ferrel updated PIO-45:
> --------------------------
>    Description:
> as integrated into the UR, in the integration-test, the
> SelfCleaningDataset erases all data. This feature works fine in the AML
> version of PIO.
>
> Although not tested one could assume that this would be true with any
> other Datasource in other templates.
>
> [~emergentorder] can you check to see if the PIO merge was done correctly.
>
>  was:
> as integrated into the UR, in the integration-test, the
> SelfCleaningDataset erases all data. This feature works fine in the AML
> version of PIO.
>
> Although not tested one could assume that this would be true with any
> other Datasource in other templates.
>
> [~amerritt] can you check to see if the PIO merge was done correctly.
>
>
> > SelfCleaningDatasource erases all data
> > --------------------------------------
> >
> >                Key: PIO-45
> >                URL: https://issues.apache.org/jira/browse/PIO-45
> >            Project: PredictionIO
> >         Issue Type: Bug
> >   Affects Versions: 0.10.0-incubating
> >           Reporter: Pat Ferrel
> >           Assignee: Alexander  Merritt
> >           Priority: Critical
> >            Fix For: 0.11.0
> >
> >
> > as integrated into the UR, in the integration-test, the
> SelfCleaningDataset erases all data. This feature works fine in the AML
> version of PIO.
> > Although not tested one could assume that this would be true with any
> other Datasource in other templates.
> > [~emergentorder] can you check to see if the PIO merge was done
> correctly.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>
>

Re: [jira] [Updated] (PIO-45) SelfCleaningDatasource erases all data

Posted by Pat Ferrel <pa...@occamsmachete.com>.
Do you have time to look at this Alex? I may have made a mistake in merging this feature. At present any use of it erases all data. Since it is only used from templates we haven’t had one that used it except your integration test that should be merged with Apache-PIO. Can you at least run those to see if the problem is reproducible? Or tell me how to run those? It’s included in one of the example templates, right?


On Nov 20, 2016, at 5:30 PM, Pat Ferrel (JIRA) <ji...@apache.org> wrote:


    [ https://issues.apache.org/jira/browse/PIO-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pat Ferrel updated PIO-45:
--------------------------
   Description: 
as integrated into the UR, in the integration-test, the SelfCleaningDataset erases all data. This feature works fine in the AML version of PIO.

Although not tested one could assume that this would be true with any other Datasource in other templates.

[~emergentorder] can you check to see if the PIO merge was done correctly.

 was:
as integrated into the UR, in the integration-test, the SelfCleaningDataset erases all data. This feature works fine in the AML version of PIO.

Although not tested one could assume that this would be true with any other Datasource in other templates.

[~amerritt] can you check to see if the PIO merge was done correctly.


> SelfCleaningDatasource erases all data
> --------------------------------------
> 
>                Key: PIO-45
>                URL: https://issues.apache.org/jira/browse/PIO-45
>            Project: PredictionIO
>         Issue Type: Bug
>   Affects Versions: 0.10.0-incubating
>           Reporter: Pat Ferrel
>           Assignee: Alexander  Merritt
>           Priority: Critical
>            Fix For: 0.11.0
> 
> 
> as integrated into the UR, in the integration-test, the SelfCleaningDataset erases all data. This feature works fine in the AML version of PIO.
> Although not tested one could assume that this would be true with any other Datasource in other templates.
> [~emergentorder] can you check to see if the PIO merge was done correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)