You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by Gabriel Reid <ga...@gmail.com> on 2014/05/26 10:46:44 UTC

CurrentSCN in Phoenix connections

Hi,

I've been trying out the CurrentSCN connection parameter recently, and
it's behavior wasn't quite what I expected. I'd like to double-check
that the current behavior is in accordance with other people's
expectations, and if it isn't then I'll work on updating it.

There are two main issues I'm having. The first is that the existence
of a table (and I assume other table metadata) takes the CurrentSCN
into account. This means that if I connect to Phoenix, create a table,
and then re-connect with a CurrentSCN in the past, my newly-created
table won't be visible. My expectation would be that table metadata
(or at least table existence) is not versioned.

The second issue is that data updated on a connection with a given
CurrentSCN is not visible using the same CurrentSCN. For example, if a
connection has a CurrentSCN and I upsert a row, that row won't be
visible with a select statement unless I reconnect with a larger
CurrentSCN. I'm assuming that this is due to timeranges in HBase are
end-exclusive, but my general expectation would be that within
Phoenix, an updated row would be visible under the same SCN.

Could anyone give any insight as to whether or not this current
behavior is by-design or not?

Thanks,

Gabriel

Re: CurrentSCN in Phoenix connections

Posted by James Taylor <ja...@apache.org>.
Yes, that'll work, but use MetaDataProtocol.MIN_SYSTEM_TABLE_TIMESTAMP to
ensure that you see the latest SYSTEM.CATALOG table.


On Thu, May 29, 2014 at 10:01 AM, Gabriel Reid <ga...@gmail.com>
wrote:

> Thanks for the info James. Indeed, it makes sense (once you know it).
>
> @Dan The idea of having two separate points in time sounds
> interesting, but I I think you're also right that it might be getting
> pretty close to being overly complicated.
>
> @James Any idea on how much impact it would have if table creation was
> done with timestamp=0 by default, but all the rest was left as-is?
> This would allow seeing that the table exists if you connected with a
> CurrentSCN in the past, as well as allowing insertion of historical
> data.
>
> I was thinking of doing this manually (just connect with CurrentSCN=0
> to create tables, and then re-connect normally after that), but I'm
> wondering if it's at all possible (and at all useful) to have this as
> default behavior.
>
> - Gabriel
>
>
>
>
> On Tue, May 27, 2014 at 4:38 AM, Dan Di Spaltro <da...@gmail.com>
> wrote:
> > Does it make sense to potentially have an option for Table Metadata to be
> > read at TS1 and the data to be read at TS2 or is that just too
> complicated
> > as a user interface?
> >
> > -Dan
> >
> >
> > On Mon, May 26, 2014 at 5:49 PM, James Taylor <jamestaylor@apache.org
> >wrote:
> >
> >> Hi Gabriel,
> >> Yes, this is all WAD. We correlate the timestamp of table creation and
> >> alteration with the connection timestamp so that you can see the schema
> as
> >> it was at that point in time. That's what the data would match against
> (and
> >> is validated against) as well. So, yes, table metadata is essentially
> >> versioned.
> >>
> >> It's somewhat confusing when you're inserting data, but if a user
> >> understands how timestamps on KeyValues work, then this matches that
> >> behavior. The main use case is really for back-in-time querying.
> >>
> >> Thanks,
> >> James
> >>
> >>
> >> On Mon, May 26, 2014 at 1:46 AM, Gabriel Reid <gabriel.reid@gmail.com
> >> >wrote:
> >>
> >> > Hi,
> >> >
> >> > I've been trying out the CurrentSCN connection parameter recently, and
> >> > it's behavior wasn't quite what I expected. I'd like to double-check
> >> > that the current behavior is in accordance with other people's
> >> > expectations, and if it isn't then I'll work on updating it.
> >> >
> >> > There are two main issues I'm having. The first is that the existence
> >> > of a table (and I assume other table metadata) takes the CurrentSCN
> >> > into account. This means that if I connect to Phoenix, create a table,
> >> > and then re-connect with a CurrentSCN in the past, my newly-created
> >> > table won't be visible. My expectation would be that table metadata
> >> > (or at least table existence) is not versioned.
> >> >
> >> > The second issue is that data updated on a connection with a given
> >> > CurrentSCN is not visible using the same CurrentSCN. For example, if a
> >> > connection has a CurrentSCN and I upsert a row, that row won't be
> >> > visible with a select statement unless I reconnect with a larger
> >> > CurrentSCN. I'm assuming that this is due to timeranges in HBase are
> >> > end-exclusive, but my general expectation would be that within
> >> > Phoenix, an updated row would be visible under the same SCN.
> >> >
> >> > Could anyone give any insight as to whether or not this current
> >> > behavior is by-design or not?
> >> >
> >> > Thanks,
> >> >
> >> > Gabriel
> >> >
> >>
> >
> >
> >
> > --
> > Dan Di Spaltro
>

Re: CurrentSCN in Phoenix connections

Posted by Gabriel Reid <ga...@gmail.com>.
Thanks for the info James. Indeed, it makes sense (once you know it).

@Dan The idea of having two separate points in time sounds
interesting, but I I think you're also right that it might be getting
pretty close to being overly complicated.

@James Any idea on how much impact it would have if table creation was
done with timestamp=0 by default, but all the rest was left as-is?
This would allow seeing that the table exists if you connected with a
CurrentSCN in the past, as well as allowing insertion of historical
data.

I was thinking of doing this manually (just connect with CurrentSCN=0
to create tables, and then re-connect normally after that), but I'm
wondering if it's at all possible (and at all useful) to have this as
default behavior.

- Gabriel




On Tue, May 27, 2014 at 4:38 AM, Dan Di Spaltro <da...@gmail.com> wrote:
> Does it make sense to potentially have an option for Table Metadata to be
> read at TS1 and the data to be read at TS2 or is that just too complicated
> as a user interface?
>
> -Dan
>
>
> On Mon, May 26, 2014 at 5:49 PM, James Taylor <ja...@apache.org>wrote:
>
>> Hi Gabriel,
>> Yes, this is all WAD. We correlate the timestamp of table creation and
>> alteration with the connection timestamp so that you can see the schema as
>> it was at that point in time. That's what the data would match against (and
>> is validated against) as well. So, yes, table metadata is essentially
>> versioned.
>>
>> It's somewhat confusing when you're inserting data, but if a user
>> understands how timestamps on KeyValues work, then this matches that
>> behavior. The main use case is really for back-in-time querying.
>>
>> Thanks,
>> James
>>
>>
>> On Mon, May 26, 2014 at 1:46 AM, Gabriel Reid <gabriel.reid@gmail.com
>> >wrote:
>>
>> > Hi,
>> >
>> > I've been trying out the CurrentSCN connection parameter recently, and
>> > it's behavior wasn't quite what I expected. I'd like to double-check
>> > that the current behavior is in accordance with other people's
>> > expectations, and if it isn't then I'll work on updating it.
>> >
>> > There are two main issues I'm having. The first is that the existence
>> > of a table (and I assume other table metadata) takes the CurrentSCN
>> > into account. This means that if I connect to Phoenix, create a table,
>> > and then re-connect with a CurrentSCN in the past, my newly-created
>> > table won't be visible. My expectation would be that table metadata
>> > (or at least table existence) is not versioned.
>> >
>> > The second issue is that data updated on a connection with a given
>> > CurrentSCN is not visible using the same CurrentSCN. For example, if a
>> > connection has a CurrentSCN and I upsert a row, that row won't be
>> > visible with a select statement unless I reconnect with a larger
>> > CurrentSCN. I'm assuming that this is due to timeranges in HBase are
>> > end-exclusive, but my general expectation would be that within
>> > Phoenix, an updated row would be visible under the same SCN.
>> >
>> > Could anyone give any insight as to whether or not this current
>> > behavior is by-design or not?
>> >
>> > Thanks,
>> >
>> > Gabriel
>> >
>>
>
>
>
> --
> Dan Di Spaltro

Re: CurrentSCN in Phoenix connections

Posted by Dan Di Spaltro <da...@gmail.com>.
Does it make sense to potentially have an option for Table Metadata to be
read at TS1 and the data to be read at TS2 or is that just too complicated
as a user interface?

-Dan


On Mon, May 26, 2014 at 5:49 PM, James Taylor <ja...@apache.org>wrote:

> Hi Gabriel,
> Yes, this is all WAD. We correlate the timestamp of table creation and
> alteration with the connection timestamp so that you can see the schema as
> it was at that point in time. That's what the data would match against (and
> is validated against) as well. So, yes, table metadata is essentially
> versioned.
>
> It's somewhat confusing when you're inserting data, but if a user
> understands how timestamps on KeyValues work, then this matches that
> behavior. The main use case is really for back-in-time querying.
>
> Thanks,
> James
>
>
> On Mon, May 26, 2014 at 1:46 AM, Gabriel Reid <gabriel.reid@gmail.com
> >wrote:
>
> > Hi,
> >
> > I've been trying out the CurrentSCN connection parameter recently, and
> > it's behavior wasn't quite what I expected. I'd like to double-check
> > that the current behavior is in accordance with other people's
> > expectations, and if it isn't then I'll work on updating it.
> >
> > There are two main issues I'm having. The first is that the existence
> > of a table (and I assume other table metadata) takes the CurrentSCN
> > into account. This means that if I connect to Phoenix, create a table,
> > and then re-connect with a CurrentSCN in the past, my newly-created
> > table won't be visible. My expectation would be that table metadata
> > (or at least table existence) is not versioned.
> >
> > The second issue is that data updated on a connection with a given
> > CurrentSCN is not visible using the same CurrentSCN. For example, if a
> > connection has a CurrentSCN and I upsert a row, that row won't be
> > visible with a select statement unless I reconnect with a larger
> > CurrentSCN. I'm assuming that this is due to timeranges in HBase are
> > end-exclusive, but my general expectation would be that within
> > Phoenix, an updated row would be visible under the same SCN.
> >
> > Could anyone give any insight as to whether or not this current
> > behavior is by-design or not?
> >
> > Thanks,
> >
> > Gabriel
> >
>



-- 
Dan Di Spaltro

Re: CurrentSCN in Phoenix connections

Posted by James Taylor <ja...@apache.org>.
Hi Gabriel,
Yes, this is all WAD. We correlate the timestamp of table creation and
alteration with the connection timestamp so that you can see the schema as
it was at that point in time. That's what the data would match against (and
is validated against) as well. So, yes, table metadata is essentially
versioned.

It's somewhat confusing when you're inserting data, but if a user
understands how timestamps on KeyValues work, then this matches that
behavior. The main use case is really for back-in-time querying.

Thanks,
James


On Mon, May 26, 2014 at 1:46 AM, Gabriel Reid <ga...@gmail.com>wrote:

> Hi,
>
> I've been trying out the CurrentSCN connection parameter recently, and
> it's behavior wasn't quite what I expected. I'd like to double-check
> that the current behavior is in accordance with other people's
> expectations, and if it isn't then I'll work on updating it.
>
> There are two main issues I'm having. The first is that the existence
> of a table (and I assume other table metadata) takes the CurrentSCN
> into account. This means that if I connect to Phoenix, create a table,
> and then re-connect with a CurrentSCN in the past, my newly-created
> table won't be visible. My expectation would be that table metadata
> (or at least table existence) is not versioned.
>
> The second issue is that data updated on a connection with a given
> CurrentSCN is not visible using the same CurrentSCN. For example, if a
> connection has a CurrentSCN and I upsert a row, that row won't be
> visible with a select statement unless I reconnect with a larger
> CurrentSCN. I'm assuming that this is due to timeranges in HBase are
> end-exclusive, but my general expectation would be that within
> Phoenix, an updated row would be visible under the same SCN.
>
> Could anyone give any insight as to whether or not this current
> behavior is by-design or not?
>
> Thanks,
>
> Gabriel
>