You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by Kristian Waagan <Kr...@Sun.COM> on 2006/08/02 18:43:05 UTC

Enforcing length restrictions for streams of unknown length

Hello,

My initial work on the new JDBC4 length less overloads are approaching 
completion, but I still have one issue that must be solved.

Currently, streams with unknown length are materialized to determine the 
length. This is the approach I have implemented in the client driver in 
lack of a better solution at the moment. But, the approach is also used 
in the embedded driver, and this is simply not good enough.

If I pass the stream down to the storage layer, bypassing the length 
checks done by the data type classes (SQLBinary, SQLBlob etc), the 
storage layer will insert all the data it can get. For instance, I can 
insert 3KB into a 2KB Blob column.
To solve this, I plan to wrap the user/application stream in a 
limit-stream. This stream will cause an exception to be thrown if it has 
read more data than can be allowed in the column it is being inserted into.

In addition to the maximum length issue, there is also that of 
truncation of trailing blanks. I don't yet fully understand what I have 
to change. Much of the functionality I need is already in place, but 
some changes might be required. For instance, the column width and 
whether truncation is allowed or not might need to be passed down to the 
limit-mechanism.

Questions, suggestions or other feedback is appreciated!



Related issues are DERBY-1473 and DERBY-1417.
I plan to finish this for 10.2.



Thanks,
-- 
Kristian

Re: Enforcing length restrictions for streams of unknown length

Posted by Kristian Waagan <Kr...@Sun.COM>.
Mike Matrigali wrote:
> 
> 
> Mike Matrigali wrote:
>> This may or may not work, not sure.  Here is stuff to be aware of.
>> If you do this approach, the store will go ahead and insert and log
>> data into the database.  For it to work correctly you will
>> have to make sure that the resulting error from the limit at least
>> aborts the statement which is doing the insert/update.
>>
>> My guess is that you are going to get some sort of STORE exception
>> with the limit exception wrapped below it.  I would not be surprised
>> if the current store exception is more severe than you want it to
>> be as the current code does not expect this error - you may have
>> to define a new less severe error in this case.  There may be more
>> than one exception path.  Make sure to test the case where
>> the inserted blob exceeds the page size and the case where the
>> inserted blob is less than the page size. 
> 
> By severity I meant that store may currently raise a transaction or
> system level exception.  A good test would be to have a multi-statement
> transaction and make sure this error does not back out an earlier
> statement in the uncommitted transation.
> 

Hello Mike,

Thank you very much for the information.
I think I can implement the upper layers of what I plan pretty fast.
I'll go ahead and see what happens. I was hoping to avoid making changes
in the store, but based on your information I fear that has to done.

I appreciate the help on which test scenarios I should write tests for.
If you, or anyone else, have more of them, keep'em coming :)

I'll report back on my findings as soon as possible, hopefully in a few
days.



Regards,
-- 
Kristian



Re: Enforcing length restrictions for streams of unknown length

Posted by Mike Matrigali <mi...@sbcglobal.net>.

Mike Matrigali wrote:
> This may or may not work, not sure.  Here is stuff to be aware of.
> If you do this approach, the store will go ahead and insert and log data 
> into the database.  For it to work correctly you will
> have to make sure that the resulting error from the limit at least
> aborts the statement which is doing the insert/update.
> 
> My guess is that you are going to get some sort of STORE exception
> with the limit exception wrapped below it.  I would not be surprised
> if the current store exception is more severe than you want it to
> be as the current code does not expect this error - you may have
> to define a new less severe error in this case.  There may be more
> than one exception path.  Make sure to test the case where
> the inserted blob exceeds the page size and the case where the
> inserted blob is less than the page size. 

By severity I meant that store may currently raise a transaction or
system level exception.  A good test would be to have a multi-statement
transaction and make sure this error does not back out an earlier 
statement in the uncommitted transation.


Re: Enforcing length restrictions for streams of unknown length

Posted by Mike Matrigali <mi...@sbcglobal.net>.
This may or may not work, not sure.  Here is stuff to be aware of.
If you do this approach, the store will go ahead and insert and log data 
into the database.  For it to work correctly you will
have to make sure that the resulting error from the limit at least
aborts the statement which is doing the insert/update.

My guess is that you are going to get some sort of STORE exception
with the limit exception wrapped below it.  I would not be surprised
if the current store exception is more severe than you want it to
be as the current code does not expect this error - you may have
to define a new less severe error in this case.  There may be more
than one exception path.  Make sure to test the case where
the inserted blob exceeds the page size and the case where the
inserted blob is less than the page size.  Off hand I think this
is a new path for store, I can't think of any case where we expect
to get an exception while reading a stream for insert/update.
With user defined types there use to be an exercised codepath if
the user tried to READ more data than existed in the database -
again this path is no longer exercised since those datatypes were
removed.

This means that after your change a lot more data on disk may be
allocated to the file and the log than before your change.  Some of
this space may never be reclaimed if there are no subsequent inserts
or no explicit compresses.  Probably the worst case would be a user
bug where they defined a default 2 gig blob column, and somehow
generated an infinite loop in the feeding stream - this would then
use 2+ gig of log file and grow the table to 2gig and then return
the error.

This is not a new problem, it is similar to how unique key violations
work today.  A row with many indexes will insert the row, and may
update many indexes before hitting the uniqueness problem.  At that
point all the work is aborted.

I do think that when there is a length we should do the checking up
front, rather than pay the abort and possible space reclamation
penalty.  So unfortunately that would mean multiple paths through
the datatype insert/update stream paths.

Kristian Waagan wrote:
> Hello,
> 
> My initial work on the new JDBC4 length less overloads are approaching 
> completion, but I still have one issue that must be solved.
> 
> Currently, streams with unknown length are materialized to determine the 
> length. This is the approach I have implemented in the client driver in 
> lack of a better solution at the moment. But, the approach is also used 
> in the embedded driver, and this is simply not good enough.
> 
> If I pass the stream down to the storage layer, bypassing the length 
> checks done by the data type classes (SQLBinary, SQLBlob etc), the 
> storage layer will insert all the data it can get. For instance, I can 
> insert 3KB into a 2KB Blob column.
> To solve this, I plan to wrap the user/application stream in a 
> limit-stream. This stream will cause an exception to be thrown if it has 
> read more data than can be allowed in the column it is being inserted into.
> 
> In addition to the maximum length issue, there is also that of 
> truncation of trailing blanks. I don't yet fully understand what I have 
> to change. Much of the functionality I need is already in place, but 
> some changes might be required. For instance, the column width and 
> whether truncation is allowed or not might need to be passed down to the 
> limit-mechanism.
> 
> Questions, suggestions or other feedback is appreciated!
> 
> 
> 
> Related issues are DERBY-1473 and DERBY-1417.
> I plan to finish this for 10.2.
> 
> 
> 
> Thanks,